索引典在博物館的規劃應用(新增)

出自 TELDAP
前往: 導覽搜尋

導論與基本原理

導論 索引典(Thesauri)?

從數位典藏/圖書館面臨的挑戰談起…

- 改善檢索效益,以處理大量的典藏資訊

- 為不同媒材的典藏資訊,提供統一查詢機制

- 為使用者提供知識為本的查詢機制

- 支援個人或工作團隊建立與維護資訊系統

- 支援資訊尋求行為,成為解決問題、學習與智性工作的一部分

- 支援合作性工作

- 讓學術傳播運行於電腦支援的多邊對話中

Image001索引典.gif


知識組織系統之概觀 用詞清單(Term Lists)

- 權威檔(Authority files)

- 詞彙表(Glossaries)

- 字典(Dictionaries)

- 地名詞典(Gazetteers)


分類與類目(Classification and Categories)

- 主題標目(Subject headings)

- 分類表或類目表(Classification Schemes, Taxonomies, and Categorization Schemes)


關聯性清單(Relationship Lists)

- 索引典(Thesauri)

- 語意網路(Semantic Networks)

- 知識本體(Ontologies)


權威檔(Authority Files)

Authority files are lists of terms that are used to control the variant names for an entity or the domain value for a particular field. Examples include names for countries, individuals, and organizations. Nonpreferred terms may be linked to the preferred versions. This type of KOS generally does not include a deep organization or complex structure. The presentation may be alphabetical or organized by a shallow classification scheme. A limited hierarchy may be applied to allow for simple navigation, particularly when the authority file is being accessed manually or is extremely large. Examples of authority files include the Library of Congress Name Authority File and the Getty Geographic Authority File.

Image002權威檔.jpg
圖1:Authority Files 1
Image003權威檔.jpg
圖2:Authority Files 2


詞彙表(Glossaries)

A glossary is a list of terms, usually with definitions. The terms may be from a specific subject field or from a particular work. The terms are defined within a specific environment and rarely include variant meanings. Examples include the Environmental Protection Agency (EPA) Terms of the Environment.

Image004詞彙表.jpg
圖3:Glossaries


字典(Dictionaries)

A dictionary is a listing or words and phrases giving information such as spelling, morphology and part of speech, senses, definitions, usage, origin, and equivalents in other languages (bi- or multilingual dictionary).

Image005字典.jpg
圖4:Dictionaries 1
Image006字典.jpg
圖5:Dictionaries 2


地名詞典(Gazetteers)

A gazetteer is a list of place names. Traditional gazetteers have been published as books or have appeared as indexes to atlases. Each entry may also be identified by feature type, such as river, city, or school. An example is the U.S. Code of Geographic Names. Geospatially referenced gazetteers provide coordinates for locating the place on the earth's surface.

Image007地名詞典.jpg
圖6:Gazetteers


主題標目(Subject Headings)

This scheme type provides a set of controlled terms to represent the subjects of items in a collection. Subject heading lists can be extensive and cover a broad range of subjects; however, the subject heading list's structure is generally very shallow, with a limited hierarchical structure.

Image008主題標.jpg
圖7:Subject Headings 1
Image009主題標.jpg
圖8:Subject Headings 2
Image010主題標.jpg
圖9:Subject Headings 3


分類表(Classification Schemes)

v Classification Schemes, Taxonomies, and Categorization Schemes.

These terms are often used interchangeably. Although there may be subtle differences from example to example, these types of KOSs all provide ways to separate entities into "buckets" or broad topic levels. Some examples provide a hierarchical arrangement of numeric or alphabetic notation to represent broad topics. These types of KOSs may not follow the rules for hierarchy required in the ANSI NISO Thesaurus Standard (Z39.19) (NISO 1998), and they lack the explicit relationships presented in a thesaurus. Examples of classification schemes include the Library of Congress Classification Schedules (an open, expandable system), the Dewey Decimal Classification (a closed system of 10 numeric sections with decimal extensions), and the Universal Decimal Classification (based on Dewey but extended to include facets, or particular aspects of a topic).

Subject categories are often used to group thesaurus terms in broad topic sets that lie outside the hierarchical scheme of the thesaurus. Taxonomies are increasingly being used in object-oriented design and knowledge management systems to indicate any grouping of objects based on a particular characteristic. (The science of naming things is called taxonomy).

Image011分類表.jpg
圖10:Classification Schemes
Image012分類表.jpg
圖11:Taxonomies 1
Image013分類表.jpg
圖12:Taxonomies 2


索引典(Thesauri)

1. A thesaurus is a structure that manages the complexities of terminology and provides conceptual relationships, ideally through an embedded classification/ontology.

2. A thesaurus may specify descriptors authorized for indexing and searching. These descriptors form a controlled vocabulary (authority list, index language).

3. A monolingual thesaurus has terms from one language, a multilingual thesaurus from two or more language.

Image014索引典.jpg
圖13:Thesauri 1
Image015索引典.jpg
圖14:Thesauri 2
Image016索引典.jpg
圖15:Thesauri 3


語意網路(Semantic Network)

With the advent of natural language processing, there have been significant developments in semantic networks. These KOSs structure concepts and terms not as hierarchies but as a network or a web. Concepts are thought of as nodes, and relationships branch out from them. The relationships generally go beyond the standard BT, NT, and RT. They may include specific whole-part, cause-effect, or parent-child relationships. The most noted semantic network is Princeton University's WordNet, which is now used in a variety of search engines.

Image017語意網路.jpg
圖16:Semantic Network 1
Image018語意網路.jpg
圖17:Semantic Network 2


知識本體(ontology)

Ontology is the newest label to be attached to some knowledge organization systems. The knowledge-management community is developing ontologies as specific concept models. They can represent complex relationships among objects, and include the rules and axioms missing from semantic networks. Ontologies that describe knowledge in a specific area are often connected with systems for data mining and knowledge management.


索引典與Metadata的關係 Metadata

Image019Metadata.gif


索引典的功能:以數位典藏環境為例

1. 支援學習與吸收理解資訊

2. 協助研究者與實務者的問題釐清

3. 支援資訊檢索

- 提供知識為本的使用者檢索支援

- 支援資訊展現

- 提供索引的工具

- 促進多個資料庫的結合,或統一查詢多個資料庫

- 支援檢索後的文件處理


索引典的基本原理

1. 通則(概念的類型)

- 具體實體(concrete entities)

˜ Things and their physical parts

˜ Materials

- 抽象實體(abstract entities)

˜ Actions and events

˜ Abstract entitles, and properties of things, materials or actions

˜ Disciplines or sciences

˜ Units of measurement

- 個別實體(individual entities)

2. 詞的形式

- 名詞或名詞片語

˜ Adjectival phrases

˜ Prepositional phrases

- 形容詞

- 副詞

- 動詞

- 縮寫

3. 同形異義字(homographs)或一詞多義(polysemes)

4. 詞的選擇

- Spelling

- Loan words and translations of loan words

- Transliteration

- Slang terms and jargon

- Common names and trade names

- Popular names and scientific names

- Place names

- Proper names of institutions and persons

5. 範圍註與定義

6. 複合詞(compound terms)基本概念

- 定義

- 原則

- 考慮因素

-特質

-列入複合詞的要件

-不列入複合詞的要件

7. 複合詞(compound terms)的定義

- 是一種多字詞(multiwords)

- 將二個以上的字予以結合在一起,來表達一個語意(lexical unit)

8. 複合詞(compound terms)的原則

- 必須能夠在一個階層式或樹狀式的結構中, 來表達一個概念(a single concept)或思想(a unit of thought)

- 範例

˜ children and television

˜ adopted children

˜ educational television

9. 複合詞(compound terms)的考慮因素

- 作品保證(literary warrant)

- 索引典詞彙數量的管理

- 紙本式與電腦系統式

˜ Precoordinated

˜ Postcoordinated

- 避免檢索上的錯誤(false drops in retrieval)

˜ library science

˜ library science

˜ science library

10. 複合詞(compound terms)的特質

- 集中焦點(focus, head noun)

˜ 用以標引與識別較大範圍的層次(broader class)

- 辨別差異(difference, modifier)

˜ 用以標引與縮小較小範圍的層次(subclass)

- 範例

˜ concrete reinforced concrete

˜ glass stained glass

11. 列入複合詞的要件

- 分開會導致語意上的模糊或遺漏

˜ plant food, rose windows

- 單獨存在時, 會有語意模糊的現象

˜ composite drawings, first aid

- 修飾詞(modifier)已非原來的意涵

˜ trade winds

- 修飾詞已引導至另外一種意義

˜ butterfly valves, tree structure

- 修飾詞並不是對原有的層級概念(subclass)加以修飾

˜ rubber ducks

- 已是正式名的一部份

˜ United Nations

12. 不列入複合詞的要件

-焦點(focus)屬於某一屬性或物件的一部份;如果必須依存它時,仍使用複合詞

˜ office management = offices[object] + management[action]

˜ printed textiles (printed[action] + textiles[object])

- 修飾物件的一項動作,如果這項動作與物件有依存關係時,仍使用複合詞

˜ birds migration = birds[agent] + migration[action]

˜ dancing shoes (dancing[action] + shoes[object])

13.索引典的基本關聯屬性

13.索引典的基本關聯屬性.jpg


等同關係

1. 參照符號

- USE

- UF

2. 包括二類型的詞

- 同義字

˜ Terms of different linguistic origin

˜ Popular names and scientific names

˜ Common nouns and trade names

˜ Variant names for emergent concepts

˜ Current or favoured terms vs. outdated or deprecated terms

˜ Variant spellings, including stem variants and irregular plurals

˜ Terms originating from different cultures sharing a common language

˜ Abbreviations and full names

˜ The factored and unfactored form of a compound term

- 半同義字

Image020二類型的詞.gif

3. USE

4. UF

- Aves USE birds;birds UF Aves

- Outline USE shape;shape UF outline

5. 等同關係的類型(同義字)

a) terms of different linguistic origin

Examples:

cats / felines

freedom / liberty

sodium / natrium

sweat / perspiration

b) popular terms and scientific names

Examples:

aspirin / acetylsalicylic acid

gulls / Laridae

salt / sodium chloride

c) generic nouns and trade names

Examples:

petroleum jelly / Vaseline

photocopies / Xeroxes

refrigerators / Frigidaires

tissues / Kleenex

d) variant names for emergent concepts

Examples:

hovercraft / air cushion vehicles

e) current or favored terms versus outdated or deprecated terms

Examples:

poliomyelitis / infantile paralysis

developing countries / underdeveloped countries

f) common nouns and slang or jargon terms

Examples:

helicopters / whirlybirds

psychiatrists / shrinks

g) dialectal variants

Examples:

elevators / liftssubways / undergrounds

6. 等同關係的類型(半同義字)

Examples:

Wetness / dryness

Smoothness / roughness

- Generic Posting

Examples:

waxes  plant waxes

UF plant waxes USE waxes

Furniture

UF beds   beds USE furniture

UF chairs  chairs USE furniture

UF desks  desks USE furniture

UF tables  tables USE furnitire


1.3.5 層級關係

1. 參照符號

- BT

- NT

Image021 參照符號.gif

2. 包括四類型

- 屬種關係

- 整部關係

˜ Systems and organs of the body

˜ Geographical locations

˜ Disciplines or fields of discourse

˜ Hierarchical social structures

- 實例關係

- 多層級關係

3. 參照符號

- BT (Broader Term)

- NT (Narrower Term)

4. 例子1

mammals   vertebrates

BT vertebrate  NT mammals

5. 例子2

- anatomy vs. central nervous system

- Central nervous system vs. brain

6. 屬種關係

Image022屬種關係.gif

7. 代碼符號

BTG = Broader term (generic)

NTG = Narrower term (generic)

例子 ratsrodents

BTG rodents NTG rats

8. 整部關係

- systems and organs of the body

Example:

nervous system

central nervous system

brain

spinal cord

- Geographic locations

Example:

Canada

Ontario

Ottawa

Toronto

- Disciplines or fields of discourse

Example:

science

biology

botany

zoology

- Hierarchical organizational, corporate, social, or political structures

Example:

countries

states/provinces

cities

9. 整部關係的代碼符號

- BTP = Broader term (partitive)

- NTP = Narrower term (partitive)

Example:

Central nervous system nervous system

BTP nervous system  NTP central nervous system

10. 實例關係

Example:

mountain regions-class-  state capitals

Alps    -instances-  Albany

Himalayas        Trenton

11. 實例關係的代碼符號

- BTP = Broader term (partitive)

- NTP = Narrower term (partitive)

Example:

Fairy tales

NTI  Cinderella

12. 多層級關係

Image023多層級關係.gif

13. 多層級關係的節點標示(node labels)

Example:

Cars

by purpose

racing cars

sports cars


1.3.6 聯想關係

1. 參照符號

- RT

2. 包括二類型

- 相同範疇

- 不同範疇

˜ A discipline or filed of study and the objects or phenomena studied

˜ An operation or process and its agent or instrument

˜ An action and the product of the action

˜ An action and its patient

˜ Concepts related to their properties

˜ Concepts related to their origins

˜ Concepts linked by causal dependence

˜ A thing and its counter agent

˜ A concept and its unit of measurement

˜ Syncategorematic phrases and their embedded nouns

Image024二類型.gif

3. 參照符號

— RT (related term)

Example:

cells     cytology

RT cytology  RT cells

Example:(相同範疇)

boats     ships

BT vehicles   BT vehicles

RT ships    RT boats

Image024參照符號.gif

4. 相同範疇的聯想關係(衍生關係)

Image025相同範疇的聯想關係.gif

Example:(字順展現)

donkeys


horses

BT equines


BT equines

RT mules


RT mules

equines


mules

NT donkeys


BT equines

NT houses


RT donkeys

NT mules


RT houses

5. 不同範疇的聯想關係

—Disciplines or fields of study and the objects or phenomena studied, or the discipline!|s practitioners

Example:

mathematics


mathematicians

RT mathematicians


RT mathematics

neurology


nervous system

RT nervous system


RT neurology

botany


plants

RT plants


RT botany

—Operations or processes and their agents or instruments

Example:

temperature control


thermostats

RT thermostats


RT temperature

hunters


hunting

RT hunting


RT hunters

—An action and their products

Example:

scientific research

RT scientific inventions

publishing

RT music scores

—An action and its patient

Example:

data analysis

RT data

teaching

RT students

—Concepts related to their properties

Example:

liquids

RT surface tension

women

RT femininity

—Concepts related to their origins

Example:

water

RT water wells

information

RT information sources

—Concepts linked by causal dependence

Example:

injury

RT accidents

infections

RT pathogens

—A thing or action and its counter agent

Example:

害蟲

RT 殺蟲劑

腐蝕

RT 腐蝕抗化劑

—A raw material and its product

Example:

(拌水泥用的)粒料

RT 混凝土

獸皮

RT 皮革製品

—An action and a property associated with it

Example:

精確測量

RT 準確性

—A concept and its opposite

Example:

單身

RT 已婚者

寬容

RT 偏見

6. 多層級關係的節點標示(node labels)

Example:


Books

RT




Binding

printing


1.3.7 索引典的結構

概念與用語間的關係 (Concept-term relationships)

概念性結構(conceptual structure)的二大原則

—語意與層面分析(Semantic and facet analysis)

—層級 (Hierarchy)

應用範例

Linking to Full Text

Linking Sequence No. to Bio-sequence Databanks
Image027Linking to Full Text.jpg
圖18:Linking Sequence No. to Bio-sequence Databanks 1
Image028Linking to Full Text.jpg
圖19:Linking Sequence No. to Bio-sequence Databanks 2
Image029Linking to Full Text.jpg
圖20:Linking Sequence No. to Bio-sequence Databanks 3
Image030Linking to Full Text.jpg
圖21:Linking Sequence No. to Bio-sequence Databanks 4
Image031Linking to Full Text.jpg
圖22:Linking Sequence No. to Bio-sequence Databanks 5
Image032Linking to Full Text.jpg
圖23:Linking Sequence No. to Bio-sequence Databanks 6
Image033Linking to Full Text.jpg
圖24:Linking Sequence No. to Bio-sequence Databanks 7
Image034Linking to Full Text.jpg
圖25:Linking Sequence No. to Bio-sequence Databanks 8
Image035Linking to Full Text.jpg
圖26:Linking Sequence No. to Bio-sequence Databanks 9
Image036Linking to Full Text.jpg
圖27:Linking Sequence No. to Bio-sequence Databanks 10
Image037Linking to Full Text.jpg
圖28:Linking Sequence No. to Bio-sequence Databanks 11


Linking Individual Industrial Codes to the Full Scheme

Image038the Full Scheme.jpg
圖29:Linking Individual Industrial Codes to the Full Scheme 1
Image039the Full Scheme.jpg
圖30:Linking Individual Industrial Codes to the Full Scheme 2


Linking to Descriptive Records Linking Organism Names to Taxonomic Records

Image040Taxonomic Records.jpg
圖31:Linking Organism Names to Taxonomic Records 1
Image041Taxonomic Records.jpg
圖32:Linking Organism Names to Taxonomic Records 2
Image042Taxonomic Records.jpg
圖33:Linking Organism Names to Taxonomic Records 3
Image043Taxonomic Records.jpg
圖34:Linking Organism Names to Taxonomic Records 4
Image044Taxonomic Records.jpg
圖35:Linking Organism Names to Taxonomic Records 5
Image045Taxonomic Records.jpg
圖36:Linking Organism Names to Taxonomic Records 6
Image046Taxonomic Records.jpg
圖37:Linking Organism Names to Taxonomic Records 7
Image047Taxonomic Records.jpg
圖38:Linking Organism Names to Taxonomic Records 8
Image048Taxonomic Records.jpg
圖39:Linking Organism Names to Taxonomic Records 9


Linking Personal Names to Biographical Information

http://authorities.loc.gov

http://catalog.loc.gov/

http://virtua.lib.tku.edu.tw

http://www.lib.ntu.edu.tw/catalog/webpac/webpac.asp


規劃與設計原則

設計前提

1. 避免重複工作的投入

- 採用既有的索引典

- 以既有的索引典為基礎,進行小幅度新增,修改與刪除

- 發展新的索引典

2. 決定索引典的結構與展現格式

- 扁平式(flat)

- 階層式(hierarchy)

3. 發展方式

- 委員會:由一組學科專家組成

- 經驗式:從既有文獻中予以分析出所需的詞彙

- 混合式(hybrid)

4. 電腦工具的協助與利用

- 潛在的詞彙(candidate terms)與停用詞清單(stop list, ex. To, the et al.)

- 索引典詞彙的實際使用次數

- 索引典詞彙實際被使用(query)的次數

5. 索引詞的Metadata (Term Records)

- Descriptor, scope note (SN), 同義詞, non-displayable variations, NT/BT, RT, category/call no., 歷史註 (HN)

6. 索引詞的品質驗證(Term Verification)

7. 索引詞的精確度(Level of Specificity)

- 數量&成本

8. 未使用的索引詞(Unassigned Descriptors)

9. 公佈與存儲(Announcement and Deposit of Published Thesauri)


規劃原則

1. 範圍

- 學科與領域

- 核心主題—深入,邊緣主題—粗略

2. 資料類型

- 期刊—精細,圖書—約略

3. 資料量

- 資料量大小與成長率

- 量大小 vs. 成本高低

4. 使用者

- 學科專家 vs. 大眾

5. 問題類型

- 概括—粗略,明確—詳細

6. 詞彙組合方式

- 前組合或後組合

7. 詳簡度


建構方法

索引典的建構程序

1. 主題領域的界定

2. 索引典特徵與陳列的選擇

3. 公告

4. 演繹法 vs. 歸納法

5. 詞的選擇

- Terminological sources in standardized form

- Literature scanning

* Manual selection

* Automatic term selection

- Question scanning

- User’/experts’ experience and knowledge

- The complier’s experience and knolwedge

6. 詞的記錄

7. 發現結構

- Preliminary organization of the subjects covered by the thesaurus

- Analysis and grouping of terms within broad categories

* Analysis using a systematic display

* Analysis using a graphic display

- Editing the systematic display

8. 從systematic display製作字順索引典

— 傳統字順索引典的製作

— 字順展現伴隨分類展現

9. 與專家進行最後檢查

10. 介紹此套索引典

11. 編輯

12. 測試

13. 出版製作

14. 寄存於交換中心


管理

索引典的管理

1. 維護機制與例行作業

2. 索引典的修改

— 既有詞彙的修正(Amendment of exiting terms)

— 詞彙狀態的標示(Status of exiting terms)

* 原不用詞彙晉升為使用詞彙

* 原使用詞彙的階層下降或升級

— 詞彙的刪除(Deletion or demotion of existing terms)

— 關係的更新(Addition of new, or deletion of old relationships)

— 詞彙的新增(Addition of new terms)

— 既有索引典結構的調整(Amendment of exiting structure)

3. 索引典管理軟體

4. 索引典的實質形式

— 印成紙本

— 電子索引典

* 子檔案(Flat file ): 並不具備任何關係或結構

* 資料庫架構(Database structured)

* 超鏈結功能(Hypertext)


相關議題

1. 多語言索引典

2. 調和與整合索引典

— 調合式

* 從專業或單一學科轉換至某一常用或通用的索引典

* AAT→LCSH

* MeSH→LCSH

— 整合式

* 巨觀 vs. 微觀(Macrothesauri vs. Microthesauri)

-將某一專業或單一學科融入至某一常用或通用的索引典

* 變成單一索引典: 全部保留與並置,同時予以打散與建立


主要參考書目

Soergel, D. (2004). Thesauri and ontologies in digital libraries: tutorial. In ECDL 2004. ISO 2788-1986(E) (Documentation-Guidelines for the establishment and development of monolingual thesauri)

NISO (2003). Guidelines for the construction, format, and management of monolingual thesauri. ANSI/NISO Z39.19-2003

Aitchison, J., Gilchrist A. and Bawden, D. (1997). Thesaurus construction and use: a practical manual. London: Aslib.


參考資料

參與研發單位:技術研發分項計畫-後設資料工作組

提供單位:技術研發分項計畫-後設資料工作組

使用單位:各主題計畫