CONSTRUCTING A THESAURUS WITH

 REFERENCE TO AGROVOC
 

                 S. M. Mannan, Suraiya Begum and Minhajuddin Ahmed
 

Sometimes we talk simply for the sake of hearing ourselves talk, and for the same reason we dance, or sing in the bathroom. The activity gives us a pleasant sense of being alive.1 Generally, noises we thus make are called "noise for noise's sake." But in the age of information explosion, indexers or information scientists do not make such noises when they try to make relationships between two concept? or subjects. Rather they are very much interested to see how a word becomes inevitable for another word.2 The indexer depends on logic as to make relationship between two terms is a question of judgment . Index is the necessary communications link between information and information users.

There is a current tendency to talk of "Index Language" rather than "Indexing System."

The indexing languages include:

1. Classification,

2. Subject Headings, and
3. Thesaurus.

Classification helps to arrange subject in a collection. A subject is the topic treated in a book, video-tape or other work. A subject heading is the word or phrase used in the library catalogue to express this topic.
3
 

The word 'thesaurus' derives from Greek and Latin words mean 'a treasury' and it has been used for several centuries to mean a lexicon or treasury of words. An interesting and entertaining historical account has been given by Karen Sparck Jones, who traces the origins of 'Synonymy' in dictionaries and identifies the main difference from natural languages: the thesaurus involves "vocabulary normalisation."4

Modern thesaurus may be said to date from 1852, when the first edition of the Thesaurus of English words and Phrases was published by Peter Mark Royet.

The thesaurus tries to group words together according to the subject concept, thesaurus is a "terminological control device used in translating from natural language or documents, indexers or users, into a more constrained system language. It is a controlled and dynamic vocabulary of semantically and generically related terms which cover a specific domain of knowledge."5

International Standard Organization (ISO) defines thesaurus "as a structured subject of natural language. It (thesaurus) describes the subject content of documents, objects, or collection of data."6

From the above definitions it is clear that thesaurus is a list of subject headings where the relationships between the subject headings are clearly stated.

A particular thesaurus should accurately reflect the information content of the body of documents or other items in a collection which the thesaurus addresses. It should contain terms and cross references appropriate to the subject document collection and the language and the information needs of users.

The following thesaurus have been developed by the different international information system and services for indexing and retrieval purposes :

(a) INIS thesaurus

(b) Thesaurus of Engineering and Scientific Terms (TEST)

(c) MeSH (Medical Subject Headings)

(d) AGROVOC

(e) Macrothesaurus for information processing in the field of economic and social development.

Thesaurus Structure

The internal form of individual entries and the arrangement of the various entries in relation to others constitute the structure of a thesaurus. Cross reference in a thesaurus makes explicit the ways in which entries relate to each other in a network of concepts. The terms permitted by a thesaurus for use in indexing are called 'Descriptor.' Descriptors are something that describe or convey the sense of an individual concept that may be wide or narrow in meaning. The relationship between one term and others in the thesaurus is indicated by the following abbreviations/symbols.

Uf:  Used for Indicates the lead, from which reference is made

USE: Use -
Indicates preferred terms

NT: Narrower term
- Indicates a more specific term, one level lower in hierarchy

BT: Broader term - Indicates a more general term, one level higher in hierarchy

RT: Related term - Indicates conceptual relationship between term related hierarchically

AT: Additional term - Indicates an alternative term for the same concept.

These entries, can be further explained by the following examples :

HEDGING PLANTS

UF Fences (living)

UF Living Fences

BT Ornamental Plants

NT Carpinus

RT Protective Plants

The basic elements in a thesaurus are the individual words, terms or phrases which are often called 'Descriptors' or Keywords.' Some writers use these two as synonyms: others make distinctions of various types. It may be mentioned here that relationships and symbols used to express them have always been an integral part of the list of subject headings. They have been of two simple forms nearly always called 'See also' references. 'Use' and 'Used for' symbols in thesaurus perform almost the same functions of 'See' and 'See also' of subject headings. It is important to make standard use of descriptors, key words and terms.

Definitions and explanations have to be given whenever there is a need to state the precise meaning of a particular term in the particular context of any thesaurus.

The term 'elevation' has several different meanings in technology and 'Public School' has more or less opposite meanings in the United Kingdom and the United States. To ensure consistency in use, and in order not to mislead searchers, it is
necessary to add a 'Scope Note' (SN) immediately under the term, preferably in parentheses, e.g. :

PUBLIC SCHOOL

 SN         (In the United Kingdom, an independent foundation which does not receive funds from the state.)
(In the United States, a school established and supported by the state system of education.)

Dagobert Soergel gives good examples of such homographs:

Seal 1 (Marine fish)
Seal 2 (Documents)
Drill 1 (Instruction)
Drill 2 (Agriculture)
Drill 3 (Fabric)

Not every term needs a scope note, but their presence is of considerable help in using thesaurus correctly and, indeed, in reaching a correct understanding of the field of knowledge concerned. The descriptors are usually arranged in word by word sequence and appear mainly in the capital letters.

Relationships

"The UN1S1ST Guidelines" list a number of types of relation that may occur in a thesaurus.8 The following major types of relationships generally exist between descriptors in a thesaurus.

i) Hierarchical relation: This includes genus/species and thing/type relationship, e.g.

Dog (genus)

Pekinese (species)

Banks (genus)

Deposit Banks (species)

The most widely used symbols representing generic relationship in alphabetical thesaurus are:

BT (Broader term)

NT (Narrower term)

                                 For example:     Paintings                 Graphics Arts

                   BT Graphic Arts NT       Paintings

   ii) Part-Whole relation: e. g.

              France
         NT Paris

   iii) Generic relation: i.e. something is the predecessor of another thing: e. g.

Father
RT   Son

  iv) Cause and effect relation: e. g.

Cultivation
RT   Plant production

   v) COSSCO relation: This refers to generic-specific relationship as a
       part of semantic relationships, namely

. CO-ordinate

Super-ordinate      COSSCO
Subordinate
Collateral

The 'COSSCO' relationships may be represented by 'family tree' type of structure/ in which the various steps in hierarchical sub-division of a class are shown.
An example is:
9

        

 vi) Anonymity, i.e. a concept is the opposite of another concept: e.g.

Hardness
RT Softness

                                                        Next
 

                                                      
 

 Go To Home Page                                                   Go To Article Page    
         

© Information Science Today All Rights Reserved
            
info@infosciencetoday.org