International Council for Science 
CODATA- Committee on Data for Science and Technology
13th ICSU - CODATA Conference in collaboration with the ICSU-Panel on World Data Centers, Beijing, October 1992
MATERIALS and ENVIRONMENT Databases  and  Definition  Problems:   Workshop  M   and  System  Presentation


Bridges and a Masterplan for Islands of Data
in a Labyrinth of 
Environmental and Economic Information: 

The HEMIS Design-Proposal as a Subset and Extention of
Retrieval and Information Management Systems 

Heiner Benking
Ulrich Kampffmeyer
Project Consult, Isestr. 63, 2000 Hamburg 13, Germany

Description of a Directory of Directories Information System, a pathfinder and catalytic system approach, nowadays called a "meta-database information system", addressing Awareness and Transparency by referenc-ing and linking repositories and thereby supporting and easing interaction within and between research, organizations, and management issues.
Beside approaches to bridge old and new data with different qualification and original acquisition scope, emphasise is laid on bridging coded and non-coded data-sources, use of facetted thesauri to collect items rele-vant to defined subjects, and finally new multi-media and hyper-link technologies indispensable to relate entities.
In addition to the requirements of the extremely complex area of environmental research and management, the language barrier inhibiting access and incorporation of knowledge barred by different meanings, terminologies and national languages pose an additional challenge.
With features possible only with multi-media optical information systems, ways of portraying and referencing to different repositories or data-sources, which is a prerequisite to the demand for "interaction along and across hierar-chical scales ( di Castri, Hadley 1988)
After the describtion of the international setting of the project, design considerations are covered to greater detail to share experience from past projects and set future development and application directions.

 Large (central) databases lead to disaster in understanding 
if the sources, definitions, or meanings of the data are not known 

Jeffers (1978)If you want a wise answer  you have to ask reasonably.J.W. Goethe

With the availability of new technologies, new approaches to old problems seem possible. Complementing the very important and ambitious project "International Oceanographic Date Archaeology and Rescue" project proposal (Levitus 1992), this paper a 3-year design and development effort, under the auspices of UNEP, is focussing on additional features for advanced filing and retrieval applications.
The task is to combine and orient all relevant data-sources for environmental measure-ment, similar to the one for environmental research and management demands the inclu-sion of hetereogeneous data-sources. This was considered possible only when using advanced linking, archiving, and retrieval technologies (Benking 1990a).

Fig 1: The CINCI - Canyon: The technology canyon between coded- and non-coded information. (Kampffmeyer, ONLINE ´92)
By describing lessons learned from the UNEP-HEM Project (United Nation Environ-mental Programme - Center for Harmonization of Environmental Measuement), the authors describe ways of bridging conventional and new approaches and present a design concept which was possible only with the experience gained at large economic, press archive, cor-porate, and bibliographical information systems. 

The Task was set by an initiative of the Environmental Experts of the Economic Summit (EEES 1986/1987) and presents the mission of the UNEP-HEM office "To enhance the capabilities and quality of information on the state of the environment world-wide in order to improve the provision to policy-making bodies, international programmes, and the scientific community, of the harmonized information required for the sound management of environ-mental resources".

Fig 2: HEM Goal, Objectives, Activities, Output
International Expert Group Meeetings (EARTHWATCH 1991) proved that no other organi-zation hold data and information in all the areas touched by such a broad task, and conventional design approaches could not match the inhomogenities and flexibility required to match the requirements (EARTHWATCH in print).

Beside the description of functions, features and benefits of the tabled design proposal, the paper describes functionalities of a preliminary prototype, which was developed by using commercially-available software and to prove thereby that a low-cost, broad-distribution concept is feasible.
A set of different access and selection strategies is seen as another critical step for effec-tive and predictive operations.
The matching of complex relations is seen as most critical.  Experience still needs to be gained in the management of links in the production environment and also at the work-place of the users.
Given the complexity and extraordinary requirements of this project, alternative ways have been proposed  and developed to strategically identify and select items, to provide a basis for further requirements or objectives.
Another obstacle is the lack of translators between computer systems on the application layer. - The authors follow closely the developments of EDI and UN/EDIFACT and feel that the system concept proposed is another step to homogenize data and information to be perhaps used later in supra-national environments. How well advanced the European com-puter industry in specific applications is, has been  presented impressively (Peeters- EDI 1992).

 "The major source of missed items in information retrieval systems comes from the inability of electronic retrieval systems to establish connections between two different formulations of the same concept and between two different descriptions of the same idea."

                            Alain Bonnet (1992) The important thing about any word is how you understand itBublilius Syrus Serventiae, 43 BC Harmonization versus Standardization
Alternating Strategies to avoid the sectoral data traps through transcendation (Croze 1983)

There is some discussion regarding the term harmonization and its use for technical and scientifiy community, beside the conventional use in political and legal domains. The authors believe that beside comparability and compatibility within one field there is a need to bridge within disciplines and technical approaches and compare, relate, and globalize activities between different organizations. 
Harmonization must be considered a bottom-up approach, which crosses boundaries and requires agreement which is most easily achieved by acceptance and broad usage. It is a true integrative effort and requires much more concern and sensitivity than physical stan-dardization and the quest for compatibility in physical and homogeneous environments. This definition is an extension of the CODATA definition (CODATA 1990), suitable for application in conjunction with other disciplines and supports the demand for interdiciplinary interaction along and across hierarchical scales (Castri, di; Hadley 1988).

Figure 3 Standardization and Harmonization 
The basic differences are found in the objectives and direction of both activities. Some definitions consider a parallelism in one field as harmonization and restrict the definition of standardization accordingly. The main difference might be found in the homogeneity of the topic to be covered. Due to the fact that the term harmoni-zation is primarily used in legal and federal environments, the authors would like to present a broader meaning for general discussion. Harmonization of measurements without description of the specific area of application is the only use so far encountered (EEES 1986,1987).  

For Abbildung 52 see:  LATE FOOTNOTE 1 (1996): 
This Figure was later used by Gerhard Budin in his habilitation at the TU-Vienna and is "standard" for the description of HARMONIZATION in this version in literature. See: Wissensorganisation und Terminologie - Komplexität und Dynamik wissenschaftlicher Informations- und Kommunikationsprozesse. Tübingen: Gunter Narr Verlag, 1996 (Forum für Fachsprachenforschung Band 28) - Please see also my footnote in: show-schau-postscript

This work of bridging started with the UN Harmonization Projects in the late 80ies, see GeoJournal on Harmonization and Access and Assimilation and "Bridges and a Masterplan" presentation at ICSU-CODATA 1992, was used in a "habilitation" at the Institute of Philosophy of Sciences and Social Studies of Sciences at the TU Vienna, and is now an important activity for terminology groups which are today increasingly concerned with Languages and Cultures, especially in Europe. 
and go to the sites and work of Prof. Dr. Gerhard Budin

Heiner Benking
Paradigma for Harmonization of Environmental Information 
It is hardly possible to harmonize, structure and standardize the nomenclature and thereby the data already existing in environmental sciences and management given the hundreds of institutions. One possibility for harmonization is access via standardized retrieval of information of different sources, structure, and quality. The above axiom was confirmed during the production of the compilation of (A Survey of Environmental Monitoring and Information Management Programmes of International Organizations 1991). Any attempt to present the inter-relations in this field in linear or hierarchical form, was doomed to failure; even though the tabular presentations or listings provide insights into international activities, the match with reality in two dimensions is poor or even misleading, for example crossing monitoring programs and problem areas, or organizations and interaction or contri-butions.  For further review of international structures, co-operations, and linkages see (Hansen 1991), (Judge 1992).

Design Scope and Process
The design phase has been supervised by an International Expert Group, by consultations with primary users, and other interested parties worldwide. The potential user community identified includes the general and specialised UN agencies, development banks, bilateral aid agencies, national governments, and natio-nal and international non-governmental organi-zations (NGO´s). The results of these consultations are summarized in Table 1.
  •  A need for information exchange and a communication platform 
  •  Addressing a subject matter scope which uses a broad interpretation of "environment" with a referral system or meta-database containing information about the main category areas of environmental information and programmes, databases, measurement techniques, models, reference materials etc.
  •  A high-level information system, containing generalized information for planning and pro-ject management, and for identifying the sources of more detailed data. HEMIS is not designed to contain detailed primary data, but rather information about data. In this sense, HEMIS will be a meta-database and information system
  •  An information system which shall allow access to multivariate information deriving from different partners and projects with a standardized and harmonized method of access
  •  A thesauri-based information system for multilingual access via standardized nomenclature
  •  Local provision of information
  •  Easy to use multiple access strategies
  •  Export and import functionality: import with indexing, export into different report and open formats. Automatic indexing and transformation of information from partner institutes for harmonized access via HEMIS
  •  Low-cost access to HEMIS leads to implementation on personal computers (PCs) and industry standards like Microsoft Windows or IBM OS/2. 

Tab 1: Governing factors and stipulations as set by the International Expert Group Meetings
Functionality and scope depend on acceptance, distribution channels, and co-opera-tion to share the necessary effort and the definition of various levels of user commu-nities, besides the requirements of the HEM office. Specific requirements of organiza-tions involved or inter-viewed are summarized in the User Requirement Study (EARTHWATCH, in press).

Project Scope
Possible Coopertion Partners
According to a brochure developed for UNEP-HEM, and forming part of this article (HEMIS Design and Development Status Report 1992), all specialized U.N. agencies hold consid-erable relevant environmental and related data and other meta-data for their fields, in some cases (such as WMO) in highly auto-mated form. Organizations like UNESCO, OECD, WRI, EEA-TF, ESA, and IOC could provide inventories or cooperate in other ways.

Fig 4: Harmonization and Distribution of Information via HEMIS

Fig 5: Start Screen of the Proposed Prototype

Requirement Summary
The value is seen in added retrieval functionality and access to information, which is only partially or regionally available. The approach of the current design proposals was approved. Geo-graphic and temporal scope were requested. Its open design may be enlarged in future for other tasks. The need for information exchange and communication by elec-tronic media was broadly accepted. The wide interpretation of the term "environment" was seen as the only possible way to match with the latest technologies bridging cross-sectoral subjects. The need for a referral and linking system such as HEMIS was confirmed. 
The following detailed concept of the system incorporates the major user requirements. Modules for interchange will be provided along with the co-operation agreements under nego-tiation, as well as automatic indexing and transformation tools for standardized and harmonized input from external sources. The above list of requirements, without factual, political or budget restric-tions, provides a proper view of the task to be tackled when addressing all possible demands of the widest range of potential users. The final design has to take this heterogeneity into account, as well as distribution and pricing policies. 
Most of the data in HEMIS is directly accessible via the data fields. Other information is added, i.e. for explanatory purposes, help, additional non-indexed data, or non-coded infor-mation like scanned images.
Index information may be manually created or automatically generated via an automatic index-ing and translation facility. Files from partner institutions can be converted into HEMIS data-base access information. Guided tours, hyper-links and access information for non-coded data have to be added manually.
HEMIS, therefore, is in principle a system at the UNEP-HEM office for the gathering, harmoni-zation and preparation of an information-base, which will be distributed to all interested institutions, including the access informa-tion on digital media.

Design Considerations
There is very little chance of harmonizing, structuring, or standardising the nomenclature and data already existing in environmental science, given the hundreds of different institutes who will be producing information for the HEMIS system. The only possi-bility is harmonization of access to the informa-tion from different sources, and of different structure and quality by standardized access methods. Therefore controlled nomenclature, multilingual thesauri, access via selection lists of key-words, and a context-sensitive help function are essential to fulfil the harmonization task in complex and inhomogeneous fields like materials and environment. A preliminary High Level Entity Relationship Model (Fig 6) was developed and a final Design Process was started but not completed yet.

Fig 6: Example of an High Level Entity Relationship Model

HEMIS is an Information System
It is not the aim of UNEP HEM to create another standard meta database, since meta data-base systems already exist in many different fields. HEMIS is an uni-versal information system which will allow scientists and administrative staff, as well as interested members of the public, access to environmental information which informs about original data in data-bases, publica-tions, reports etc. HEMIS will use the information from other databases and meta databases and integrate it in a uniformly accessible form. The objective is to bring harmonized information about environmental institutes, programmes, databases, etc. to the fingertips of every PC-user, and raise public awareness of what is being done in environ-mental monitoring world-wide.

Multilingual Access
HEMIS has to be multilingual to allow access to information which is probably originally not available in the native language of the user. It is very important to make sure that users learn which work has been already carried out on similar subjects, even when they are not familiar with the terminology or language used in the original documents. The HEMIS THESAURUS (thesaurus of main subject keywords) has to be implemented in such a way that it acts as an electronical translation, guidance and orientation tool.
Distributed Information versus On-line Access
The world is at the threshold of multi-media information technology: information is no longer presented only as data or text, but as image, video, voice, graphic etc. and as combined elec-tronic documents of all these types. The design of a world-wide accessible information sys-tem has to take account of such future developments.
An on-line database system is not able to handle and provide the user with huge masses of non-coded information due to existing telecommunication transfer rates. The system approach is designed from a interactive point of view. Techniques like guided tours and hyperlinks combined with non coded information (NCI) and coded information (CI) cannot be used on-line with traditional retrieval systems. The development of an on-line enhance-ment is therefore a task to be viewed along with the developments of the information and communication industry.
The proposed design of HEMIS is based on two system platforms: one internal system at the UNEP HEM office for in house use: i.e. building up the information basis, integrating data from partner institutes and producing media for external retrieval. The external retrieve-only stations build up a local system based on the distributed media. Both require different software and hard-ware modules. All modules are based on industry-standard compo-nents for flexibility and future increases in capability. The software system is designed in such a way that only the inter-action between different modules, the special thesaurus facility, and the transformation modules for the inte-gration of partner data have to be developed indi-vidually.

Keywords versus Fulltext
Some database and information systems use fulltext retrieval soft-ware (Kampffmeyer 1991a). This allows searching for every word in every combination. However, a fulltext system cannot assure that the searched-for information is indeed the information that was wanted, or that it is complete. At the present stage of technology a fulltext system can-not be used to do the harmoniza-tion and transfor-mation task of HEMIS. Harmoni-zation information has to be structured. The best choice is therefore a standard database system based on keywords widely organized in selec-tion thesauri which allows referencing to different synonyms, homonyms, translations, acronyms etc. A keyword-oriented system with controlled nomenclature assures that the user indeed finds all the information he is looking for.
PC-Computer as System Platform
Most computers world-wide are able to run MS-DOS. The internal system at UNEP-HEM as well as the external local stations will be based only on standard PC components. The multi-media database will run under a Windows graphic user interface. The only peri-pheral which is not used in large numbers at present is the CD-ROM drive. The overall investments for both system platforms - internal and external - will be very low. This will help to distribute the infor-mation world-wide.

HEMIS leads to closer Co-operation
HEMIS can incorporate information from all institutions engaged in environmental science of international, global, or regional significance - private, public, and industrial. The distribution of HEMIS infor-mation world-wide, via a standardized media like CD-ROM, is of great inter-est for every institution and will lead to a deeper cooperation not only with UNEP HEM, but  also with others who deliver information to HEMIS.
The HEMIS information basis may also be of interest to industrial sponsors, who could not only provide data, but also use the system to disseminate the information promoting their environ-mental activities. Thus HEMIS is not only a harmonization tool in itself, but also an information platform for all people, institutions, companies and administrations with envi-ronmental concerns. 
HEMIS Software-System Layout
The software is devised in several layers with different tasks. The layers of the retrieval software are shown below.

Fig 7: System Architecture with User Interface, Data Base, Information Retrieval System, and Document Level of the HEMIS retrieval software.

The User Interfaces
HEMIS includes two different types of user interface. Both are based on the graphical SAA standard. One is designed for information gathering, system maintenance and production of the information base at UNEP HEM. The second is the user interface for local access to the information distributed by UNEP-HEM. It is a subset of the user interface for the production and management system, allowing only retrieval and report capabilities. 
Both types of user interface make use of graphic features like icons for calling complex operations etc. All user interfaces can be switched between the available national lan-guages. Every text on the screen is related to a digit which refers to a file with text entries in the specific language. Fur-thermore, the keywords in the selection and multiple selection list are also switched to the currently operating language. 
The Database - an Object-Oriented Reference System
HEMIS will include different databases and an information management system for the media used (object access database). The databases themselves will be a relatio-nal pro-gram system available as a standard product. The stored information (data set, text file, image etc.) are objects (referred as documents), which are linked via the unique document identifier with the descriptors. 
The complete system is based on a reference model. All selection lists and thesaurus entries  are stored with reference to a unique identifying number. The descriptor data-base contains only the references between the unique identifiers and related object identifiers for all selection lists and thesaurus fields.

Using independent database modules and access strategies speeds up the system and allows very flexible usage. The splitting of processes to prepare a database search with the thesaurus (which is in fact a database of its own) produces a hitlist with the descriptor database entries. Only the documents choosen are loaded from the external storage medium. Changes in the thesaurus, for example, have no effect on the data already stored. The database retrieval runs on fast magnetic media and only the docu-ments themselves need to be transferred from the external media to the user desktop. This is important when slow optical storage media are used.
The amount of data to be managed by the descriptor database is very small compared with conventional database systems. The need for stor-age capacity is relatively small, especially in comparison with fulltext databases, even if some hundreds of thousands of references have to be man-aged.

The system will support a great range of field types: standard fields for date and time as well text fields for individual input. Most of the fields will be organized as a structured one-dimen-sional selection list or as a multi-dimensional thesaurus for the controlled use of nomen-clature. The use of a selec-tion list avoids typing errors. The selection list is useful for retrieval to show the user the available keywords. Not all selection lists must have the structure of a two, three or more level thesaurus. Selection lists will be used for countries, sectors, biomes, and other purposes. Selec-tion lists are represented in the database only by a few digits even if there is a long text dis-played. Only the referenced entries in selec-tion lists or thesauri can be used for automatic translation purposes. The database is also able to handle large free-text fields which have some abilities of full-text database systems. The free-text fields are not opti-mized for access.
The information retrieval and object access database (IRS) contains the logical and physical address information of the documents. A separated database and media management system is necessary for the access of optical media. This Information Retrieval System (IRS) holds only the document IDs and their references to the objects on the external media together with management information of the media itself. Such a module is necessary especially to handle informa-tion on multiple optical media.

The Hyper-Link Database
A second mode of access is the hyper-link technique. The object oriented approach of the data-base (see below) allows linking of all kinds of entries (datasets, files, images etc.; referred to as document) with hyper-links.
There are two different ways to organize the management of hyper-links. One is to store hyper-links in a dedicated database. A set of predefined link types, represented as digits to save storage space, is connected with a list of document identifiers. This feature enables to the creation of "guided tours" for unskilled users, leading them from a start docu-ment through a series of related documents without starting a new search action.

The other type of link is part of the document itself. Only when a document is retrieved are those links related to the document available. If such a document is selected from the hitlist and brought to the screen for display, the user can select the available links from a special menu. The links available are displayed as a selection list which is comparable to a hitlist. A selected link then leads to the document identifier of a linked document. This fea-ture also allows crea-tion of links bet-ween documents which may at first glance have no connection.
Object links stored together with the data component of documents avoid the storage of all links and their relations in a huge database. In the first phase of the HEMIS system, the links are created manually by scientific staff. When integrating data from partner institutes the transformation programme will support the staff by proposal lists to create links more easily (link database for guided tours and hyper-links inte-grated in the document header).

Thesauri and Selection Lists
The use of a standardized nomenclature not only has a lot of advantages, it also leads to several problems, like: definition of key-words and hierarchical structures, point of view, inter-pretation of terms, spelling, acronyms, etc. For easy use of the system, and to allow a standard-ized access, a thesaurus structure is used for single selection and multiple selec-tion lists refer-ring to the SAA and Microsoft Windows standards.
The selection list opens if an entry field is activated. The user is then able to make a choice. There is no chance of typing errors and the user gets information only about the content of data available in the active field.
This thesaurus facility can be linear or hierarchical. Linear means only one entry is select-able, hierarchical means after a selection was made, a new selection list opens displaying entries which are related or subordinated to the chosen subject. The displayed keywords may be underlaid with synonyms, acronyms, explanations, etc. This information is also avail-able via global search.

Fig 8: Internal Structure of Keywords and related Information according to the ISO-Standard

Fig 9:  "Slice" Model of the Thesauri
The keywords and their related information in each "slice" point to the same unique identifier (ID) Only the ID is used for retrieval in the database. The thesaurus acts as a pre-processor.
The thesaurus is organized in a network structure, which is only represented as a hierar-chical order. This means different entries may occur at different hierarchical levels due to their mean-ing in different scientific context. The hierarchy is mainly used for the visual organization of the keywords, which helps the user to navigate through the data. Due to the network struc-ture, the same keywords may occur several times at different positions in the thesaurus repre-sentation.
Every main keyword is related to one unique identifier. Predecessor, successor, explana-tions, etc. refer to the same identifier in the chosen language. The position in the network is defined through one or more predecessors and a number of following descriptors (successors). Successors and predecessors are used for modelling the ISO-standard rela-tions like broader term, narrower term, crosslink i.e. The thesauri are designed for use as a multilingual tool. Different tables point to the same unique identifier (Kampffmeyer 1992a, 1993 in print).
HEMIS will include a number of selection fields with thesauri. An example for a two-dimen-sional thesaurus is the list of continents with related countries. The main thesau-rus, the HEMIS thesaurus, is the four-dimensional subject keyword thesaurus. This the-saurus is based on the INFOTERRA definitions.

The contents and structure of a "language slice" can be adapted to the national require-ments. This includes the ranking in the hierarchy, predecessors and followers, number and meaning of synonyms etc. 
Only the reference between main keywords and their unique identifier is not allowed to be changed. In this way the thesaurus is not only a translated structure but an interpre-tation which fits to the differences of the languages used. The harmonization effect is that when the user uses the thesaurus for access he will be led by the standardized keywords to informa-tion which was originally described with other keywords or in another context.

This structure allows the different slices of the thesaurus to be developed separately by differ-ent partner institutions. UNEP HEM defines the "main descriptors" and the basic struc-ture. The partner institutions than translate this structure into their native languages. It is even pos-sible to use several slices in the same language. The structure, as a network, will also allow the addition of new categories and main descriptors without changing the structure.

In the first stage UNEP HEM will use other exist-ing thesauri defini-tions in use in environmental science (i.e. INFOTERRA and others) for the HEMIS main thesaurus of subjects. The HEMIS thesaurus may be enlarged by the partner institutions in deeper hierarchy levels. For example, if UNEP chooses to create a hierarchy with four levels, partner institutions may add a fifth or sixth level in their language slice to give more details on special subjects. Equally, they may approach UNEP HEM with their requirements for more keywords, which will be added by UNEP HEM with further releases. The thesaurus is based on a reference model which even allows use of other predecessors and followers in the diffe-rent slices without any loss of information and is realized as an independent SQL-database. The Thesaurus maps entries and selection list items into unique identifiers, which are entered into the descriptor data-base to ease the creation of hitlists. The thesaurus is realized as a network, so that descriptors are not to be organized hierarchically only. Various selection lists may guide the user through the search process. Alternatively, global search functions may be employed. Most interesting is the harmonization effect and repeatability in various lan-guages or terminologies, created by referencing to the mains. Thesauri range from 4-5 digits for subjects, 2-3 digits for geographic, to simple one-digit multiple selection lists, and act as pre-processors.

Fig 10: Internal Structure of each Keyword and related Information in the Thesaurus
The unique identifier (A) is used for accessing the reference database. Thesaurus data, designed in a multi-dimensional structure, set in different tables (slices) point to the same unique identifier. The logical structure of a keyword in the thesaurus structure is defined in the network of one or more predecessors (B) and ID-numbers of the following descriptors (C). The network allows uni- and bi-directional links. The structure is independent from the hierarchical level of the original hierarchical position of the predecessors and successors. The selection list on a lower hierarchical level is individually created regarding the entries marked on the higher level and the previous entries which led to the current position in the thesaurus network. The main descriptor (D) is the keyword, which will be displayed inside a multiple selection list when the thesaurus is used for retrieval. The position in the hierarchy (E) defines at which level of the hierachical ordering the keyword was originally situated (in the paper-based standard hierarchical structure - see also high-level entity relationship model Fig 6) and where it will be displayed using a tree overview function. The field (F) contains a list of e.g. synonyms, homonyms, abbreviation, plurals, Latin definition, chemical formulas, acronyms. This feature allows one keyword to be connected with all definitions  which are not used in the restricted "main thesaurus". These definitions can be integrated without loss of information in this "synonym" field. The harmonization effect using the thesaurus for access is to lead the user by the standardized main keywords to other information as well which was originally de-scribed with other keywords or in another context. The additional keywords are not dis-played in the thesaurus hierarchical structure but are retrievable by a global search. The last field (G) is used to include a text explanation which is displayed as a context-oriented help function. This offers the possibility of giving a detailed explanation how a keyword is defined and used.

Retrieval with Thesaurus and Global Search
When the user starts a search operation in the HEMIS system, he may choose between two possibilities for primary access :
The user may open the thesaurus window on its first level by mouse or keyboard action. He is allowed to choose one or more keywords from the displayed list. A new window opens, display-ing the next level of hierarchy. If two or more keywords were chosen on the prede-cessing level, a mixed list of all keywords is generated, displaying all keywords belonging to all chosen primary keywords. From the second level he may also choose one or more of the entries displayed which will lead to a third selection list, or he may start the retrieval proc-ess. Starting a retrieval action is allowed only from the second or one of the following levels to avoid much too long hitlists. From the second level on, he may also use the global search field for refinement purposes which is then used in an "AND" mode only. If more than one key-word is chosen the user may indicate whether he wants to search in an "AND" or "OR" mode.

Fig 11: The Retrieval Process

The global search facility is field independent. It acts as QBE (Query By Example). The user types in the keyword he is looking for. A box opens demanding him to specify whether he wants to:

 search only in the main thesaurus in the language used, search as well in the synonyms, acronyms etc.
 search in the help texts as well 
 search in the main thesauri of another language slice as well, if available
 search in the synonyms, acronyms a.s.o. of another specified language slice as well, 
 if available

Tab 2: Search Modes

If an option is chosen, the program will indicate that the retrieval will take more or less time. Global search takes more time than a search in the thesaurus mode. The user may also use left-hand truncation and/or wild cards. 
Every retrieval leads to a hitlist and the main keyword is displayed. Now the user may choose to select one or more objects (see database structure below) for display or print. He may also refine his request, changing to the thesaurus mode (see below). If too many entries are found, the program changes automatically to the refinement mode using the thesaurus.
Creation and Editing the Information Basis and the Thesaurus Entries
ASCII information delivered as files in a predefined format will be automatically trans-formed and referenced to the unique identifiers or related fields. This is done by indivi-dual trans-forma-tion programs which are based on the thesauri, selection lists, and field contents definitions. For each file format a parameter and transformation file is created once. Another tool is used for the creation and maintenance of the thesaurus and selection lists them-selves.
The thesaurus is a creation and editing tool which supports translation of the thesaurus entries. It is an interactive program, which for translation displays an editing mask with the original entry and asks in a second mask for the corresponding entries. Changes in the existing structure can only be made by UNEP HEM staff. Partners are allowed to add infor-mation in specified fields of the main thesauri and to fill in the translation in their language slice.
The thesaurus tool, the selection lists, and the basic HEMIS thesaurus with the main key-words are to be distributed on diskettes for edition and appending purposes.

Providing some unconventional new pathways to existing data and perhaps triggering new research and management approaches, the paper deals beside the prior defined "second generation" of environmental information" (Averous 1990) with a third, networked or linked  information generation. Descriptive and contextual information are required of indi-vidual numbers to understand and put findings to use, beside the recent focus on quality issues (QA/QC) in the environmental management agenda . - Studies of the differences in retrieval strategies between different leading research- and technology-driven nations show a correlation, for example, between the number of patents registered and the way source and descriptive data are requested and used (Hoetker 1991).
The results envisoned will provoke new information from old data, the possible comparison of old and new data (lineage databases), new applications possible with new realms or comprehensivess of information, and possibly better utilisation of human reasoning and association powers. Effective and economic access and selective approach to the evolving information glut, especially when managing qualitative and descriptive information together with the data, is the real critical point. Only strategic access and restriction to of original data (Benking, Kampffmeyer 1992), not secondary "sources", might help to avoid getting drowned by the data- and information glut to come.  - The development of data translators (standardization and harmonization) is a prerequisite of information exchange on the application layer. 
The capabilities of commercial of-the-shelf software packages have increased dramatically, especially in the lower price segments, even for complex tasks like filing and communication (Adamik 1992). The main interest naturally stays with the data itself, which can be transferred between packages and applications. 
Concepts for exchange and transposition are needed. Inconsistent toolboxes are broadly available, but this only worsens the dilemma. - This paper would like to contribute some practical means to the continuing feasibility studies and discussions about the scope and breath of the envisioned object-oriented reference system.

 Up-to-date technology, available world-wide, based on standard platforms
 Indirect harmonization effect
 Software platform can be used without major changes to internal or external systems
 Easy adaptation and integration of different classification and nomenclature 
 Multilingual access to multi-media information
 Easy-to-use information system for different user communities:
 scientific partners, officials in administrations, and unskilled private users 
 interested in environ-mental science, programs and project

Tab 3 Highlights of the Design Concept


Abbreviation of Organizations
BMU   Federal Ministry for the Environment (Germany)
CAS   Chinese Academy of Sciences
CEC   Commission of the European Communities
CODATA  Committee on Data for Science and Technology
DARA   German Space Agency (Germany)
EEA-TF   European Environmental Agency- Task Force (CEC)
ESA   European Space Agency 
GEMS   Global Environmental Monitoring System (UNEP)
GRID   Global Resource Information Database (UNEP)
HEM   Harmonization of Environmental Measurement (UNEP)
ICSU   International Council of Scientific Unions
IRPTC   International Register for Potentially Toxic Chemicals (UNEP)
IUCN   International Union for the Conservation of Nature and Natural Resources
MARC   Monitoring and Assessment Research Centre (UNEP)
NASA   National Aviation and Space Agency (USA)
SRU   German National Environmental Advisors (Germany)
UNEP   U.N. Environment Programme (UNO)
UNESCO  U.N. Educational, Scientific and Cultural Organization (UNO)
WCMC   World Conservation Monitoring Centre (UNEP/IUCN/WWF)
WDC   World Data Centres (ICSU)
WHO   World Health Organization (UNO)
WMO   World Meteorological Organization (UNO)
WWF   World Wide Fund for Nature


benking-budin - please note the change to the article in 2001 after learning about Budin nicely referencing,
using and further elaborating the concept and idea of relating standardization and harmonization in contrast to homogenization.