Bridges and a Masterplan
for Islands of Data
in a Labyrinth of
Environmental and Economic
Information:
The HEMIS Design-Proposal as a Subset and
Extention of
Retrieval and Information Management Systems
Heiner Benking
Ulrich Kampffmeyer
Project Consult, Isestr. 63, 2000 Hamburg
13, Germany
Abstract
Description of a Directory of Directories
Information System, a pathfinder and catalytic system approach, nowadays
called a "meta-database information system", addressing Awareness and Transparency
by referenc-ing and linking repositories and thereby supporting and easing
interaction within and between research, organizations, and management
issues.
Beside approaches to bridge old and new
data with different qualification and original acquisition scope, emphasise
is laid on bridging coded and non-coded data-sources, use of facetted thesauri
to collect items rele-vant to defined subjects, and finally new multi-media
and hyper-link technologies indispensable to relate entities.
In addition to the requirements of the
extremely complex area of environmental research and management, the language
barrier inhibiting access and incorporation of knowledge barred by different
meanings, terminologies and national languages pose an additional challenge.
With features possible only with multi-media
optical information systems, ways of portraying and referencing to different
repositories or data-sources, which is a prerequisite to the demand for
"interaction along and across hierar-chical scales ( di Castri, Hadley
1988)
After the describtion of the international
setting of the project, design considerations are covered to greater detail
to share experience from past projects and set future development and application
directions.
| Large (central) databases
lead to disaster in understanding
if the sources, definitions, or meanings of the data are not known Jeffers (1978)
|
If you want a wise answer
you have to ask reasonably.
J.W. Goethe
|
Introduction
With the availability of new technologies,
new approaches to old problems seem possible. Complementing the very important
and ambitious project "International Oceanographic Date Archaeology and
Rescue" project proposal (Levitus 1992), this paper a 3-year design and
development effort, under the auspices of UNEP, is focussing on additional
features for advanced filing and retrieval applications.
The task is to combine and orient all
relevant data-sources for environmental measure-ment, similar to the one
for environmental research and management demands the inclu-sion of hetereogeneous
data-sources. This was considered possible only when using advanced linking,
archiving, and retrieval technologies (Benking 1990a).
Fig 1: The CINCI - Canyon: The technology
canyon between coded- and non-coded information. (Kampffmeyer, ONLINE ´92)
By describing lessons learned from the
UNEP-HEM Project (United Nation Environ-mental Programme - Center for Harmonization
of Environmental Measuement), the authors describe ways of bridging conventional
and new approaches and present a design concept which was possible only
with the experience gained at large economic, press archive, cor-porate,
and bibliographical information systems.
The Task was set by an initiative of the
Environmental Experts of the Economic Summit (EEES 1986/1987) and presents
the mission of the UNEP-HEM office "To enhance the capabilities and quality
of information on the state of the environment world-wide in order to improve
the provision to policy-making bodies, international programmes, and the
scientific community, of the harmonized information required for the sound
management of environ-mental resources".
Fig 2: HEM Goal, Objectives, Activities,
Output
International Expert Group Meeetings (EARTHWATCH
1991) proved that no other organi-zation hold data and information in all
the areas touched by such a broad task, and conventional design approaches
could not match the inhomogenities and flexibility required to match the
requirements (EARTHWATCH in print).
Beside the description of functions, features
and benefits of the tabled design proposal, the paper describes functionalities
of a preliminary prototype, which was developed by using commercially-available
software and to prove thereby that a low-cost, broad-distribution concept
is feasible.
A set of different access and selection
strategies is seen as another critical step for effec-tive and predictive
operations.
The matching of complex relations is seen
as most critical. Experience still needs to be gained in the management
of links in the production environment and also at the work-place of the
users.
Given the complexity and extraordinary
requirements of this project, alternative ways have been proposed
and developed to strategically identify and select items, to provide a
basis for further requirements or objectives.
Another obstacle is the lack of translators
between computer systems on the application layer. - The authors follow
closely the developments of EDI and UN/EDIFACT and feel that the system
concept proposed is another step to homogenize data and information to
be perhaps used later in supra-national environments. How well advanced
the European com-puter industry in specific applications is, has been
presented impressively (Peeters- EDI 1992).
| "The major source
of missed items in information retrieval systems comes from the inability
of electronic retrieval systems to establish connections between two different
formulations of the same concept and between two different descriptions
of the same idea."
Alain Bonnet (1992)
|
The important thing about
any word is how you understand it
Bublilius Syrus
Serventiae, 43 BC
|
Harmonization versus Standardization
Alternating Strategies to avoid the sectoral
data traps through transcendation (Croze 1983)
There is some discussion regarding the
term harmonization and its use for technical and scientifiy community,
beside the conventional use in political and legal domains. The authors
believe that beside comparability and compatibility within one field there
is a need to bridge within disciplines and technical approaches and compare,
relate, and globalize activities between different organizations.
Harmonization must be considered a bottom-up
approach, which crosses boundaries and requires agreement which is most
easily achieved by acceptance and broad usage. It is a true integrative
effort and requires much more concern and sensitivity than physical stan-dardization
and the quest for compatibility in physical and homogeneous environments.
This definition is an extension of the CODATA definition (CODATA 1990),
suitable for application in conjunction with other disciplines and supports
the demand for interdiciplinary interaction along and across hierarchical
scales (Castri, di; Hadley 1988)
![]() |
![]() |
Fig 3: Standardization versus Harmonization
The basic differences are found in the
objectives and direction of both activities. Some definitions consider
a parallelism in one field as harmonization and restrict the definition
of standardization accordingly. The main difference might be found in the
homogeneity of the topic to be covered. Due to the fact that the term harmoni-zation
is primarily used in legal and federal environments, the authors would
like to present a broader meaning for general discussion. Harmonization
of measurements without description of the specific area of application
is the only use so far encountered (EEES 1986,1987).
LATE NOTE from Febr.
2001:
The Figure Abbildung
52 was developed by Gerhard Budin after Fig .3 for his habilitation (see
page 69 and 146) at the TU-Vienna and is "standard" for the description
of HARMONIAZATION in his version in Literature.
please see also
the footnote 2 in: show-schau-postscript
This work of bridging
started with the UN Harmonization Projects in the late 80ies, see GeoJournal
on Harmonization
and Access
and Assimilation and this piece: "Bridges and a Masterplan" was
used for his "habilitation" at the Institute
of Philosophy of Sciences and Social Studies of Sciences at the TU Vienna,
and is now an important activity for terminology groups which are today
increasingly concerned with Languages and Cultures, especially in Europe.
Please vist the sites and work of Prof.
Dr. Gerhard Budin
Paradigma for Harmonization of Environmental
Information
It is hardly possible to harmonize, structure
and standardize the nomenclature and thereby the data already existing
in environmental sciences and management given the hundreds of institutions.
One possibility for harmonization is access via standardized retrieval
of information of different sources, structure, and quality. The above
axiom was confirmed during the production of the compilation of (A Survey
of Environmental Monitoring and Information Management Programmes of International
Organizations 1991). Any attempt to present the inter-relations in this
field in linear or hierarchical form, was doomed to failure; even though
the tabular presentations or listings provide insights into international
activities, the match with reality in two dimensions is poor or even misleading,
for example crossing monitoring programs and problem areas, or organizations
and interaction or contri-butions. For further review of international
structures, co-operations, and linkages see (Hansen 1991), (Judge 1992).
Design Scope and Process
The design phase has been supervised by
an International Expert Group, by consultations with primary users, and
other interested parties worldwide. The potential user community identified
includes the general and specialised UN agencies, development banks, bilateral
aid agencies, national governments, and natio-nal and international non-governmental
organi-zations (NGO´s). The results of these consultations are summarized
in Table 1.
|
Project Scope
Possible Coopertion Partners
According to a brochure developed for
UNEP-HEM, and forming part of this article (HEMIS Design and Development
Status Report 1992), all specialized U.N. agencies hold consid-erable relevant
environmental and related data and other meta-data for their fields, in
some cases (such as WMO) in highly auto-mated form. Organizations like
UNESCO, OECD, WRI, EEA-TF, ESA, and IOC could provide inventories or cooperate
in other ways.
Fig 4: Harmonization and Distribution
of Information via HEMIS
Fig 5: Start Screen of the Proposed Prototype
Requirement Summary
The value is seen in added retrieval functionality
and access to information, which is only partially or regionally available.
The approach of the current design proposals was approved. Geo-graphic
and temporal scope were requested. Its open design may be enlarged in future
for other tasks. The need for information exchange and communication by
elec-tronic media was broadly accepted. The wide interpretation of the
term "environment" was seen as the only possible way to match with the
latest technologies bridging cross-sectoral subjects. The need for a referral
and linking system such as HEMIS was confirmed.
The following detailed concept of the
system incorporates the major user requirements. Modules for interchange
will be provided along with the co-operation agreements under nego-tiation,
as well as automatic indexing and transformation tools for standardized
and harmonized input from external sources. The above list of requirements,
without factual, political or budget restric-tions, provides a proper view
of the task to be tackled when addressing all possible demands of the widest
range of potential users. The final design has to take this heterogeneity
into account, as well as distribution and pricing policies.
Most of the data in HEMIS is directly
accessible via the data fields. Other information is added, i.e. for explanatory
purposes, help, additional non-indexed data, or non-coded infor-mation
like scanned images.
Index information may be manually created
or automatically generated via an automatic index-ing and translation facility.
Files from partner institutions can be converted into HEMIS data-base access
information. Guided tours, hyper-links and access information for non-coded
data have to be added manually.
HEMIS, therefore, is in principle a system
at the UNEP-HEM office for the gathering, harmoni-zation and preparation
of an information-base, which will be distributed to all interested institutions,
including the access informa-tion on digital media.
Design Considerations
There is very little chance of harmonizing,
structuring, or standardising the nomenclature and data already existing
in environmental science, given the hundreds of different institutes who
will be producing information for the HEMIS system. The only possi-bility
is harmonization of access to the informa-tion from different sources,
and of different structure and quality by standardized access methods.
Therefore controlled nomenclature, multilingual thesauri, access via selection
lists of key-words, and a context-sensitive help function are essential
to fulfil the harmonization task in complex and inhomogeneous fields like
materials and environment. A preliminary High Level Entity Relationship
Model (Fig 6) was developed and a final Design Process was started but
not completed yet.
Fig 6: Example of an High Level Entity
Relationship Model
HEMIS is an Information System
It is not the aim of UNEP HEM to create
another standard meta database, since meta data-base systems already exist
in many different fields. HEMIS is an uni-versal information system which
will allow scientists and administrative staff, as well as interested members
of the public, access to environmental information which informs about
original data in data-bases, publica-tions, reports etc. HEMIS will use
the information from other databases and meta databases and integrate it
in a uniformly accessible form. The objective is to bring harmonized information
about environmental institutes, programmes, databases, etc. to the fingertips
of every PC-user, and raise public awareness of what is being done in environ-mental
monitoring world-wide.
Multilingual Access
HEMIS has to be multilingual to allow
access to information which is probably originally not available in the
native language of the user. It is very important to make sure that users
learn which work has been already carried out on similar subjects, even
when they are not familiar with the terminology or language used in the
original documents. The HEMIS THESAURUS (thesaurus of main subject keywords)
has to be implemented in such a way that it acts as an electronical translation,
guidance and orientation tool.
Distributed Information versus On-line
Access
The world is at the threshold of multi-media
information technology: information is no longer presented only as data
or text, but as image, video, voice, graphic etc. and as combined elec-tronic
documents of all these types. The design of a world-wide accessible information
sys-tem has to take account of such future developments.
An on-line database system is not able
to handle and provide the user with huge masses of non-coded information
due to existing telecommunication transfer rates. The system approach is
designed from a interactive point of view. Techniques like guided tours
and hyperlinks combined with non coded information (NCI) and coded information
(CI) cannot be used on-line with traditional retrieval systems. The development
of an on-line enhance-ment is therefore a task to be viewed along with
the developments of the information and communication industry.
The proposed design of HEMIS is based
on two system platforms: one internal system at the UNEP HEM office for
in house use: i.e. building up the information basis, integrating data
from partner institutes and producing media for external retrieval. The
external retrieve-only stations build up a local system based on the distributed
media. Both require different software and hard-ware modules. All modules
are based on industry-standard compo-nents for flexibility and future increases
in capability. The software system is designed in such a way that only
the inter-action between different modules, the special thesaurus facility,
and the transformation modules for the inte-gration of partner data have
to be developed indi-vidually.
Keywords versus Fulltext
Some database and information systems
use fulltext retrieval soft-ware (Kampffmeyer 1991a). This allows searching
for every word in every combination. However, a fulltext system cannot
assure that the searched-for information is indeed the information that
was wanted, or that it is complete. At the present stage of technology
a fulltext system can-not be used to do the harmoniza-tion and transfor-mation
task of HEMIS. Harmoni-zation information has to be structured. The best
choice is therefore a standard database system based on keywords widely
organized in selec-tion thesauri which allows referencing to different
synonyms, homonyms, translations, acronyms etc. A keyword-oriented system
with controlled nomenclature assures that the user indeed finds all the
information he is looking for.
PC-Computer as System Platform
Most computers world-wide are able to
run MS-DOS. The internal system at UNEP-HEM as well as the external local
stations will be based only on standard PC components. The multi-media
database will run under a Windows graphic user interface. The only peri-pheral
which is not used in large numbers at present is the CD-ROM drive. The
overall investments for both system platforms - internal and external -
will be very low. This will help to distribute the infor-mation world-wide.
HEMIS leads to Closer Co-operation
HEMIS can incorporate information from
all institutions engaged in environmental science of international, global,
or regional significance - private, public, and industrial. The distribution
of HEMIS infor-mation world-wide, via a standardized media like CD-ROM,
is of great inter-est for every institution and will lead to a deeper cooperation
not only with UNEP HEM, but also with others who deliver information
to HEMIS.
The HEMIS information basis may also be
of interest to industrial sponsors, who could not only provide data, but
also use the system to disseminate the information promoting their environ-mental
activities. Thus HEMIS is not only a harmonization tool in itself, but
also an information platform for all people, institutions, companies and
administrations with envi-ronmental concerns.
HEMIS Software-System Layout
The software is devised in several layers
with different tasks. The layers of the retrieval software are shown below.
Fig 7: System Architecture with User Interface, Data Base, Information Retrieval System, and Document Level of the HEMIS retrieval software.
The User Interfaces
HEMIS includes two different types of
user interface. Both are based on the graphical SAA standard. One is designed
for information gathering, system maintenance and production of the information
base at UNEP HEM. The second is the user interface for local access to
the information distributed by UNEP-HEM. It is a subset of the user interface
for the production and management system, allowing only retrieval and report
capabilities.
Both types of user interface make use
of graphic features like icons for calling complex operations etc. All
user interfaces can be switched between the available national lan-guages.
Every text on the screen is related to a digit which refers to a file with
text entries in the specific language. Fur-thermore, the keywords in the
selection and multiple selection list are also switched to the currently
operating language.
The Database - an Object-Oriented Reference
System
HEMIS will include different databases
and an information management system for the media used (object access
database). The databases themselves will be a relatio-nal pro-gram system
available as a standard product. The stored information (data set, text
file, image etc.) are objects (referred as documents), which are linked
via the unique document identifier with the descriptors.
The complete system is based on a reference
model. All selection lists and thesaurus entries are stored with
reference to a unique identifying number. The descriptor data-base contains
only the references between the unique identifiers and related object identifiers
for all selection lists and thesaurus fields.
Using independent database modules and
access strategies speeds up the system and allows very flexible usage.
The splitting of processes to prepare a database search with the thesaurus
(which is in fact a database of its own) produces a hitlist with the descriptor
database entries. Only the documents choosen are loaded from the external
storage medium. Changes in the thesaurus, for example, have no effect on
the data already stored. The database retrieval runs on fast magnetic media
and only the docu-ments themselves need to be transferred from the external
media to the user desktop. This is important when slow optical storage
media are used.
The amount of data to be managed by the
descriptor database is very small compared with conventional database systems.
The need for stor-age capacity is relatively small, especially in comparison
with fulltext databases, even if some hundreds of thousands of references
have to be man-aged.
The system will support a great range of
field types: standard fields for date and time as well text fields for
individual input. Most of the fields will be organized as a structured
one-dimen-sional selection list or as a multi-dimensional thesaurus for
the controlled use of nomen-clature. The use of a selec-tion list avoids
typing errors. The selection list is useful for retrieval to show the user
the available keywords. Not all selection lists must have the structure
of a two, three or more level thesaurus. Selection lists will be used for
countries, sectors, biomes, and other purposes. Selec-tion lists are represented
in the database only by a few digits even if there is a long text dis-played.
Only the referenced entries in selec-tion lists or thesauri can be used
for automatic translation purposes. The database is also able to handle
large free-text fields which have some abilities of full-text database
systems. The free-text fields are not opti-mized for access.
The information retrieval and object access
database (IRS) contains the logical and physical address information of
the documents. A separated database and media management system is necessary
for the access of optical media. This Information Retrieval System (IRS)
holds only the document IDs and their references to the objects on the
external media together with management information of the media itself.
Such a module is necessary especially to handle informa-tion on multiple
optical media.
The Hyper-Link Database
A second mode of access is the hyper-link
technique. The object oriented approach of the data-base (see below) allows
linking of all kinds of entries (datasets, files, images etc.; referred
to as document) with hyper-links.
There are two different ways to organize
the management of hyper-links. One is to store hyper-links in a dedicated
database. A set of predefined link types, represented as digits to save
storage space, is connected with a list of document identifiers. This feature
enables to the creation of "guided tours" for unskilled users, leading
them from a start docu-ment through a series of related documents without
starting a new search action.
The other type of link is part of the
document itself. Only when a document is retrieved are those links related
to the document available. If such a document is selected from the hitlist
and brought to the screen for display, the user can select the available
links from a special menu. The links available are displayed as a selection
list which is comparable to a hitlist. A selected link then leads to the
document identifier of a linked document. This fea-ture also allows crea-tion
of links bet-ween documents which may at first glance have no connection.
Object links stored together with the
data component of documents avoid the storage of all links and their relations
in a huge database. In the first phase of the HEMIS system, the links are
created manually by scientific staff. When integrating data from partner
institutes the transformation programme will support the staff by proposal
lists to create links more easily (link database for guided tours and hyper-links
inte-grated in the document header).
Thesauri and Selection Lists
The use of a standardized nomenclature
not only has a lot of advantages, it also leads to several problems, like:
definition of key-words and hierarchical structures, point of view, inter-pretation
of terms, spelling, acronyms, etc. For easy use of the system, and to allow
a standard-ized access, a thesaurus structure is used for single selection
and multiple selec-tion lists refer-ring to the SAA and Microsoft Windows
standards.
The selection list opens if an entry field
is activated. The user is then able to make a choice. There is no chance
of typing errors and the user gets information only about the content of
data available in the active field.
This thesaurus facility can be linear
or hierarchical. Linear means only one entry is select-able, hierarchical
means after a selection was made, a new selection list opens displaying
entries which are related or subordinated to the chosen subject. The displayed
keywords may be underlaid with synonyms, acronyms, explanations, etc. This
information is also avail-able via global search.
Fig 8: Internal Structure of Keywords and
related Information according to the ISO-Standard
Fig 9: "Slice" Model of the Thesauri
The keywords and their related information
in each "slice" point to the same unique identifier (ID) Only the ID is
used for retrieval in the database. The thesaurus acts as a pre-processor.
The thesaurus is organized in a network
structure, which is only represented as a hierar-chical order. This means
different entries may occur at different hierarchical levels due to their
mean-ing in different scientific context. The hierarchy is mainly used
for the visual organization of the keywords, which helps the user to navigate
through the data. Due to the network struc-ture, the same keywords may
occur several times at different positions in the thesaurus repre-sentation.
Every main keyword is related to one unique
identifier. Predecessor, successor, explana-tions, etc. refer to the same
identifier in the chosen language. The position in the network is defined
through one or more predecessors and a number of following descriptors
(successors). Successors and predecessors are used for modelling the ISO-standard
rela-tions like broader term, narrower term, crosslink i.e. The thesauri
are designed for use as a multilingual tool. Different tables point to
the same unique identifier (Kampffmeyer 1992a, 1993 in print).
HEMIS will include a number of selection
fields with thesauri. An example for a two-dimen-sional thesaurus is the
list of continents with related countries. The main thesau-rus, the HEMIS
thesaurus, is the four-dimensional subject keyword thesaurus. This the-saurus
is based on the INFOTERRA definitions.
The contents and structure of a "language
slice" can be adapted to the national require-ments. This includes the
ranking in the hierarchy, predecessors and followers, number and meaning
of synonyms etc.
Only the reference between main keywords
and their unique identifier is not allowed to be changed. In this way the
thesaurus is not only a translated structure but an interpre-tation which
fits to the differences of the languages used. The harmonization effect
is that when the user uses the thesaurus for access he will be led by the
standardized keywords to informa-tion which was originally described with
other keywords or in another context.
This structure allows the different slices
of the thesaurus to be developed separately by differ-ent partner institutions.
UNEP HEM defines the "main descriptors" and the basic struc-ture. The partner
institutions than translate this structure into their native languages.
It is even pos-sible to use several slices in the same language. The structure,
as a network, will also allow the addition of new categories and main descriptors
without changing the structure.
In the first stage UNEP HEM will use other
exist-ing thesauri defini-tions in use in environmental science (i.e. INFOTERRA
and others) for the HEMIS main thesaurus of subjects. The HEMIS thesaurus
may be enlarged by the partner institutions in deeper hierarchy levels.
For example, if UNEP chooses to create a hierarchy with four levels, partner
institutions may add a fifth or sixth level in their language slice to
give more details on special subjects. Equally, they may approach UNEP
HEM with their requirements for more keywords, which will be added by UNEP
HEM with further releases. The thesaurus is based on a reference model
which even allows use of other predecessors and followers in the diffe-rent
slices without any loss of information and is realized as an independent
SQL-database. The Thesaurus maps entries and selection list items into
unique identifiers, which are entered into the descriptor data-base to
ease the creation of hitlists. The thesaurus is realized as a network,
so that descriptors are not to be organized hierarchically only. Various
selection lists may guide the user through the search process. Alternatively,
global search functions may be employed. Most interesting is the harmonization
effect and repeatability in various lan-guages or terminologies, created
by referencing to the mains. Thesauri range from 4-5 digits for subjects,
2-3 digits for geographic, to simple one-digit multiple selection lists,
and act as pre-processors.
Fig 10: Internal Structure of each Keyword
and related Information in the Thesaurus
The unique identifier (A) is used for
accessing the reference database. Thesaurus data, designed in a multi-dimensional
structure, set in different tables (slices) point to the same unique identifier.
The logical structure of a keyword in the thesaurus structure is defined
in the network of one or more predecessors (B) and ID-numbers of the following
descriptors (C). The network allows uni- and bi-directional links. The
structure is independent from the hierarchical level of the original hierarchical
position of the predecessors and successors. The selection list on a lower
hierarchical level is individually created regarding the entries marked
on the higher level and the previous entries which led to the current position
in the thesaurus network. The main descriptor (D) is the keyword, which
will be displayed inside a multiple selection list when the thesaurus is
used for retrieval. The position in the hierarchy (E) defines at which
level of the hierachical ordering the keyword was originally situated (in
the paper-based standard hierarchical structure - see also high-level entity
relationship model Fig 6) and where it will be displayed using a tree overview
function. The field (F) contains a list of e.g. synonyms, homonyms, abbreviation,
plurals, Latin definition, chemical formulas, acronyms. This feature allows
one keyword to be connected with all definitions which are not used
in the restricted "main thesaurus". These definitions can be integrated
without loss of information in this "synonym" field. The harmonization
effect using the thesaurus for access is to lead the user by the standardized
main keywords to other information as well which was originally de-scribed
with other keywords or in another context. The additional keywords are
not dis-played in the thesaurus hierarchical structure but are retrievable
by a global search. The last field (G) is used to include a text explanation
which is displayed as a context-oriented help function. This offers the
possibility of giving a detailed explanation how a keyword is defined and
used.
Retrieval with Thesaurus and Global
Search
When the user starts a search operation
in the HEMIS system, he may choose between two possibilities for primary
access :
The user may open the thesaurus window
on its first level by mouse or keyboard action. He is allowed to choose
one or more keywords from the displayed list. A new window opens, display-ing
the next level of hierarchy. If two or more keywords were chosen on the
prede-cessing level, a mixed list of all keywords is generated, displaying
all keywords belonging to all chosen primary keywords. From the second
level he may also choose one or more of the entries displayed which will
lead to a third selection list, or he may start the retrieval proc-ess.
Starting a retrieval action is allowed only from the second or one of the
following levels to avoid much too long hitlists. From the second level
on, he may also use the global search field for refinement purposes which
is then used in an "AND" mode only. If more than one key-word is chosen
the user may indicate whether he wants to search in an "AND" or "OR" mode.
Fig 11: The Retrieval Process
The global search facility is field independent. It acts as QBE (Query By Example). The user types in the keyword he is looking for. A box opens demanding him to specify whether he wants to:
search only in the main thesaurus
in the language used, search as well in the synonyms, acronyms etc.
search in the help texts as well
search in the main thesauri of another
language slice as well, if available
search in the synonyms, acronyms
a.s.o. of another specified language slice as well,
if available
Tab 2: Search Modes
If an option is chosen, the program will
indicate that the retrieval will take more or less time. Global search
takes more time than a search in the thesaurus mode. The user may also
use left-hand truncation and/or wild cards.
Every retrieval leads to a hitlist and
the main keyword is displayed. Now the user may choose to select one or
more objects (see database structure below) for display or print. He may
also refine his request, changing to the thesaurus mode (see below). If
too many entries are found, the program changes automatically to the refinement
mode using the thesaurus.
Creation and Editing the Information Basis
and the Thesaurus Entries
ASCII information delivered as files in
a predefined format will be automatically trans-formed and referenced to
the unique identifiers or related fields. This is done by indivi-dual trans-forma-tion
programs which are based on the thesauri, selection lists, and field contents
definitions. For each file format a parameter and transformation file is
created once. Another tool is used for the creation and maintenance of
the thesaurus and selection lists them-selves.
The thesaurus is a creation and editing
tool which supports translation of the thesaurus entries. It is an interactive
program, which for translation displays an editing mask with the original
entry and asks in a second mask for the corresponding entries. Changes
in the existing structure can only be made by UNEP HEM staff. Partners
are allowed to add infor-mation in specified fields of the main thesauri
and to fill in the translation in their language slice.
The thesaurus tool, the selection lists,
and the basic HEMIS thesaurus with the main key-words are to be distributed
on diskettes for edition and appending purposes.
SUMMARY
Providing some unconventional new pathways
to existing data and perhaps triggering new research and management approaches,
the paper deals beside the prior defined "second generation" of environmental
information" (Averous 1990) with a third, networked or linked information
generation. Descriptive and contextual information are required of indi-vidual
numbers to understand and put findings to use, beside the recent focus
on quality issues (QA/QC) in the environmental management agenda . - Studies
of the differences in retrieval strategies between different leading research-
and technology-driven nations show a correlation, for example, between
the number of patents registered and the way source and descriptive data
are requested and used (Hoetker 1991).
The results envisoned will provoke new
information from old data, the possible comparison of old and new data
(lineage databases), new applications possible with new realms or comprehensivess
of information, and possibly better utilisation of human reasoning and
association powers. Effective and economic access and selective approach
to the evolving information glut, especially when managing qualitative
and descriptive information together with the data, is the real critical
point. Only strategic access and restriction to of original data (Benking,
Kampffmeyer 1992), not secondary "sources", might help to avoid getting
drowned by the data- and information glut to come. - The development
of data translators (standardization and harmonization) is a prerequisite
of information exchange on the application layer.
The capabilities of commercial of-the-shelf
software packages have increased dramatically, especially in the lower
price segments, even for complex tasks like filing and communication (Adamik
1992). The main interest naturally stays with the data itself, which can
be transferred between packages and applications.
Concepts for exchange and transposition
are needed. Inconsistent toolboxes are broadly available, but this only
worsens the dilemma. - This paper would like to contribute some practical
means to the continuing feasibility studies and discussions about the scope
and breath of the envisioned object-oriented reference system.
|
| The most productive and
yielding research is that which pleases the thinker and supports mankind
at the same time.
Christian Doppler
|
Mankind consists of two
fractions: The first expresses itself misleadingly, the second
misinterprets it.
Alexander R. Roda
|
Acknowledgement
The authors want to thank the following
organizations and persons who enabled them to participate and present HEMIS
at the CODATA conference. Dr. Hartmut Keune, Prof. Ian Crain for the invitation
to Beijing, and in particular Prof. Hu Yaruo, Vice Chairman of the CAS
Computing Centre and CODATA Executive Secretary and Prof. Xu Zhihong
National Delegate and Chairman of the Local Organising Committee for particular
interest and prepa-rations on short notice, last not least HEWLETT PACKARD
Asia Pacific for excellent and timely services.
The UNEP-HEM office and its work on improvement and harmonization of environmental
information management has been made possible through the generous support
of the German Federal Ministry for the Environment (BMU) which enabled
UNEP to establish the HEM-programme according to the UNEP Governing Council
decisions. During the imple-men-tation phase of HEM several international
and national organizations have supported the ongoing development, in particular
the OECD, UNESCO, EEA-TF, ESA, The Govern-ment of Norway, DARA, SRU, and
Hewlett Packard. Last not least we want to express our gratitude for unswerving
support to the Head of the International Expert Group, Dr. David Clark,
who helped not only with the final polish of this paper, and in particular
Prof. Jim Doodge, Prof. Edgar Westrum, and Dr. Ernest Merian for their
interest, patience and advice.
Literature
ACCIS Guide to United Nations Information
Sources on the Environment, United Nations, New York, 1988
ACCIS Directory of United Nations Databases
and Information Services, Advisory Committee for the Co-ordina-tion of
Information Systems, New York, 1990
Adamik, P.: Ein Netz voller Informationen
- NETWORLD - Neue Medien, In: cogito, 30-33, 5-92
A Glossary of Terms relating to Data,
Data Capture, Data Manipulation, and Databases: (Westbrook, J.H.,
Grattige, W. eds.), ICSU-CODATA, Paris
A Survey of Environmental Monitoring and
Information Management Programmes of Interna-tional Organiza-tions, Second
Edition, UNEP- HEM, Munich, April 1991
Anderia Georges: Information in 1985 -
A Forecasting Study of Information Needs and Resources" OECD, Paris 1973
Averous, M.: 2nd Generation of Environmental
Information.....; OECD, Paris 1990
Benking, H., Kampffmeyer, U.: Information
about Environmental Information; Discussion Paper and Proposal for the
First UNEP-HEM International Expert Group Meeting in Combination with the
Feasibility Stud-ies and Proposals MeDIS and ISET, hand-out and documents
available through UNEP-HEM, see also (EARTHWATCH 1991), Munich-Hamburg,
February- July 1990
Benking, H., Kampffmeyer, U. B.: Harmonization
of Environmental Meta-Information with a Thesaurus-based Multi-Lingual
Multi-Media Information System, International Space Year (ISY)- Earth-
and Space Sci-ence Information Systems (ESSIS), Procee-dings, February
1992
Benking, H., Kampffmeyer, U. B.: Access
and Assimilation: Pivotal Environmental Informa-tion Challenges - Linking,
Archiving, and Exploiting Multi-Lingual and Multi-scale Envi-ronmental
Information Reposi-tories, GeoJournal, 26.3, 323-334, Kluwer Academic Pub-lishers,
(March) 1992a
Bestougeff, H., Dubois, J.E.: New Perspectives
in Scientific Complex Data Management; CODATA Bulletin, 22.4, 1990
Bonnet; Alain: Journal of the Association
for Global Strategic Information, March 1992
Castri, di, F.: Hadley, M.: Enhancing
the Credibility of Ecology: Interaction along and across Hierarchical Scales,
GeoJournal Trilogy, 17,1, 3-35 (1988)
CIESIN: Pathways of Understanding; and
Building a Database,(ISSC, UNESCO, HDGC, CIESEN), Paris, December, 1992
CODATA: Directions for Internationally
Compatible Environmental Data, ICSU, Hemisphere Publishing Corpo-ration,
New York, 1990
Committee on Earth and Environmental Sciences
and FCCSET: The U.S. Global Change Data and Information Management Program;
Washington, September 1972
Cleveland, Harlan: People lead their Leaders
in an Information Society
Croze, H.: The Sectoral Data Trap, sub-title;
IN: Global Monitoring and Biosphere Reserves, Chapter 6 Global and Regional
Monitoring, In: Conservation, Science, and Society; Natural Resources,
and Research XXI, Vol. II, First International Biosphere Congress Minsk/USSR,
UNEP/UNESCO, FAO/IUCN, ISBN 92-3-102254-7, Sept/Oct 1983
Executive Office of the President: Data
Management for Global Change Research Policy Statements, Office of Science
and Technology (1991)
Didsbury Jr., Howard F.: Communications
and the future - Prospects, Promises, and Problems, World Future Society,
Bethesda Maryland, !982
Directory of Organizations and Institutes
Active in Environmental Monitoring, First Edition, UNEP-HEM, Munich, 1992
Directory of Global and Regional Data
Sets Supporting Global Change Research; Project Summary; National Geophysical
Data Centre (NOAA/NESDIS/EGC1), Boulder April 1989
EEES Report (Environmental Experts of
the Economic Summit) on Current International Scientific Activities in
Improvement and Harmonization of Techniques and Practices of Environmental
Meas-urement, GSF-PFU Secretariat, Munich, 1986
EEES Report on Priority Areas for Improvement
and Harmonization ... Final Report including Summit Declara-tions , GSF-PFU
Secretariat, Munich, June 1987
Directory of Organizations and Institutes
Active in Environmental Monitoring, First Edition, UNEP-HEM, Munich, 1992
EARTHWATCH, UNEP-GEMS Report Series 8,
Towards the Design for a Meta-database for Harmonization of Environmental
Measurement, Report of the Expert Group Meeting, UNEP-HEM July 26-27, 1990,
Nai-robi, June 1991
EARTHWATCH, UNEP-GEMS Report Series, USER
REQUIREMENTS for the Harmoni-zation of Environmental Measurement Information
System HEMIS, in press
Environmental Data Report, UNEP,
Blackwell Reference, London
Environmental Information Statement, International
Forum for Environmental Informa-tion for the Twenty-First Century, Montreal
1991
EYLESS IN GAIA: The State of Environmental
Monitoring; World Recources Institute, Washington D.C., March 1990
Feather, Frank, Rashmi, V., Mayour, N.:
Communications for Global Development - Closing the Information Gap
GEMS Meeting Reports No. 2: UNEP Expert
Meeting on Improvement and Harmonization of Environmental Measurement,
Munich, December 1987
Hansen, P.: International Action and Institutional
Measures - New Approaches , Malente Symposium IX, The ECO-Nomic Revolution
- Challenge and Opportunity for the 21st Century, Dräger Foundation
in Co-operation with UNCED, Malente, November 1991
Harley, William G.: The Mc Bride Commission
Report: Issues and Processes in Global Communication, UNESCO 1976
Hoetker, Glenn P.: Why is Japanese scientific
and technical information so hard to find and use? - Warum sind japanische
naturwissenschaftliche und technische informationen so schwierig zu finden
und zu be-nutzen?; In: NfD 43, 76-82 (1992) or Proceedings 82. Annual Conference
of the Special Libraries Association, Texas, June 1991
ICSU: An Agenda of Science for Environment
and Development into the 21st Century, Cambridge University Press, Vienna,
November 1991
Infoterm Series 6, Theoretical and methodological
Problems of Terminology K.G. Saur, Munich 1981
Infoterm Series 7, Terminology's for the
Eighties, K.G. Saur, Munich 1982
Jeffers, J.N.R.: An Introduction to System
Analysis, Arnold, London, 1978
Judge, A.: Visualizing Relationship Networks
- International, Interdisciplinary, Inter-Sectoral; Yearbook of Inter-national
Organizations, Encyclopaedia of World Problems and Human Potential, Union
of Interna-tional Associations, Brussels, January 1992
Kampffmeyer, U.: Deskriptoren versus Volltext,
DGD/LID Tagung Frankfurt, Strategien für Optical Filing - Anwendungen
im Pressearchiv, Frankfurt November, 1991
Kampffmeyer, U.: Von der Datenverarbeitung
zur Integrierten Informationsverarbeitung: Welchen Beitrag kann "Optical
Filing" leisten? Congress V, 10.02, ONLINE ´92, 15. International
Congress Fair for Techni-cal Communications, Hamburg, 10. -14. 2. 1992,
ISBN 3-89077-106-8
Kampffmeyer, U.: Multilinguale Retrieval-
und Informationssysteme: Technik und Beispiele, ONLINE ´93 Con-gress
IV, C430, Hamburg 1993, in print
Keune, H.; Murray, A. B., Benking, H.:
Harmonization of Environmental Measurement, Geo-Journal, 23.3, 249-255,
Kluwer Academic Publishers, March 1991
Keune, H., Kampffmeyer, U. B., Benking,
H., Theisen, A.: Discussion Paper for the 2. Expert Group Meeting, UNEP-HEM
Meta-Database and Information System HEMIS, hand-out un-published, UNEP-HEM,
October 1991
Keune, H.; Kampffmeyer, U.: HEMIS - A
Meta-Database and Information System with Multiple and Multi-Lingual Access
Strategies, Presentation and Demonstration hand-out, UN-ISY Conference,
Boulder, August 1992
Kuhns, W.: Twice as Natural - Speculations
on the Emerging Information Culture
Lancaster, F.I.: Vocabulary Control for
Information Retrieval, Resources Press, Arlingto, AV, 1986
Landau; S.I.: Dictionaries: The Art and
Craft of Lexicography, Scribers Sons, New York, 1984
Levitus, S.: Data Archaeology & Rescue
Project, IOC/IODE-XIV/13, Paris July 1992
Lookeren Campagne, van Ir. N.: World Sciences
contribution to N.W. European Policy making on Environ-mental Issues, Ökologie
Dialog, Bonn, 1992
Man's Dependence on the Earth - The Role
of the Geosciences in the Environment, (Archer, A.A., Lüttig G.W.,
Snezhko, (eds.) UNEP, Nairobi, UNESCO, Paris, E. Schweizerbarth'sche Verlagsbuch-handlung,
Stuttgart, 1987
Marien, Michael: Non-Communication and
the future, Computopia 1962
SRU - Der Rat von Sachverständigen
für Umweltfragen: Allgemeine Ökologische Umwelt-beobachtung,
Der Bundesminister für Umwelt, Naturschutz und Reaktorsicherheit,
Bonn, October 1990
Peeters, Emile: Building Trans-European
Networks TEDIS II, EDI 92, Germany, Hamburg, 1992
The ResearThe Research Institute for Information
and Knowledge: The List of Multifield-Common and Multilingual Basic Terms
in Science and Technology, Kanagawa University, Tokyo, 1992
UNISIST, Guidelines for the Establishment
and Development of Multi-Lingual Thesauri, UNESCO, Paris, May 1980
Wessel, Andrew, E.: Information Retrieval
und Automatisierung, Luchterhand, Böblingen-Sindelfingen, 1975
Abbreviation of Organizations
BMU
Federal Ministry for the Environment (Germany)
CAS
Chinese Academy of Sciences
CEC
Commission of the European Communities
CODATA Committee on Data for Science
and Technology
DARA
German Space Agency (Germany)
EEA-TF
European Environmental Agency- Task Force (CEC)
ESA
European Space Agency
GEMS
Global Environmental Monitoring System (UNEP)
GRID
Global Resource Information Database (UNEP)
HEM
Harmonization of Environmental Measurement (UNEP)
ICSU International
Council of Scientific Unions
IRPTC
International Register for Potentially Toxic Chemicals (UNEP)
IUCN
International Union for the Conservation of Nature and Natural Resources
MARC
Monitoring and Assessment Research Centre (UNEP)
NASA
National Aviation and Space Agency (USA)
SRU
German National Environmental Advisors (Germany)
UNEP
U.N. Environment Programme (UNO)
UNESCO U.N. Educational, Scientific
and Cultural Organization (UNO)
WCMC
World Conservation Monitoring Centre (UNEP/IUCN/WWF)
WDC
World Data Centres (ICSU)
WHO
World Health Organization (UNO)
WMO
World Meteorological Organization (UNO)
WWF
World Wide Fund for Nature