EUROVOC

Link to: EUROVOC

 

 

EUROVOC Thesaurus in the National Council of the Slovak Republic

 

Eurovoc thesaurus is a multilingual automated vocabulary of the European Union. It was created in cooperation with the European Parliament, Commission of the EU and the Office of the Official Publications of the EU. The work on descriptors creation and structuring of the thesaurus began in 1982. The first edition was published in 7 languages in 1984. The aim of the application of the Eurovoc thesaurus was to make information resources of the parliaments and information institutions accessible by one common vocabulary and to search documents by the approved terminology. At present Eurovoc exists in all official languages of the member states of the EU as well as in some other languages of non-member countries.

 

At present the version 4.1 of Eurovoc is in use. Thesaurus is a structured list of expressions intended to represent in unambiguous fashion the conceptual content of the documents in a documentary system and of the queries addressed to that system. Thesaurus is a living thing which evolves as a function of the needs expressed by the indexers and users of the documentary holdings and of the increasingly numerous databases. It is a common language of all documentary systems which deal with the EU a European Communities activities, and thus in the institutions of the community, in national or regional institutions or in the private sector. Only descriptors can be used to represent the conceptual content when indexing documents and formulating questions.

 

At the generic level Eurovoc thesaurus has a two-tier hierarchical classification: 

  • fields, identified by two-digit numbers and titles in words,

  • microthesauri, identified by four-digit numbers – the first two digits being those for the field containing the microthesaurus.

 

The Eurovoc thesaurus thus comprises:

  • descriptors, i.e. words or expressions which denote in unambiguous fashion the constituent concepts of the field covered by the thesaurus /e.g. implementation of the law/,

  • non-descriptors, i.e. words or expressions which in natural language denote the same concept as a descriptor /e.g. application of the law/ or equivalent concepts /e.g. enforcement of the law, validity of the law/ or concepts regarded as equivalent in the language of the thesaurus,

  • semantic relationships, i.e. relationships based on meaning, firstly between descriptors and non-descriptors and secondly between descriptors.

 

The Eurovoc thesaurus covers all fields which are of importance for the activities of the European institutions: politics, international relations, European Communities, law, economics, trade, finance, social questions, education and communications, science, business and competition, employment and working conditions, transport, environment, agriculture, forestry and fisheries, agri-foodstuffs, production, technology and research, energy, industry, geography, international organizations.

 

The Eurovoc thesaurus is used for the documentation mainly:

  • by the Office for official Publications of the European Communities for indexing the texts published in the Official Journal and other documents and publications, in addition it is used for the CATEL /electronic catalogue/ database and the EUROCAT catalogue on CD-ROM,

  • by the European Parliament for indexing references in the EPOQUE database and indexing the documentary holdings in its library. In the EPOQUE database Eurovoc can be consulted in any of the available languages by using the RELATE command in the Common Command Language. The European Parliament also inserts into CELEX the Eurovoc descriptors associated with parliamentary resolutions and questions,

  • by the European Centre for Parliamentary Research and Documentation for indexing parliamentary studies,

  • several libraries in the institutions and other Community bodies use Eurovoc for indexing their documentary holdings,

  • at the national and regional levels parliaments and governments in a number of European countries are among the users of Eurovoc.

 

The Eurovoc thesaurus has been compiled in accordance with the standards of the International Standards Organization:

  • ISO 2788-1986: Guidelines for the establishment and development of monolingual thesauri,

  • ISO 5964-1985: Guidelines for the establishment and development of multilingual thesauri.

 

All language versions of the Eurovoc thesaurus comprise:  

  • 21 fields

  • 127 microthesauri

  • 5933 descriptors /of which 504 are top terms/

  • approximately 5877 reciprocal hierarchical relationships /5877 BT- broad terms and 5877 NT-narrower terms/

 

The fields, microthesauri descriptors, hierarchical relationships and associative relationships are strictly equivalent in all languages. On the other hand the numbers of non-descriptors and scope-notes vary from language to language.

 

 

Application of the Eurovoc Thesaurus in the National Council of the Slovak Republic

The idea to create the Slovak version of Eurovoc started in 2000. It was obvious that this system should be implemented also in Slovakia with regard to the integration ambitions of the Slovak Republic as well as to the needs to cooperate also in this area of information exchange.

 

The licence for the distribution of the Eurovoc thesaurus in Czech and English versions for the non-member states of the European Union had already in that time the parliament of the Czech Republic /namely the Parliamentary Library of the Czech Chamber of Representatives/. Due to the good partner relationships of the both Czech and Slovak parliamentary libraries the Slovak Parliamentary Library received the version 3.1 of Eurovoc with the TAT software as a gift from the Czech parliament.

 

In 2002 the Agreement between the Office of Official Publications of the European Union /EUR-OP/ and the Chancellery of the National Council of the Slovak Republic on assigning the licence to create the Slovak version of Eurovoc and passing this version to the EUR-OP after-words was signed.

 

The translation of the version 3.1. of Eurovoc thesaurus into Slovak language was ready by the end of 2002. It was the first step necessary for the application in the National Council of the Slovak Republic. Also the version 4 of Eurovoc thesaurus was translated in 2002 in accordance with the agreement signed between the Chancellery of the National Council of the Slovak Republic and the EUR-OP.

 

We managed to send the EUR-OP in February 2003:

  • 10 copies of the version 3.1. of Eurovoc in the printed version,

  • 10 copies on CD-ROM

  • on-line version in the PDF format.

 

The application of the Eurovoc Thesaurus in Slovakiawas done in several steps:

 

I. Books and Periodicals Indexing

At the beginning of 2003 the version 3.1. was included into the automated library system T-Series that our Parliamentary library had in use. We started to index with the EUROVOC version 3.1. new books and periodicals in cooperation with the T-Series system librarian, EUROVOC system librarian and librarians preparing our library catalogue. We started to use the version 4.1 in 2003 in harmonization with the complex software which is being prepared for the National Council of the Slovak Republic by the EXE IT Company.

 

The indexing of the library documents itself can be characterized as follows: T-Series System (since 2009 a new PROFLIB System) uses these subject viewpoints for indexing:

  1. thesaurus /subject groups, subject categories, thesaurus meanings, geographical names/

  2. names and corporations

  3. classification signs

  4. key words /words from the documents titles/

  5. documents abstracts.

 

Thesaurus is a dictionary of terms dedicated to documents indexing and searching. It contains 2 types of terms – descriptors /preferred terms/ and non-descriptors /non-preferred/. Descriptors are terms used for documents specification. Non-descriptors are not used in the documents specification and they are used as notices/notes only. They are marked by the stars in the validation fields.

 

T-Series System uses four thesauri altogether. Three of them are used for subject viewpoints registering /subject groups, subject categories, thesaurus terms/ and the fourth one is used for geographical cities and regions registering. In spite of the fact the terminology differs a little bit, it is possible to create the same types of relationships among the separate terms.

 

The basic characteristics of the thesauri in the PROFLIB System are: during the thesauri creation a concrete editation window is chosen within the menu Subject Groups, Thesaurus and Other Subject Approaches /these are subject groups, subject categories, thesaurus terms, geographical names and classification signs/. In thesauri creation it is necessary to start at the highest level and to go down, that is to state/assign superior terms of the thesaurus at the beginning and to develop the afterwords.

 

The relationships among the terms have the mutuality character. If in a concrete term there is a note/message/notice for some other term /superior, menial/subordinate, related/, the system automatically complements this note back to the original term.

 

If we change or delete the concrete record, the system automatically makes actualisation of the notes/notices to all connected documents.

 

The system automatically puts to each record all last actualised data /the date and signature of the librarian/.

 

The system makes it possible to search according to actualised data. If we put e.g. the date or the signature of the librarian, the result will be the list of terms put /actualised/ in the concrete time period, or done by the concrete person.

 

Every term shows in the concrete amount of records the number of notes/notices referring to books, articles and periodicals.

 

PROFLIB is also able to import data from the external resources /e.g. EUROVOC Thesaurus/ and to put them into the database. It means the user can take information from some other database and import them to the PROFLIB database. Of course it is necessary to use the specific table for the data conversion import profiles, which is easily possible in the system administration. The records must be in a specific format which is suitable for the import into the PROFLIB database.

 

When the import of the format appropriate for the import to the PROFLIB system it is possible to put descriptors of the thesaurus from the open validation field into the field in the editing mask.

 

 

II. Printings and Parliamentary Debate Indexing

 

Starting point situation

 

The design of conception and strategy for implementation of the Eurovoc thesaurus in the context of the parliamentary information system has been influenced by the following facts:

 

  1. There are no generally accepted standards or recommendations of the EU for implementing the thesaurus in the national parliament's information systems.

 

  1. We awere not aware of any conception of the EU concerning the future development of the Eurovoc thesaurus, with regard to backward compatibility and possible future structural changes.

 

  1. The IT infrastructure of the Slovak Parliament is based on the Microsoft Windows platform (OS, Office, BackOffice, .NET framework).

 

  1. exe IT maintains the majority of parliamentary documents that are in electronic form, such as bills, resolutions, joint committee reports, parliamentary prints etc.

 

 Solution goals

  The strategy for implementing the Eurovoc thesaurus for indexing and retrieval of the various parliamentary documents has been developed with close cooperation between exe IT and pivotal members of the IT department, the Parliamentary Library, the Administrative department and the Department for Legislation and Law Approximation, with the following goals in mind:

 

  1. The software components needed for implementing the thesaurus had to be integrated within the existing Parliamentary information system.

 

  1. The Eurovoc deployment process had to be outlined in such a manner that we would be able to pick up feedback from parliamentary, as well as public users as early in the process as possible.

 

Solution progression proposal

The following key phases were defined for the implementation of the Eurovoc thesaurus in the Slovak Parliament:

 

Phase

Deadline

Deliverables

  1. Eurovoc database implementation

3/2003

Thesaurus implementation within the MS SQL Server 2000 environment. A data model for the thesaurus has been designed and developed that encompasses all the specifics of the thesaurus data structures and adds support for versioning, management and comparison of thesaurus versions (see point 2. in the "Starting point situation" section). Also an import procedure for importing the 3.1 thesaurus data from the TAT program's database has been developed.

  1. Intranet browser

4/2003

A simple intranet application has been developed, which allows browsing and searching within the thesaurus database created in the previous phase. The application has been published to a select group of people in order to get early feedback.

  1. Eurovoc management

7/2003

A tool for thesaurus management is planned that will support the following key features:

  • Import of new thesaurus version that are in the EU XML format.

  • Thesaurus export into T-Series library system.

  • Tree-like thesaurus visualization in terms of the BT NT relationships.

  • A comprehensible and easy to use display of all the information pertaining to a descriptor (history of changes, associative relationships, nondescriptors, languages…).

  • An easy to use comprehend thesaurus editing with integrated change tracking.

  • Simple and intuitive term search with wildcards support.

  • Support for control the application exclusively with the keyboard.

  • Thesaurus versioning.

  • Support for easy thesaurus translation into Slovak language; support for offline translations (division of labor between several translators).

  1. Indexing

7/2003

A lucid indexing tool for indexation of a selected subset of the parliamentary documents (that is parliamentary printings and parliamentary debate) along with an intuitive intranet application for retrieval of the indexed documents. This phase should run in parallel with the phase 3. as to contribute to the goal defined in part 2. "Solution goals", i.e. "… to pick up feedback from parliamentary, as well as public users as early in the process as possible."

  1. Evaluation and planning next steps

10/2003

Evaluation of the results and feedback from the previous phases in the form of an intranet-based survey and recommendations from a selected subset of thesaurus users; planning of the further system development, such as:

  • Indexing the whole set of parliamentary documents.

  • Ensuring continuous indexing while creating and modifying parliamentary document by means of extensibility hooks implemented in exe IT's applications that create and manage documents.

  • Streamlining of indexation with help of the fulltext technology licensed by the Forma Company.

  • Continuous improvement of the retrieval tools, publishing of selected subset of the parliamentary documents on the parliamentary web (http://www.nrsr.sk) with support for searching and retrieval using Eurovoc descriptors.

 

 Implementation notes 

A top notch technologies and tools were chosen for implementation of the Eurovoc thesaurus in the Slovak Parliament. The tools allow for rapid prototyping, implementation, deployment and integration of the software components needed to facilitate the goals mentioned in the section "Solution goals":

  • Microsoft SQL Server 2000 for the database engine.

  • Microsoft .NET Framework for the development and deployment environment.

  • Microsoft Visual Studio .NET for a set of development and maintenance tools.

Full indexing of the parliamentary prints (drafts of laws etc.) started in 2008. All parliamentary prints indexed by EUROVOC thesaurus can be found on the National Council web page.