Anyone working with research these days could confirm that scarcity of research outcomes is not an issue anymore; thematic repositories and aggregators provide access to a wealth of research publications and data, as well as related information such as projects working in a specific thematic area, profiles of researchers and organizations activated in a specific research context, funding opportunities, tools and services etc. However, one of the major issues identified is the difficulty in retrieving such information from various heterogeneous content sources that are scattered worldwide, isolated and incompatible with each other. This is, to a high degree, due to the lack of interoperability between the different systems hosting such information.
Knowledge Organization Systems (KOSs) are controlled vocabularies such as thesauri, topic maps, ontologies etc. that are used for the classification of content. KOSs also function as the backbone of Linked Open Data systems, as they provide the mean for linking different (and sometimes heterogeneous) data sources. There are numerous KOSs available nowadays, serving different research communities (including the agricultural one); in fact, it seems that there are way too many of them available, smaller and larger domain-specific and agnostic, in various types and flavors.
The good news for the agricultural information and knowledge management community is that the three biggest and most widely used thesauri, namely AGROVOC of UN FAO, the National Agricultural Library Thesaurus of USDA and CAB Thesaurus (CABI) decided to join forces and give birth to a child named Global Agricultural Concept Scheme, or simply GACS. You can read more about GACS in our previous blog post.
GACS in numbers
Out of the 32,000 concepts for AGROVOC, 140,000 for CAB Thesaurus, and 53,000 for NAL Thesaurus, the 10,000 most commonly used concepts from each were automatically mapped. These mappings were manually validated by experts and any problematic mappings were discussed and corrected.. These mappings were then used to generate a draft concept scheme (“GACS Beta”) which currently features 15,406 concepts and 398,216 labels in 28 languages. You can read more about the process in the corresponding blog post by Thomas Baker