Text mining in agriculture: The AgroTagger keyword extractor

agrotagger

The use of keywords is crucial for the description, organization, indexing, retrieval and sharing of research in every scientific field and agriculture is not excluded. However, manual annotation of research outcomes is time-consuming and error-prone so automatic methods for metadata annotation are always explored. AgroTagger is one of the tools facilitating the work of information and knowledge managers (among others) in the agri-food sector, by applying text-mining on top of agri-food research outcomes.

AgroTagger is a keyword extractor that uses a subset of the AGROVOC thesaurus (about 2,5K concepts out of the total >40K concepts of AGROVOC) as a set of allowable keywords, used for indexing information resources.… Click to read the full post

Text Mining and Agriculture: The AgroNLP projects

textminingnew

The TETIS team (Territoires, Environnement, Télédetection et Information Spatiale or Land, environment, remote sensing and spatial information in English) is a Joint Research Unit of some of the major French actors in text mining and Natural Language Processing, namely AgroParisTech, Irstea (National Research Institute of Science and Technology for Environment and Agriculture) and Cirad (the International Cooperation Centre for Agricultural Development Research). It is located in Montpellier. The TETIS Unit has been really active in projects related to the application of text mining in use cases in the agricultural sector.

These AgroNLP projects (Natural Language Processing applied to AGRicultural dOmain) aim to address different challenges faced by stakeholders in the agri-food sector, such as:

  • Animal Disease Surveillance: The project proposes a new methodology in the domain of epidemic intelligence in animal health in order to discover knowledge in web documents dealing with animal disease outbreaks.
Click to read the full post

Working on text mining – this is what we do

textmining

Our participation in the OpenMinTeD Horizon 2020 project allowed us to get to know a little bit more on text mining, the communities around it, the various types of stakeholders and the issues that they face.

Our job was everything around the user requirements’ elicitation; we were responsible for defining and applying a methodology for creating the profiles of various types of text mining stakeholders, understanding their content-related needs, identifying their issues and proposing the optimal solutions to address them – focusing on the text mining-related ones. Pretty demanding, right?

The project encompasses four (4) different communities:

  1. Scholarly Communication
  2. Agriculture / Biodiversity
  3. Life Sciences
  4. Social Sciences

Our methodology consisted of two rounds of requirements’ elicitation:

i) A general online questionnaire was prepared and project partners were asked to adapt it for their communities.… Click to read the full post

Open Access & Text Mining: Moving Things Forward

Text mining image

Text mining refers to “the process or practice of examining large collections of written resources in order to generate new information” (source). I am not an expert in text mining, but I understand that it is about applying specialized software/algorithms/techniques on existing textual information so that it can be read and analyzed by machines in order for them to extract more meaningful information for us, humans. Of course, text mining is no news to the research community, as it seems that it all started back in the ’80s with a methodology titled CAVE (Content Analysis of Verbatim Explanations) but its background goes beyond the scope of this article.… Click to read the full post

What happens in Turin stays in Turin? We have a FREME report

FREME_Stoitsis

Freme-blog

Have you ever used services like Open Calais ? If yes then probably you were amazed by how easy is to automatically extract semantic information directly from the submitted text. If you are not familiar with Open Calais it is a service that extracts events, people, organisations, topics and social tags from unstructured data. I would suggest to try it. A lot of developers and companies are using such services to enrich their content, to improve discovery services, to build recommendations services and to extract analytics from content.

But could such great service work with any open dataset that you have for your domain and not only with a specific and close data?
Click to read the full post

Text and data mining (TDM) in agri-food research

OpenMINTED_Tag_Color

A wealth of published research outcomes is currently publicly available (mostly thanks to the really active Open Access initiatives and mandates that keep finding ways to open up even more research data); however, at the same time, researchers are still facing a challenge when seeking for specific elements of a research publication that would support their own research, such as an image, a diagram or a dataset related to a specific topic, such as crop disease, within the scientific literature. Indeed, such components are currently embedded in various types of publications and cannot be identified, described and retrieved as individual entities.… Click to read the full post