Natural Language Processing

Copyright & Related rights

This section is an introduction to copyright notions and related rights:

Tools for named entity recognition

This is a list of tools for named entity recognition that are available as part of the CLARIN Resource Families initiative.

Named entity recognition (NER) is an information extraction task which identifies mentions of various named entities in unstructured text and classifies them into predetermined categories, such as person names, organisations, locations, date/time, monetary values, and so forth. They can, for example, help with the classification of news content, content recommentations and search algorithms.

Tools for normalization

This is a list of tools for text normalization that are available as part of the CLARIN Resource Families initiative.

Text normalization is the process of transforming parts of a text into a single canonical form. It represents one of the key stages of linguistic processing for texts in which spelling variation abounds or deviates from the contemporary norm, such as in texts published in historical documents or on social media. After text normalization, standard tools for all further stages of text processing can be used. Another important advantage of text normalization is improved search which can be performed with querying a single, standard variant but takes into account all its spelling variants, be it historical, dialectal, colloquial or slang.

CLARIN Knowledge Sharing

The aim of the CLARIN Knowledge Sharing Initiative is to ensure ensure that the available knowledge and expertise provided by CLARIN consortia does not exist as a fragmented collection of unconnected bits and pieces, but is made accessible in an organized way to the CLARIN community and to the Social Sciences and Humanities research community at large. 

One central step in building the Knowledge Sharing Infrastructure is the establishment of Knowledge Centres. Most existing CLARIN centres are able to get the status of a Knowledge Centre right away; the K-Centres rather formalize and centrally register the existing expertise but does usually not require much additional effort from an institute except that the knowledge-sharing services have to be reliable and their skope has to be made explicit on a dedicated web-page of the respective institute(s).

The list of CLARIN Knowledge Centres is available here: