Items

Item
Title Body
Research Data Management Toolkit

This toolkit includes a number of resources on research data management. However, due to its broad scope, the toolkit is not structured as an online course.It contains courses, videos, infographics, books and other materials

Guidelines on FAIR Data Management in Horizon 2020

Guidelines to help researchers make their research data findable, accessible, interoperable and reusable (FAIR), to ensure sound managementd. Good research data management is not a goal in itself, but rather the key conduit leading to knowledge discovery and innovation, and to subsequent data and knowledge integration and reuse.

Digital Curation 101

Digital Curation 101 employs the curation lifecycle model sections as a means of presenting content to students. The DC 101 has been developed because the DCC, in its role as a source of expert advice and guidance to the community, identified a need for a contextual, theoretical introduction to the basics of digital curation with practical examples and exercises. The target audience is new grant holders with Research Council curation mandates to fulfil. The course indicates what should be considered in planning and implementing projects.

easySHARE

easySHARE is a simplified HRS-adapted dataset for student training, and for researchers who have little experience in quantitative analyses of complex survey data. While the main release of SHARE is stored in more than 100 single data files, easySHARE stores information on all respondents and of all currently released data collection waves in one single dataset. Moreover, for the subset of variables covered in easySHARE, the complexity was considerably reduced. For example the information collected only from one person of a couple or in a household was transferred to all respective respondents; time constant information collected only in the first interview was transferred to all later interviews; the coding of missing values was enriched to provide an easier understanding of the routing and filtering of the interviews; etc. In addition, several ready to analyse variables have been added, such as health indexes, demographic information, or economic measures. When possible measures have been selected or recoded to facilitate comparative analyses with the US Health and Retirement Study (HRS).

Tools for named entity recognition

This is a list of tools for named entity recognition that are available as part of the CLARIN Resource Families initiative.

Named entity recognition (NER) is an information extraction task which identifies mentions of various named entities in unstructured text and classifies them into predetermined categories, such as person names, organisations, locations, date/time, monetary values, and so forth. They can, for example, help with the classification of news content, content recommentations and search algorithms.

Tools for normalization

This is a list of tools for text normalization that are available as part of the CLARIN Resource Families initiative.

Text normalization is the process of transforming parts of a text into a single canonical form. It represents one of the key stages of linguistic processing for texts in which spelling variation abounds or deviates from the contemporary norm, such as in texts published in historical documents or on social media. After text normalization, standard tools for all further stages of text processing can be used. Another important advantage of text normalization is improved search which can be performed with querying a single, standard variant but takes into account all its spelling variants, be it historical, dialectal, colloquial or slang.

Wordlists

This is a list of wordlists that are available as part of the CLARIN Resource Families initiative.

Wordlists are lexical resources which only provide alphabetical or frequency-based lexical inventories. In the vast majority of the cases, the wordlists can be directly downloaded from CLARIN national repositories or queried through easy-to-use online search environments.

Glossaries

This is a list of glossaries that are available as part of the CLARIN Resource Families initiative.

Glossaries are specialised dictionaries that contain domain-specific terminology and/or expressions. In the vast majority of the cases, the glossaries can be directly downloaded from CLARIN national repositories or queried through easy-to-use online search environments.

Conceptual resources

This is a list of conceptual resources that are available as part of the CLARIN Resource Families initiative.

Concept-based resources include onomasiological lexical resources such as wordnets, framenets, thesauri and ontologies. Such resources are typically interlinked with semantic relations (e.g. hypernymy, hyponymy). In the vast majority of the cases, the conceptual resources can be directly downloaded from the national repositories or queried through easy-to-use online search environments.

Dictionaries

This is a list of dictionaries that are available as part of the CLARIN Resource Families initiative.

Dictionaries were primarily created for human use (e.g., language learning/teaching, translation, lexicology) and are typically semasiological, which means that they are organized around words and contain information on their meanings, definitions, pronunciation, etc.