The SSH Training Discovery Toolkit provides an inventory of training materials relevant for the Social Sciences and Humanities.
Use the search bar to discover materials or browse through the collections. The filters will help you identify your area of interest.
|Computer-mediated communication corpora||
This is a list of computer-mediated communication corpora that are available as part of the CLARIN Resource Families initiative.
Computer-mediated communication (CMC) constitutes public and private communication on-line, such as posts on blogs, forums, comments on online news sites, social media and networking sites such as Twitter and Facebook, instant chat rooms such as, mobile phone applications such as WhatsApp and e-mail. Because corpora that compile computer-mediated communication often include very informal styles of writing, they are interesting for a wide range of research fields, such as language variation, pragmatics, media and communication studies, etc. They are also very important for the development of robust NLP tools that can deal with non-standard spelling, vocabulary and grammar. Compilation and dissemination of such corpora are hindered by the unclear legal status of CMC data when distributed as resource to the scientific community, which is further exacerbated by the rapidly changing terms of service by content providers.
|CLARIN Legal Information Platform||
The platform aims to introduce researchers with basic notions related to the legislative and licensing framework in Europe on Copyright and Data Protection:
It also includes proposals for:
|CLARIN Depositing Services||
One of the fundamental services of the CLARIN infrastructure is making sure that language resources can be archived and made available to the community in a reliable manner. To help researchers to store their resources (e.g. corpora, lexica, audio and video recordings, annotations, grammars, etc.) in a sustainable way, many of the CLARIN centres offer a depositing service. They are willing to store the resources in their repository and assist with the technical and organisational details. This has a wide range of advantages:
|CLARIN Resource Families||
The aim of the CLARIN Resource Families initiative is to provide a user-friendly overview of the available language resources in the CLARIN infrastructure for researchers from digital humanities, social sciences and human language technologies. The overviews are organized according to the types of data in the resources and include listings sorted by language.
The listings include the most important metadata and brief descriptions, such as resource size, text sources, time periods, annotations and licences as well as links to download pages and concordancers, whenever available. In addition to the resources found in the CLARIN infrastructure, CLARIN Resource Families provides an overview of other existing valuable language resources which have not yet been integrated in the infrastructure.
CLARIN Resource Families also provides hyperlinks to other relevant materials such as the thematic CLARIN workshops and tutorials and their accompanying videolectures, as well as a list of key publications on the resources surveyed.