Oral Archives for Sociolinguistic Research

The goal of the course in sociolinguistics is to show students the possibilities and challenges offered by oral history archives for (socio)linguistic research. The course is intended as a research framework that will guide students during their future research work. The lectures allow students to become acquainted with the CLARIN infrastructure, and to present them with software tools that will allow them to carry out their own thesis research independently. The course offers guidance for the following steps that must be addressed during a (research) project dealing with oral archives: i) reviewing ethical and legal issues arising from using and reusing legacy data; ii) use of metadata to provide the appropriate level of description for the dataset; iii) automatic and manual transcription of the speech material, using the CLARIN infrastructure; iv) the selection and use of the appropriate CLARIN software and tools depending on the research goals (phonetic, lexical, discourse analysis, etc.).


Taken from: Teaching with CLARIN:

Introduction to Speech Analysis

This course offers a general picture of managing speech corpora and of the methods that are available for the acoustic-phonetic study of speech. During the course, students use a speech analysis program called Praat and learn to apply the main features of the program in their own work with speech recordings. In addition, students will learn the basics of another program called ELAN that can be used for transcribing and annotating audio as well as video material.

Taken from: Teaching with CLARIN:

Introduction to Digital Humanities

The aim of the course is to introduce digital humanities and to describe various aspects of digital content processing. The practical aims consist of introducing current data sources, annotation, pre-processing methods, software tools for data analysis and visualisation, and evaluation methods.

Currently, we identified that students are somewhat aware of digital humanities but it is difficult for them to dive in and, mainly, to anticipate what they should learn for their future research. A more detailed goal of this course is to present some current projects, show the datasets and technologies behind, and encourage students to explore the datasets and use the technologies on data they already know. A high level goal is to set the knowledge of the technologies and available datasets into the research iteration loop (create hypotheses -> design instruments -> collect data -> analyze and evaluate).


Taken from: Teaching with CLARIN:

GATE Training Course

The training materials are all based around teaching the use of GATE, a freely available open-source toolkit for Natural Language Processing that has been widely used in both academia and industry for many different tasks.

The modules provide instruction on how to get to grips with the GATE toolkit for basic language processing, as well as more advanced techniques, and include a number of different scenarios, such as processing social media, hate speech and misinformation detection. They include modules both for programmers who want to further develop their own tools within the toolkit, and for non-programmers who want to just make use of existing tools. The modules teach not only the use of GATE itself, but also how to adapt it to one’s own needs (for example, to adapt English tools to a different language, or how to customise existing tools), and also the basic concepts around a number of language processing tasks including both low-level (tokenisation, POS tagging, parsing) to more sophisticated (information extraction, social media analysis, hate speech detection, misinformation detection), as well as how to interpret and integrate the results of the processing. Finally, it teaches programmers how to extend the toolkit itself, by adding new tools or integrating it into other systems.


Taken from Teaching with CLARIN: 

Computational Morphology with HFST

The course demonstrates how HFST tools can be used for generating finite-state morphologies. Through practical exercises, students will learn how to use finite-state methods to develop a morphology for a language. This online course is suitable as a complement to a more theory or linguistics-oriented course on morphology.

After successfully completing the course:

- you can explain the basic theory on finite-state automata and transducers,

- you can design morphological lexica using finite-state technology,

- you know how to write morpho-phonological rules in a finite-state framework,

- you understand the diversity of morphological structure in different languages

 and you know how to take these differences into account when designing computational models of morphology.


Taken from Teaching with CLARIN: 

Archilochus of Paros: Elegiac Fragments – XML Archive

Goals and objectives of the training materials:

  • to improve textual criticism on ancient Greek fragmentary texts (research skills and data acquisition skills: research data management + text analytics)
  • to improve competence on text annotation (research skills and data acquisition skills: analytical thinking + text annotation);
  • to reach a reliable corpus for a digital scholarly edition of an ancient Greek poet (research skills and data handling: research methods + data repositories + data formats and standards).

Taken from Teaching with CLARIN:

Bringing synergy to better data management and research in Europe

The course includes a series of recorded videos, quizes, and practical assignments that will allow you to go through the course at your own pace. It invites researchers, students, trainers and data professionals and any other individual that is looking to gain basic knowledge on Open Science, EOSC and best practices for FAIR data.

ORION Open Science Train-the-Trainer MOOC

The ORION Open Science Train-the-Trainer course is intended to guide you in how to facilitate and run training on Open Science. The course covers the theoretical underpinnings of adult education as well as practical methods and techniques to use in training events. From didactics to video creation, from audience profiles to Brainwalking. There are a range of materials, media, and activities intended to strengthen your abilities as a training facilitator, both face-to-face and online.

ORION MOOC for Open Science in the Life Sciences!

This course is an introduction to Open Science principles in biomedicine, life sciences and other related research fields. It is intended to help scientists to share their research with the world more effectively.

Data Skills Modules

These introductory level interactive modules are designed for users who want to get to grips with key aspects of survey, longitudinal and aggregate data.

Modules can be conducted in your own time and you are able to dip in and out when needed. The modules give an introduction to key aspects of the data using short instructional videos, interactive quizzes and activities using open access software where possible.

Each module stands alone but those with little experience of surveys may find it useful to start with the Survey Data Module before moving on to the Longitudinal Data Module.

Modules include: Survey Data, Longitudinal Data, Aggregate Data