The SSH Training Discovery Toolkit provides an inventory of training materials relevant for the Social Sciences and Humanities.

Use the search bar to discover materials or browse through the collections. The filters will help you identify your area of interest.


Tools for normalization

Item icon
Needs curation

This is a list of tools for text normalization that are available as part of the CLARIN Resource Families initiative.

Text normalization is the process of transforming parts of a text into a single canonical form. It represents one of the key stages of linguistic processing for texts in which spelling variation abounds or deviates from the contemporary norm, such as in texts published in historical documents or on social media. After text normalization, standard tools for all further stages of text processing can be used. Another important advantage of text normalization is improved search which can be performed with querying a single, standard variant but takes into account all its spelling variants, be it historical, dialectal, colloquial or slang.

Free access
Access conditions
Some resources may require registration and/or personal access rights in addition to signing in
Intended audience
Curated topics
Last updated