Resources
The special interest group for Uralic languages hosts an up-to-date list of resource for Uralic languages. This pages tries to capture state of the Uralic languages computational resources using linkable, downloadable and usable resources as references. For a full list of resources available, users are advised to turn to services, such as meta-share.
Please help us keep the list up-to-date, send information of new resources and fixes to current ones to our ticket tracking system on github.
By Language
Finnish
Finnish keyboard
- Kotoistus keyboard layout, for national (SFS 5966) and international standards
Comes with all common OSes and systems: Microsoft’s, Linux, Apple’s and Android-based.
Finnish Morphology
- Omorfi (see also: apertium-fin, giella-fin)
- Voikko (also: suomi-malaga, vfst morphology)
- GF Finnish
- UralicNLP (uses Omorfi)
Finnish Treebanks
- Universal Depedencies Finnish (see also: Turku dependency treebank)
- Universal Dependencies Finnish FTB (see also: FinnTreeBanks)
Finnish Machine Translation
- Apertium Finnish-English (high coverage, low quality)
- GF Finnish to any
North Saami
North Saami keyboards
- Official layout keyboard layout, for national and international standards
Comes with all common OSes and systems: Microsoft’s, Linux, Apple’s and Android-based.
Hungarian
Hungarian morphologies
By Resource
Morphology
- Omorfi (see also: apertium-fin, giella-fin)
- Voikko (also: suomi-malaga, vfst morphology)
- hunmorph
- GF (Available: Finnish, Hungarian…)
- UralicNLP (Available models/data: Finnish, North Saami…)
Larger collections:
- Universal dependencies, treebanks, dependency syntax conventions for Finnish, Estonian and Hungarian plus other world languages (includes Uralic guidelines
- Akusanat an online dictionary
- UralicNLP resources different resources for Uralic languages
- OPUS open source parallel corpora corpora for most of the world’s languages
- Giellatekno repository of uralic analysers and tools, most Uralic languages
- Apertium, machine translation dictionaries including some uralic languages
- Grammatical Framework Haskell descriptions of linguistic data, including a few uralic languages
- Korp at CSC, a corpus search interface for CSC.fi-managed corpora
- Wanca corpora from SUKI project on harvesting internet for Uralic texts
- Voikko spell-checking for many Uralic languages
- Language bank of Finland a Finland’s central repository of language resources
- Centre for Estonian Language Resources
- Divvun Writers’ tools for Saami languages, and lots of others