Skip to main content
Info meny
Aktuellt
FAQ
About us
Contact us
Sök
Plattformar
Data
Analyses
Research
Staff
Menu
Breadcrumb
Home
Language resources
Language resources
Language resources
On this page you can browse and search our datasets. Click on a row name to see what files are available for download. You can go directly to the search interface by clicking on the tool logo.
All (1397)
Collections (32)
Corpora (1236)
Lexicons (84)
Training and evaluation data (27)
Models (50)
Title
Free search
Language
- Any -
Swedish
Albanian
Arabic
Belarusian
Blissymbols
Bosnian
Bulgarian
Croatian
Czech
Danish
Dutch
English
Estonian
Faroese
Finland Swedish
Finnish
French
German
Icelandic
Iranian Persian
Italian
Kele (Papua New Guinea)
Kurdish
Latin
Latvian
Lower Sorbian
Macedonian
Modern Greek (1453-)
Multiple languages
Norwegian
Norwegian Bokmål
Old English (ca. 450-1100)
Old High German (ca. 750-1050)
Old Norse
Old Saxon
Polish
Portuguese
Romanian
Russian
Serbian
Slavomolisano
Slovak
Slovenian
Somali
Spanish
Turkish
Turkmen
Ukrainian
Upper Sorbian
Xhosa
Resurs
Typ
Språk
Åtkomst
Dependency parsing model: Stanza
Pretrained models for dependency parsing.
Model
Swedish
Dataset:
synt_stanza_eval.zip
2020-12-09 – 99.05 MB – CC-BY-4.0
Dataset:
synt_stanza_full2.zip
2020-12-09 – 99.17 MB – CC-BY-4.0
Dataset:
stanza_pretrain.zip
2025-02-20 – 91.7 MB – CC-BY-4.0
Collection
Kubord-fasttext
A collection of fasttext models trained on modern newspaper texts from the National Library of Sweden
Model
Swedish
See 12 collected resources
Kubord-fasttext - Aftonbladet 2010–2022 - lemma
Fasttext model trained on Aftonbladet 2010–2022
Model
Swedish
Dataset:
kubord-fasttext-afb-2010-2022-lemma.zip
2024-08-05 – 2.94 GB – CC-BY-4.0
Kubord-fasttext - Aftonbladet 2010–2022 - token
Fasttext model trained on Aftonbladet 2010–2022
Model
Swedish
Dataset:
kubord-fasttext-afb-2010-2022-token.zip
2024-06-11 – 3.18 GB – CC-BY-4.0
Kubord-fasttext - Aftonbladet 2010–2024 - lemma
Fasttext model trained on Aftonbladet 2010–2024
Model
Swedish
Dataset:
kubord-fasttext-afb-2010-2024-lemma.zip
2025-06-18 – 3 GB – CC-BY-4.0
Kubord-fasttext - Aftonbladet 2010–2024 - token
Fasttext model trained on Aftonbladet 2010–2024
Model
Swedish
Dataset:
kubord-fasttext-afb-2010-2024-token.zip
2025-06-18 – 3.17 GB – CC-BY-4.0
Kubord-fasttext - Dagens Nyheter 2010–2022 - lemma
Fasttext model trained on Dagens Nyheter 2010–2022
Model
Swedish
Dataset:
kubord-fasttext-dn-2010-2022-lemma.zip
2024-08-05 – 2.81 GB – CC-BY-4.0
Kubord-fasttext - Dagens Nyheter 2010–2022 - token
Fasttext model trained on Dagens Nyheter 2010–2022
Model
Swedish
Dataset:
kubord-fasttext-dn-2010-2022-token.zip
2024-06-11 – 3.1 GB – CC-BY-4.0
Kubord-fasttext - Dagens Nyheter 2010–2024 - lemma
Fasttext model trained on Dagens Nyheter 2010–2024
Model
Swedish
Dataset:
kubord-fasttext-dn-2010-2024-lemma.zip
2025-06-18 – 2.9 GB – CC-BY-4.0
Kubord-fasttext - Dagens Nyheter 2010–2024 - token
Fasttext model trained on Dagens Nyheter 2010–2024
Model
Swedish
Dataset:
kubord-fasttext-dn-2010-2024-token.zip
2025-06-18 – 3.1 GB – CC-BY-4.0
Kubord-fasttext - Göteborgsposten 2013–2022 - lemma
Fasttext model trained on Göteborgsposten 2013–2022
Model
Swedish
Dataset:
kubord-fasttext-gp-2013-2022-lemma.zip
2024-08-05 – 2.69 GB – CC-BY-4.0
Kubord-fasttext - Göteborgsposten 2013–2022 - token
Fasttext model trained on Göteborgsposten 2013–2022
Model
Swedish
Dataset:
kubord-fasttext-gp-2013-2022-token.zip
2024-06-11 – 2.84 GB – CC-BY-4.0
Kubord-fasttext - Göteborgsposten 2013–2024 - lemma
Fasttext model trained on Göteborgsposten 2013–2024
Model
Swedish
Dataset:
kubord-fasttext-gp-2013-2024-lemma.zip
2025-06-18 – 2.74 GB – CC-BY-4.0
Kubord-fasttext - Göteborgsposten 2013–2024 - token
Fasttext model trained on Göteborgsposten 2013–2024
Model
Swedish
Dataset:
kubord-fasttext-gp-2013-2024-token.zip
2025-06-18 – 2.89 GB – CC-BY-4.0
Lemmatization model: Stanza
Pretrained model for lemmatization.
Model
Swedish
Dataset:
lem_stanza.zip
2020-11-19 – 3.74 MB – CC-BY-4.0
POS-tagging model: Flair
Pretrained models for POS-tagging.
Model
Swedish
Dataset:
flair_eval.zip
2020-06-18 – 1.37 GB – CC-BY-4.0
Dataset:
flair_full.zip
2020-06-18 – 1.37 GB – CC-BY-4.0
POS-tagging model: Marmot
Pretrained models for POS-tagging.
Model
Swedish
Dataset:
marmot_eval.marmot
2020-06-29 – 108.59 MB – CC-BY-4.0
Dataset:
marmot_full.marmot
2020-06-29 – 113.41 MB – CC-BY-4.0
Dataset:
saldo_marmot.txt
2020-06-29 – 46.33 MB – CC-BY-4.0
POS-tagging model: Stanza
Pretrained models for POS-tagging.
Model
Swedish
Dataset:
morph_stanza_eval.zip
2020-12-09 – 19.94 MB – CC-BY-4.0
Dataset:
morph_stanza_full2.zip
2020-12-09 – 20.19 MB – CC-BY-4.0
Dataset:
stanza_pretrain.zip
2025-02-20 – 91.7 MB – CC-BY-4.0
Pretrained embeddings
A list of pretrained embeddings for Swedish
Model
Swedish
sbx/KB-bert-base-swedish-cased_PI-detection-basic
En modell baserad på KB/bert-base-swedish-cased tränad med syfte att upptäcka personliga uppgifter, särskilt i studentuppsatser.
Model
Swedish
Dataset:
KB-bert-base-swedish-cased_PI-detection-basic
113.22 KB – GPL-3.0
sbx/KB-bert-base-swedish-cased_PI-detection-basic-iob
En modell baserad på KB/bert-base-swedish-cased tränad med syfte att upptäcka personliga uppgifter, särskilt i studentuppsatser.
Model
Swedish
Dataset:
KB-bert-base-swedish-cased_PI-detection-basic-iob
113.4 KB – GPL-3.0
sbx/KB-bert-base-swedish-cased_PI-detection-detailed
En modell baserad på KB/bert-base-swedish-cased tränad med syfte att upptäcka personliga uppgifter, särskilt i studentuppsatser.
Model
Swedish
Dataset:
KB-bert-base-swedish-cased_PI-detection-detailed
113.68 KB – GPL-3.0
sbx/KB-bert-base-swedish-cased_PI-detection-detailed-iob
En modell baserad på KB/bert-base-swedish-cased tränad med syfte att upptäcka personliga uppgifter, särskilt i studentuppsatser.
Model
Swedish
Dataset:
KB-bert-base-swedish-cased_PI-detection-detailed-iob
113.92 KB – GPL-3.0
sbx/KB-bert-base-swedish-cased_PI-detection-general
En modell baserad på KB/bert-base-swedish-cased tränad med syfte att upptäcka personliga uppgifter, särskilt i studentuppsatser.
Model
Swedish
Dataset:
KB-bert-base-swedish-cased_PI-detection-general
113.4 KB – GPL-3.0
sbx/KB-bert-base-swedish-cased_PI-detection-general-iob
En modell baserad på KB/bert-base-swedish-cased tränad med syfte att upptäcka personliga uppgifter, särskilt i studentuppsatser.
Model
Swedish
Dataset:
KB-bert-base-swedish-cased_PI-detection-general-iob
113.73 KB – GPL-3.0
Swedish Diachronic Word Embeddings
Swedish Diachronic Word Embedding Models Trained on Historical Newspaper Data
Model
Swedish
Dataset:
HENGCHEN-TAHMASEBI_-_2020_-_Kubhist2_diachronic_embeddings.zip
2024-01-25 – 15.13 GB – CC-BY-4.0
Word Embeddings trained on English Wikipedia
Word Embeddings trained on English Wikipedia
Model
English
Dataset:
wiki_300_5_word2vec.model
2024-01-25 – 112.01 MB – CC-BY-4.0
Dataset:
wiki_300_5_word2vec.model.syn1neg.npy
2024-01-25 – 3.75 GB – CC-BY-4.0
Dataset:
wiki_300_5_word2vec.model.wv.vectors.npy
2024-01-25 – 3.75 GB – CC-BY-4.0
Dataset:
wiki_300_50_word2vec.model
2024-01-25 – 28.04 MB – CC-BY-4.0
Dataset:
wiki_300_50_word2vec.model.syn1neg.npy
2024-01-25 – 949.26 MB – CC-BY-4.0
Dataset:
wiki_300_50_word2vec.model.wv.vectors.npy
2024-01-25 – 949.26 MB – CC-BY-4.0
Plattformar
Hur vi arbetar
Data
Analyses
Research
Publications
Doktorandutbildning
For PhD students and supervisors
Research meetings
Staff
Aktuellt
Calendar
Conferences and workshops
Autumn Workshop
Höstworkshop 2025
Höstworkshop 2024
Höstworkshop 2023
Höstworkshop 2022
Höstworkshop 2021
Autumn Workshop 2020
Autumn Workshop 2011 and Korp-release
Autumn Workshop 2012
Autumn Workshop 2013
Autumn Workshop 2014
Autumn Workshop 2015
Autumn Workshop 2016
Autumn Workshop 2017
Autumn Workshop 2018
Autumn Workshop 2019
Språkbanken 40 years
FAQ
About us
Organisation
Språkbanken 50 years
Celebration
A brief history
How to cite
Cookies
Internal
Contact us
Help desk
Sök