Skip to main content

Voices of formerly Enslaved Corpus

Data citation Information

Språkbanken Text (2026). Voices of formerly Enslaved Corpus (updated: 2026-03-11). [Data set]. Språkbanken Text. https://doi.org/10.23695/p5hw-dr52
BibTeX Additional ways to cite the dataset.
A corpus of transcribed and annotated narratives of formerly enslaved people mainly in U.S.A..

The Voices of formerly Enslaved Corpus (Pilot)

A corpus of transcribed and annotated narratives from informants who were formerly enslaved. Texts are partly in standard English and partly in vernacular English [eng]. There are two main parts:

  • Selection of narratives collected in the Federal Writers' Projct (FWP) collected in late 1930's and published in 1941.
  • Selection of narratives collected as part of the Documenting the American South Collection (DocSouth)

The Corpus has been developed in collaboration with Språkbanken Text.

For more on annotation, preparation of data, and acknowledgements see:

  • IE, LJO, KR. (202x). Voices of the formerly Enslaved Corpus, Pilot v0.1. [forthcoming][Where], pp nn--nn.

For questions about the corpus:
Klas Rönnbäck klas.ronnback@econhist.gu.se

If you notice any errors or inconsistencies in annotations, please report them to this email address.

Main contributors:

  • Irene Elmerot
    Researcher, University of Gothenburg
  • Leif-Jöran Olsson
    Senior Research Engineer, University of Gothenburg
  • Klas Rönnbäck
    PI and Senior Researcher, University of Gothenburg

Download

File Size Modified Licence
corpus-of-the-formerly-enslaved-Pilot-v0.1.tar.bz2
Voices of formerly Enslaved Corpus, Pilot v0.1 Information (XML)
CC-BY-SA-4.0

Type

  • Corpus

Language

English

Size

Tokens: 10,323,284
Sentences: 405,274

Created

2025-07-24

Updated

2026-03-11

Contact

klas.ronnback@econhist.gu.se