This readme file was created on 2020-06-26 by Sune Gregersen. =========================== == GENERAL INFORMATION == =========================== This dataset, entitled "Corpus data for Early English modals" (DOI: 10.21942/uva.12568559), contains material analysed for the PhD dissertation "Early English modals: Form, function, and analogy" (University of Amsterdam, 2020). The research project was carried out at the University of Amsterdam from 2015 to 2019 by Sune Gregersen (ORCID: 0000-0002-3387-4340) under the supervision of Prof. Olga Fischer. The research was supported by the Dutch Research Council (NWO) under project number 326-70-001. The dissertation consists of a theoretical and methodological introduction followed by four interconnected studies on the early history of the English modals, in particular their morphosyntactic and semantic properties in the Old English (c. AD 800–1100) and Middle English periods (c. AD 1100–1500). The corpus material in this dataset was used for these four studies. ==================== == DATA SOURCES == ==================== The data were gathered from a number of electronic corpora and repositories of Old and Middle English as well as historical and contemporary Danish texts. Material from the following corpora and repositories was used: ADL = Arkiv for Dansk Litteratur. 2001–2017. Copenhagen: Det Danske Sprog- og Litteraturselskab. . Burnley, David, and Alison Wiggins (eds.). 2005 [1977]. The Auchinleck Manuscript. Originally published by Scolar Press in association with the National Library of Scotland. Downloaded from the Oxford Text Archive on 2016-04-12. . CMEPV = Corpus of Middle English Prose and Verse. 2006. Humanities Text Initiative, University of Michigan. . DOEC = di Paolo Healey, Antonette (ed.). 2005 [2000]. Dictionary of Old English Corpus in Electronic Form. Originally published on CD-ROM by the Dictionary of Old English Project. Downloaded from the Oxford Text Archive on 2016-07-26. . HC = Helsinki Corpus of English Texts (Diachronic Part). 1991. Helsinki. Downloaded from the Oxford Text Archive on 2016-03-04. . ICEL = Markus, Manfred (ed.). 2009. Innsbruck Corpus of English Letters, version 2.1 [Part of the Innsbruck Computer Archive of Machine-Readable English Texts]. Institut für Anglistik, Universität Innsbruck. CD-ROM. ICMEP = Markus, Manfred (ed.). 2010. Innsbruck Corpus of Middle English Prose, version 2.4 [Part of the Innsbruck Computer Archive of Machine-Readable English Texts]. Institut für Anglistik, Universität Innsbruck. CD-ROM. KorpusDK, version 1512. 2019. Copenhagen: Det Danske Sprog- og Litteraturselskab. PPCME2 = Kroch, Anthony, and Ann Taylor (ed.). 2000. Penn-Helsinki Parsed Corpus of Middle English, 2nd edition. Department of Linguistics, University of Pennsylvania. CD-ROM. Project Gutenberg. 1971–2020. . Tekstnet = Nielsen, Marita Akhøj (ed.). 2018. Tekstnet: Tekster fra Danmarks middelalder og renæssance 1100-1550. Copenhagen: Det Danske Sprog- og Litteraturselskab. . ================ == SOFTWARE == ================ The sources were queried with the following software: AntConc = Anthony, Laurence. 2014. AntConc, version 3.4.3m. . CoREST = Asmussen, Jørg. 2010–2019 CoREST – Corpus Retrieval System & Tools. . CorpusSearch = Randall, Beth. 2005–2007. CorpusSearch 2: A tool for linguistic research. . The concordances were annotated in Microsoft Excel for Mac, version 16.16.23. ================ == CONTENTS == ================ Each file in the dataset is included in .xlsx and .csv format. The files are organized into five folders. One of these (corpus) contains metadata. The remaining four folders (impersonal-modals, dare, can-may, mot) contain one or more files with concordances used for the four main chapters of the dissertation. For details about the analytical categories and information about the selection of material, please refer to the published dissertation. corpus ====== Contains a list of sources used for the investigations in Chapters 7 and 8 (see "can-may" and "mot" below). The Old English sources (oe-corpus) are drawn from the DOEC. The Early Middle English (eme-corpus) and Late Middle English (lme-corpus) sources come from several corpora listed above under "Data sources". impersonal-modals ================= Material for Chapter 5, "Morphosyntactic changes in Middle English". Consists of three files (ppcme2-behove, ppcme2-mot, ppcme2-ought) containing annotated concordances from the PPCME2 corpus. dare ==== Material for Chapter 6, "Reconsidering the history of DARE". Consists of a single file (korpusdk-turde) with annotated concordances from the Present-Day Danish corpus KorpusDK. can-may ======= Material for Chapter 7, "The development of CAN and MAY". Consists of six files containing annotated concordances from the corpora listed in the folder "corpus". Three of these files contain material relating to the development of CAN (can-oe, can-eme, can-lme). The other three files contain material relating to the development of MAY (may-oe, may-eme, may-lme). mot === Material for Chapter 8, "The development of MOT". Consists of four files with annotated concordances. Three of these four files (mot-oe, mot-eme, mot-lme) contain data from the corpora listed in the folder "corpus". One file (maa-mda) contains data drawn from early Danish sources from ADL and Tekstnet. =============== == CONTACT == =============== Please direct any questions or suggestions concerning this dataset to: Sune Gregersen Amsterdam Center for Language and Communication University of Amsterdam E-mail: s.gregersen[apenstaartje]uva.nl or sune.gregersen[apenstaartje]gmail.com ORCID: 0000-0002-3387-4340