R codes and dataset for Visualisation of Diachronic Constructional Change using Motion Chart
datasetposted on 10.03.2019, 06:18 by Gede Primahadi Wijaya RajegGede Primahadi Wijaya Rajeg
Primahadi Wijaya R., Gede. 2014. Visualisation of diachronic constructional change using Motion Chart. In Zane Goebel, J. Herudjati Purwoko, Suharno, M. Suryadi & Yusuf Al Aried (eds.). Proceedings: International Seminar on Language Maintenance and Shift IV (LAMAS IV), 267-270. Semarang: Universitas Diponegoro. doi: https://doi.org/10.4225/03/58f5c23dd8387
Description of R codes and data files in the repository
This repository is imported from its GitHub repo. Versioning of this figshare repository is associated with the GitHub repo's Release. So, check the Releases page for updates (the next version is to include the unified version of the codes in the first release with the tidyverse).
The raw input data consists of two files (i.e.
go_INF.txt). They represent the co-occurrence frequency of top-200 infinitival collocates for will and be going to respectively across the twenty decades of Corpus of Historical American English (from the 1810s to the 2000s).
These two input files are used in the R code file
1-script-create-input-data-raw.r. The codes preprocess and combine the two files into a long format data frame consisting of the following columns: (i)
coll(for "collocate"), (iii)
BE going to(for frequency of the collocates with be going to) and (iv)
will(for frequency of the collocates with will); it is available in the
Then, the script
input_data_raw.txtfor normalising the co-occurrence frequency of the collocates per million words (the COHA size and normalising base frequency are available in
coha_size.txt). The output from the second script is
input_data_futurate.txtcontains the relevant input data for generating (i) the static motion chart as an image plot in the publication (using the script
3-script-create-motion-chart-plot.R), and (ii) the dynamic motion chart (using the script
The repository adopts the project-oriented workflow in RStudio; double-click on the
Future Constructions.Rprojfile to open an RStudio session whose working directory is associated with the contents of this repository.
English Department, Faculty of Arts, Udayana University, Indonesia
- Programming Languages
- Digital Humanities
- Data Communications
- English Language
- Language Studies not elsewhere classified
- Language in Time and Space (incl. Historical Linguistics, Dialectology)
- Natural Language Processing
- Linguistic Structures (incl. Grammar, Phonology, Lexicon, Semantics)
- Computational Linguistics
- Linguistics not elsewhere classified