References
Abeysooriya, Mandhri, Megan Soria, Mary Sravya Kasu & Mark Ziemann.
2021. Gene name errors: Lessons not learned. PLOS Computational
Biology. Public 17(7). e1008984. https://doi.org/10.1371/journal.pcbi.1008984.
Acheson, Daniel J., Justine B. Wells & Maryellen C. MacDonald. 2008.
New and updated tests of print exposure and reading abilities in college
students. Behavior Research Methods 40(1). 278–289. https://doi.org/10.3758/brm.40.1.278.
Alhazmi, Fahd. 2020. A visual interpretation of the standard deviation.
Medium. https://towardsdatascience.com/a-visual-interpretation-of-the-standard-deviation-30f4676c291c.
Almeida, Alexandre, Adam Loy & Heike Hofmann. 2018. ggplot2
compatible quantile-quantile plots in R. The R
Journal 10(2). 248–261. https://doi.org/10.32614/RJ-2018-051.
alvinashcraft, alexbuckgit, ArcticLampyrid & bearmannl. 2022.
Maximum path length limitation. Learn Microsoft. https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation.
Baayen, R. Harald. 2008. Analyzing linguistic data: A practical
introduction to statistics using R. Cambridge
University Press.
Barrett, Malcolm. 2018. Why should I use the here package
when I’m already using projects? https://malco.io/articles/2018-11-05-why-should-i-use-the-here-package-when-i-m-already-using-projects.
Ben-Shachar, Mattan, Daniel Lüdecke & Dominique Makowski. 2020.
Effectsize: Estimation of effect size indices and standardized
parameters. Journal of Open Source Software 5(56). 2815. https://doi.org/10.21105/joss.02815.
Berez-Kroeker, Andrea L., Bradley McDonnell, Eve Koller & Lauren B.
Collister. 2022. The Open Handbook of
Linguistic Data Management.
MIT Press. https://doi.org/10.7551/mitpress/12200.001.0001.
Bochynska, Agata, Liam Keeble, Caitlin Halfacre, Joseph V. Casillas,
Irys-Amélie Champagne, Kaidi Chen, Melanie Röthlisberger, Erin M.
Buchanan & Timo B. Roettger. 2023. Reproducible research practices
and transparency across linguistics. Glossa Psycholinguistics
2(1). https://doi.org/10.5070/G6011239.
Breheny, Patrick & Woodrow Burchett. 2017. Visualization of
Regression Models Using visreg. The R Journal 9(2). 56. https://doi.org/10.32614/RJ-2017-046.
Bryan, Jennifer. 2018. Let’s Git started: Happy
Git and GitHub for the useR. Open
Education Resource. https://happygitwithr.com/.
Bryan, Jenny. 2017. Project-oriented workflow. Tidyverse.org.
https://www.tidyverse.org/blog/2017/12/workflow-vs-script/.
Busterud, Guro, Anne Dahl, Dave Kush & Kjersti Faldet Listhaug.
2023. Verb placement in L3 french and L3 german: The role of
language-internal factors in determining cross-linguistic influence from
prior languages. Linguistic Approaches to Bilingualism. John
13(5). 693–716. https://doi.org/10.1075/lab.22058.bus.
Center for OpenScience. 2025. Choosing the Right Preregistration
Template: A Guide for Researchers. https://www.cos.io/blog/choosing-preregistration-template-guide-for-researchers.
Çetinkaya-Rundel, Mine & Johanna Hardin. 2021. Introduction to
modern statistics. Second. Leanpub. https://openintro-ims.netlify.app/.
Cleveland, William S. & Robert McGill. 1987. Graphical perception:
The visual decoding of quantitative information on graphical displays of
data. Journal of the Royal Statistical Society: Series A
(General) 150(3). 192–210. https://doi.org/10.2307/2981473.
Cohen, Jacob. 1988. Statistical power analysis for the behavioral
sciences. 2. ed., reprint. New York, NY: Psychology Press.
Consortium, TEI. 2025. TEI P5: Guidelines for electronic text encoding
and interchange. Zenodo. https://doi.org/10.5281/zenodo.17161156.
Dąbrowska, Ewa. 2019. Experience, aptitude, and individual differences
in linguistic attainment: A comparison of native and nonnative speakers.
Language Learning 69(S1). 72–100. https://doi.org/10.1111/lang.12323.
Dauber, Daniel. 2024. R for non-programmers: A guide for social
scientists. Open Education Resource. https://bookdown.org/daniel_dauber_io/r4np_book/.
Douglas, Alex, Deon Roos, Francesca Mancini & David Lusseau. 2024.
An introduction to R. https://intro2r.com/.
Ekman, Paul & Wallace V Friesen. 1978. Facial action coding system.
Environmental Psychology & Nonverbal Behavior.
Few, Stephen. Save the pies for dessert. August 2007. http://www.perceptualedge.com/articles/08-21-07.pdf.
Field, Andy P., Jeremy Miles & Zoë Field. 2012. Discovering
statistics using r. Sage.
Fox, John & Sanford Weisberg. 2019. An R companion
to applied regression. Third edition. SAGE.
Fricke, Lea, Patrick G Grosz & Tatjana Scheffler. 2024. Semantic
differences in visually similar face emojis. Language and
Cognition. Cambridge University Press 1–15. https://doi.org/10.1017/langcog.2024.12.
Fugate, Jennifer MB & Courtny L Franco. 2021. Implications for
emotion: Using anatomically based facial coding to compare emoji faces
across platforms. Frontiers in Psychology. Frontiers Media SA
12. 605928. https://doi.org/10.3389/fpsyg.2021.605928.
Garnier, Simon, Noam Ross, BoB Rudis, Antoine Filipovic-Pierucci, Tal
Galili, Timelyportfolio, Alan O’Callaghan, et al. 2023.
Sjmgarnier/viridis: CRAN release v0.6.3. Zenodo. https://doi.org/10.5281/ZENODO.4679423.
Gelman, Andrew. 2018. Ethics in statistical practice and communication:
Five recommendations. Significance 15(5). 40–43. https://doi.org/10.1111/j.1740-9713.2018.01193.x.
Gelman, Andrew. 2019. Embracing variation and accepting uncertainty:
Implications for science and metascience. https://www.youtube.com/watch?v=VQCcMP4A5Ks.
Good, Jeff. 2022. The scope of linguistic data. In Andrea L.
Berez-Kroeker, Bradley McDonnell, Eve Koller & Lauren B. Collister
(eds.), The open handbook of linguistic data management, 27–47.
MIT Press. https://doi.org/10.7551/mitpress/12200.001.0001.
Gries, Stefan Th. & Nick C. Ellis. 2015. Statistical measures for
usage-based linguistics. Language Learning 65(S1). 228–255. https://doi.org/10.1111/lang.12119.
Gries, Stefan Thomas. 2021. Statistics for linguistics with
R: A practical introduction (De Gruyter Mouton
Textbook). 3rd revised edition. de Gruyter Mouton.
Grömping, Ulrike. 2006. Relative Importance for Linear Regression
inR: The Package relaimpo. Journal of Statistical
Software 17(1). https://doi.org/10.18637/jss.v017.i01.
Grosz, Patrick Georg, Gabriel Greenberg, Christian De Leon & Elsi
Kaiser. 2023. A semantics of face emoji in discourse. Linguistics
and Philosophy. Springer 46(4). 905–957. https://doi.org/10.1007/s10988-022-09369-8.
Haroz, Steve. 2022. Comparison of preregistration platforms. https://doi.org/10.31222/osf.io/zry2u.
Harrell, Frank E. 2015. Regression modeling strategies: With
applications to linear models, logistic and ordinal regression, and
survival analysis (Springer Series in Statistics). Springer
International Publishing. https://doi.org/10.1007/978-3-319-19425-7.
Horst, Allison & Julie Lowndes. 2020. Openscapes - Tidy
data for efficiency, reproducibility, and collaboration. https://openscapes.org/blog/2020-10-12-tidy-data/.
Hvitfeldt, Emil. 2021. Paletteer: Comprehensive collection of color
palettes. https://github.com/EmilHvitfeldt/paletteer.
iris-database.org. 2011. IRIS. https://iris-database.org/.
Kaufman, Allison B. & James C. Kaufman (eds.). 2018. The illusion of
causality: A cognitive bias underlying pseudoscience. In
Pseudoscience. The MIT Press. https://doi.org/10.7551/mitpress/10747.003.0007.
Kung, Susan Smythe. 2022. Developing a data management plan. In Andrea
L. Berez-Kroeker, Bradley McDonnell, Eve Koller & Lauren B.
Collister (eds.), The open handbook of linguistic data
management, 101–115. MIT Press. https://doi.org/10.7551/mitpress/12200.001.0001.
Lakens, Daniël. 2022. Improving your statistical inferences.
Zenodo. https://doi.org/10.5281/ZENODO.6409077.
Lausberg, Hedda & Han Sloetjes. 2009. Coding gestural behavior with
the NEUROGES-ELAN system. Behavior Research Methods 41(3).
841–849. https://doi.org/10.3758/BRM.41.3.841.
Le Foll, Elen. 2022. Textbook English: A
corpus-based analysis of the language of EFL textbooks used in secondary
schools in France, Germany and
Spain. Osnabrück University PhD thesis. https://doi.org/10.48693/278.
Lenth, Russell V. 2025. Emmeans: Estimated marginal means, aka
least-squares means. https://rvlenth.github.io/emmeans/.
Levshina, Natalia. 2015. How to do linguistics with R:
Data exploration and statistical analysis. John Benjamins.
Levshina, Natalia. 2022. Comparing Bayesian and Frequentist Models of
Language Variation: The Case of Help + (to-)Infinitive. In Ole Schützler
& Julia Schlüter (eds.), 224–258. 1st edn. Cambridge University
Press. https://doi.org/10.1017/9781108589314.009.
Lindeman, Richard Harold, Peter Francis Merenda & Ruth Z. Gold.
1980. Introduction to bivariate and multivariate analysis.
Scott, Foresman.
Lüdecke, Daniel. 2020. sjPlot: Data visualization for statistics in
social science. https://CRAN.R-project.org/package=sjPlot.
Lüdecke, Daniel, Mattan S. Ben-Shachar, Indrajeet Patil, Philip Waggoner
& Dominique Makowski. 2021. performance:
An R package for assessment, comparison and testing of
statistical models. Journal of Open Source Software 6(60).
3139. https://doi.org/10.21105/joss.03139.
Lüdecke, Daniel, Indrajeet Patil, Mattan S. Ben-Shachar, Brenton M.
Wiernik, Philip Waggoner & Dominique Makowski. 2021. See: An
R package for visualizing statistical models. Journal
of Open Source Software 6(64). 3393. https://doi.org/10.21105/joss.03393.
Maier, Emar. 2023. Emojis as pictures. Ergo 10. https://doi.org/10.3998/ergo.4641.
Matejka, Justin & George Fitzmaurice. 2017. Same stats, different
graphs: Generating datasets with varied appearance and identical
statistics through simulated annealing. In, 12901294. New York, NY, USA:
Association for Computing Machinery. https://doi.org/10.1145/3025453.3025912.
Matute, Helena, Fernando Blanco, Ion Yarritu, Marcos Díaz-Lago, Miguel
A. Vadillo & Itxaso Barberia. 2015. Illusions of causality: How they
bias our everyday thinking and how they could be reduced. Frontiers
in Psychology. Frontiers 6. https://doi.org/10.3389/fpsyg.2015.00888.
Mertzen, Daniela, Sol Lago & Shravan Vasishth. 2021. The benefits of
preregistration for hypothesis-driven bilingualism research.
Bilingualism: Language and Cognition 24(5). 807–812. https://doi.org/10.1017/S1366728921000031.
Mizumoto, Atsushi. 2023. Calculating the relative importance of multiple
regression predictor variables using dominance analysis and random
forests. Language Learning 73(1). 161–196. https://doi.org/10.1111/lang.12518.
Mizumoto, Atsushi & Luke Plonsky. 2016. R as a lingua franca:
Advantages of using r for quantitative research in applied linguistics.
Applied Linguistics 37(2). 284–291. https://doi.org/10.1093/applin/amv025.
Neuwirth, Erich. 2022. Package “RColorBrewer.”
ColorBrewer palettes 991. https://cran.r-project.org/web/packages/RColorBrewer/RColorBrewer.pdf.
Nicenboim, Bruno, Daniel Schad & Shravan Vasishth. 2026.
Introduction to Bayesian Data Analysis for cognitive science
(Chapman & Hall/CRC Statistics in the social and behavioral sciences
series). Boca Raton London New York: CRC Press, Taylor & Francis
Group. https://doi.org/10.1201/9780429342646.
Nimon, Kim F. 2012. Statistical assumptions of substantive analyses
across the general linear model: A mini-review. Frontiers in
Psychology 3. https://doi.org/10.3389/fpsyg.2012.00322.
Ou, Jianhong. 2021. colorBlindness: Safe color set for color
blindness. https://CRAN.R-project.org/package=colorBlindness.
Paquot, Magali, Alexander König, Egon W. Stemle & Jennifer-Carmen
Frey. 2024. The core metadata schema for learner corpora (LC-meta):
Collaborative efforts to advance data discoverability, metadata quality
and study comparability in L2 research. International Journal of
Learner Corpus Research 10(2). 280–300. https://doi.org/10.1075/ijlcr.24010.paq.
Parsons, Sam, Flávio Azevedo, Mahmoud M. Elsherif, Samuel Guay, Owen N.
Shahim, Gisela H. Govaart, Emma Norris, et al. 2022. A community-sourced
glossary of open scholarship terms. Nature Human
Behaviour. Nature 6(3). 312–318. https://doi.org/10.1038/s41562-021-01269-4.
Pedersen, Thomas Lin & Maxim Shemanarev. 2024. Ragg: Graphic
devices based on AGG. https://ragg.r-lib.org.
Pfadenhauer, Katrin & Evelyn Wiesinger (eds.). 2024. Romance
motion verbs in language change: Grammar, lexicon, discourse. De
Gruyter. https://doi.org/10.1515/9783111248141.
Pfeifer, Valeria A, Emma L Armstrong & Vicky Tzuyin Lai. 2022. Do
all facial emojis communicate emotion? The impact of facial emojis on
perceived sender emotion and text processing. Computers in Human
Behavior. Elsevier 126. 107016. https://doi.org/10.1016/j.chb.2021.107016.
Plonsky, Luke & Frederick L. Oswald. 2014. How big is
“big”? Interpreting effect sizes in L2 research.
Language Learning 64(4). 878–912. https://doi.org/10.1111/lang.12079.
Prat, Chantel S., Tara M. Madhyastha, Malayka J. Mottarella &
Chu-Hsuan Kuo. 2020. Relating natural language aptitude to individual
differences in learning programming languages. Scientific
Reports. Nature 10(1). 3817. https://doi.org/10.1038/s41598-020-60661-8.
R Core Team. 2024. R: A language and environment for statistical
computing. R Foundation for Statistical Computing. https://www.R-project.org/.
Roettger, Timo B. 2021. Preregistration in experimental linguistics:
Applications, challenges, and limitations. Linguistics. De
59(5). 1227–1249. https://doi.org/10.1515/ling-2019-0048.
Scheffler, Tatjana & Ivan Nenchev. 2024. Affective, semantic,
frequency, and descriptive norms for 107 face emojis. Behavior
Research Methods. Springer 1–22. https://doi.org/10.3758/s13428-024-02444-x.
Schimke, Sarah, Israel de la Fuente, Barbara Hemforth & Saveria
Colonna. 2018. First language influence on second language offline and
online ambiguous pronoun resolution. Language Learning 68(3).
744–779. https://doi.org/10.1111/lang.12293.
Schweinberger, Martin. 2022. Data management, version control, and
reproducibility. https://ladal.edu.au/repro.html.
Seibold, Heidi & Rabea Müller. BERD course: Make your research
reproducible. https://doi.org/10.17605/OSF.IO/RUPT7.
Silge, Julia. 2022. Janeaustenr: Jane Austen’s complete
novels. https://CRAN.R-project.org/package=janeaustenr.
Smith, Gary. 2018. Step away from stepwise. Journal of Big
Data. SpringerOpen 5(1). 1–12. https://doi.org/10.1186/s40537-018-0143-6.
Sonderegger, Morgan. 2023. Regression modeling for linguistic
data. The MIT Press.
Sóskuthy, Márton. Generalised additive mixed models for dynamic analysis
in linguistics: A practical introduction. https://doi.org/10.48550/arXiv.1703.05339.
Stefanowitsch, Anatol & Susanne Flach. 2017. The corpus-based
perspective on entrenchment. In Hans-Jörg Schmid (ed.), Entrenchment
and the psychology of language learning: How we reorganize and adapt
linguistic knowledge, 101–127. De Gruyter. https://doi.org/10.1037/15969-006.
Tabachnick, Barbara G. & Linda S. Fidell. 2014. Using
multivariate statistics (Always Learning). Pearson new
international edition, sixth edition. Pearson.
The Turing Way Community. 2022. The Turing
Way: A handbook for reproducible, ethical and collaborative
research (1.0.2). Zenodo. https://doi.org/10.5281/zenodo.3233853.
Thompson, Bruce. 1995. Stepwise regression and stepwise discriminant
analysis need not apply here: A guidelines editorial. Educational
and Psychological Measurement. SAGE 55(4). 525–534. https://doi.org/10.1177/0013164495055004001.
Van Hulle, Sven & Renata Enghels. 2024a. The category of throw verbs
as productive source of the spanish inchoative construction. In Katrin
Pfadenhauer & Evelyn Wiesinger (eds.), Romance motion verbs in
language change, 213–240. De Gruyter. https://doi.org/10.1515/9783111248141-009.
Van Hulle, Sven & Renata Enghels. 2024b. TROLLing replication data
for: “The category of throw verbs as productive source of the
spanish inchoative construction. DataverseNO, V1.” https://doi.org/10.18710/TR2PWJ.
Vasishth, Shravan & Andrew Gelman. 2021. How to embrace variation
and accept uncertainty in linguistic and psycholinguistic data analysis.
Linguistics 59(5). 1311–1342. https://doi.org/10.1515/ling-2019-0051.
Wickham, Hadley. 2016. ggplot2: Elegant graphics for data
analysis. New York: Springer. https://ggplot2.tidyverse.org.
Wickham, Hadley, Mine Çetinkaya-Rundel & Garrett Grolemund. 2023.
R for data science: Import, tidy, transform, visualize, and model
data. 2nd edition. O’Reilly. https://r4ds.hadley.nz/.
Wickham, Hadley, Romain François & Lucy D’Agostino McGowan. 2024.
Emo: Easily insert ’emoji’. https://github.com/hadley/emo.
Wickham, Hadley, Davis Vaughan & Maximilian Girlich. Tidy messy
data. https://tidyr.tidyverse.org/.
Wilkinson, Leland. 2005. The Grammar of
Graphics (Statistics and Computing). New York:
Springer. https://doi.org/10.1007/0-387-28695-0.
Williams, Matt N., Carlos Alberto Gómez Grajales & Dason Kurkiewicz.
2013. Assumptions of multiple regression: Correcting two misconceptions.
Practical Assessment, Research, and Evaluation 18(11).
Windhouwer, Menzo & Twan Goosen. 2022. Component metadata
infrastructure. In Darja Fišer & Andreas Witt (eds.), CLARIN:
The infrastructure for language resources, 191–222. De Gruyter. https://doi.org/10.1515/9783110767377-008.
Winter, Bodo. 2020. Statistics for linguists: An introduction using
R. Routledge. https://doi.org/10.4324/9781315165547.
Withers, Peter. 2012. Metadata management with arbil. In V. D. Arranz,
B. Broeder, M. Gaiffe, M. Gavrilidou & M. Monachini (eds.),
Proceedings of the workshop describing LRs with metadata: Towards
flexibility and interoperability in the documentation of LR, 72–75.
European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2012/workshops/11.LREC2012%20Metadata%20Proceedings.pdf#page=79.
Ye, Jiachu, Xiaoyan Lai & Gary Ka-Wai Wong. 2022. The transfer
effects of computational thinking: A systematic review with
meta-analysis and qualitative synthesis. Journal of Computer
Assisted Learning 38(6). 1620–1638. https://doi.org/10.1111/jcal.12723.
Ziemann, Mark, Yotam Eren & Assam El-Osta. 2016. Gene name errors
are widespread in the scientific literature. Genome Biology
17(1). 177. https://doi.org/10.1186/s13059-016-1044-7.