15  What’s next? AI-assisted reseaRch?

If you are reading the last chapter of what is by no means a short textbook, you are likely someone who is eager to acquire new skills and knowledge. As such, you are well aware that learning is a process that requires intrinsic motivation, time, and a great deal of effort and patience. Many tech companies, by contrast, are keen for us to believe that this tiring, time-consuming experience of human learning is no longer necessary: So-called ‘artificial intelligence’ (AI) systems are designed to spare us the burden of thinking for ourselves. If the answers to our (research) questions are just a few effortless prompts away, it is legitimate to ask: Is it still worth devoting so much time and effort to acquire and further develop knowledge and skills in data analysis, programming, and statistics?

This concluding chapter explains why we can answer this question with a decisive “yes”. To this end, we consider the role of AI in research and learning before recapping what we have learnt so far and where to go from there.

15.1 On the technology behind AI 🤖

Contrary to popular belief, ‘artificial intelligence’ is not a technology. It is, and always has been, a marketing term used to sell (or secure research funding for) a wide, diverse, and shifting array of ideas, research projects, systems, and technologies (see Figure 15.1 from Guest et al. (2025); also Rooij et al. (2024)). Despite marketing claims, these systems and technologies are not ‘intelligent’; they cannot think, understand, or reason (see e.g. Quattrociocchi, Capraro & Perc 2025). This is not to say that they do not have valuable applications, but rather that we need to critically evaluate promotional claims and anthropomorphising language (e.g. AI says / thinks / knows / understands).

Figure 15.1: A cartoon set theoretic view of terms commonly used when discussing the superset AI from Guest et al. (2025): Figure 1 (CC BY 4.0)

Popular, commercial chat-based AI products such as ChatGPT, Claude, CoPilot, and Gemini are powered by Large Language Models (LLMs). LLMs are statistical models fitted to huge amounts of training data to generate a probable response given a prompt based on statistical patterns found in the training data, reinforcement learning from human feedback, as well as additional opaque (usually profit-driven) criteria (see e.g. Bender & Hanna 2025). At first sight, the underlying principle of next-token prediction is comparable to the autocomplete functions from our phones and Google search bar (see Figure 15.2).

A looping GIF of a screen capture showing someone typing "LLMs are not" into Google and various automated completions pop up like "LLMs are not ai".
Figure 15.2: Autocomplete function on web search query (google.com on 8 February 2026)

It goes without saying that modern LLMs are much more powerful predictive text generators than the models that once offered to help us draft our text messages. This is due to four key factors:

  1. The development of a novel algorithmic architecture known as a transformer that is based on attention mechanisms (Vaswani et al. 2017)

  2. The availability and largely unethical, if not outright illegal (see e.g. Samuelson 2023; Lucchi 2024) scraping of huge amounts of training data from the internet, including (academic) books and articles, but also vast amounts of blog posts, social media data, forum discussions, Wikipedia articles, and YouTube videos.

  3. The availability of (relatively) cheap, large-scale computational power (which, however, still comes at a high environmental cost, see e.g. Luccioni et al. 2025; Schön, Hoffmann & Becker 2025).

  4. Reinforcement learning from human feedback, a process whereby crowdsourced human workers mostly from low-income countries (see e.g. Perrigo 2023) provide extensive feedback on LLM outputs to fine-tune models for what humans want to obtain when they query a model (see Bender & Hanna 2025: Ch 3).

Despite its name, OpenAI — the main company behind ChatGPT — does not develop open-source LLMs, nor is the company a not-for-profit initiative. So why have OpenAI and other large tech firms been offering their AI products for free or at prices well below actual running (let alone development) costs? Two reasons are worth considering. First, because one major bottleneck to improving current LLMs is access to new data. Human-generated data is highly valuable and chatbot users are providing lots of it. AI companies are harvesting this data to train the next generation of LLMs. Second, because free or cheap access to AI products encourages us to become reliant on them for all kinds of work-related tasks and personal activities. As they become ubiquitous to our everyday lives, we are naturally inclined to rely on and trust their outputs. It is only a matter of time until subscriptions prices are hiked up and/or promoted contents are (more or less transparently) integrated in model outputs (see e.g. Mühlhoff 2025: 91-97).

TipYour turn!

Q15.1 Which of the following analogies have (rightly or wrongly) been used by different scholars to refer to Large Language Models (LLMs)? You may have heard of one or a few of these, but you can guess them all?









🦜 Click on the parrot for a hint.

The term ‘stochastic parrot’ 🦜 is probably the best known LLM metaphor. It was introduced by computational linguist Emily M. Bender and colleagues in 2021 to characterise LLMs as systems that mimic text without true understanding, highlighting their limitations in processing meaning. This metaphor emphasises that LLMs generate outputs based on statistical patterns in their training data, similar to how parrots mimic sounds without comprehension.

In 2022, the cognitive scientist Iris Van Rooij published a blog post in which she succinctly explains why she believes that LLMs cannot legitimately be used for academic writing because they essentially “automate plagiarism”.

The ‘spicy autocomplete’ metaphor 🌶️ is difficult to trace back to one or more specific author(s). It suggests that LLMs are just fancy versions of a smart phone’s predictive text, which predict the next word based on what came before, except that LLMs add some randomness, i.e. spice, to the output. This framing also implies that LLMs are pattern-matching algorithms without real understanding or problem-solving ability (see Groß 2024).

In 2024, Hicks, Humphries & Slater published a paper in the journal ‘Ethics and Information Technology’ entitled ‘ChatGPT is bullshit’, in which they argue that the output of LLMs is best understood as ‘bullshit’ in the philosophical sense described by Frankfurt (2005) because LLMs are indifferent to the truth of their outputs.

‘Synthetic text extruding machines’ is a term that Emily M. Bender and Alex Hanna like to use, e.g. in their 2025 book entitled ‘The AI Con: How to Fight Big Tech’s Hype and Create the Future We Want’. They describe the process of LLM-generated texts by explaining that, “[l]ike an industrial plastic process, language corpora are forced through complicated machinery to produce a product that looks like communicative language, but without any intent or thinking mind behind it.”

As an alternative to well-established metaphors such as the ones listed above, classical philologist Gyburg Uhlmann proposed ‘kitsch’ as a new metaphor to describe the output of LLMs. She argues that ‘kitsch’ “is particularly suitable for analytically illuminating a previously neglected feature of LLM-based images and texts: their tendency to produce homogeneous and average content, which is […] leading to the equalisation of language, style and argument” (Uhlmann 2025).

15.2 On the value of critical thinking

In educational contexts, the challenges brought about by AI have been compared to the introduction of pocket calculators in the 1960s. The comparison is an interesting one. Although calculators are now widely available and perform school-level arithmetic to perfection, we still first teach school pupils to do mathematics without a calculator because we know that this is necessary to develop an understanding of what a calculator does. Following this analogy, we should still teach and learn data analysis, statistics, and programming, even if AI were to generate error-free solutions to these tasks. The crux of the problem is how to convince ourselves that it is worth learning to do things the hard way given how convenient and effortless AI products seem. More on that in Section 15.3 but, for now, let’s return to our calculator analogy and consider another, crucial issue: the reliability of LLM outputs.

A calculator is a deterministic system that will always provide the same, correct answer to a mathematical operation. By contrast, an LLM is a stochastic model that is fitted to generate probable outputs. These are generated by highly complex black-box algorithms that are based on (often illegally and/or unethically acquired) training data, reinforcement learning from human feedback, and additional company-internal fine-tuning, as well as a degree of randomness. This means that, unlike calculators, their outputs are irreproducible and therefore unreliable.

A marketing term that tech companies have been pushing to describe this inherent lack of reliability is “hallucination”. However, as LLMs have no way of representing what is true or false, it is misleading to speak of LLMs “hallucinating”. LLMs are more likely to generate outputs that are truthful when they have been trained on a lot of reliable data on the subject but, by definition, they cannot evaluate their sources — which can range from high-quality, peer-reviewed academic journals to satiric Reddit comments written by cheeky teenagers — and, crucially, have no understanding of their contents.

Given that their outputs are both irreproducible and unreliable, commercial AI tools are unsuitable for most research-related activities. For writing, LLMs not only output texts without any form of fact-checking, they also automate plagiarism as they fail to credit the authors whose texts they were trained on. In this context, it is worth noting that any bibliographic references output by LLMs are also randomly generated text strings. They may or may not correspond to real sources. When the sources exist, they may or may not contain (some of) the information regurgitated by the LLMs. We have no way of knowing from the model output.

The same goes for literature reviews: having no access to the training data, this is not a task that we can responsibly delegate to an LLM. To make matters worse, AI systems are known to perpetuate and exacerbate biases (e.g. the ‘Matthew effect,’ see Pooley 2025). High-quality research requires us to put in the intellectual effort of searching, reading, and critically evaluating the literature ourselves. Like writing, this is an integral part of the research process.

TipYour turn!

Q15.2 Read Jeff Pooley’s short commentary ‘The Matthew Effect in AI Summary’ (archived version). What does the Matthew Effect refer to?





🦜 Click on the parrot for a hint.

Q15.3 Which term describes the phenomenon whereby the contributions of marginalised female scientists are overlooked or attributed to their male colleagues?





🦜 Click on the parrot for a hint.

Q15.4 According to Pooley (2025), which of the following biases in academia are likely to be aggravated by the use of Large Language Models (LLMs) to summarise academic literature and write research articles?





🦜 Click on the parrot for a hint.

Writing code for data (pre)processing, analysis, visualisation, and statistical modelling is also increasingly becoming part and parcel of research processes in the language sciences. Whilst it may be tempting to outsource (part of) the data analysis process to a machine learning algorithm, an LLM or an AI agent, time spent examining our data (e.g. cleaning, wrangling, and visualising them) allows us to gain in-depth knowledge of our data with which we can spot issues in complex statistical analyses further down the road. It is much easier to critically interpret the outputs of complex models if we have a good intuitive sense of what is plausible based on our data. This is not something that we can responsibly delegate to even the most sophisticated AI.

Comic showing a stick person holding a canoe paddle at his side and standing on top of a "big pile of linear algebra" containing a funnel labeled "data" and box labeled "answers". A second stick person stands to the left side of the panel. Person 2: This is your machine learning system?  Person 1: Yup! You pour the data into this big pile of linear algebra, then collect the answers on the other side.  Person 2: What if the answers are wrong?  Person 2: Just stir the pile until they start looking right.

A xkcd comic depicting a pile of algebra as a machine learning algorithm (xkcd CC BY-NC 2.5)

15.3 On the value of human learning

All this is not to say that AI products are never useful. Many people report successfully using AI for designing experiments, writing, and coding — including for (academic) research. Here, two factors are worth considering. First, the less we know about a domain, the more we tend to overestimate the quality of LLM outputs in this domain (see also Tully, Longoni & Appel 2025). When it comes to learning how to code and do statistics, political scientist, R package developer, and educator Andrew Heiss1 (2024) goes as far as saying that “using ChatGPT and other LLMs when learning R is actually really detrimental to learning, especially if you just copy/paste directly from what it spits out.” He goes on to explain that:

Using ChatGPT with R requires a good baseline knowledge of R to actually be useful. A good analogy for this is with recipes. ChatGPT is really confident at spitting out plausible-looking recipes. A few months ago, for fun, I asked it to give me a cookie recipe. I got back something with flour, eggs, sugar, and all other standard-looking ingredients, but it also said to include 3/4 cup of baking powder. That’s wild and obviously wrong, but I only knew that because I’ve made cookies before (Heiss 2024: n.p.).

This begs the question as to how novices can reach the level of expertise necessary to be able to reliably assess LLM outputs in a specific domain. Researchers are often short on time and LLMs are sold to us as a convenient way to take shortcuts. Indeed, AI products are designed to give us the illusion that we are more efficient and productive when we use and trust them. The risk is that, once we have become dependent, we are no longer able to compare how long we would have needed to complete the same task had we not relied on a third-party AI system to do so.

TipYour turn!

Read through Google’s “AI overview” displayed below. Note that you can click on the images to enlarge them.

Google query: 'cheese not sticking to pizza' The AI Overview reads: Here are some tips to help cheese stick to pizza: Mix about 1/8 cup of non-toxic glue, like Elmer's school glue, into the sauce to add some tackiness. Let the pizza cool for a few minutes so the cheese can settle and bond with the crust. Cook the cheese until it just melts, but don't overcook it.

Google AI overview to the query “cheese not sticking to pizza” from 2024

Reddit post by user "fucksmith" dating from 13y ago. The post reads: To get the cheese to stick I recommend mixing about 1/8 cup of Elmer's glue Q in with the sauce. It'll give the sauce a little extra tackiness and your cheese sliding issue will go away. It'll also add a little unique flavor. I like Elmer's school glue, but any glue will work as long as it's non-toxic. The post has 197 replies.

Satirical Reddit post from 2013
Figure 15.3: Screenshots from r/Pizza illustrating the infamous ‘pizza glue’ AI fail (see also Notopoulos 2024)

Q15.5 Among a number of sensible-sounding suggestions, we find a recommendation for adding non-toxic glue to pizza sauce. This mention is thought to have come from an old Reddit post (see screenshot above). Which aspect(s) of the AI overview point to this theory?







🦜 Click on the parrot for a hint.

When it comes to programming, the use of LLMs may seem less risky than when conducting literature reviews, writing prose, or analysing data. After all, we can test that generated code runs as expected. If it does not (which is often the case although we might not immediately spot this), we can debug the code with the support of one or more LLMs until it does. Whilst this might feel like the path of least resistance, data scientist, R developer, and educator, Mine Çetinkaya-Rundel, explains how debugging someone else’s code (e.g., what an LLM produced) is considerably harder than debugging our own code (Chow, Wickham & Mckinney 2025). This is because when it’s our own code, we had an idea, followed a certain example, or a specific strategy and this knowledge helps us debug in a systematic way. Crucially, each problem or error is an opportunity to learn. Relying on LLM-generated code to solve these errors robs us of these opportunities.

AI can’t learn from its mistakes—it doesn’t understand why something failed. It just pattern-matches from training data (Stetskov 2025: n.p.).

Research on the mid to long-term impact of AI usage on cognitive abilities such as writing, coding, and critical thinking is still in its infancy; however, a number of studies point towards a very genuine risk of deskilling (see e.g. Ferdman 2025) in many domains of use (e.g. medicine, see Natali et al. 2025). In the context of AI-assisted coding, researchers from Anthropic (the company behind the Claude family of AI products) conducted a pre-registered (Tamkin & Shen 2026) experiment in which 52 (mostly junior) software developers completed a programming task using a Python library that they were not familiar with and were subsequently tested for their understanding of the code. Tamkin & Shen (2026) compared:

  1. how quickly the programmers completed the task with and without AI assistance and
  2. whether using AI made them less, more, or equally likely to understand the code they had just written.
Two plots: the first shows the average time taken to complete the experimental task. The average with AI was 23 minutes, and without AI nearly 25. But the confidence intervals overlap considerably. The second plot shows average test scores. Here the difference is much clearer: the AI group scores 50% on average, whilst the non-AI control group scores around 65%.
Figure 15.4: Difference in means (and 95% CI error bars) of overall task time and test scores between the treatment (AI Assistant) and control (No AI) groups (n = 52) (Source: Tamkin & Shen 2026: Fig. 1)

On average, the participants assigned to the AI assistance group completed the programming task about two minutes faster than those who worked without AI — a difference which was not statistically significant (p = 0.391). There was, however, a significant difference in their code comprehension test scores: the AI group averaged 50% on the test, compared to 67% in the group that did not have access to AI (Cohen’s d = 0.738, p = 0.01). Tamkin & Shen (2026) also report that the largest gap in scores between the two groups was on debugging questions. They attribute the gains in skill development of the group that did not use AI to “the process of encountering and subsequently resolving errors independently” (Tamkin & Shen 2026: 2-3).

Senior developers don’t grow out of thin air [… They] develop intuition through thousands of small failures (Stetskov 2025: n.p.).

TipYour turn!

As part of an exploratory (not pre-registered) analysis, Tamkin & Shen (2026) from Anthropic (see Section 15.3) decomposed the quiz scores into sub-areas and question types (see Figure 8 from Tamkin & Shen (2026), reprinted below). Each question in the quiz belonged to exactly one task (e.g., Task 1 or Task 2) and exactly one question type (e.g., Conceptual, Debugging, or Code Reading). Figure 8 shows that, for both tasks, the control (no AI) group performed better than the AI group.

A plot showing participants' average scores in each type of quiz questions together with 95% CI error bars. Across all subareas of the quiz (task 1, task 2, conceptual, debugging, and code reading), the control, no AI group performed better on average than the treatment, AI group. However, the 95% CI intervals of the two groups always overlap to a smaller or larger degree. The difference in average scores is largest for the debugging questions and smallest for the code reading.

Figure 8 from : Score breakdown by questions type relating to each task and skill area

Q15.6 Looking at the results displayed in Figure 8 above, which question type shows the largest difference in average quiz scores between the treatment and control groups?




Q15.7 Why might the control group have, on average, performed better on debugging questions compared to the AI group?





🦜 Click on the parrot for a hint.

Q15.8 What do the results displayed in Figure 8 from Tamkin & Shen (2026) (see above) suggest about the impact of AI assistance on code reading skills?





15.4 On the value of the commons

It is worth pointing out that the programmers in the control, no AI group in the aforementioned Anthropic study (Tamkin & Shen 2026) did not write code with no help whatsoever. Instead, they had access to regular web searching and code documentation. It is a common misconception that learning to code involves memorising lots of functions and commands. In practice, programming always involves looking things up. Up until very recently, researchers would typically search the web to find answers to their coding problems instead of prompting an AI product. These searches led us to interesting forum discussions (e.g., this Q&A on justifying text in Quarto), blog posts by fellow researchers and developers (e.g. Andrew Heiss’ blog), and helpful documentation files (e.g. in the form of so-called ‘vignettes’). If our web search did not help us solve our problem, we prepared a reprex, short for minimal reproducible example (see Wickham, Çetinkaya-Rundel & Grolemund 2023: Ch. 8 on getting help), posted it on a dedicated forum and community members would typically provide helpful answers within days, hours, or even minutes.

All of these contents were human-generated. Not all were 100% reliable, but high-quality answers to problems on StackOverflow, for example, were upvoted by readers and errors in the documentation of open-source packages were quickly identified and corrected by the community of users. The same principle was true of statistical questions. Today, this strategy is still viable but, unfortunately, there are a lot of AI-generated webpages (‘AI slop’) that one first needs to ignore.

TipYour turn!

Q15.9 What is Stack Exchange?






🦜 Click on the parrot for a hint.

Q15.10 Open this archived version of a Q&A about confidence intervals hosted on Cross Validated, Stack Exchange’s statistics forum. In which year did Eliott originally ask their question?

🦜 Click on the parrot for a hint.

Q15.11 As of 9 February 2026 when the page was archived, how many Cross Validated members had upvoted the top answer?





🦜 Click on the parrot for a hint.

In Section 11.2, we conducted the following t-test to find out whether the observed difference between the L1 and L2 speakers’ non-verbal IQ ‘Blocks’ test scores was significant or not:

t.test(formula = Blocks ~ Group, 
       data = Dabrowska.data)

    Welch Two Sample t-test

data:  Blocks by Group
t = -4.4084, df = 152.46, p-value = 1.956e-05
alternative hypothesis: true difference in means between group L1 and group L2 is not equal to 0
95 percent confidence interval:
 -5.239791 -1.996693
sample estimates:
mean in group L1 mean in group L2 
        13.84444         17.46269 

The output of the t.test() command above was fed into an LLM (DeepSeek R1 Distill Llama 70B) with a prompt requesting a short paragraph interpreting these results. The model’s response is printed below:

The Welch two‑sample t‑test shows a significant difference between the groups (t = ‑4.41, df ≈ 152.5, p = 1.96 × 10⁻⁵). Group L1 has a mean of 13.84 blocks, while group L2’s mean is 17.46 blocks, a mean difference of ‑3.62 blocks. The 95 % confidence interval for this difference (‑5.24 to ‑1.99) means we can be 95 % confident that the true average in L2 exceeds that in L1 by roughly 2 to 5 blocks. Because the interval does not include zero, the null hypothesis of equal means is rejected.

Q15.12 Aside from the fact that “blocks” is not the correct measurement unit, which major statistical error does the LLM’s interpretation of the t-test output include?






In addition to the flood of AI slop making it difficult to find reliable information, human content creators are finding that their intellectual property is being scraped without consent to be used as training data for commercial AI products. In the era of Open Scholarship, researchers, software developers, and educators have been sharing their work with the world for the benefit of scientific progress, typically with only authorship attribution as a reward. However, this may change as AI companies appropriate their work and LLMs automate plagiarism (Rooij 2022).

This may sound like an entirely dystopian situation but, up until fairly recently, researchers had to pay to use programming languages for statistical analyses and scientific computing. FORTRAN compilers, MatLab, S, and SPPS were (and still are) proprietary software which were inaccessible to many researchers and students. It is no exaggeration to say that open-source, community-led programming languages such as R, Python, and Julia revolutionised data analysis, making state-of-the-art methods accessible to far more people. However, as high-quality open-access resources become rarer due the contamination of AI slop and illegal scraping, we may be returning to an era of restricted access to scientific computing. Hence, it is worth remembering that it is communities of humans who have been developing programming languages such as R and Python, their many extensions such as the {tidyverse} packages (see also selected list of linguistics-specific R packages), open-source software such as R Studio, and high-quality documentation and Open Educational Resources.

These communities depend on collaboration, interactions, and mutual support. Yet, we have entered an age where human interactions are marketed as unproductive, time-consuming, and burdensome. We are told that they can easily be replaced by more efficient and “objective” chatbots.

It can’t be overstated the extent to which part of why these [AI] tools exist is to isolate us, to insert these [AI] companies in between us and all the relationships that we have (Chris Gilliard speaking on Tech Won’t Save Us on 16 October 2025).

Aside from the fact that LLMs are known to be prone to all kinds of very serious biases and that their efficiency is far from proven (see e.g. Loker 2025 on how AI code tends to create more problems than human-generated code), we should not lose sight of the value of subjective, human interactions. This is not to say that LLMs are never useful, but rather that I strongly encourage you to (also) devote time to learning from reliable, human-generated resources (see e.g. Next-step resources), join a course with other human beings to continue your learning journey, and/or find a learning buddy to discuss and solve problems together. This textbook was entirely human-generated and benefited greatly from countless rounds of revisions thanks to interactions with and feedback from (human!) students and colleagues (see Acknowledgements). Of course, it would have been quicker to write the textbook without asking for feedback or to get an LLM to generate first drafts of sections, code, and/or quiz questions. But in linguistics research, teaching, and learning, quality matters more than velocity.

Check your progress 🌟

Congratulations, you have reached the end of this textbook! In this concluding chapter, you have answered 0 out of 12 questions correctly. Check out the Next-step resources to continue your learning journey!

NoteMore food for thought 🍏 🍎
CautionUsing LLMs in R

For us linguists, there is no doubt that LLMs are fascinating objects of study and that they can be useful for some research-related tasks, provided that their outputs be thoroughly evaluated. Once you have developed a sound understanding of how LLMs and AI products function and have critically reflected on their potential, limitations, and societal impact (see recommendations above), you may want to explore non-commercial, open-source LLMs (see Liesenfeld, Lopez & Dingemanse 2023 and the European Open Source AI Index) in R.

A useful starting point to this end is Luis D. Verde Arregoitia’s curated list of R packages and other resources: ‘Large Language Model tools for R’ (also available in Spanish). I also highly recommend reading the “Read first” and “Further reading” sections.

Figure 15.5: Selection of hex logos of LLM-related R packages from Arregoitia (2026)

15.5 What’s next? 🧭

This textbook has taken you on a journey: first introducing Open Scholarship (Chapter 1), which forms the backbone of this textbook’s approach to doing science, then consolidating knowledge about file formats, file naming, and project organisation (Chapter 2 and Chapter 3), which are often major hurdles for successful data analysis pipelines. Having installed and set up R and RStudio (Chapter 4), you took your first steps in learning to code in R (Chapter 5). Next, you learnt to import real research data into an R project (Chapter 6). From Chapter 7 onwards, you learnt how to analyse data in R using descriptive and inferential statistics. This entailed wrangling data to prepare them for statistical analyses (Chapter 9) and data visualisation (Chapter 10), as well as developing an understanding of key statistical concepts such as measures of central tendency and variability (Chapter 8), distributions, effect sizes, confidence intervals, and p-values (Chapter 11). Chapter 12 introduced statistical modelling with simple linear regression and Chapter 13 expanded this concept to multiple predictor variables and interactions between predictors. Having mastered this foundational knowledge, you are now ready to tackle more advanced statistical methods such as mixed-effects and non-linear regression models, logistic and other classification models, machine learning algorithms, and much more (see Next-step resources). In Chapter 14, you learnt how to apply literate programming skills to conduct and publish reproducible research in Quarto. Finally, in this concluding chapter, you reflected on the values of critical thinking, human learning, and of nurturing communities in the day and age of AI.

Learning statistical theory and practice are inseparable from scientific reasoning (Vasishth & Gelman 2021: 1312).

I personally believe that it will always be worth investing in learning complex matters, developing critical thinking skills, as well as in building meaningful human relationships. No matter how powerful and efficient future AI products may be, the skills that will be valued in the future will not be “advanced” prompting techniques, but rather social skills, critical (statistical) literacy, and computational thinking. Crucially, these are also the skills that are essential to make the most of AI tools.

The resources listed in the Appendix are an excellent starting point for continuing your learning journey. In addition, the online version of this textbook features an expanding collection of case study chapters co-authored by students from my ‘Introduction to Data Analysis in R’ class. Each chapter attempts to computationally reproduce the findings of a published linguistics study using the authors’ original data. The student authors document all the steps necessary to reproduce the results and discuss the success (or not!) of their reproductions. Retracing the steps outlined in a case study that interests you or that covers a method you’d like to learn more about, is a great way to consolidate and expand the skills and knowledge you have acquired so far.

Two cute little monsters decorating the R logo with hearts, flowers, and smilies.

Happy learning and good luck with all your data analysis pRojects!️
Artwork by Allison Horst (CC BY 4.0)

  1. I highly recommend Andrew Heiss’ blog and his beautiful teaching resources. Fun fact for language students and linguists: Andrew majored in Arabic and Italian and didn’t learn about statistics or R until he started his second master’s!↩︎