13 Writing and publishing in Quarto
Chapter overview
So far, we have seen how we can export the outputs of our analyses conducted in R
, both in the form of tables (Section 9.9) and graphics (Section 10.3). However, in most situations, we want to communicate our research with the form of a document that combines both text and analysis outputs. This is where literate programming comes into play!
In this chapter, you will learn about:
- the concept of literate programming
- why reproducibility matters and how to aim for reproducible research
- using Quarto for writing research reports, theses, and papers and
- sharing and publishing your research in different formats including HTML, PDF, LibreOffice Writer, and Microsoft Word.
13.1 Literate programming
The basic idea of literate programming is that we combine text, code, and code outputs (i.e. tables, statistics, and plots) within a single document that can be exported into different formats for sharing and publishing.
Literate programming can be implemented in different authoring formats. Up until very recently, the most common format for R
projects was R Markdown. For Python
projects, Jupyter Notebooks remains the standard to date. In this chapter, we will focus on Quarto, a relatively new open-source scientific and technical authoring and publishing system that has the advantage of supporting many different programming languages. This means that chunks of code in R
, Python
, Julia
, and other languages can be combined into one document, making project management and collaboration much easier. Quarto allows us to easily export (or render) our documents to HTML, PDF, Word, and many more formats (see Figure 13.1 and Section 13.11).
Literate programming is particularly useful for scientific research and data science. Did you know that this entire textbook was written in Quarto? I chose this format because it allows for the seamless combination of explanations with nicely formatted R
code chunks and code outputs (i.e., all of the textbook’s tables, data visualisations, quiz questions, etc.) with consistent section and figure numbering, cross-references, links, bibliographic references, and much more. By the end of this chapter, you’ll be ready to start writing your own term paper, dissertation, thesis, journal article, or book in Quarto.
13.2 Reproducible research
Not only is using Quarto (or any other literate programming format, see Section 13.1) very convenient for the author(s), it also helps us make our research more reproducible. Unfortunately, the terms reproducible, replicable and repeatable are often confused and, not helping matters, some definitions in the literature contradict each other. In this textbook, we will adopt the terminology of The Turing Way. We thus define reproducibility as the ability of an independent researcher or team to obtain the same results as in a published study using the same data and methods that were used in the original study (see Figure 13.2).
This is in contrast to replicability, where the same methods, but different data is used; and robustness, where the same data, but different methods are used. Finally, if a finding can be reliably observed across different datasets with different methods, then we can say that the finding is generalisable.
Given this definition, reproducibility might seem like a low bar to pass. You might be thinking: Shouldn’t it be obvious that we’ll get the same results if we repeat a study using exactly the same data and method? Well, yes, it should be. But it very often isn’t! For a start, to be able to even attempt to reproduce the results of a study, the underlying data must be available. Linguists sharing their raw data as Ewa Dąbrowska did as part of her 2019 paper (Dąbrowska 2019) remains the exception rather than the norm (see Bochynska et al. 2023)1. Second, the open data must be in an accessible format and must be published with enough documentation to be understandable to an independent researcher. Third, the author(s) of the original study need to have very diligently documented all their data wrangling and analyses steps. The best way to do this is undoubtedly to use code that does not require closed-source software (e.g., a researcher without a license for SPSS or Stata will not be able to run SPSS or Stata scripts, see Section 1.2). This open code must be shared in an accessible format, too. Fourth, independent researchers need to be able to run these scripts. To this end, it is important that they know exactly which tools were used. Thus, if the analyses were conducted in R
, they need to know which R
version and which packages and package versions were used (Section 13.10). They also need to know in which order the scripts were run and, finally, the scripts must run on their own computers without any errors. So now, reproducibility doesn’t sound quite so easy, right? Luckily, if we apply the principles of literate programming in Quarto, we can go a long way towards ensuring that our research is reproducible.
To find out more about best practices for reproducible research, check out The Turing Way’s excellent Guide for Reproducible Research.
Watch this video from 2019, in which Garrett Grolemund (data scientist and instructor at RStudio) explains why literate programming is key to improving medical science, data science, and ultimately all empirical research endeavours. Be aware that everything that Garrett says about R Markdown is also true of Quarto.
Q13.1 What is meant by the replication crisis?2
Q13.2 Which stages of the research process are potential sources of uncertainty?
Q13.3 What would it take for a linguist to fully understand the conclusions of another linguist’s quantitative study?
13.3 Getting started with Quarto
Quarto documents are designed to:
Help you collaborate with other researchers (including your future self!) who are interested in both reproducing your results and understanding how you reached them (i.e. the code).
Provide you with a convenient environment in which to do research - a kind of “modern-day lab notebook where you can capture not only what you did, but also what you were thinking” (Wickham, Çetinkaya-Rundel & Grolemund 2023).
Communicate your analyses to others, including those who are not familiar with any programming language.
13.3.1 Installation
We will be writing Quarto documents from the RStudio IDE3; however, Quarto itself requires a separate installation. Follow these steps to install Quarto and check that everything is working as expected:
Go to https://quarto.org/docs/get-started/ and download the latest Quarto version that is compatible for your operating system.
Once the download is completed (which may take several minutes), double-click on the installer file that you downloaded and click your way through the installation process.
In RStudio, create a new Project by selecting File > New Project… in the main menu, or by clicking on the “new project” button (see Section 6.3). You can choose to first option to create a new project directory if you’ve not yet got one or the second option to select an existing project directory.
Then, create a new Quarto document by navigating to File > New File > Quarto Document…, or clicking on the “new document” button and selecting “Quarto Document…”. A dialogue menu will appear (Figure 13.3). Leave everything as is and simply click on”Create” at the bottom.
- RStudio has now opened a new, untitled Quarto file (
.qmd
). Change the title of the document (which is not the same as its filename!) and add the following three lines in the document header by copying and pasting the following lines at the top of the document.4 Quarto document headers are written in YAML which, I kid you not, stands for Yet Another Markup Language! 😅
---
title: "Learning Quarto"
subtitle: "by reproducing the descriptive statistics of Dąbrowska's (2019) study"
author: "Write your name here"
date: last-modified
---
To check your Quarto installation, render your document by either selecting File > Render Document in the main menu, or clicking on “Render” button in the Quarto menu bar (see Figure 13.4). You will first be prompted to give your
.qmd
file a name (e.g.,LearningQuarto.qmd
) and save it. Once you have saved it, your.qmd
file will automatically be rendered to HTML (Quarto’s default rendering format).Navigate to the folder where you saved your
.qmd
file to find the rendered HTML file. It will have the same filename as your Quarto document but with the file extension.html
(e.g.,LearningQuarto.html
). If you double-click on the file, it will open up in your default web browser (e.g., Firefox, Google Chrome, Safari). You should see that the HTML document features the title of your document, your name as the author, and today’s date (see Figure 13.5).
For now, the document is empty. In the next sections, you will learn how to add text, code, and code outputs to your Quarto document.
13.3.2 RStudio’s visual editor
You may have noticed that RStudio proposes two different modes in which Quarto documents can be edited: Source and Visual (see Figure 13.6).
The Visual mode offers a WYSIWYM authoring experience. This means that you can use the Quarto editing toolbar (see Figure 13.6) for formatting and that you’ll immediately see the effect of your formatting on screen. For example, to format a word in italics, you can click on the corresponding button in the toolbar or use the keyboard shortcut (⌘/Ctrl + I) - just like you would in text-processing software. In the background, however, RStudio automatically converts your formatted text to Markdown in the underlying source code of your .qmd
file. Markdown is a plain-text format. In Markdown, words in italics are enclosed in asterisks like this: *italics*
.
You can toggle back and forth between these two modes by clicking on Source and Visual in the editor toolbar (or using the keyboard shortcut ⌘/Ctrl ⇧ F4).
In this task, you will learn to use RStudio’s Visual model to format text in a Quarto document.
- In a new line beginning after the final
---
of the YAML header, paste the introduction text below. - Using the Quarto editing toolbar, format the text so that, in the Visual mode, it looks like the text displayed in Figure 13.7.
- Render the document and compare how it is formatted in the HTML version.
Introduction
The aim of this report is to reproduce the descriptive statistics reported in Dąbrowska (2019: 5-6) using the original datasets (Dąbrowska 2019: Appendix S4):
Method
Participants
Ninety native speakers (42 male and 48 female) and 67 nonnative speakers of English (21 male and 46 female) were recruited through personal contacts, church and social clubs, and advertisements in local newspapers. Participants were told that the purpose of the study was to examine individual differences in native and nonnative speakers’ knowledge of English and whether these differences are related to their linguistic experience and abilities. All participants signed a written consent form before the research commenced.
The L1 participants were all born and raised in the United Kingdom and were selected to ensure a range of ages, occupations, and educational backgrounds. The age range was from 17 to 65 years (M = 38, SD = 16). Twenty-two percent of the participants held manual jobs, 24% held clerical positions, and 28% had professional-level jobs or were studying for a degree; the remaining 26% were occupationally inactive (i.e., unemployed, retired, or homemakers). In terms of education, participants’ backgrounds ranged from no formal qualifications to Ph.D., with corresponding differences in the number of years spent in full-time education (from 10 to 21; M = 14, SD = 2). Six participants reported a working knowledge of another language; the rest described themselves as monolinguals.
In the Visual mode (see Figure 13.8 (a)), you will need to click on the “Normal” drop-down menu (see Figure 13.8 (b)) to change the formatting of the word Introduction to the “Header 1” style.
To format the long citation, choose the “Blockquote” option from the the “Format” drop-down menu (see Figure 13.8 (c)).
13.4 Markdown text
As you discovered in Task 13.1, writing and formatting text in RStudio’s Visual editor is very similar to writing in a word-processing software such as LibreOffice Writer or Microsoft Word. In the background, however, RStudio converts all formatting to Markdown.
To make writing in Quarto more convenient and less error-prone, you can switch on a spell-checker within RStudio. To do so, go to Tools > Global Options… > Spelling.
Switch to the Source mode to view the text that you formatted in the Visual editor in Task 13.1 in Markdown format.
Q13.4 How is text highlighted in bold displayed in Markdown?
Q13.5 How is a first-level heading displayed in Markdown?
Q13.6 How are block quotes formatted in Markdown?
Q13.7 How will the word ~~mystery~~
be formatted in Markdown?
There are many more formatting options in Markdown. Below are a few more for you to try out.
## Lists
- Bulleted list item 1
- Item 2
- Item 2a
- Item 2b
1. Numbered list item 1
2. Item
The numbers are incremented automatically in the output.
## Links and images
<http://example.com>
[linked phrase](http://example.com)
{fig-alt="Quarto hex logo and the word quarto spelled in small case letters"}
## Tables
| First Header | Second Header |
|--------------|---------------|
| Content Cell | Content Cell |
| Content Cell | Content Cell |
The best way to get the hang of Markdown is simply to try things out. You will also find a handy cheatsheet under Help > Markdown Quick Reference. Remember that you can always go back to the Visual mode to format your text, if that’s easier for you. When it comes to debugging any Quarto syntax errors, however, it’s usually easier to catch these in plain text, so you’ll typically want to use the Source mode for that.
13.5 Code chunks
To run code inside a Quarto document, you need to insert a code chunk. There are three ways to do so:
- Using the keyboard shortcut Cmd/Ctrl + Option + I
- Clicking on the green “Insert chunk” button icon in the editor toolbar
- Manually typing the chunk delimiters
```{r}
and```
It is definitely worth learning the keyboard shortcut as it will save you a lot of time in the long run!
In the code chunk below, {r}
tells Quarto that this chunk is written in the programming language R
. If you wanted to embed a chunk of Python code, you must begin it with ```{python}
instead.
Using one of the three aforementioned options, insert the following R
code chunk in your document.
```{r}
library(here)
library(tidyverse)
```
As you are working on your code within a Quarto document, you can either run:
- each individual line of code using the keyboard shortcut ⌘/Ctrl ⏎ or
- the entire code chunk either by clicking the “Run”
icon or using the shortcut ⇧ ⌘/Ctrl ⏎.
RStudio will execute the code and display the results either within your document (below each chunk) or in the Console, depending on your RStudio settings.5
Chunk output can be customised with chunk options. There are many options to choose from but, the most important options control whether your code block should be executed when you render your Quarto document and what results are inserted in the rendered version:
eval: false
prevents code from being evaluated. And obviously, if the code is not run, no code outputs will be generated.include: false
runs the code, but does not show the code or its outputs in the rendered document. This option is useful for code chunks that are not informative to the readers of your document.echo: false
prevents the code, but not the results from appearing in the rendered document. This option is useful when you want to present the results of your analyses to people who are not interested in the underlying code.message: false
orwarning: false
prevents messages or warnings from appearing in the rendered document.
It is also possible to label code chunks using the label
option. This can help to navigate long Quarto documents and to quickly identify code chunks generating errors during rendering. Chunk labels should be short but meaningful. They should not contain spaces or any other special characters except hyphens (-
).
The easiest way to choose a chunk option is by clicking the gear icon on the chunk you want to modify. This way, you can both choose a label and set up your chunk. If you want to write the code yourself, chunk options are placed at the top of the corresponding chunk following #|
, as in the chunk below. As you can see, the eval: false
chunk option means that the rendered document includes the mathematical operation 13 * 13
, but not the result because chunk was not executed during the rendering process.
In your Quarto document, add a label to your first R
chunk and render your document to HTML.
```{r}
#| label: setup
library(here)
library(tidyverse)
```
Q13.8 What is the output of the setup
chunk in your rendered .html
document?
🐭 Click on the mouse for a hint.
Q13.9 Which code chunk option can you use to remove the two messages from the rendered version of your Quarto document, whilst still ensuring that the setup
chunk is displayed and executed (so that the libraries can be used in future code chunks)?
🐭 Click on the mouse for a hint.
Q13.10 Which code chunk option can you use to remove both the setup
chunk and its outputs from the rendered version of your Quarto document, whilst still ensuring that the libraries are loaded so that their functions can be used further down in the document?
🐭 Click on the mouse for a hint.
13.6 Inline code
So far, we have seen how we can insert and format text in Quarto and how we can add code chunks with various options. But, to make the most of literate programming, we want to combine the two.
This chapter assumes that you are familiar with the following research article (which was first introduced in Section 6.1):
Dąbrowska, Ewa. 2019. Experience, Aptitude, and Individual Differences in Linguistic Attainment: A Comparison of Native and Nonnative Speakers. Language Learning 69(S1). 72-100. https://doi.org/10.1111/lang.12323.
Our starting point for this chapter are the author’s original datasets, which are linked in the article’s Appendix S4.
Appendix S4: Datasets
Dąbrowska, E. (2018). L1 data [Data set]. Retrieved from https://www.iris-database.org/iris/app/home/detail?id=york:935513
Dąbrowska, E. (2018). L2 data [Data set]. Retrieved from https://www.iris-database.org/iris/app/home/detail?id=york:935514
You will only be able to reproduce the analyses and answer the quiz questions from this chapter if you have successfully imported the two datasets from Dąbrowska (2019). To import the datasets, follow the instructions from Section 6.3 to Section 6.5 and complete Task 6.1.
In your Quarto document, insert the following R
chunk to load the Dąbrowska (2019) data.
```{r}
#| label: import-data
#| include: false
<- read.csv(file = here("data", "L1_data.csv"))
L1.data <- read.csv(file = here("data", "L2_data.csv"))
L2.data ```
As this new import-data
chunk requires the here()
function, make sure that it comes after the setup
chunk because, when the document is rendered, code chunks will be executed in the order that they appear. If the {here} library is not loaded before the data is imported, the rendering process will be aborted and an error message will be displayed in the Console.
To begin, we will reproduce the following basic descriptive statistics about the two datasets:
Ninety native speakers (42 male and 48 female) and 67 nonnative speakers of English (21 male and 46 female) were recruited through personal contacts, church and social clubs, and advertisements in local newspapers.
As you may recall from Chapter 7, the number of native and non-native participants corresponds to the number of rows in the corresponding dataset:
nrow(L1.data)
[1] 90
nrow(L2.data)
[1] 67
In Quarto, we can use inline code to dynamically insert these numbers in our paragraph. Inline code in R
begins with `{r}
and ends with a single backtick `
. It is best to use the Source mode to insert inline code.
Using the Source mode, add the following section to your Quarto document and render it to HTML.
## Descriptive statistics about the participants
`{r} nrow(L1.data)` native speakers and `{r} nrow(L2.data)` nonnative speakers of English were recruited through personal contacts, church and social clubs, and advertisements in local newspapers.
The rendered version should read like this (if you are obtaining different numbers, this either means that you have tempered with the original data files or that they have been corrupted)6:
Descriptive statistics about the participants
90 native speakers and 67 nonnative speakers of English were recruited through personal contacts, church and social clubs, and advertisements in local newspapers.
Inline code should only be used for very simple code, ideally with no more than one function, as in `{r} nrow(L1.data)`
. To insert the output of more complex operations, it is best to write the code and save its output(s) to the local environment in a hidden code chunk (using the option #| include: false
).
```{r}
#| label: L1-gender
#| include: false
<- L1.data |>
L1.males filter(Gender == "M") |>
count()
<- L1.data |>
L1.females filter(Gender == "F") |>
count()
```
The saved objects (L1.males
and L1.females
) each contain one number. They can therefore be directly called within the text as inline code.
`{r} nrow(L1.data)` native speakers (`{r} L1.males` male and `{r} L1.females` female) and `{r} nrow(L2.data)` nonnative speakers of English were recruited through personal contacts, church and social clubs, and advertisements in local newspapers.
When rendered, the paragraph will read:
90 native speakers (42 male and 48 female) and 67 nonnative speakers of English were recruited through personal contacts, church and social clubs, and advertisements in local newspapers.
In your Quarto document, add a code chunk called L2-gender
in which you compute the values necessary to complete the missing descriptive statistics in the sentence above. When rendered, your sentence should read:
90 native speakers (42 male and 48 female) and 67 nonnative speakers of English (21 male and 46 female) were recruited through personal contacts, church and social clubs, and advertisements in local newspapers.
To save the number of male L2 participants as an R
object, we can follow the same procedure as above.
<- L2.data |>
L2.males filter(Gender == "M") |>
count()
For the number of female L2 participants, however, it’s not so simple because some are labelled f
, while others are labelled F
(see Task 9.1 in Section 9.4.2).
table(L2.data$Gender)
f F M
6 40 21
Below are four possible methods to solve this issue (and there are many more still!):
# Method 1:
<- L2.data |>
L2.Females filter(Gender == "F") |>
count()
<- L2.data |>
L2.females filter(Gender == "f") |>
count()
<- L2.Females + L2.females
L2.allfemales
# Method 2:
<- L2.data |>
L2.allfemales filter(Gender == "F" | Gender == "f") |>
count()
# Method 3:
<- L2.data |>
L2.allfemales filter(Gender %in% c("F", "f")) |>
count()
# Method 4:
<- L2.data |>
L2.allfemales mutate(Gender = toupper(Gender)) |>
filter(Gender == "F") |>
count()
Some of these methods are perhaps more elegant than others, but they are all acceptable. After all, they all work! 🙃
Once they are saved to the local environment, the values can be inserted inline in the usual way:
`{r} nrow(L1.data)` native speakers (`{r} L1.males` male and `{r} L1.females` female) and `{r} nrow(L2.data)` nonnative speakers of English (`{r} L2.males` male and `{r} L2.allfemales` female) were recruited through personal contacts, church and social clubs, and advertisements in local newspapers.
If we want to start our paragraph with 90 written in as a word rather than in digits, we can use the numbers_to_words function()
function from the {xfun} package. First, you’ll need to install the {xfun} package and then add a line to your setup
chunk to load it.7
```{r}
library(xfun)
```
First, you can test that it works by running this code:
numbers_to_words(nrow(L1.data))
[1] "ninety"
To start our paragraph with a capital letter, we’ll need to set the function’s cap
argument to TRUE
.
`{r} numbers_to_words(nrow(L1.data), cap = TRUE)` native speakers (`{r} L1.males` male and `{r} L1.females` female) and `{r} nrow(L2.data)` nonnative speakers of English (`{r} L2.males` male and `{r} L2.allfemales` female) were recruited through personal contacts, church and social clubs, and advertisements in local newspapers.
Next, we want to reproduce the following descriptive statistics about the L1 participants:
The L1 participants were all born and raised in the United Kingdom and were selected to ensure a range of ages, occupations, and educational backgrounds. The age range was from 17 to 65 years (M = 38, SD = 16).
We can use the base R
functions min()
, max()
, mean()
, and sd()
to compute these values.
`{r} min(L1.data$Age)` to `{r} max(L1.data$Age)` years (*M* = `{r} mean(L1.data$Age)`, *SD* = `{r} sd(L1.data$Age)`). The L1 participants were all born and raised in the United Kingdom and were selected to ensure a range of ages, occupations, and educational backgrounds. The age range was from
The rendered document will read:
The L1 participants were all born and raised in the United Kingdom and were selected to ensure a range of ages, occupations, and educational backgrounds. The age range was from 17 to 65 years (M = 37.5444444, SD = 16.148998).
Whilst these values are correct, in practice, we want to round them off to the nearest integer. To this end, we can wrap the round()
function around the mean()
and max()
function (see Section 7.5.1).
`{r} min(L1.data$Age)` to `{r} max(L1.data$Age)` years (*M* = `{r} round(mean(L1.data$Age))`, *SD* = `{r} round(sd(L1.data$Age))`). The L1 participants were all born and raised in the United Kingdom and were selected to ensure a range of ages, occupations, and educational backgrounds. The age range was from
The rendered document will read:
The L1 participants were all born and raised in the United Kingdom and were selected to ensure a range of ages, occupations, and educational backgrounds. The age range was from 17 to 65 years (M = 38, SD = 16).
For more complex computations, it is much better to compute the values in a dedicated code chunk. This also allows you to add code annotation which is important to ensure that other researchers (and your future self!) understand the reasoning behind the code.
For example, the following chunk containing annotated code can be used to reproduce the descriptive statistics concerning L1 participants’ professional occupations and foreign language skills.
```{r}
#| label: L1-jobs
# Counting manual job participants using a tidyverse solution:
<- L1.data |> # Select data frame
L1.manualjobs count(OccupGroup) |> # Tally each level of OccupGroup
mutate(proportion = n/sum(n)) |> # Calculate proportion
filter(OccupGroup == "M") |> # Select only the manual occupations
pull(proportion) |> # Select just the proportion value
round(2)
# Alternative: Counting manual job participants using a base R solution:
<- round(proportions(table(L1.data$OccupGroup))["M"], digits = 2)
L1.manualjobs
# Counting clerical job participants
<- round(proportions(table(L1.data$OccupGroup))["C"], digits = 2)
L1.clerical
# Counting professional job participants
<- L1.data |>
L1.pro.num filter(OccupGroup %in% c('PS', 'PS ')) |>
count()
<- round((L1.pro.num/ nrow(L1.data)), digits = 2)
L1.pro
# Counting professionally inactive participants
<- round(proportions(table(L1.data$OccupGroup))["I"], digits = 2)
L1.inactive
# Counting participants who speak at least one language other than English
<- L1.data |>
L1.otherlgs filter(OtherLgs != "None") |>
count()
```
The values saved to the local environment as R
objects can then be inserted inline within the Markdown text as follows:
`{r} numbers_to_words((L1.manualjobs*100), cap = TRUE)` percent of the participants held manual jobs, `{r} L1.clerical*100`% held clerical positions, and `{r} L1.pro*100`% had professional-level jobs or were studying for a degree; the remaining `{r} L1.inactive*100`% were occupationally inactive (i.e., unemployed, retired, or homemakers). In terms of education, participants’ backgrounds ranged from no formal qualifications to Ph.D., with corresponding differences in the number of years spent in full-time education (from `{r} min(L1.data$EduYrs)` to `{r} max(L1.data$EduYrs)`; *M* = `{r} round(mean(L1.data$EduYrs))`, *SD* = `{r} round(sd(L1.data$EduYrs))`). `{r} L1.otherlgs` participants reported a working knowledge of another language; the rest described themselves as monolinguals.
🧑💻 Add this section to your Quarto document and render it to HTML. Compare the values in your rendered document with the original ones from the published study.
Twenty-two percent of the participants held manual jobs, 24% held clerical positions, and 28% had professional-level jobs or were studying for a degree; the remaining 26% were occupationally inactive (i.e., unemployed, retired, or homemakers). In terms of education, participants’ backgrounds ranged from no formal qualifications to Ph.D., with corresponding differences in the number of years spent in full-time education (from 10 to 21; M = 14, SD = 2). Six participants reported a working knowledge of another language; the rest described themselves as monolinguals (Dąbrowska 2019: 6).
Q13.11 Compare the rendered version of your document with the original descriptive statistics reported in Dąbrowska (Dąbrowska 2019: 6). Could you successfully reproduce these descriptive statistics? Which values are different?
13.7 Tables
The easiest way to manually construct a table in a Quarto document is to switch to Visual mode and click on Insert > Table or use the shortcut Option + Cmd + T. You can choose how many rows and columns you need and then simply complete your table in the Visual editor.
Same data | Different data | |
---|---|---|
Same analysis method | Reproducible | Replicable |
Different analysis method | Robust | Generalisable |
When you switch to the Source mode, you will see that, in Markdown (see Section 13.4), your table has been converted to a pipe table. Pipe tables allow for column alignment and captions.
| | Same data | Different data |
|-------------------------------|--------------|----------------|
| **Same analysis method** | Reproducible | Replicable |
| **Different analysis method** | Robust | Generalisable |
: Terminology used in this chapter
Most of the time, however, you will want to display tabular results based on data that you have imported, manipulated, and/or analysed in R
. If the output of a code chunk within your Quarto document is a table, it will automatically be displayed in your rendered document (unless you specify a chunk option to hide its output, see Section 13.5).
|>
L1.data count(OtherLgs,
sort = TRUE)
OtherLgs n
1 None 84
2 German 3
3 French 2
4 Spanish 1
However, this output is not particularly nicely formatted. There are several R
packages designed to create tables that are “presentation-ready”. One of these is the {gt} package. Beyond its main function gt()
, it offers many more functions to further style tables such as cols_label()
to change the column headers.
```{r}
#| label: tbl-L1-languages
#| tbl-cap: "Example of a {gt} table"
#| tbl-cap-location: top
#| tbl-colwidths: [80,20]
#install.packages("gt")
library(gt)
|>
L1.data count(OtherLgs,
sort = TRUE) |>
gt() |>
cols_label(
OtherLgs = "Additional language",
n = "N")
```
Additional language | N |
---|---|
None | 84 |
German | 3 |
French | 2 |
Spanish | 1 |
In addition, Quarto also has a range of chunk options to customise the display of tables, including tbl-cap
for the addition of a table caption and tbl-cap-location
to determine where the caption is placed. In the above chunk, the table’s label
chunk option begins with tbl-
. This allows for in-text cross-referencing to the table with the insertion of @tbl-L1-languages
in this Quarto document, which is rendered as the following linked cross-reference: Table 13.1.
The Quarto guide provides further information about formatting tables: https://quarto.org/docs/authoring/tables.html.
13.8 Figures
In Quarto documents, figures can either be embedded (e.g., as a .png
or .jpeg
image file) or generated as a result of a code chunk.
13.8.1 Images
To embed an image from an external file, you can use the “Insert” menu in RStudio’s Visual editor and select “Figure / Image” (see Figure 13.9). This will open up a menu where you can select the image that you want to insert, as well as add alt-text (see Section 10.1.5) and a caption. The easiest way to adjust the size of an embedded image is to click on the image and then adjust the size of the image with the blue circle in the bottom-right corner of the image (see Figure 13.9).
Below is the source code for Figure 13.10 in Markdown. The code includes the relative path to the image file (see Section 3.3) relative to the project directory (see Section 6.3). In the example below, the image file BERD_pipeline-real.jpg
is located in a subfolder called images
.
[A more realistic research pipeline [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/) @seiboldBERDCourseMake2023](images/BERD_pipeline-real.jpg){#fig-RealisticPipeline fig-alt="Cartoon drawing of a complex set of pipes with various entry points for \"data\" and a single output: a research paper with text, a table, and a plot. Sections of the pipe are coloured according to the processes that they correspond to. These include data cleaning, overview, figures, modelling, and text." width="480"} !
This example embedded image includes a caption (that, itself, includes a link), an alt-text (see Section 10.1.5), and a custom width in pixel. Note that, in the source code, special characters such as quotation marks need to be escaped using a backslash \
. Tags beginning with #fig-
can be used to cross-reference images by replacing the #
with @
. Hence, in this chapter, @fig-RealisticPipeline
in Markdown is rendered as Figure 13.10.
Figures can be arranged in many ways. The example below uses the :::
div syntax to display two images side-by-side. This syntax also allows for subcaptions as shown in Figure 13.11.
::: {#fig-Pipelines layout-ncol="2"}{#fig-IdealisedPipeline}
{#fig-RealisticPipeline2}
[CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/) @seiboldBERDCourseMake2023
Research workflows as pipelines :::
To find out more about inserting and arranging figures, check out the detailed Quarto guide: https://quarto.org/docs/authoring/figures.html.
13.8.2 Plots
If your Quarto document includes code chunks that generate plots, they will automatically be integrated in your rendered document. Plots will either appear immediately after the corresponding code chunk or where the code chunk would be, if you chose to hide the code chunk that generated the plot with the echo: false
option.
As with computed tables (see Section 13.7), various code chunk options can be added to customise the look of computed figures in rendered documents. Compare the code chunk options below and the generated output in Figure 13.12.
```{r}
#| label: fig-scatterplot
#| fig-cap: "L2 participants' lexical proficiency in English and their professional occupational group"
#| fig-height: 5
#| fig-asp: 0.618
#| message: false
|>
L2.data ggplot(mapping = aes(x = VocabR,
y = CollocR)) +
geom_point(aes(colour = OccupGroup),
size = 2) +
geom_smooth(method = "lm") +
scale_colour_viridis_d() +
labs(x = "Vocabulary test scores",
y = "Collocation test scores",
colour = "Occupational\ngroups") +
theme_bw()
```
According to the authors of “R for Data Science”, figure sizing and scaling is “an art and science and getting things right can require an iterative trial-and-error approach” (Wickham, Çetinkaya-Rundel & Grolemund 2023). This is because there are five main options that control figure sizing: fig-width
, fig-height
, fig-asp
, out-width
and out-height
. The first three control the size of the figure created by R
, whereas the latter two control the size at which it is inserted in the rendered document.
If you are sharing your research analyses and results in HTML format, you can embed interactive plots (see Section 10.2.8) in your Quarto documents. Hover over Figure 13.13 to start exploring the data interactively.
Show R
code to generate the interactive plot below.
#install.packages("plotly")
library(plotly)
<- L2.data |>
L2.scatter2 ggplot(mapping = aes(x = VocabR,
y = CollocR,
text = paste("L1:", NativeLg, "</br>Age:", Age, "</br>Years in formal education:", EduTotal, "</br>Job:", Occupation))) +
geom_point(aes(colour = OccupGroup),
size = 2) +
scale_colour_viridis_d() +
labs(x = "Vocabulary test scores",
y = "Grammar test scores",
colour = "Occupational\ngroups") +
theme_bw()
ggplotly(L2.scatter2)
13.9 References
An important aspect of academic writing is the inclusion of in-text bibliographic references (citations) and a well-formatted list of references (the bibliography). RStudio’s Visual editor makes inserting bibliographic references extremely simple. To insert a reference, simply click on “Insert” and then select “Citation” or use the keyboard shortcut ⌘/Ctrl ⇧ F8. This opens up a menu (see Figure 13.14) giving you the option to search for the source that you’d like to cite on your own computer (e.g., in your own Zotero database, if you use Zotero) or on the web via the Crossref database or directly using a DOI.
Alternatively, if you start typing @
in the Visual editor, a quick reference menu will appear. Either way, any references that you add will be displayed as @
followed by a reference identifier. For example, in the source code of this Quarto document, every reference to Dąbrowska (2019) is indicated as @DabrowskaExperienceAptitudeIndividual2019
.
For more information on how to format your in-text citations, see the Quarto guide.
When you insert your first reference in a Quarto document, RStudio will automatically create a references.bib
file in your project folder. All references are automatically added to this new BibLaTeX file. As shown below, .bib
files contain entries that begin with @
followed by the type of reference (article
, book
, manual
, url
, etc.) and the reference identifier (e.g., DabrowskaExperienceAptitudeIndividual2019
, wickhamDataScienceImport2023
). The rest of the entries contains structured information about each reference including its title, date of publication, and DOI or ISBN.
references.bib
@article{
DabrowskaExperienceAptitudeIndividual2019,
title={Experience, Aptitude, and Individual Differences in Linguistic Attainment: A Comparison of Native and Nonnative Speakers},
volume={69},
ISSN={1467-9922},
url={https://onlinelibrary.wiley.com/doi/abs/10.1111/lang.12323},
DOI={10.1111/lang.12323},
number={S1},
journal={Language Learning},
author={Dąbrowska, Ewa},
year={2019},
pages={72–100}
}
@book{
wickhamDataScienceImport2023,
place={Beijing, Boston, Farnham, Sebastopol, Tokyo},
edition={2},
title={R for Data Science: Import, tidy, transform, visualize, and model data},
ISBN={978-1-4920-9740-2},
url={https://r4ds.hadley.nz/},
publisher={O’Reilly},
author={Wickham, Hadley and Çetinkaya-Rundel, Mine and Grolemund, Garrett},
year={2023}
}
In order to connect this bibliography.bib
file with our Quarto document, we need to add a bibliography
key to our YAML header. If our references.bib
file is located in the same folder as our Quarto document (which is what RStudio does by default), we can simply add the following line to our document header:
---
title: "Learning Quarto"
subtitle: "by reproducing the descriptive statistics of Dąbrowska's (2019) study"
author: "Elen Le Foll"
date: last-modified
bibliography: references.bib
---
With this modified YAML header, when the document is rendered, a bibliography will automatically be added to the end of the document. This means that, if you have citations in your document, it is a good idea to include a header section # References
at the end of the document.
References
Dąbrowska, Ewa. 2019. “Experience, Aptitude, and Individual Differences in Linguistic Attainment: A Comparison of Native and Nonnative Speakers.” Language Learning 69 (S1): 72-100. https://doi.org/10.1111/lang.12323.
Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 2nd ed. O’Reilly. https://r4ds.hadley.nz/.
By default, Quarto will use the Chicago Manual of Style author-date citation format (as above). However, you can point to a different citation stylesheet in the form of a .csl
(Citation Style Language) file in the YAML header. This allows us to determine exactly how our bibliography and in-text citations should be formatted. Many institutions, publishers, and journals have their own (sometimes annoyingly specific!) requirements. Luckily, the research community has put together a large repository of citation stylesheets for you to choose from: https://www.zotero.org/styles. You can download any of these stylesheets (as a .csl
file), place the file in your project folder, and then link it to your Quarto document by adding a cls
key to your header.
---
title: "Learning Quarto"
subtitle: "by reproducing the descriptive statistics of Dąbrowska's (2019) study"
author: "Elen Le Foll"
date: last-modified
bibliography: references.bib
csl: international-journal-of-learner-corpus-research.csl
---
For example, if you wanted to submit your paper to the International Journal of Learner Corpus Research, you could find the corresponding CLS stylesheet in the Zotero styles database, save it in your project folder, and link to it in your YAML header as above. When rendered, your document’s bibliography would then read:
References
Dąbrowska, E. (2019). Experience, Aptitude, and Individual Differences in Linguistic Attainment: A Comparison of Native and Nonnative Speakers. Language Learning, 69(S1), 72-100. https://doi.org/10.1111/lang.12323.
Wickham, H., Çetinkaya-Rundel, M., & Grolemund, G. (2023). R for data science: Import, tidy, transform, visualize, and model data (2nd ed.). O’Reilly. Retrieved from https://r4ds.hadley.nz/.
Using any of the methods described above, add an in-text bibliographic reference to the following article in your Quarto document:
In’nami, Yo, Atsushi Mizumoto, Luke Plonsky & Rie Koizumi. 2022. Promoting computationally reproducible research in applied linguistics: Recommended practices and considerations. Research Methods in Applied Linguistics 1(3). 100030. https://doi.org/10.1016/j.rmal.2022.100030.
Specifically, we want to cite this passage from page 8:
As implementing these steps may seem daunting, we recommend that researchers engage in reproducible research incrementally. That may be one small step for a researcher, but it will represent a giant leap for the field of applied linguistics when consolidated and accumulated in the long run.
a. Which in-text citation cites this specific page?
Go to the Zotero style repository and download the .csl
citation stylesheet to format references according to the American Psychological Association (APA) 7th edition. Link this stylesheet to your Quarto document and render to HTML.
b. Now that your document includes references formatted in APA7, how are the authors’ names listed in your bibliography?
One of the major challenges of doing research is actually managing the large number of references that you need to consult, read, and cite in your projects. The good news is that reference management software are there to help you overcome this challenge. Whether you are working on a term paper, your Master’s dissertation, your PhD thesis, or a post-doctoral project, it is always worth investing the time to learn to use a reference manager!
Zotero is a free and open-source bibliographic reference manager that will help you organise all your sources and generate beautifully formatted bibliographies for all your projects. It offers a browser extension that enables you to quickly add references to your library directly from your web browser.
What’s more, Zotero can be integrated in RStudio, making it very easy to include BibTeX-formatted references in your Quarto documents. Find out more in the RStudio documentation.
13.10 Computing environment
In addition to referencing academic papers, it is also very important that we reference which R
version we used for our analyses and which packages and package versions. This serves two purposes:
- Independent researchers (and our future selves!) know exactly what they need to be able to reproduce our analyses (see Section 13.2).
- We give credit to the kind people who spent time and effort developing and sharing the
R
packages that we used for our analyses (see Section 1.2).
The easiest way to “give credit where credit is due” to R
package developers is to use the {grateful} package. Its cite_packages()
function will scan your project for all the R
packages that are used and generate a BibTeX file called grateful-refs.bib
that contains the package references.
You will first need to add a reference to the BibTeX file generated by {grateful} in your YAML header. This means that your Quarto document will now have two bibliography files, which is fine as long as you use the following YAML syntax to reference them both.
---
bibliography:
- references.bib
- grateful-refs.bib
---
Then, load the library (which you will have to install first, of course) and call the cite_packages(output = "paragraph")
function. This will generate a paragraph that mentions all the used packages (see below) and add their references to the bibliography (either at the bottom of your rendered Quarto document or, in the case of this textbook, in the corresponding chapter, see Section 13.9).
```{r}
#install.packages("grateful")
library(grateful)
cite_packages(output = "paragraph", out.dir = ".")
```
We used R version 4.5.0 (R Core Team 2025a) and the following R packages: checkdown v. 0.0.12 (Moroz 2020), colorBlindness v. 0.1.9 (Ou 2021), ggpattern v. 1.1.4 (FC, Davis & ggplot2 authors 2025), ggrepel v. 0.9.6 (Slowikowski 2024), ggwordcloud v. 0.6.2 (Le Pennec & Slowikowski 2024), gt v. 1.0.0 (Iannone et al. 2025), here v. 1.0.1 (Müller 2020), janeaustenr v. 1.0.0 (Silge 2022), kableExtra v. 1.4.0 (Zhu 2024), knitcitations v. 1.0.12 (Boettiger 2021), knitr v. 1.50 (Xie 2014; Xie 2015; Xie 2025a), paletteer v. 1.6.0 (Hvitfeldt 2021), patchwork v. 1.3.0 (Pedersen 2024), plotly v. 4.10.4 (Sievert 2020), report v. 0.6.1 (Makowski et al. 2023), rmarkdown v. 2.29 (Xie, Allaire & Grolemund 2018; Xie, Dervieux & Riederer 2020; Allaire et al. 2024), scales v. 1.4.0 (Wickham, Pedersen & Seidel 2025), tidyverse v. 2.0.0 (Wickham et al. 2019), tools v. 4.5.0 (R Core Team 2025b), truncnorm v. 1.0.9 (Mersmann et al. 2023), viridis v. 0.6.5 (Garnier et al. 2024), xfun v. 0.52 (Xie 2025b).
Alternatively, cite_packages()
can generate a table with all the package names, versions, and references. Table 13.2 lists all of the packages used in the making of this textbook.
cite_packages(output = "table", out.dir = ".")
Package | Version | Citation |
---|---|---|
base | 4.4.1 | R Core Team (2025a) |
checkdown | 0.0.12 | Moroz (2020) |
colorBlindness | 0.1.9 | Ou (2021) |
ggpattern | 1.1.4 | FC, Davis & ggplot2 authors (2025) |
ggrepel | 0.9.5 | Slowikowski (2024) |
ggwordcloud | 0.6.2 | Le Pennec & Slowikowski (2024) |
gt | 0.11.1 | Iannone et al. (2025) |
here | 1.0.1 | Müller (2020) |
janeaustenr | 1.0.0 | Silge (2022) |
kableExtra | 1.4.0 | Zhu (2024) |
knitcitations | 1.0.12 | Boettiger (2021) |
knitr | 1.49 | Xie (2014); Xie (2015); (knitr2024?) |
paletteer | 1.6.0 | Hvitfeldt (2021) |
patchwork | 1.2.0 | Pedersen (2024) |
plotly | 4.10.4 | Sievert (2020) |
report | 0.5.9 | Makowski et al. (2023) |
rmarkdown | 2.29 | Xie, Allaire & Grolemund (2018); Xie, Dervieux & Riederer (2020); Allaire et al. (2024) |
scales | 1.3.0 | Wickham, Pedersen & Seidel (2025) |
tidyverse | 2.0.0 | Wickham et al. (2019) |
tools | 4.4.1 | R Core Team (2025b) |
truncnorm | 1.0.9 | Mersmann et al. (2023) |
viridis | 0.6.5 | Garnier et al. (2024) |
xfun | 0.51 | Xie (2025b) |
Tracking the versions of the packages that your code relies on is important if you want your analysis code to be reproducible in the long-run (i.e. so that you or a colleague run it next month or next year). However, having to manually install these packages with these exact versions is hardly feasible. To simplify the process of re-creating your project environment, I recommend using {renv}.
The {renv} library keeps track of the exact package versions that your project depends on, and ensures that those exact versions are installed whenever and wherever your project is opened. {renv} provides each project with its own isolated package library, ensuring that you can update packages in new projects without risking breaking older projects. To create project-specific environments that additionally include system dependencies, I recommend {rix}. The aim of these packages is to make your R
projects more isolated, portable and therefore reproducible.
For an accessible introduction to stabilising your computing environment, I recommend reading: https://berd-nfdi.github.io/BERD-reproducible-research-course/3-3-stabilize.html and https://the-turing-way.netlify.app/reproducible-research/renv.html.
13.11 Publishing formats
So far, we have only tried rendering our Quarto document to HTML, which is the default publishing format for Quarto documents. HTML has many advantages and is great for publishing online, but the beauty of Quarto is that you can publish in many other formats, too.
13.11.1 Text-processing formats
For example, if your supervisor or colleague would like a Microsoft Word document, this is no problem. All you need to do is add the following line to your YAML header and then render your document again. This is will generate a .docx
file that includes your text, any code that you wanted to show in your document, and all of the code outputs that you wanted to share, such as your statistics, graphs, and tables.
---
format: docx
---
Note, however, that dynamic code outputs (such as the interactive {plotly} plot displayed in Figure 13.13) cannot meaningfully be rendered to static formats such as Microsoft Word documents or PDF. Attempting to do so can cause rendering errors. To avoid R
attempting to render these code chunks to formats other than HTML, you can enclose them in the following div block:
::: {.content-hidden unless-format="html"}
```{r}
#| label: fig-scatterplot-plotly
ggplotly(L2.scatter2)
```
:::
To share your work with LibreOffice and OpenOffice users, use the format: odt
option, which will generate an OpenDocument - an open standard file format that can be opened in any text-processing software.
If you completed Task 8.1 (Section 8.3.1), you copied-and-pasted a paragraph with gaps into LibreOffice Writer or Microsoft Word and then manually inserted descriptive statistics that you had calculated in R
. This method of copying-and-pasting across different programmes is very error-prone! What if you accidentally paste the wrong number in the wrong place? And what if there is an update to the dataset or you make some changes to the data cleaning procedure? You’d had to manually change all the numbers again. This is time-consuming and, more worryingly, very likely to result in errors!
Now that you know about literate programming in Quarto, rewrite the following paragraph describing the GrammarR
variable in L1.data
and L2.data
in Quarto. Use in-text code chunks to fill the gaps and then render your paragraph to .docx
or .odt
format to check the results.
On average, English native speakers performed only marginally better in the English grammatical comprehension test (median = ______) than English L2 learners (median = ______). However, L1 participants’ grammatical comprehension test results ranged from ______to ______, whereas L2 participants’ results ranged from ______to ______.
13.11.2 PDF
It is also possible to render Quarto documents to PDF; however, this requires you to have a working LaTeX distribution on your computer. If you already have a recent LaTeX installation, you shouldn’t need to do anything else to generate PDF documents. Quarto developers recommend that you use the TinyTeX distribution. To install (or update) TinyTeX, go to the Terminal pane in RStudio and run the following command:
Terminal
quarto install tinytex
This is likely to take several minutes but you will only need to do it once. Afterwards, you can add the following line to your Quarto YAML header and you’re ready to render to PDF!
---
format: pdf
---
Quarto’s different publishing formats have different options. For example, if you want your PDF document have numbered sections and to include a table of content (TOC), you can add the following lines to your YAML header:
---
format:
pdf:
number-sections: true
toc: true
---
13.11.3 Slides
In research, it’s quite common that you will be working on a project that will be submitted as a paper or thesis (e.g., in PDF format) and that you’ll also want to present in class, to your research group, or at a conference. You can easily turn your Quarto document into presentation slides, too! There are currently three presentation formats to choose from:
Revealjs | Revealjs is an open-source HTML presentation framework. | format: revealjs |
PowerPoint | PowerPoint is Microsoft Office’s presentation editing software. | format: pptx |
Beamer | Beamer is a LaTeX class for producing presentations and slides in PDF format. | format: beamer |
13.12 Conclusion
In this chapter, we’ve only just scratched the surface of what’s possible in Quarto. The Quarto documentation is very detailed and well worth exploring to find out what else you can do in Quarto from books to blogs and interactive dashboards: https://quarto.org/docs/guide/.
➡️ For those of you who want to dive a little deeper, I heartily recommend the final chapter of “An Introduction to Quantitative Text Analysis for Linguistics: Reproducible Research Using R” by Jerid Francom: https://qtalr.com/book/part_5/11_contribute.html.
➡️ Quarto has many functionalities that are particularly attractive to those of us involved in higher education teaching and academic research. Watch Quarto for Academics (20 minutes) by Mine Çetinkaya-Rundel to find out more.
➡️ Finally, the latest edition of “R for Data Science” also has a great chapter on communicating the results of data science projects using Quarto: https://r4ds.hadley.nz/communicate.
Check your progress 🌟
Well done! You have successfully completed this chapter on literate programming using Quarto. You have answered 0 out of 10 questions correctly.
Are you confident that you can…?
Although there is ground for optimism here, as more and more linguists and language education scholars are beginning to make their data open in repositories such as the Open Science Framework (OSF), IRIS, and TROLLing (see Section 2.4).↩︎
The above definitions are all from the community-sourced FORRT glossary (Parsons et al. 2022).↩︎
You can, of course, write Quarto documents using any other IDE (Integrated Developer Environment, see Section 4.2.1) that supports Quarto such as VS Code, Jupyter, or Neovim.↩︎
Note that, in YAML syntax, character strings such as the document title and your name must be enclosed in quotation marks. By contrast, the date is not enclosed in quotation marks because it is a dynamic variable that will be adjusted to your computer’s system date so that, every time you render the document, the date will be updated.↩︎
You can change this behaviour in your RStudio preferences under Tools > Global Options > R Markdown by selecting or unselecting the option: “Show output inline for all R Markdown documents”.↩︎
Using Microsoft Excel to open these
.csv
files can corrupt the files and can happen even if you did not use Excel yourself (e.g., on some Windows computers, this is sometimes done automatically as part of the download process). To find out more, see Section 2.6.↩︎To make your Quarto document even more reproducible, you can replace your
setup
chunk with the following function that will automatically check if a package needs to be installed before it is loaded:```{r} #| label: improved-setup # List of packages necessary in this Quarto document: <- c("here", "tidyverse", "xfun") packages # Function to install the packages that are not yet installed: <- packages %in% rownames(installed.packages()) installed_packages if (any(installed_packages == FALSE)) { install.packages(packages[!installed_packages]) } # Function to load the packages without printing any messages: invisible(lapply(packages, library, character.only = TRUE)) ```
Alternatively, consider using {renv} or {rix} for your project (see Section 13.10).↩︎