3 Synthesis
This page displays Chapter 11 from the Author Accepted Manuscript (AAA) version of the book: https://osf.io/jpxae/. Please cite the Version of Record: https://benjamins.com/catalog/scl.116.
3.1 Summary
This study has provided a systematic, empirical account of the kind of English that secondary school EFL learners interact with via their textbooks as compared to the kind of English that they can be expected to encounter outside the EFL classroom. This new understanding of Textbook English is important because textbooks constitute one of the major, if not the most important, vector of English language input that EFL learners encounter in the first four to five years of their secondary education. Although it is popular knowledge that the language portrayed in EFL textbooks somehow “feels” different from how English is generally used outside the classroom, this study is the first to attempt to model the nature of these linguistic peculiarities across different registers and textbook proficiency levels by accounting for a broad range of linguistic features and their co-occurrences. Specifically, it set out to describe the language that secondary school pupils in France, Germany, and Spain are exposed to via their coursebooks and their accompanying audio and audio-visual materials.
To this end, the Textbook English Corpus (TEC) was compiled. It comprises nine series of secondary school EFL textbooks (42 textbook volumes) used at lower secondary level in France, Germany, and Spain and was manually annotated for register. Three reference corpora (Spoken BNC2014, Info Teens, and Youth Fiction) were used as baselines for comparisons between the language of the TEC and the kind of naturally occurring English that learners can be expected to encounter, engage with, and produce themselves outside the EFL classroom.
From the literature review (Chapter 3), we concluded that, to date, a multitude of studies have focused on the representations of individual linguistic features in EFL and ESL textbooks. Some of these studies were described as ‘intra-textbook analyses’ because they seek to explore and describe the language of textbooks without relying on any comparison benchmarks. By contrast, ‘comparative textbook language analyses’ draw on reference corpora or corpus-driven lists to infer what is special about the language of textbooks. In this second paradigm, we identified three recurrent issues. First, previous research has failed to consider interactions between the individual linguistic features examined. Thus, whilst some influential studies have helped us to understand how English learners can be misled by their textbooks into making unidiomatic use of specific lexico-grammatical features (e.g., Römer 2005 on the progressive aspect), we concluded that only a multivariable approach can paint the full picture as to how Textbook English – as a whole – differs from the English that language learners are likely to encounter outside the EFL classroom. Second, we saw that prior scholarship has mostly ignored register differences between the various types of texts typically included in school foreign language textbooks. Given that school EFL textbooks frequently feature, for example, extracts from short stories, dialogues, instructions, and exercises on a single double page, we argued that a meaningful analysis of Textbook English requires a register-based approach. Third, previous quantitative corpus-based studies have usually been undertaken at the corpus level, e.g., comparing the occurrences of a linguistic feature across an entire textbook corpus with those from a reference corpus, and have therefore often failed to account for the effects of varying textbook proficiency levels or the potential idiosyncrasies of individual textbook authors, editors, or publishers. Thus, prior to the present study, much textbook language research had (often implicitly) assumed that Textbook English constitutes a homogenous variety of English with no (systematic) sources of internal variation.
This study set out to test this assumption and uncover the linguistic specificities of Textbook English. Specifically, it examined the extent to which the language of current EFL textbooks used in secondary schools in France, Germany, and Spain is representative of ‘real-world’ English as used by native/proficient English speakers in similar communicative situations. It asked whether some textbook registers are more faithfully represented than others and whether textbooks’ portrayal of different registers becomes more natural-like as the textbooks’ targeted proficiency level increases. Finally, the study also sought to identify the clusters of linguistic features that characterise Textbook English across different registers and learner proficiency levels.
To answer these research questions, Biber’s (1988; 1995) multi-feature/multi-dimensional analysis (MDA) framework was chosen as a method capable of summarising the patterns of co-occurrences of many linguistic features across different groups of texts. In a preliminary study, the texts of the TEC were compared against the dimensions of Biber’s (1988) seminal model of variation in general spoken and written registers of English (Le Foll 2021; 2022: chap. 6). On this basis, the present study identified a number of potential methodological issues linked to both the use of Biber’s (1988) model as a baseline and the MDA framework as it is traditionally applied. Consequently, a modified MDA framework was developed and implemented for the present study. This modified framework relies on a stringent selection of linguistic features, the normalisation of feature counts to linguistically informed baselines, the application of a computationally stable dimension reduction method (Principal Component Analysis; PCA), the use of mixed-effects linear regression modelling to tease out the potential mediating effects of various variables, and the interpretation of the results using multi-dimensional graphs that expose, rather than obscure, the full breadth of linguistic variation.
In applying the modified MDA framework, the results of the study have convincingly debunked the long-held assumption that the language of school EFL textbooks can meaningfully be considered a homogenous variety of English. Mode and register emerge as significant drivers of intra-textbook linguistic variation, making it impossible to adequately describe Textbook English without considering situationally determined, functional variation. Despite few significant differences between the language of EFL textbooks used in France, Germany, and Spain or between the nine different textbook series of the TEC, this study did uncover noteworthy interactions between the different text registers and target proficiency levels. The clusters responsible for these interactions underwent close examination. The study also explains and illustrates the key linguistic differences that distinguish stereotypically textbook-like texts from situationally similar ‘real-world’ texts.
Corroborating the findings of previous Textbook English studies, notably Mindt (1987; 1992; 1995) and Römer (2004; 2005), the present study identified a wide gap between conversational English as it is presented in contemporary secondary school EFL textbooks and ‘real-world’ conversation that learners can be expected to be involved in outside the EFL classroom. Whilst we are not claiming that all textbook dialogues should resemble the everyday, casual conversations of English L1 speakers (as represented, e.g., in the reference Spoken BNC2014 corpus), it is somewhat disconcerting that, across all nine textbook series of the TEC, textbooks’ representations of conversational spoken English become less authentic as learners are expected to become more proficient in English.
By contrast, and more reassuringly, as the target proficiency levels of the textbooks increased, so did the observed similarities between the informative and fiction subcorpora of the TEC and their respective reference corpora. This latter trend likely points to well-intended pedagogical progressions aimed at scaffolding the development of learners’ linguistic competences. Despite this general trend towards more authentic informative texts as the textbooks’ target proficiency level increases, the results also highlighted potentially problematic textbook texts, even at the highest proficiency levels represented in the TEC (B2). We concluded that some informative texts featured in B2 textbooks were characterised by a lack of register coherence, e.g., pairing words and phrases typical of formal, written English with others more commonly found in informal, (pseudo-)spoken registers. Although this descriptive study makes no claim as to any potential causal links between Textbook English and EFL learners’ production, we did note that a lack of register awareness is an issue that has also been observed in learner corpus research (e.g., Gilquin & Paquot 2008).
We acknowledged that not all textbook texts are designed to reflect naturally occurring English. However, when it is the aim, the results of the present study, along with the use of corpus tools, can be used to adapt or create textbook texts that better reflect the kind of English learners can expect to encounter outside the EFL classroom. The results of the present study support the adoption of a “register approach” to ELT, which entails exposing learners to lexico-grammatical patterns of use in the form of situationally contextualised, meaningful constructions and texts, as proposed by Rühlemann (2008). In terms of pedagogical implications, Section 9.4 spelt out the wide-reaching implications of such a register approach for teacher education and materials design.
Although it was originally conceived with the analysis of Textbook English in mind, it is hoped that many of the changes implemented in the modified MDA framework (see 5.3) will be of interest to corpus linguists working on a wide range of research questions and language varieties. Indeed, many of the issues raised in Chapter 5 are not by any means confined to the analysis of textbook language. For instance, the solutions proposed in 5.3.1 to overcome issues such as the comparison of texts of radically different lengths, the lack of punctuation in transcriptions of spoken language (see 5.3.2), and the non-independence of texts/text samples from the same textbook series, web domain or novel (see 5.3.8) are relevant to many other research areas. These include the study of many e-language registers (e.g., social media posts, blogs, forums, product reviews) and texts produced by young L1 users and L2 learners of all ages and proficiency levels.
Whilst by no means claiming to be fail-safe, the publication of the full code and data used to perform the analyses presented in this study is intended to allow for the computational reproducibility of the results. Crucially, it also allows for additional, independent replications. The Online Supplements exemplify how quantitative (corpus )linguistic methods can, with relatively simple means, be made more transparent, robust, and replicable. Thus, it is hoped that this study may serve as a springboard for further methodological innovations in the multivariate analysis of linguistic data.
3.2 Future directions
The present study is descriptive and exploratory in nature. As such, it opens many avenues for future research. It has contributed some methodological innovations to the MDA framework that may be further explored and tested in future MDA studies on diverse language varieties and registers. Regarding the analysis of school EFL textbooks, it has shown how Textbook English can be examined across a broad range of linguistic features both as a variety of English in its own right, and in comparison to various target reference varieties. Future studies could apply the method to study the language of different EFL, ESL and ESP textbooks and other pedagogical materials (e.g., online e learning courses) used in different educational systems and/or at different proficiency levels.
Another avenue to be explored concerns the quality and quantity of the lexical input provided by EFL textbooks. For each textbook volume and series, the word and phraseme types can be extracted and their rates of repetition across each textbook volume and series can be calculated. The lexical input of the 42 textbook volumes and nine textbook series of the TEC could then be compared to examine the extent to which they share a common core EFL lexical syllabus. In addition, the textbooks’ lexical range may be compared to corpus-based lists such as the new General Service List (Brezina & Gablasova 2015) and the PHRASE List (Martinez & Schmitt 2012). Given the TEC’s register annotation, it would also be possible to compare the words and phrasemes of an individual textbook register, e.g., the Conversation subcorpus of the TEC with corpus-derived lists of the most frequent words and phrasemes in spoken English (e.g., Fankhauser-Kahmeier 2024).
The modified MDA framework could also be applied to analyses of secondary school textbooks of other languages. Indeed, it would be most interesting to compare the present multi-feature/multi-dimensional models of Textbook English with those of other “textbook languages”. Such comparisons may reveal that, cross-linguistically, some of the observed characteristics of Textbook English are in fact universal features of foreign language textbooks – representative of what we might then call: ‘(School) Textbook Language’.
It is important to stress that, on the basis of the present study, we can only speculate as to the impact of Textbook English in and outside the EFL classroom. As vividly put by Cook (Cook 2002: 268), [i]t may be better to teach people how to draw with idealised squares and triangles than with idiosyncratic human faces. Or it may not. The job of applied linguists is to present evidence to demonstrate the learning basis for their claims […]. Whilst a large body of evidence from usage-based linguistic studies and related disciplines has consistently highlighted the strong connection between input exposure and L2 learners’ developmental patterns (e.g., Achard & Niemeier 2004; Pérez-Paredes, Mark & O’Keeffe 2020; Tyler 2012; Tyler & Ortega 2018), it still remains unclear the extent to which “bring[ing] textbooks for teaching English as a foreign language into closer correspondence with actual English” (Mindt 1996: 247) will facilitate or hamper learners’ progress. Crucially, we must remember that, as insightful as these multi-dimensional descriptions of Textbook English have been, textbooks do not exist in a vacuum. Yet surprisingly few empirical studies have looked into how textbooks – i.e., not only their language, but also their structures, tasks, and activities – mediate classroom interactions and learning outcomes (Rösler & Schart 2016: 490). In addition, much research remains to be done on how teachers and students actually use textbooks in the classroom. Empirical data on the status quo in secondary EFL classrooms is urgently needed to a) understand the real impact of textbooks and b) develop research-informed recommendations for materials designers and new pre- and in-service teacher training courses that genuinely address current problems and meet teachers’ and learners’ needs.
In addition to classroom-based investigations into textbook use and learning outcome, the results of the present study and follow-up corpus-based textbook language studies may be triangulated with findings from learner corpora to gain new insights into L2 learning processes. Such research could test McEnery and Kifle’s (1998) hypothesis that “[w]here textbooks are included in an exploration of L2 learning, they can explain differences between NS [native speaker] and NNS [non-native speaker] usage” (as cited in Tono 2004: 52). In such endeavours, robust models of textbook language are potentially very useful because few large-scale research projects will realistically be able to investigate both the language of the textbooks that learners use and the language production of these same learners (though see Möller 2020 for such a research design in the context of Content and Language Integrated Learning). The hope is that, if the models of Textbook English elaborated in the present study are shown to be generalisable to further EFL textbooks, they may be used as a means of better understanding certain usage patterns that are more frequent in the language of instructed EFL learners than in that of naturalistic ESL learners (for first attempts in this direction, see Winter & Le Foll 2022 on EFL learners’ use of if-conditionals; and Le Foll 2023 on periphrastic causative constructions).
In sum, there is still much to be learnt from “pedagogically-driven corpus-based research” (Gabrielatos 2006: 1). In this study, we have seen how MDA can be applied to describe the language of textbooks on multiple dimensions of variation and to point to potential pedagogical issues. These corpus-based findings highlight the need for greater consideration of register in language teaching and learning. The findings were used to point to the benefits of using freely available corpora and tools to create more meaningful, content-rich learning contexts. In other words, this study has not only demonstrated how multivariable corpus-linguistic methods can be used to analyse Textbook English, but it has also outlined ways in which corpora and corpus tools can be used to boost the representativeness of ‘real-world’ language use in school EFL textbooks. As such, this pedagogically-driven corpus-based study can be said to have “corpused” full circle.