Terms in context: A corpus-based analysis of the terminology of the European Union's development cooperation policy Judith Kast-Aigner

Judith Kast-Aigner

Fachsprache | 3–4 / 2009 Articles / Aufsätze

Terms in context: A corpus-based analysis of the terminology

of the European Union's development cooperation policy

Judith Kast-Aigner


This paper is concerned with the terminology the European Union (EU) has created

and used with regard to its development cooperation policy during its existence. The idea of

fostering the development of less privileged countries by means of preferential trade agreements and financial aid was already incorporated in the Treaty of Rome in 1957, making development cooperation one of the Union’s oldest policy areas. Both the concrete concepts and the respec- tive terms applied have been subject to continuous change, the more so as they were strongly influenced by the political and economic situation at the time. The purpose of the paper is to illustrate how tools and techniques developed in corpus linguistics can assist terminologists in compiling terminological information. It presents some of the results of a detailed diachronic study of the English terminology of the EU’s development cooperation policy, aiming to describe the conceptual and terminological changes in this field over time. The analysis is based on a corpus of EU texts and supported by linguistic software, viz. WordSmith Tools. The generation of key words and word clusters is complemented by the establishment of terminological domains, which represents a helpful way of structuring the terms, thus facilitating the identification of the main topics of the underlying texts. The use of corpora in terminology opens up the possibility to gather both conceptual and linguistic as well as usage information about the terminological units. It also allows the analysis of concordances that can help to reveal ideological aspects of the terminology involved. The findings may contribute to the knowledge and understanding of European development cooperation among professionals in European and national bodies as well as scholars and teachers in the field of development cooperation.

Keywords European Union, development cooperation, EU-specific terminology, economic terms, corpus-based terminology, corpus linguistics, diachronic corpus, terminological domains, collocations, word clusters, lexical change 1 Introduction This paper presents some of the results of a diachronic analysis of the English terminology that the European Union (EU) has created and used in the field of development cooperation since 1957, aiming at portraying the conceptual and terminological changes in this field over time.i It is based on a corpus of English language EU documents and assisted by linguistic soft- ware, viz. WordSmith Tools. The paper is structured as follows: First, the subject area under investigation is outlined and the research questions to be dealt with in the study are posed.

Second, the methodology adopted in the study, i.e. a corpus-based approach to terminology, is described and the rationale for using corpora in terminology work is explained. Third, three examples are presented in order to illustrate the ways in which a corpus-based approach can be employed to generate and structure terminological information. Finally, some summarizing and concluding remarks are made, both with regard to the findings of the research and the methodological approach adopted.

The EU’s development cooperation policy has been described as an “understudied area of European politics, despite its economic and political significance” (Arts and Dickson 2004: 3).

This view is confirmed by Lister, who points out that books dealing with the EU tend to focus on its internal organisation and neglect its development cooperation policy, although the latter represents one of the Union’s first common policies (Lister 1997: 22). Indeed, research in this area appears to be less extensive than one would expect in consideration of the fact that development cooperation has been part of the European project since the very beginning. This also and even more so applies to terminological and linguistic aspects of the EU's development cooperation policy. As a matter of fact, there has not been any work in this field to date that provides more than just a superficial and random compilation of terms, let alone research dealing with the evolution of the terminology since the establishment of a common European development cooperation policy.

2.1 Overview

The EU's development cooperation policy dates back as far as 1957, when the Treaty of Rome was signed to create the European Economic Community (EEC). Whereas nowadays the EU is active in virtually every part of the world, the relationship with a group of countries in Africa, the Caribbean and the Pacific, which would later become the so-called acp group of states, was incorporated in the Community's founding treaties, thus representing the EU's oldest relationship in terms of development cooperation. The EU's close ties with this group of states were maintained via several Conventions, viz. the Yaoundé Conventions of 1963 and 1969 (referred to as Yaoundé I and II), when the signatory states in the South were still colonies of the European countries, and the Lomé Conventions of 1975, 1979, 1984 and 1989 (referred to as Lomé I, II, III and IV), signed by a continuously growing number of both European and ACP states. At present, the relations between the EU and the ACP group are governed by the Cotonou Partnership Agreement (referred to as cotonou), which was signed in 2000 (Frisch 2008, Grilli 1993, Hewitt and Whiteman 2004).

2.2 research questions

The analysis is part of a larger study, which is meant to provide a comprehensive and systematic overview of the terminology the EU has created and used in the field of development cooperation over time, in particular addressing the following research questions. First, it aims at identifying the key concepts that constitute the EU's development cooperation policy along with the terms in which these concepts manifest themselves. Second, it examines the concepts and terms that were used in the various stages of the EU's development cooperation policy, trying to reveal conceptual and terminological changes over the years. Third, it aims to shed light on the ideological agenda which may have had an impact on the terminology used in this field.

Hence, the synchronic perspective providing a record of current terminology is to be complemented by an analysis of the language of earlier eras of European development cooperation policy. According to Sager, such a diachronic view enables the study of language development as it allows for revealing changes in the meaning of lexical items. In addition, it facilitates the identification of conceptual changes which otherwise may have been difficult

- 140 Judith Kast-Aigner Fachsprache | 3–4 / 2009 Articles / Aufsätze to recognise (Sager 1990: 132). The need for a diachronic perspective is particularly evident considering the plurality of approaches the EU has followed in development cooperation policy since 1957, which are likely to have found their way into language. Furthermore, looking at the evolution of terminology over the last 50 years may provide insight into the ideological forces behind term formation and terminological choices. The latter issue is addressed by Temmerman (2000: 62), who points out that language is used to express human world perception and conception, and Cabré (1999: 23), arguing that terms convey the culture of a people and reflect a certain view of the world.

The paper is meant to outline the findings of the above-mentioned study on the basis of three carefully selected examples. On the one hand, these examples are intended to highlight the major research results, considering that it is not possible to present the entire spectrum within the scope of this contribution. On the other, they aim at reporting on the methodological approach adopted in the study and addressing the research questions outlined above. Two examples are based on the text of the First Lomé Convention of 1975, the so-called Lomé I corpus, the first one focussing on the identification and elaboration of key words, the second one showing how word clusters are classified into terminological domains. The third example deals with changes of key words and word clusters over time, tracking the word cooperation from 1957 to date.

3 Methodological issues

The methodology used in this research may be referred to as corpus-based terminology, which

has been defined as “a working method which explores a collection of domain-specific language materials (corpus) to investigate terminological issues” (Gamper and Stock 1998/1999:

149). Ahmad and Rogers have identified three main tasks for which electronic text corpora may be used to assist terminologists: viz. to capture, validate and elaborate data (Ahmad and

Rogers 2001: 740). Thus, corpora support the terminologist throughout a terminology project:

in the early stages when the key issues are to identify term candidates (i.e. to capture data) and to provide evidence for and about term candidates (i.e. to validate data) as well as in the core stages when the main tasks are to compile definitions and to select contextual examples (i.e.

to elaborate data).

3.1 Benefits of corpus-based terminology

While machine-readable corpora have been accepted in lexicography and language for general-purpose work for some time, their use and popularity in terminography or language for special-purpose work has been lagging behind. Arguing for their use, Bowker (1996: 30–31) points out three main advantages of corpora in terminology, an approach which she refers to as a corpus-based approach to terminography or simply corpus-based terminography.

Firstly, machine-readable corpora enable terminologists to increase both the speed and the scope of their research. Not only can larger quantities of data be processed more rapidly, thereby exposing terminologists to a larger number of conceptual descriptions, but corpora also allow them to leave out the sections of a text that are terminologically irrelevant and to

focus on those parts which are of interest from a terminological point of view (Bowker 1996:

31–32). The latter parts may be referred to as knowledge-rich contexts, containing “at least one item of domain knowledge that could be useful for conceptual analysis” (Meyer 2001: 281).

- 141 Judith Kast-Aigner Articles / Aufsätze Fachsprache | 3–4 / 2009 Secondly, a machine-readable corpus makes it easier to investigate syntactic and semantic information as well as linguistic patterns which are difficult to discover when scanning texts manually. The classic example is the study of concordances, also referred to as key words in context (KWIc), in order to identify collocations that may help to improve the use of terms immensely (Bowker 1996: 32).

This argument ties in with the third key strength of corpus-based terminography. In contrast to conventional term banks, which contain hardly any examples of terms in context, corpora present a variety of contexts as well as more extensive contexts which not only provide valuable supplementary information but also help to understand and thus to use terms more effectively (Bowker 1996: 32–33).

The second aspect discussed by Bowker is of particular importance as it highlights the fundamental idea of working with corpora, best described in the words of John Sinclair: “The ability to examine large text corpora in a systematic manner allows access to a quality of evidence that has not been available before” (Sinclair 1991: 4). This ability is relevant for my study in two respects. As mentioned above, information on collocations shows how terms can be used effectively. In addition, concordances enable the study of the relation between language and ideology. In his analyses of texts and text corpora, Stubbs shows that they can reveal patterns of language that institutions use to build up our linguistic, conceptual and ideological view of the world (Stubbs 1996: 59). As Hunston points out, such patterns may convey messages implicitly without the reader being intuitively or consciously aware (Hunston 2002: 109).

3.2 analysing corpora

The analysis of the corpus compiled for the purpose of this study is assisted by concordance software, viz. WordSmith Tools, a programme for looking at how words are used in texts.

According to Hunston (2002: 67), key words are a valuable starting point for analysing specialised corpora. The keyness of a word in a text or collection of texts may be characterised in terms of importance and “aboutness” (Scott 2007: 3–4), in the sense that it indicates that the word is important and shows what the text is about, respectively. Most accurately described by Scott and Tribble, “what the text ‘boils down to’ is its keyness, once we have steamed off the verbiage, the adornment, the blah blah blah” (Scott and Tribble 2006: 56). By comparing the relative frequencies of words in two corpora, viz. a smaller, more specialised corpus and a

larger, more general one, WordSmith generates the key words for the former (Hunston 2002:

68). More precisely, WordSmith compares the word list of the corpus under investigation, i.e.

the corpus of EU texts, with the word list of the British National Corpus (BNC), which is used as a reference corpus. For every word in the corpus under investigation, WordSmith contrasts the patterns of frequency and calculates a keyness score. It is advisable to work through the initial list of key words in order to remove noise as well as words which are clearly not relevant from a terminological point of view, viz. grammatical words (e.g. articles, conjunctions, prepositions) and words that are characteristic of the text type under investigation (e.g. shall, article, paragraph).

While the resulting list of key words does not represent a final list of terms that require terminological definitions or that are suited for inclusion in a terminological dictionary, it can be useful as it offers an overview of the main subjects covered in the texts and provides the starting point for further analysis, in particular in connection with the calculation of word clusters. As terms are frequently compound words and not single words, it is necessary to identify

