«Terms in context: A corpus-based analysis of the terminology of the European Union's development cooperation policy Judith Kast-Aigner Abstract This ...»
- 142 Judith Kast-Aigner Fachsprache | 3–4 / 2009 Articles / Aufsätze word clusters, which may be defined as “words which are found repeatedly together in each others' company, in sequence” (Scott 2004–2007: 225). While forming a tighter relationship than collocates, clusters merely represent repeated strings which may or may not turn out to be true multi-word units (Scott 2007: 19). Biber et al. (2000: 989), who refer to clusters as lexical bundles, describe them as sequences of words that show a statistical tendency to co-occur in a particular register. The identification of word clusters requires the user to specify the cluster size (between two and eight words) and a minimum frequency, i.e. a minimum number for the cluster to appear in the results.
In this analysis, the key parameters are a cluster size of two to six words and a minimum frequency of five. As the calculation of clusters merely yields sequences of words that tend to co-occur, the results have to be revised. This step includes the elimination of those clusters that are clearly nothing more than repeated strings. For example, WordSmith identifies the word strings national authorizing, national authorizing officer and national authorizing officer shall, the relevant term clearly being national authorizing officer. Likewise, the term products originating in the acp states is of interest from a terminological viewpoint, whereas the word strings products originating, products originating in, products originating in the and products originating in the acp are not. Furthermore, related clusters, described by Scott as clusters “which overlap to some extent with others” (2004–2007: 89), have to be identified. Related clusters that form part of more comprehensive clusters are removed unless they are considered to have a meaning that is independent from the meaning of the latter and occur in the corpus at least five times. For example, WordSmith identifies the word clusters interest rate and interest rate subsidy, both of which represent independent terms. By contrast, the word strings european coal, european coal and, european coal and steel and european coal and steel community represent a case where some related clusters have to be removed, with european coal and steel community as the only meaningful word cluster. The aim of this procedure is to generate a list of multi-word units which represent term candidates in a sense that they are relevant from a terminological point of view and considered to have a separate meaning.
Along the lines of Mahlberg (2007: 198–199), who establishes groups in order to categorise concordances, the resulting word clusters are divided into several categories, each of which characterises a particular theme prevailing in the corpus texts. Despite being a rough approach to analysing clusters, this step facilitates the identification of the main characteristics and themes of the underlying texts and makes it easier to grasp the plurality of terms which include the key words identified before. Moreover, the establishment of groups enables a focused view of the various word clusters and assists in raising issues and questions that otherwise might not have come to mind. Mahlberg refers to these groups as functional groups, admitting that these categories are neither watertight nor absolutely clear-cut (2007: 199). She also points out that the labels introduced for the functional groups represent so-called ad hoc labels, which
aim at nothing more than showing the typical characteristics of the group (Mahlberg 2007:
199–200). Unlike Mahlberg, who is interested in the discursive features of texts rather than the terminology used, only those multi-word units are categorised that can be considered to have a separate meaning and appear – to varying extents – useful from a terminological angle.
Thus, the term functional groups will be replaced with the expression terminological domains, referring to groups of word clusters that represent the key topics of the underlying texts.
As outlined in Section 2.1., the relationship between the EU and the ACP group was established in the Treaty of Rome and maintained via several Yaoundé and Lomé Conventions.
The current development regime is laid down in the Cotonou Agreement, signed in 2000 and revised in 2005. The texts of these agreements form the corpus, on which the analysis of the terminology of the EU's development cooperation policy is based, with the individual agreements representing different subcorpora. An overview of the corpus and the subcorpora is provided in Table 1.
The following three examples are meant to illustrate the methodological approach used in the research as well as some of the key findings of the study. After discussing the identification and analysis of key words in Section 4.1., the grouping of word clusters into terminological domains is dealt with in Section 4.2., with both examples being based on the Lomé I corpus.
examines changes in the usage and frequency patterns of one of the most important key words, viz. cooperation, and its collocates from 1957 to date.
4.1 Dealing with key words in the Lomé I corpus
The Lomé I corpus contains the complete text of the First ACP-EEC Convention of Lomé as well as the attached agreements, protocols and declarations, adding up to approximately 32,000 words. Using WordSmith, the word list of the Lomé I corpus is generated, which results in 2,918 entries. Subsequently, the key words are computed by comparing the Lomé I word list with the word list of the reference corpus. The preliminary set of 348 key words is reduced to 276 key words as irrelevant words and noise have to be removed. The top 60 key words are listed in Table 2.
What may appear as nothing more than a list of words at first glance, turns out to contain a lot of useful information on the key elements of the Lomé I Convention.
First, the table gives insight into the key actors and main issues of ACP-EEC cooperation. Several key words refer to the parties involved in the Convention, including both general expressions (e.g. states, member and territories), specific countries and groupings (e.g. acp, which is short for african, caribbean and pacific; european, Guinea and Dahomey), legal entities and functionaries of the signatory states (e.g. president, ministers and authorities) and of the EEC (community, council and commission). The words development and cooperation are both among the top 60 key words of Lomé I and can probably be considered the most significant words when it comes to describing the purpose of the Convention: cooperation was deemed an appropriate means of promoting development in the developing countries. The main instruments used to this end were trade and aid, both of which left their mark in the key words of Lomé I. Not only do the terms trade and aid appear in the top 60, but also numerous other key words are related to trade (e.g. customs, goods, originating and exporting) and assistance for projects, i.e. aid (financing, microprojects, implementation and execution).
Second, it is sensible to compare the key words of Lomé I with the key words of its predecessors, viz. the Treaty of Rome, Yaoundé I and Yaoundé II, in order to differentiate between terms that had also been used in the pre-Lomé era and terms that first appeared in Lomé I. The comparison shows that almost 40 per cent of the key words of Lomé I (i.e. 109 out of 276) had already been identified as key words in one of the earlier agreements, whereas the remaining ones (i.e. 167 out of 276) emerged as key words in the Lomé I corpus. It is striking that the ratio of old to new words is different among the top 60 key words of Lomé I, with only 17 of 60 (28 per cent) representing key words that had not appeared in any of the former agreements. Apparently, the top key words
- 145 Judith Kast-Aigner Articles / Aufsätze Fachsprache | 3–4 / 2009 include more of those words that cannot be attributed to a particular Convention or stage in the relationship between the Community and the ACP group, but endure more than one generation of agreements, thus being characteristic of ACP-EEC cooperation in general.
Those key words that appeared in Lomé I for the first time (marked with an asterisk in Table 2) may provide information about the new features of the Lomé I Convention, including innovative concepts and instruments, new provisions that are necessary to understand the nature and dynamics of ACP-EEC cooperation and, as in the case of the term acp, references to milestones in the relationship between the Community and the developing countries.
The fact that the term acp had not been a key word in the former agreements – in fact, it had not been used at all – may appear unusual, considering that the Lomé I Convention represented the fourth in a series of cooperation agreements between the Community and the developing countries. Only by looking into the texts of the Conventions and learning more about the origin of the ACP group is it possible to discover the reasons why this is the case.
This step involves the search for words and word clusters referring to the Community's contracting parties prior to Lomé, which is based on data from the respective subcorpora, viz.
Rome, Yaoundé I and Yaoundé II. Part Four of the EEC's founding treaty, the Treaty of Rome, signed in 1957, established an “Association of the Overseas Countries and Territories”. At the time, the developing countries covered by the Association were still colonies of the European countries and were referred to as those “non-European countries and territories which have special relations with Belgium, France, Italy and the Netherlands” (European Communities 1957: Article 131). Accordingly, the word clusters countries and territories and special relations represent two of the most frequent multi-word units in the Rome corpus. The expression special relations clearly represents a euphemistic label for the colonial ties that four of the six European founding members had (Hewitt and Whiteman 2004: 133). The Yaoundé Conventions were Conventions of Association, with the newly independent signatory states in Africa being referred to as the associated states. This is reflected in the Yaoundé I corpus, where associated states and associated state appear among the most frequent word clusters and the word association ranks fourth in terms of keyness. The entry of the United Kingdom (UK) into the EEC in 1973 produced a massive extension of the geographical scope of the Community's development cooperation policy, as the former British colonies had to be taken into consideration. Twenty independent Commonwealth countries in Africa, the Caribbean, the Indian and the Pacific Ocean were invited to participate in the Community's negotiations with the Associated States on a new Convention of Association, namely the Convention that was to follow Yaoundé II (European Commission 1973: 5–6). According to Dieter Frisch, former Director General for Development at the European Commission, the English-speaking African states disliked and rejected the word association since, in their opinion, it clearly indicated “a second class membership of a post-colonial nature” (Misser 2008: 12). The fact that it was finally dropped was seen as a major step forward and paved the way for the conclusion of Lomé I. Hewitt and Whiteman note that “the linguistic change was significant here; that
is, the elimination of the hated expression ‘association’, so redolent of neo-colonialism” (2004:
140). After the Convention of Lomé I was signed by nine EEC members and an enlarged group of 46 developing countries in February 1975, the non-European signatories to Lomé I joined forces and entered into the Georgetown Agreement, creating the African, Caribbean and Pacific, or acp, group of states as such (Percival 2008: 10). This brief digression on the history of ACP-EEC cooperation is meant to illustrate the idea of using the results of the corpus analysis as a starting point for a detailed investigation of words and word clusters and, consequently,
- 146 Judith Kast-Aigner Fachsprache | 3–4 / 2009 Articles / Aufsätze of the concepts and issues involved. The emergence of the term acp has given rise to a more comprehensive analysis of the terms that were used to refer to the Community's contracting parties prior to Lomé, which in turn reveal information on the relationship between the EEC and these countries at the time.
4.2 Grouping word clusters into terminological domains
The analysis of the Lomé I corpus includes the generation and study of its word clusters. On the basis of the key words of Lomé I and considering the parameters (i.e. 2–6-word clusters with a minimum frequency of five), WordSmith generates a myriad of word strings, which, by eliminating noise and repeated strings, are reduced to 118 multi-word units. Table 3 shows the top 35 word clusters, in the order of frequency of occurrence.
These word clusters can be categorised into several terminological domains, each of which represents a particular theme of the Lomé I Convention. Not only does the establishment of terminological domains facilitate the processing of the data, it also and primarily assists in working out the main topics prevailing in the corpus and, thus, in ACP-EEC cooperation.
Several word clusters are clearly related to the contracting parties, including the terms acp states and Member states. Others refer to the institutions that were involved in the Convention (e.g. council of the european communities and council of acp Ministers) as well as the joint institutions set up by Lomé I (e.g. council of Ministers and committee of ambassadors). A number of word clusters are associated with the key elements of the Convention, viz. trade and aid. For example, the terms customs authorities, movement certificate and originating products are related to trade; the expressions projects and programmes, invitations to tender and exceptional aid are linked to the provision of financial resources. Furthermore, various types of cooperation (e.g.
technical cooperation and industrial cooperation) and forms of development (e.g. industrial development and economic and social development) are frequently mentioned in the Convention.