«Andrés Gregor Zelman The University of Amsterdam 2002 ii Mediated Communication and the Evolving Science System: Mapping the Network Architecture of ...»
From the analysis of the four query words: Firms, Organization, Project, and Task, additional information concerning their distribution over the dataset was gained. The keyword Firms is shown to occur with a high frequency in both the second and final time periods, and this distribution is mirrored when measured for its standardized mean (every 1000 words). The distribution of the other three keywords selected for the analysis differs slightly. Organization, for example, occurred with a similar frequency across the first three documents sets and increased in the fourth, while when compared for standardized distribution across the four documents the terms appear more central to the first and third document sets. By contrast, the keywords Project and Task were shown to have varying distributions across the document sets while the standardized mean shows a different distribution. Project appears to be a central term in the first and final documents, but when standardized it is shown to be used more in the first and third document sets. Likewise, the keyword Task appears to be central to the first and final documents, but when standardized the distribution suggested that it was particularly central to the first and third time periods. These distributions give an immediate sense of the cognitive priorities of each time period.
In what follows, a contextualization of these words using their central neighbourhood collocates will serve to isolate the ways that each of these terms were used. As argued 10 The visual comparison between the four query words can be viewed in Appendix A.6 65 in the Research Questions & Expectations section of this chapter, textual analysis computing tools alone cannot locate associate words (words which are linked to keywords by cognitive association), but it is possible to use such tools to locate neighbourhood keywords. Here the isolation of keywords and their central collocates aimed to locate the ‘sub-symbolic’ meaning of words which remains contingent on the association between them. The key collocates of each Query word were isolated using the WordSmith program. From these occurrences, ten to twelve words were selected for display below. Once isolated, a further selection was made of two key collocates to identify which words are associated with the query words.11 Again, the full results can be viewed in Appendix A.6. The selected collocates for each keyword, and its successive co-collocates are shown below in four figures: Figure 4.1: ‘Firms’
Collocate Analysis, Figure 4.2: ‘Organization’ Collocate Analysis, Figure 4.3:
‘Project’ Collocate Analysis, and Figure 4.4: ‘Task’ Collocate Analysis.
This collocate analysis isolated an interesting set of neighbourhood keywords for the word Firms, including: Institutions, and Network(s). When the keyword and these collocate words were used together as the search terms, an additional level of specification was achieved. Here, when Firms and Institutions were used as the query words, only the words Network and Networks were identified. When Firms, Network, and Networks were searched together, only the words Small and Institutions were found. This approach gives us a clear sense of the structure of the discourse surrounding the use of the term Firms in print communication of the SOEIS project.
When the keyword Organization was examined for its neighbourhood collocates, a self evident grouping of words was identified. From these collocates, the words 11 The words in italics are the keywords selected from each collocate list for the deeper analysis 66 Project and Task were selected for the deeper analysis. Not surprisingly, when Organization and Project were queried together the words Self, Information, and Society were identified. These words were central to the project and are indeed, title words of the research project itself. When Organization and Task were queried together, the words Self and Modeling arose. The word Self is to be expected given its link with the word Organization. The word Modeling, while not exhibiting a high frequency as a collocate to the search terms Organization and Task, was identified as key suggesting that the word played a particular role in the print communications of the SOEIS project. Modeling is understood here to represent a link between the words Organization and Task that might otherwise escape unnoticed. All three words connote a processual concern with the carrying out of the project tasks.
When the keyword Project was analyzed for its collocates a wide range of neighbourhood words were identified. From these, the words Results and SOEIS were selected for the deeper analysis, as they concern the SOIES project and its motivation.
When a search was performed on the keyword Project and the word Results in tandem, the words Entire, Research and Dissemination were identified as the central neighbourhood collocates. While the significance of the word Entire in this distribution remains conjectural, the words Research and Dissemination are clearly cognitively connected with the keyword Project and this analysis thereby reveals the importance of these terms to each other – to the operation of the SOEIS project, and to the composition of the print communication dataset itself. When Project and SOEIS were sought together the words Organization, Task, and HTTP arose. The relationship between these words and the keyword Project is clear – here the orientation of these words to each other reveals a collective cognitive orientation of the SOEIS members to the function words: Task and Organization.
67 Finally, the fourth collocate query performed in this analysis used Task as the query keyword. Of the resulting neighbourhood collocates the words Project and Policy were selected for further analysis. When Task and Project where used as the query terms the words SOEIS, work, and deliverable arose. The relationship between these types of words is clear – each are oriented toward the processual aspect of the carrying on of the research project. Similarly, when Task and Policy where queried together the words Implications, Research and European were revealed. These latter words are also processually, rather than conceptually, oriented.
The collocate analysis of the four keywords: Firms, Organization, Project and Task has revealed a networked interconnectivity between keywords and their neighbourhood collocates. Interestingly, the words isolated as important to the SOEIS project (Firms and Organization) tend to have conceptual words as their collocates, whereas the words selected on the basis of importance for the functioning of the SOEIS project tend to have process oriented words as their collocates.
Combined these analyses reveal a unity across the sets in that most words occur in the final document set providing another indicator of an ‘aggregating text’. The emergent text is shown to accumulate over the time periods. The network dimensions of the SOEIS research group and their collective print output can thereby be understood to serve an archival function. With respect to the theoretical triad, the architecture of the print communications has been sketched in with a complex network of interrelations.
By observing the fluctuations of keyword use, when compared as a time series, discernable networks of keyword usage were revealed. Similarly, and in light of Actor Network Theory, the collocate analysis revealed interrelationships between networked keywords. The analysis of the SOEIS print communication has revealed networked keyword use over the document set, but whether these are properties particular to the print medium will be more easily discernable once compared with the electronic communication database.
System The analysis of the systemic properties of the SOEIS print communication is described below and provides an interesting juxtaposition to the network analysis in that the print behaviour of the SOEIS participants is conceptualized differently. Here the communications are viewed collectively as an overall system. Both the linear and non-linear relationships between the wordlists for each time period are compared.
The examination of the dataset for system transformation entails an assessment of the texts for critical transitions or path dependencies over the four datasets.12 Here we measure the expected information content of each time period, as each related to the previous period or state of the communication. The analysis was performed on two levels. The first made the comparison on the shared occurrence of all words present the full reference corpus – Print All: 7627 words. The print document set was first checked for linear transitions between P1-P2, P2-P3, and P3-P4, and was then 12 Callon 1986 – Path dependency is interpreted as ‘obligatory passage points’ (Leydesdorff, 1995:100). An obligatory passage point from A through B to C is found if: AB+BCAC; in contrast, no path dependency is found if AB+BCAC.
68 compared for non-linear associations; namely, P1-P3, P1-P4, and P2-P4. The results are presented below in Figure 4.5: Four Time Periods Compared Linearly (7627 words) and Figure 4.6: Four Time Periods Compared Non-Linearly (7627 words).
.614.355 (1355) (1070)
The linear and non-linear relationships are compared in the following way. Linearly, between the first and second periods.509 bits of information was shared, and between the second and third time periods.453 bits of information was shared. To determine if the pathway between period one and period three involved a critical revision of the information (via period two), we look to the bits of information shared non-linearly between periods one and three (.355 bits of information). If the total bits of information shared between periods one and two, and between two and three is less that the bits of information shared non-linearly between periods one and three then period two can be understood to have entailed a critical revision of the information, and was thereby critical for the development of the shared information (by boosting the previous signal). When compared for the degree of continuity between the time periods no significant difference was found between the linear relationships (P1-P2, P2-P3, P3-P4), and the non-linear relationships (P1-P3, P1-P4, P2-P4). However, note that since the time periods do not contain the same amount of words shared between the time periods (for example, 1828 words were shared between periods one and two, and 1214 words between periods two and three) the calculation is presented with a division by zero.13 The solution to this problematic is found by comparing only the shared words between the four time periods. Using this additional level of specification, the word lists were compared on the basis of the words that were shared by all four texts – Print Shared: 1009 words. The results are presented below in
Figure 4.7: Four Time Periods Compared Linearly (1009 words) and Figure 4.8:
Four Time Periods Compared Non-Linearly (1009 words).
13 Zero creates a problem because it doesn’t occur as a predictor. Leydesdorff argues that given the formula for the expected information value (I = Σ qi 2log (qi / pi)), we are “confronted with the division by zero in the case of the emergence of a ‘new’ occurrence in the a posteriori text” (1995:94). For Information Theory this means that the ‘appearance of something which was predicted with certainty not to occur (p = 0) comes as a total surprise, so that this message has infinite expected information value.’ (ibid:94).
Again, when compared for the degree of continuity between time periods no significant difference was found between the linear and non-linear relationships.
Thus, contrary to the original expectation, the results do not show any evidence of critical revision or transition in the collective word use of the authors. This suggests that there was no fundamental reorganization of the word usage during the course of the research project.14 However, this may also be interpreted to reflect a fundamental quality of collaborative research projects: that they are well organized and codified into particular discourses. As isolated by the standardized ratio of the percentage of unique words as isolated in the architecture analysis, the SOEIS print communications appear to have a lot of continuity.
Since no critical transitions were found in this analysis, a more detailed examination of the four respective wordlists for overall word distribution was performed in order to further test the expectation that self-organizational properties could be discerned.
Where the previous analysis concerned the overall distribution of collective (and thereby systemic) patterns of word use, here we look again to the architectural parameters of information flow present in this mode of communication to aid the interpretation of the systemic analysis. The analysis measures for both Specificity and Transmission. Specificity is understood as the specificity of total word distribution, or more accurately: the ratio of the expected information content of the distribution relative to the maximum information content. Specificity is therefore a measure of the degree to which some words occur differently across the print dataset along the time dimension. By contrast, Transmission is the coupling of word distribution and the time dimension; it is the mutual information between word distributions over the time periods and is understood to reflect a reduction of the uncertainty (specificity): it is
thereby a measure of the flow of mutual information across the dataset. Table 4.6:
Print System Dynamics, below, shows the overall word Occurrence, Unique Words, Specificity, and Transmission of the print dataset.15 14 The path dependencies tested within the print database do differ from the electronic; in the next chapter we compare the print and electronic databases for relative similarities and differences.
15 The occurrence and unique word counts shown in Table 4.6: Print System Dynamics are different than the occurrence and unique word rates found in the architecture analysis above; this difference is because for this analysis the texts were pre-filtered using the stop-list.