FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:     | 1 |   ...   | 2 | 3 || 5 | 6 |   ...   | 29 |

«Creativity Support for Computational Literature By Daniel C. Howe A dissertation submitted in partial fulfillment of the requirements for the degree ...»

-- [ Page 4 ] --

Notice that the “split” operation maintains the default features that are still relevant (e.g., part-of-speech, stress, phonemes, syllables, etc.) to the whole, and deletes those that are not (e.g., “chunk-type”, as the individual word is no longer part of a “noun-phrase”). Such “smart” decomposition works similarly for custom features (e.g., the trivial “letters-perword” feature used in the example), assuming that the feature in question contains the same number of items, when split on the current word-delimiter, as the number of total words in the phrase. Once we have a reference to a RiString with a single word (which can be queried via the method RiString.getWordCount()), we can add a range of additional features via a feature called “sentence-id”, identifying it as being part of a sentence represented by another RiString.

various other RiTa objects. For example, we might use RiWordNet to add synonyms, meronyms, or hypernyms, or perhaps the RiLexicon object to add rhymes, “soundex” matches, and alliterations.

This approach provided several advantages: it was simple and easy to learn; it was easy to customize and extend; and it facilitated data sharing both between RiTa objects, and with external libraries, programs and web-services. Further, as Java strings were one of the few basic variable types (along with the primitives) that were used in the Processing API, it was intuitive to new users and required little additional documentation or explanation. The lack of support in Java for operator overloading prevented the use of this augmented String from being truly transparent; that is, instead of typing (in typical Java syntax) String s = “hello”, RiTa users had to type RiString s = new RiString(“hello”). The argument could be made, however, that this is more consistent with Java's distinction between primitives and Objects, however inconsistent it may be; all other Java objects (except for String) require the use of the “new” keyword, while all the Java primitives do not; a fact that often causes some confusion for new users.

As might be expected, the usual Java String methods (in fact the entire CharSequence interface) could be invoked on a RiString object just as one would on a String. As mentioned above, RiString.split(), for example, would split the instance described above into three new RiString objects, one for each word. This allowed RiString objects to be passed along a pipeline (similar to Unix command-line tools), and for lookups to be performed in “lazy” fashion, if and as needed. For example, if a core component (say the RiGrammar object) needed to know the part-of-speech for a RiString with which it was generating a phrase, it

might ask whether the “pos” feature was present via the following line:

if (theRiString.hasFeature("pos")) {...

} If not, it might invoke the RiPosTagger object to first add the part-of-speech feature before proceeding with generation. Objects in different states of analysis could thus be exchanged with features added only as needed, enabling through a sort of ad-hoc introspection mechanism, a weak version of polymorphism (the lack of which being one critique of such a non-hierarchical approach). In fact, it was this feature that later enabled the use of RiTa

objects in a real-time drag-and-drop environment for language processing (See chapter 6:

Future Work).

As mentioned above, extending RiString objects with custom feature types was trivial, as one needed only to add a new key-value pair that could later be checked for and accessed (rather than creating a new subclass which required the understanding of inheritance, overloading, or overriding). Many students used this facility to add features that were specific to their projects, such as a feature denoting a “semantic-link” between word pairs, or a “source” tag during web-parsing to later identify the page on which a word or phrase was found. This structure also eliminated the need for many external data structures that would otherwise have been necessary. A potential downside was the aggregate expense for programs with many RiString objects. In some programs, for example, a large number of feature maps might be allocated (perhaps even one per word in a large text), each containing as few as one feature. Obviously in such cases a single map (or hashtable) with words as keys and desired feature as values would be more efficient. As one might expect, such optimizations were not encouraged for students in the early stages of their projects, but were easily implemented if and as necessary in later stages. Statistical Crossover A second design tension involved the constraint of (perceptual) real-time with the desires of students to work with very large amounts of text. For example, when working with n-gram models, students often wanted to encourage frequent “crossover” (defined below), among a large number of texts by using a relatively low n-value. While all methods on RiTa's n-gram object, RiMarkov, returned very quickly no matter the size of the model, there were two related problems. First, large models require large memory and this directly conflicted with our constraint of having programs run as applets in standard web browsers. Second, large models can take significant time to load, which a) may result in unacceptable delays for users, and b), more importantly, make development and debugging a tedious process as each run of the program might require up to 30 seconds to load, a situation that directly conflicted with the design constraint of facilitating rapid (or micro) iteration, as described below.

Thus we see that a single unexpected use-case threatened three related but distinct design constraints (perceptual real-time, micro-iteration, and web-based execution). We should note that this particular set of issues, the memory requirements of large textual models and corpora, is one reason why creativity support tools for this context have been so difficult to realize in the past [Howe and Soderman 2009]. Perhaps for the first time however, the computational power of consumer-level computers is near the point where this difficulty can be successfully mitigated.

Our attempt to resolve this design tension resulted in two (parallel) strategies, one involving the creation of a new mechanism (and package) in the toolset, presented to library users via the RiTaServer object described above, and one involving a change to the RiMarkov class itself. As computational writers often have very different goals than typical natural language researchers, it should come as no surprise that their perspective on a specific process may be vastly different than a researcher interested in say, machine translation, even when they are using they very same algorithmic method. The use of n-grams (or Markov models) presented just such a case as, over the course of the semester, it became clear that the methods typically found in an n-gram package were not sufficient for students’ needs. One example became evident through conversations with several students attempting to use large text models for n-gram based generation who, unlike in the translation case, were most interested in a property we came to call “crossover”.

As an example, we can take two texts, A and B, with the number of sentences in each being sA and sB respectively. From the difference of the sets of sentences in these texts, there will be some number of unique sentences in each, uA and uB, and some number d = (sA - uA) = (sB - uB) of sentences present in both. Depending on the uniqueness of the texts at the sentence level, as represented by

–  –  –

some percentage of all n-grams in the joint model will contain words unique to each text. It was just these “crossover” sentences, present in neither text or A nor B that were of primary interest for the subset of users attempting to add more and/or longer texts. While the typical application of n-grams would be to find sentences that were most likely to occur, the goal of these users, rather interestingly, was the opposite, specifically to find novel sentences, those that could logically occur, according to the constraints of the model, but were less (or even least) likely to do so. This inverted use pattern, related to the artistic strategy of misuse, proved to be a recurring theme when algorithmic techniques were borrowed from existing areas of research for use in creative practice, a topic discussed further in the chapter 3.

For this group of users, however, a simple constraint on the generation process achieved the same goal (specifically, increased “crossover”), giving higher probabilities to those sentences not existing in any of the input texts from which the model was built. Since texts were processed sequentially, when this option was specified (via a simple method call), it was easy to build a compact lookup table for these sentences (again assuming the number of input texts is not too vast) and then use the table to generate higher percentages of sentences not already existing in the model.21 Sentences generated by the component could also be added to the lookup table, thus ensuring that no duplicate generations occurred. RiGrammarView A third design tension involved the RiGrammar object and a number of our design constraints, specifically transparency, open exchange, and rapid (or micro) iteration. To support the first three of these constraints, we designed the RiGrammar object to read grammar files from plain-text files stored in the user’s resource (or data) folder, along with any required fonts, images, sound files, etc. When exported via the RiTa plugin, these files were linked, along with the project’s source files, and displayed in the HTML tags for the page. Thus viewers of the piece interested in its inner workings could access at last two additional layers below its surface representation. In addition to facilitating transparency regarding the workings of piece, it also enabled open exchange in that students could download and experiment with each others grammar files. Further, as it provided a relatively clean separation of concerns, between process (the source) and data (the grammar), it satisfied the generic design principle of modularity.

Note that this simple constraint does not guarantee “true” crossover, only generation of phrases or sentences not previously seen in any of the input texts.

Figure 2: Screenshot of the RiGrammarView tool.

Unfortunately this setup significantly limited students ability to rapidly iterate during the potentially long phase of grammar development. With the original setup (depending slightly on the environment used), each time one wanted to modify the grammar, it required closing the program, modifying and re-saving the grammar file, recompiling the source, then re-launching the program. Over hundreds and thousand of small grammar changes, this time accumulated and caused significant frustration. Much like in the case of the RiTaServer above, our resolution to this tension involved the implementation of an auxiliary tool, namely the RiGrammarView component (as show in Figure 2). By adding a single line to their program (see Figure 3), users could invoke a custom editor that loaded (by default) the contents of the current grammar file. Then, by pushing a ‘refresh’ button the RiGrammarView would dynamically swap out the grammar rules in the associated RiGrammar object, replacing them with those found in the editor text.

Figure 3: Code invoking the RiGrammarView editor.

With this setup users needed only to make changes and hit refresh to immediately see the results in their running program. If the changes were satisfactory the grammar file could be saved to disk. If one desired to test a program with multiple grammars, each could be loaded from separate files at runtime, and the results compared without restarting the program. When satisfied with the grammar as written, one could simply comment out the single line of code invoking the editor and publish their sketch as usual. In this way we were able to maintain a clean separation of concerns between code and grammar files, facilitate open exchange and transparency, and still enable rapid micro-iterations, here within a single execution cycle.

2.3 Component Descriptions Researchers also need to distinguish between software features that are merely novel and those that are demonstrably effective in enabling users to produce creative outcomes. [Schneiderman et al. 2006] Since its creation, the RiTa toolkit has developed on a number of parallel tracks. New objects have been defined and new methods added to accommodate functionalities that arose during real-world use. Similarly, new mechanisms were implemented to better address elements of the project in which design tensions were identified. At the same time, the RiTa tools were continually re-factored to increase consistency, clarity, efficiency, and usability.

This section describes the functionality provided by the objects in the “core” RiTa library. A brief description of each object follows, highlighting common usage patterns, and specific literary augmentations. Where applicable, explanations of design concerns leading to specific implementation decisions are presented.

The RiTa toolkit is implemented in Java, optionally integrates with the Processing language environment, and runs on all the major platforms including Windows, Mac OS X, and Linux/Unix. It is freely available under an open source Creative Commons license at http://www.rednoise.org/rita/. For further detail, see the complete API available at http://rednoise.org/rita/documentation/docs.html.

Pages:     | 1 |   ...   | 2 | 3 || 5 | 6 |   ...   | 29 |

Similar works:

«O du mein Österreich: Patriotic Music and Multinational Identity in the Austro-Hungarian Empire by Jason Stephen Heilman Department of Music Duke University Date: _Approved: Bryan R. Gilliam, Supervisor Scott Lindroth James Rolleston Malachi Hacohen Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Music in the Graduate School of Duke University 2009 ABSTRACT O du mein Österreich: Patriotic Music and Multinational...»


«Fallāḥīn on Trial in Colonial Egypt: Apprehending the Peasantry through Orality, Writing, and Performance (1884-1914) by Anne Marie Clément A thesis submitted in conformity with the requirements for the degree of doctor of philosophy Department of Near and Middle Eastern Civilizations University of Toronto © Copyright by Anne Marie Clément 2012 Fallāḥīn on Trial in Colonial Egypt: Apprehending the Peasantry through Orality, Writing, and Performance (1884-1914) Anne Marie Clément...»

«KRITIKE An Online Journal of Philosophy Volume 10, Number 1 June 2016 ISSN 1908-7330 KRITIKE An Online Journal of Philosophy Volume 10, Number 1 June 2016 ISSN 1908-7330 THE DEPARTMENT OF PHILOSOPHY University of Santo Tomas Philippine Commission on Higher Education COPYRIGHTS All materials published by KRITIKE are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License KRITIKE supports the Open Access Movement. The copyright of an article published by the journal...»

«Roman Infrastructural Changes to Greek Sanctuaries and Games: Panhellenism in the Roman Empire, Formations of New Identities by Karen A. Laurence A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Classical Art and Archaeology) in the University of Michigan Doctoral Committee: Professor Sharon Herbert, Chair Professor Lisa C. Nevett Professor David Potter Associate Professor Christopher Ratté Assistant Professor Steven J. R. Ellis,...»

«ART IN BETWEEN EMPIRES: VISUAL CULTURE & ARTISTIC KNOWLEDGE IN LATE MUGHAL DELHI 1748-1857 Yuthika Sharma Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences COLUMBIA UNIVERSITY 2013 © 2013 Yuthika Sharma All rights reserved ABSTRACT Art in between Empires: Visual Culture & Artistic Knowledge in Late Mughal Delhi 1748 -1857 Yuthika Sharma This dissertation focuses on the artistic culture of late Mughal Delhi...»

«Acting Tragedy in Twentieth-Century Greece: The Case of Electra by Sophocles Michaela Antoniou Goldsmiths, University of London Submitted for the degree of Doctor of Philosophy Drama Department 1 The work presented in this thesis is my own. Michaela Antoniou 2 Abstract This thesis discusses the acting techniques employed by actors for tragedy on the Greek stage during the twentieth century. It argues that there were two main acting schools – ‘school’ here meaning an established unified...»


«Title: Enhanced technology acceptance model to explain and predict learners' behavioural intentions in learning management systems Name: Abdullah Al-Aulamie This is a digitised version of a dissertation submitted to the University of Bedfordshire. It is available to view only. This item is subject to copyright. Enhanced Technology Acceptance Model to Explain and Predict Learners' Behavioural Intentions in Learning Management Systems ABDULLAH AL-AULAMIE PhD UNIVERSITY OF BEDFORDSHIRE Enhanced...»

«Trickster Skins: Narratives of Landscape, Representation, and the Miami Nation A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Scott Michael Shoemaker IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Dr. Brenda Child, Adviser July 2011 © Scott M. Shoemaker, 2011 Acknowledgements I owe a tremendous amount of gratitude and appreciation to numerous people in my success in graduate school. The American Studies and...»

«AMARA AIT AISSA ETUDE EXPERIMENTALE DE MELANGE DE POUDRES DE POLYMÈRES DANS UN MÉLANGEUR ROTATIF Thèse présentée à la Faculté des études supérieures de l'Université Laval dans le cadre du programme de doctorat en Génie Chimique pour l'obtention du grade de Philosophiae Doctor (PhD) DEPARTEMENT DE GENIE CHIMIQUE FACULTÉ DES SCIENCES ET DE GÉNIE UNIVERSITÉ LAVAL QUÉBEC Amara Ait Aissa, 2011 Résumé Dans ce travail, on développe plusieurs méthodes de mesure en ligne et...»

«People Manipulate Objects (but Cultivate Fields): Beyond the Raster-Vector Debate in GIS Helen Couclelis Department of Geography, University of California Santa Barbara, CA 93106, USA A b s t r a c t. The ongoing debate in GIS regarding the relative merits of vector versus raster representations of spatial information is usually couched in technical terms. Yet the technical question of the most appropriate data structure begs the philosophical question of the most appropriate conceptualization...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.