FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:     | 1 |   ...   | 4 | 5 || 7 | 8 |   ...   | 29 |

«Creativity Support for Computational Literature By Daniel C. Howe A dissertation submitted in partial fulfillment of the requirements for the degree ...»

-- [ Page 6 ] --

RiMarkov RiMarkov provides an implementation of a Markov chain (or n-gram) based generator with specific extensions for literary output. A language model is a statistical model used to analyze or generate elements of a text or texts via probability. N-gram models (or Markov chains) are statistical models in which the next item in a sequence is predicted based upon the frequency of that sequence in a set of input texts. The “n” in n-grams refers to the number of words in each sequence that is considered in our estimate. If n = 1 we have a unigram model in which the probabilities of a single letter, for example, are estimated based only on the frequency of that letter in the input. In a bigram model, where n = 2, we would estimate the likelihood of a two-letter sequence, “Qu” for example, based on its frequency in the input compared to all other two letter sequences in the input.30 RiMarkov provides this functionality (by default) for words, though alternative tokenization strategies are supported via regular expressions.

Additionally RiMarkov provides a range of extensions specific to natural language and literature, including the recognition of abbreviations, elisions, and sentence boundaries to better facilitate sentence generation. Other such features included the weighting of inputs (e.g., for combining texts of different lengths), constraints on repetition, custom tokenization,

feature compression (letter-case, synonyms, etc.), and the following methods:

getCompletions(), getProbabilities(), and getProbabilityMap(), each of which allow for some degree of interactive control (or steering) of the model during generation. Additionally RiMarkov supports dual-mode operation in which each word is augmented by part-of-speech N-gram models were formalized by Claude Shannon in "A Mathematical Theory of Communication" [Shannon, 1949] and are a specific instance of the more general class of Markov chains.

information that can later be used to constrain the output to grammatical patterns based on Part-of-speech.

As noted by Wardrip-Fruin [2006], n-grams are a remarkably effective technique, given both the simplicity of the technique and its lack of specificity to the domain of language. One could imagine models substantially more specific to language, and in fact there has been a growing call for researchers to “put language back into language modeling” [Rosenfeld 2000]. This would require more complex models, like RiMarkov’s dual-mode operation mentioned above, that work with features other than small groupings of undifferentiated word tokens. Researchers have experimented with approaches such as decision trees and link-grammars, but the majority of such attention has focused on “maximum-entropy” modeling [Berger et al. 1996]. Maximum-entropy is another statistical technique, like Markov chains, that has seen use in a range of contexts, but its flexibility allows for the selection of a wide range of features, including those specific to language and literature. The RiChunker and RiParser, described below, make use of maximum-entropy modeling techniques.31 RiChunker Based closely on the OpenNLP implementation, this object provides a simple implementation of a maximum-entropy chunker for finding non-recursive syntactic “chunks” such as noun-phrases, using the Penn conventions. Table 4 present a list of phrase-tags used according to the Penn Treebank conventions.

For more information, see “A Maximum Entropy Approach to Natural Language Processing”, [Berger and Della Pietra 1996] which provides a good introduction to maximum entropy techniques.

–  –  –

Table 4: Alphabetical list of phrase-tags used in the Penn Treebank Project.

RiParser Based closely on the OpenNLP implementation, this object provides an implementation of a maximum-entropy parser for finding recursive (or nested) syntactic “chunks” such as noun-phrases, using the Penn conventions as listed in Table 4.

RiGrammar RiGrammar provides an implementation of a context-free grammar with specific extensions for generating literary texts. The implementation allows users to specify the rules and productions for a grammar in a local or remote plain-text file. A simple example grammar, following the RiTa conventions, is presented below.

# An Example CFG ###################################### # s - np vp # np - det n # vp - v | v np # det - 'a' | 'the' # n - 'woman' | 'man' # v - 'shoots' ######################################

–  –  –

Figure 5: A simple RiTa grammar file.

While grammars are often used in natural language analysis, RiGrammar, like most RiTa objects, is implemented (and optimized) specifically for generation, and contains a range of features specific to the literary context. For one, all rules are dynamic; that is, they can be interrogated and modified at runtime. This is particularly important when one wants to change elements of the generation process based on what has been generated thus far.

Similarly, RiGrammar supports probabilistic rules in which probabilities can be modified dynamically at runtime.

In addition, specific extensions for generation have been implemented at the method level. A simple example of such a method is expandFrom. While the expand() method simply performs generation (or expansion) from the “start” symbol, expandFrom(String from) begins with the symbol contained in the “from” argument (which can consist of either terminals, non-terminals or both), and performs an expansion starting from there, so that partial expansions can be triggered without adjusting the grammar itself. Another example, perhaps the most specific to literature, is expandWith which takes two String arguments.

During generation the first argument, a terminal, is guaranteed to be substituted for the second, a non-terminal. Once this substitution is made, the algorithm then works backwards (up the tree representing the grammar from the leaf) ensuring that the first argument, the terminal, will appear in the output string. For example, with the grammar fragment above,

one might call:

grammar.expandWith("woman", "n");

assuring not only that the n rule will be used at least once in the expansion process, but that when it is, it will be replaced by the terminal "woman". Further, expandWith can be used

with terminals that are not present in the grammar. So the following would also work:

grammar.expandWith("child", "n");

though the String "child" is not present in the grammar. This algorithm enables generation that can adjust to the current state of the applications. For example, if we want to generate a new sentence linking to one previously generated, we might pass one or more keywords from that sentence to expandWith, guaranteeing that the pair of generations will be linked, at least minimally, by that keyword. Similarly, when accepting input from a user, a program can generate phrases based on that input. This is particularly useful in conversational applications or for those with interactive characters.

To further support these dynamic generation modes, callbacks, from the grammar into user code, are supported. To generate such a callback, users include method calls within their grammar, surrounded (by default) with back-ticks. Three examples of this type of callback are presented in the rule below.

–  –  –

} The first line provides a simple way of embedding external data sources within a grammar, while the second and third lines provide functionality that is dynamically applicable to the generation context (all three represent functionality not available in strictly context-free grammars). Any number of arguments may be passed in a callback, but for each call, there must be a corresponding method (with the same number and type of arguments) in the user's program, e.g.,

–  –  –

return myRiPluralizer.pluralize(noun);

} An additional tool provided with RiTa, and facilitated by RiGrammar's dynamic grammar rules (as described above), is an interactive application, which allows users to experiment with one or more grammars in real-time. One panel of the RiGrammarView application provides the current grammar file definitions in an editable window. The other panel allows user to generate from that grammar and store the results. Thus the typical process of experimentation is greatly simplified. Rather than stop the program, open the grammar file in a text-editor, make changes, save, then re-open the program, all of this happens within the live environment provided by the tool.

2.4 Documentation RiTa is accompanied by extensive documentation that explains the toolkit and describes how to use and extend it. This documentation is divided into four primary


Examples are provided for each of the core RiTa objects, clearly illustrating basic • uses. Examples are posted as both downloadable and web-executable programs on the RiTa web-site and accompanied by links to carefully commented source code that explains the purpose of each line. Example assignments are also provided to assist

–  –  –

Reference Documentation provides precise definitions for each interface, class, • method, function, and variable in the toolkit. It is automatically generated in two formats for each version via custom comments embedded in the source code: a simplified single-page “procedurally-oriented” HTML reference32 and a standard

–  –  –

Tutorials teach students to use the toolkit incrementally by focusing on a single task, • e.g., tagging, generation, or classification. The tutorials include a high-level See http://www.rednoise.org/rita/documentation/docs.htm.

See http://www.rednoise.org/rita/javadocs/.

discussion that explains and motivates the domain, followed by a code-level walkthrough showing how RiTa would be used to perform the task in question.

The Project Gallery provides students with a wide range of existing projects • (implemented in RiTa) by other students and artists, all with linked source code.

Students can access this archive either for inspiration on projects or for assistance in addressing particular issues. Most of the projects also include direct contact links for the authors that can be used if more specific questions are required. These projects also demonstrate proper documentation strategies, a particularly important element for those working with rapidly evolving technologies. Finally, students may, with instructor approval, add their own projects to the gallery, a goal that inspired some students and helped others to feel part of a community of practicing artists.

2.5 Using Rita As consistency was one of our primary design criteria, all RiTa objects follow the

same basic pattern for object instantiation, as follows:

–  –  –

This syntax employs the conventional, if less than perfect, “new” operator34 and represents the most generic mode of object creation in Java. Although restricting object creation to “new” significantly complicates the coding of some objects (see the section on the RiTaServer below), it was judged to be more intuitive for new users and does not require an See Jonathan Amsterdam’s article on the topic at http://www.ddj.com/java/184405016;jsessionid=5GYK2EDVKVLMUQSNDLPSKHSCJUN N2JVN?_requestid=205339.

understanding of the class/object distinctions, nor of static methods (e.g., main() or createX()).

When used in Processing, object creation was performed like this:

RiObject objName = new RiObject(this);

To clarify this syntax, Processing “sketches” are, by default, Java applets that subclass the processing.core.PApplet class (which in turn subclasses java.applet.Applet) so that the “this” keyword passed to the constructor represents an instance of PApplet and provides the RiTa object with a reference back to the core Processing methods implemented within. This is the recommended syntax for Processing libraries and has been somewhat universally adopted by library developers. This back-reference to the PApplet is necessary in only a few cases within RiTa, the RiText object being a primary example, for which Processing is required, at least for now, to perform the supported 2D and 3D drawing functions. As all other core classes can be used with or without Processing, either of the syntaxes above is acceptable.

Figure 6: A simple RiTa sketch in Processing.

Typically however, the RiText object is generally the first object encountered by students as they often wish to immediately display some piece of text on which they are working. In fact, only one additional line of code is required to add visible text to an existing sketch. To illustrate, a very basic RiTa sketch is shown in Figure 6.

Figure 6 also illustrates the simplicity of the basic Processing environment. Creating this “sketch”—to use Processing terminology—required the user to a) select RiTa from the “tools” menu to generate the import line at top, and b) to type or paste in the single line shown that creates a single RiText object. To run the program, the yellow “play” button is pressed. For new or inexperienced students, this provides a significant usability gain over the complexity of a similar first program in raw Java (see Figure 7).

import java.awt.Graphics;

import java.applet.Applet;

–  –  –

} } Figure 7: A minimal Java applet.

Notice the number of concepts present in this program, from classes, to functions with and without parameters, to inheritance via the “extends” keyword. Each of these concepts must either be learned by a new user, or ignored (as is unfortunately more often the case). Further, this program still requires several steps before any output can be seen, beginning with the separate compilation step that is required in Java (which generally requires knowledge of where the “javac” compiler is located, as well as the correct path for the Java runtime classes to be set in the CLASSPATH variable). Running the program is also not trivial, as Sun's “AppletViewer” tool requires an additional HTML document to be created and saved before one can even test the code. An example version of such a file is shown in Figure 8.

–  –  –

Pages:     | 1 |   ...   | 4 | 5 || 7 | 8 |   ...   | 29 |

Similar works:

«DIGESTING THE THIRD: RECONFIGURING BINARIES IN SHAKESPEARE AND EARLY MODERN THOUGHT by Rob Carson A thesis submitted in conformity with the requirements for the degree of doctor of Philosophy Graduate Department of English University of Toronto © Copyright by Rob Carson 2009 Dissertation Abstract “Digesting the Third: Reconfiguring Binaries in Shakespeare and Early Modern Thought” Rob Carson (PhD 2009) Department of English University of Toronto My argument assesses and reconfigures binary...»

«DEVELOPMENT AND GENETIC DIVERSITY OF SCLEROTINIA SCLEROTIORUM ON POTATO IN THE COLUMBIA BASIN By ZAHI KANAAN-ATALLAH A dissertation submitted in partial fulfillment of the requirement of the degree of DOCTOR OF PHILOSOPHY WASHINGTON STATE UNIVERSITY Department of Plant Pathology MAY 2003 To the Faculty of Washington State University: The members of the Committee appointed to examine the dissertation of Zahi KanaanAtallah find it satisfactory and recommend that it be accepted Chair _ ii...»

«Confronting the Challenge of Socialism: The British Empire Union and the National Citizens’ Union, 1917-1927. Ian Thomas BA (Hons). A thesis submitted in partial fulfilment of the requirements of the University of Wolverhampton for the degree of Master of Philosophy. August 2010. This work or any part thereof has not previously been presented in any form to the University or to any other body whether for the purposes of assessment, publication or for any other purpose (unless otherwise...»

«Imperfect Partnership: Effects of Collaboratories on Scientists from Developing Countries by Airong Luo A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Information) in The University of Michigan Doctoral Committee: Professor Judith S. Olson, Chair Professor Michael D. Cohen Associate Professor Fiona Lee Assistant Professor Steven Jackson Copyright airong Luo All rights reserved DEDICATION For Eleanor and Evan Park ii ACKNOWLEDGEMENTS I...»

«KARLA ARAYA, «ANANCY STORIES.» BLO, 4 (2014), PP. 43-52. ISSN: 2173-0695 Anancy stories beyond the moralistic approach of the western philosophy of being Karla ARAYA (Universidad de Costa Rica) ABSTRACT: This article analyzes Anancy’s cogniRESUMEN: Este artículo analiza la construcción tive and sociohistorical identity beyond the moralcognitiva y socio-histórica identitaria en los cuenistic approach of the western philosophy of being. tos de la araña Anancy más allá del enfoque...»

«K O E FANĀ FOTU´: SUCCESS IN MOTION, TRANSFORMING PASIFIKA EDUCATION IN AOTEAROA NEW ZEALAND 1993-2009 A thesis submitted in fulfilment of the requirements for the Degree of Doctor of Philosophy in Pacific Studies and Education at the University of Canterbury Christchurch New Zealand Lesieli Pelesikoti Tongati‘o © ii MINISTRY OF EDUCATION STATEMENT 15 April 2010 Approval is given for Lesieli Pelesikoti Tongati‘o to use and analyse information and data, gathered during the course of her...»

«EMISSION OF VOLATILE ORGANIC COMPOUNDS FROM MULTI-LAYER STRUCTURAL INSULATED PANELS Huali Yuan Dissertation submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Civil Engineering Dr. John C. Little, Chair and Advisor Dr. Marc A. Edwards Dr. Daniel L. Gallagher Dr. Brian J. Love Dr. Linsey C. Marr 25 August 2005 Blacksburg, Virginia Keywords: VOC, emission, indoor, modeling,...»

«ENHANCEMENT OF ROLL MANEUVERABILITY USING POST-REVERSAL DESIGN A Thesis Presented to The Academic Faculty by Wei-En Li In Partial Fulfillment of the Requirement for the Degree Doctor of Philosophy in the School of Aerospace Engineering Georgia Institute of Technology August 2009 ENHANCEMENT OF ROLL MANEUVERABILITY USING POST-REVERSAL DESIGN Approved by: Professor Dewey H. Hodges, Advisor Professor J. V. R. Prasad Committee Chair School of Aerospace Engineering School of Aerospace Engineering...»

«BILINGUALISM, FEEDBACK, COGNITIVE CAPACITY, AND LEARNING STRATEGIES IN L3 DEVELOPMENT A Dissertation submitted to the Faculty of the Graduate School of Arts and Sciences of Georgetown University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Linguistics By Hui-Ju Lin, M.S., Ed.M Washington, D.C. April 6th, 2009 Copyright 2009 by Hui-Ju Lin All Rights Reserved ii BILINGUALISM, FEEDBACK, COGNITIVE CAPACITY, AND LEARNING STRATEGIES IN L3 DEVELOPMENT Thesis...»

«DO NORMATIVE JUDGEMENTS AIM TO REPRESENT THE WORLD? Bart Streumer b.streumer@rug.nl Ratio 26 (2013): 450-470 Also in Bart Streumer (ed.), Irrealism in Ethics Published version available here: http://dx.doi.org/10.1111/rati.12035 Abstract: Many philosophers think that normative judgements do not aim to represent the world. In this paper, I argue that this view is incompatible with the thought that when two people make conflicting normative judgements, at most one of these judgements is correct....»


«ROLE OF LOCUS COERULEUS AND AMYGDALA PROJECTIONS TO VENTRAL SUBICULUM IN STRESS REGULATION by Witold J. Lipski Physics B.A., Colby College, 2000 Submitted to the Graduate Faculty of the Kenneth P. Dietrich School of Arts and Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh 2011 UNIVERSITY OF PITTSBURGH KENNETH P. DIETRICH SCHOOL OF ARTS AND SCIENCES This dissertation was presented by Witold J. Lipski It was defended on November...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.