FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:     | 1 |   ...   | 3 | 4 || 6 | 7 |   ...   | 29 |

«Creativity Support for Computational Literature By Daniel C. Howe A dissertation submitted in partial fulfillment of the requirements for the degree ...»

-- [ Page 5 ] --

RiText The basic object for displaying strings of text on screen in 2D and 3D. RiText contains a variety of utility methods for manipulating typography, fonts, and images, controlling animations, and handling audio and simple text-to-speech playback. In early versions of RiTa, each RiText needed to be explicitly drawn to the screen by the user. In a simulation-based environment like Processing however, where the draw() loop is called automatically every frame, the cognitive load for new users is decreased and code reads more clearly when objects “draw themselves”. Thus, in later versions all RiText objects that were explicitly hidden (via calls to riText.setVisible(false)) are automatically rendered on screen at each frame. This default behavior is not only consistent with the Processing paradigm, it also appears to be generally intuitive for users with backgrounds in Flash, Max/MSP, or Silverlight. Although some experienced Java users (and one QuickTime-user) expressed initial confusion over this behavior, such confusion didn't appear to persist beyond one explanation. For more complex situations, where complex layering or a specific ordering of affine transformations were required, users could disable this behavior by calling RiText.disableAutoDraw() and then drawing each object manually in the order required.

RiSpeech This object provides basic cross-platform text-to-speech facilities with control over a range of parameters including voice-selection, pitch, speed, rate, etc. RiSpeech is based on FreeTTS's22 implementation of the Java Speech API or JSAPI23 specification. Due to a range of extensions to the FreeTTS components, RiSpeech works online in an applet/browser context without requiring Java Web Start24 technology or even signed applets. Further, RiSpeech is tightly integrated with the RiTa lexicon for phonemic and syllabification data, generating this data in real-time (via procedural rules) when is not available via the lexicon.

When needed, multiple RiSpeech objects can be created, each with their own parameters, for See http://freetts.sourceforge.net/.

See http://java.sun.com/products/java-media/speech/.

See http://java.sun.com/javase/technologies/desktop/javawebstart/.

concurrent speech. The RiSpeech object is also compatible with the MBROLA25 voice set with installation of a platform-specific binary. These (optional) voices provide higher quality but sacrifice cross-platform compatibility due to their use of native libraries. Additionally, on the Mac platform RiSpeech provides programmatic access to Apple's built-in high-quality speech synthesis library with over 20 voices.

RiLexicon A user-customizable lexicon equipped with implementations of a variety of matching algorithms—min-edit-distance, soundex, anagrams, alliteration, rhymes, looks-like, etc.— based on combinations of letters, syllables, stresses, and phonemes. For each word entry, RiLexicon provides syllabification, stress, and pronunciation information (for TTS) following the conventions of the CMU Pronunciation Dictionary. Additionally, a set of Part-Of-Speech tags are provided for use in the (default) transformation-based POS-tagger.

Users can also modify or customize the lexicon (e.g., add words, or change pronunciations) by editing the plain-text “rita_addenda.txt” file, an example of which comes as part of the core RiTa download. An example is presented in.

–  –  –

Figure 4: An example rita-addenda file.

See http://tcts.fpms.ac.be/synthesis/mbrola.html.

RiTokenizer A simple tokenizer for word and sentence boundaries with regular expression support for custom-tokenizing. As with all RiTa tools, RiTokenizer used the Penn corpus conventions and rules as default behavior.

RiTextBehavior An extensible set of text-behaviors including a variety of interpolation algorithms for moves, fades, color-changes, scaling, rotating, and morphing text. By implementing the RiTextBehavior interface, students can add their own simple text behaviors whose lifecycles (creation, frame-by-frame updates, clean-up) are managed transparently by the library.

RiStemmer A simple stemmer for extracting base roots from words by removing prefixes and suffixes. For example, the words “run”, “runs”, “ran”, and “running” all have “run” as the root. Based on Martin Porter's stemming algorithm [van Rijsbergen et al. 1980].

RiPluralizer A simple rule-based pluralizer for nouns. When passed a stemmed noun (see RiStemmer,) it will return the plural form. Uses a combination of letter-based pluralization rules and a lookup table of exceptions for irregular nouns, e.g., appendix → appendices.

RiSearcher A utility object for obtaining real-time unigram, bigram, and weighted-bigram counts for words and phrases via online search engines, e.g., Google (described in further detail below).

RiWordNet Installed as a separate component or used in conjunction with the rest of the toolkit, RiWordNet provides straightforward access to the WordNet ontology, supporting all the common WordNet relation types, including synonyms, antonyms, hypernyms, & hyponyms, holonyms, meronyms, coordinates, similars, nominalizations, verb-groups, derived-terms, glosses, “see-alsos”, examples, and descriptions, as well as distance metrics between terms in the ontology. RiWordNet supports WordNet version 2.x-3.0x across all platforms and, at the time of this writing, is the only public library that provides direct access to WordNet via browser-based web applets, with no need for special downloads, memory-configuration, “Java Web Start”, or applet-signing.

Additionally RiWordNet provides each term in the ontology with a unique ID for the combination of sense and part-of-speech, facilitating direct reference to the sense in question.

All methods take simple String or ints and return String arrays, greatly simplifying the complex pointer hierarchy found in the original implementation. In most cases, three methods are provided for each relation type (e.g., for hyponyms, getHyponyms(int uniqueId), getHyponyms(String word, String pos) and getAllHyponyms(String word, String pos) where the first returns hyponyms for a specific sense (as specified by its unique id), the second returns the most common sense for the specified part-of-speech, and the third returns all senses for the word/part-of-speech pair.

Additionally several literary-specific extensions are provided. These include part-ofspeech and random iterators,26 which allow users to query for random words, glosses, and descriptions that match a specific part-of-speech, facilitating simple substitutions on existing phrases. Similarly, each call to getX()—where X is one of the relations listed above—returns See non-deterministic iteration in Chapter 3: Pedagogy.

an array of Strings in random order (this default can be disabled for unit-testing,) thus enabling users to continually explore the parameter space of a generative piece during the development cycle. Lastly, RiWordNet adds a range of specific literary relations to the standard set including “soundex”, “letterex”, anagrams, and others.

RiTextField A simple text field widget to handle user keyboard input. When user input is completed, a RiTaEvent callback is triggered as described in the ‘Events and Dynamic Callbacks’ section below.

RiSample Provides intuitive library-agnostic audio support, handling playback of wav, aiff, and mp3 samples and server-based streaming of compressed mp3s.

RiPosTagger Provides a standard interface for implementations of part-of-speech taggers. The current version of RiTa includes two such implementations, both using the Penn conventions (see Table 3 below), a faster and lighter-weight transformation-based tagger based on an optimized version of the Brill algorithm [Brill 1992] and the generally more accurate maximum-entropy tagger based on the OpenNLP27 package. The transformation-based tagger is closely tied to the lexicon provided with RiTa that the set of part-of-speech tags for each word entry. This allows lookups to run in constant time, after which a set of context-specific rules are applied to select the appropriate set element for the specific context in which the word appears. Words not found in the lexicon default to the most likely part-of-speech, a singular noun, and are then run through a similar set of transformational rules which, based See http://opennlp.sourceforge.net/.

on spelling, phonemic data, and context (the surrounding words), create a “best-guess” for the part of speech, again in constant-time, once the lexicon has been loaded. Table 3 contains the full set of tags (following the Penn conventions) returned by the RiPosTagger (regardless of


–  –  –

Table 3: Alphabetical list of part-of-speech tags used in the Penn Treebank project.

RiHtmlParser Provides various utility functions for fetching and parsing text data from web pages using either the Document-Object-Model (DOM) or regular expressions. Also provides a base implementation so that subclasses can override the handleText(), handleSimpleTag(), handleStartTag(), and handleEndTag() methods to define custom behavior (as in RiSearcher).

Examples of basic functionality include the fetching of HTML pages as plain text, with or without the HTML tags stripped, and the ability to define custom parsing behavior, as in the fetchLinks() and fetchLinkText() methods which respectively fetch all the anchor links on a page and all the linked text (contained within an anchor) on a page.

RiTravesty Represents a Markov chain (or n-gram) model that treats each character as a separate token (as in Kenner and O'Rourke's original Travesty program28). Provides a range of methods to query the model for probabilities and/or phrase completions. As RiTravesty's functionality was, to a large extent, overlapped in the RiMarkov object (also n-grams but at the word level), this component was used largely as a teaching tool. One of the students' assignments was to implement the Travesty interface on their own, to demonstrate their conceptual grasp of n-grams without needing to deal with the complexities and edge-cases found in word-based models.

RiAnalyzer The RiAnalyzer object allows users to easily access micro-level features, e.g., syllables, part-of-speech, phonemes, and stress features for arbitrary strings of text. Analysis Originally published in Kenner, H. and O'Rourke, J. 1984. BYTE Magazine. Volume 9, Issue 12 (November, 1984): New chips.

involves a combination of lookup and algorithmic rules. First, RiAnalyzer performs a lookup in RiTa's custom lexicon (~35,000 words) for this data. If not found, this data at runtime is created via procedural rules, after which it is cached. Part-of-speech data is also provided (see RiPosTagger). Once a text has been analyzed, it is “annotated” with features for each of the elements mentioned (see RiString below for a full description), which can then be used in the generation of larger features. This range of micro-level features provides users with the necessary infrastructure with which to address a range of specifically literary technique from rhyme, to alliteration, line-break and enjambment, puns and visible plays-on-words, rhythm and musicality.

RiString The RiString object augments the basic Java String with support for “features”, keyvalue pairs that contain annotations about the String in question. Features, implemented via String-to-String (lazy-instantiated Hashtables), allow an arbitrary number of annotations to be attached to a given String, with a default set provided by the system itself. Support for new user-defined “features” is enabled (and encouraged) by the design.

RiString additionally provides implementations for all the usual Java String methods (in fact the entire CharSequence interface), allowing methods to be invoked on a RiString object just as they would be on a String. RiString.split(), for example, splits a RiString object above into some number of new RiStrings (just as String.split() does), one for each word, maintaining the features that remain relevant (e.g., part-of-speech, stress, phonemes, syllables, etc.) to the whole, and deleting those that are not (e.g., “chunk-type”, as the individual words were no longer part of a “noun-phrase”).

This allows RiString objects to be passed along a pipeline (similar to Unix commandline tools) and lookups to be performed in lazy fashion, if and as needed. For example, if a core object (RiGrammar for example) needs to know the part-of-speech for a RiString from which it is generating a phrase, it might ask whether the "pos" feature is present via a call to RiString.hasFeature("pos"). If not, it might invoke the RiPosTagger object to first add the part-of-speech feature before proceeding with generation. Objects in different states of analysis can thus be exchanged with features added only when needed, thus enabling, through a sort of ad-hoc introspection mechanism, a weak version of polymorphism (the lack of which being one critique of such a non-hierarchical approach). In fact it was this feature that later enabled the use of RiTa processing objects in a real-time drag/drop environment for language processing (see the RiTa visual interface in the Future Work section of chapter 6).

RiKWICker RiKWICker provides an efficient implementation of a KWIC-model generator.

KWIC is an acronym for “Key-Words-in-Context”, a common format for sentence-based concordances.29 A KWIC model may also be referred to as a permuted index, referring to the fact that the model contains all cyclic permutations of each sentence in a text. The RiKWICker implements this model by sorting and aligning all the words within a text so that each word provides a hash key for a list of all sentences that contain that word. Thus, one can retrieve the sentences containing a word in constant time, O(1). The time to create the model is O(n) where n represents the number of words in the text. Since each word must be viewed at least once when creating the model, this performance is (asymptotically) optimal.

Additionally, RiKWICker provides options to ignore stop-words and letter-case, each potentially decreasing the memory requirements of the model for longer texts.

The term “Key Words In Context” (KWIC) was first coined by Hans Peter Luhn as described in Manning [1999].

Pages:     | 1 |   ...   | 3 | 4 || 6 | 7 |   ...   | 29 |

Similar works:

«EXPLORING THE DIVERSITY OF GENTRIFICATION AND THE ROLE OF GENDER IN HONG KONG, 1986 TO 2006 By Minting Ye A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Geography –Doctor of Philosophy 2014 ABSTRACT EXPLORING THE DIVERSITY OF GENTRIFICATION AND THE ROLE OF GENDER IN HONG KONG, 1986 TO 2006 By Minting Ye Gentrification is restructuring the geography of cities all over the world (Clark, 2005; Hackworth and Smith, 2001; Lees,...»

«The effect of contamination on selected physical and chemical characteristics of Mineral Trioxide Aggregate Thesis submitted in fulfilment of the degree of Doctor of Philosophy Mohammad Hossein Nekoofar School of Dentistry Cardiff University (April 2006-August 2011) DECLARATION This work has not previously been accepted in substance for any degree and is not concurrently submitted in candidature for any degree. Thursday, 23 June 2011 STATEMENT 1 This thesis is being submitted in partial...»

«WOMEN ACADEMICS BLENDING PRIVATE AND PUBLIC LIVES Carmelina Armenti A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Department of Sociology and Equity Studies in Education Ontario Institute for Studies in Education of the University of Toronto O Copyright by Carmelina Armenti 2000 1*1 Bibliotheque nationale National Library du Canada of Canada Acquisitions and Acquisitions et Bibliographic Services services bibliographiques 395, rue Wellington 395...»

«Individual Document Management Techniques: an Explorative Study A dissertation submitted to the Department Of Computer Science, Faculty of Science at the University Of Cape Town in partial fulfilment of the requirements for the degree of Master of Philosophy (in Information Technology). By Mpho Sello February 2007 Supervised by Dr Hussein Suleman © Copyright 2007 By Mpho Sello i Abstract Individuals are generating, storing and accessing more information than ever before. The information comes...»

«MULTI-PLATFORM STRATEGY AND PRODUCT FAMILY DESIGN Yanfeng Li Dissertation submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy In Industrial and Systems Engineering Dr. Janis Terpenny (Chair) Dr. Patrick Koelling Dr. Asli Sahin-Sariisik Dr. Subhash Sarin February 22, 1010 Blacksburg, VA Keywords: Product Design, Platform, Optimization, Costing     Multi-platform strategy and...»

«Thesis for the degree ‫חבור לש קבלת התואר‬ Doctor of Philosophy ‫דוקטור לפילוסופיה‬ Submitted to the Scientific Council of the ‫מוגש למועצה המדעית של‬ Weizmann Institute of Science ‫מכו ויצמ למדע‬ Rehovot, Israel ‫רחובות, ישראל‬ Regular Format By ‫מאת‬ Ido Zelman ‫עידו זלמ‬ Kinematics of octopus arm movements ‫קינמאטיקה של תנועות זרוע התמנו‬ Advisor: Prof....»

«Synthese (2015) 192:385–392 DOI 10.1007/s11229-014-0573-4 Knowledge and the norm of assertion: a simple test John Turri Received: 2 April 2014 / Accepted: 3 October 2014 / Published online: 14 October 2014 © Springer Science+Business Media Dordrecht 2014 Abstract An impressive case has been built for the hypothesis that knowledge is the norm of assertion, otherwise known as the knowledge account of assertion. According to the knowledge account, you should assert something only if you know...»

«FROM SOLDIER TO SETTLER: THE WELSH IN IRELAND, 1558-1641 Rhys Morgan Thesis submitted to Cardiff University in fulfilment o f the requirements for the degree o f Doctor o f Philosophy February 2011 FROM SOLDIER TO SETTLER: THE WELSH IN IRELAND, 1558-1641 Rhys Morgan Thesis submitted to Cardiff University in fulfilment o f the requirements for the degree o f Doctor o f Philosophy February 2011 UMI Number: U5600B3 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is...»

«Article St. Anselm on Divine Foreknowledge and Future Contingency William Lane Craig Laval théologique et philosophique, vol. 42, n° 1, 1986, p. 93-104.Pour citer cet article, utiliser l'information suivante : URI: http://id.erudit.org/iderudit/400219ar DOI: 10.7202/400219ar Note : les règles d'écriture des références bibliographiques peuvent varier selon les différents domaines du savoir. Ce document est protégé par la loi sur le droit d'auteur. L'utilisation des services d'Érudit (y...»

«INVESTIGATION OF OPTICAL LOSS CHANGES IN SILOXANE POLYMER WAVEGUIDES DURING THERMAL CURING AND AGING A Dissertation Presented to The Academic Faculty by Shashikant G. Hegde In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the G.W. Woodruff School of Mechanical Engineering Georgia Institute of Technology April 2008 COPYRIGHT 2008 BY SHASHIKANT G. HEGDE INVESTIGATION OF OPTICAL LOSS CHANGES IN SILOXANE POLYMER WAVEGUIDES DURING THERMAL CURING AND AGING Approved...»

«Stony Brook University The official electronic file of this thesis or dissertation is maintained by the University Libraries on behalf of The Graduate School at Stony Brook University. © Allll Riightts Reserved by Autthor. © A R gh s Reserved by Au hor Translating Contemporary Japanese Culture: Novels and Animation A Dissertation Presented by Tadahiko Haga to The Graduate School in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Comparative Literature Stony...»

«Externalizing Behavior in Post-Institutionalized Children: An Examination of Parent Emotion Socialization Practices, Respiratory-Sinus Arrhythmia, and Skin Conductance A Dissertation SUBMITTED TO THE FACULTY OF UNIVERSITY OF MINNESOTA BY Adriana Marie Herrera IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Megan R. Gunnar Bonnie Klimes-Dougan May 2014 © Adriana Marie Herrera 2014 Acknowledgements I would like to acknowledge those people who provided invaluable...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.