«Creativity Support for Computational Literature By Daniel C. Howe A dissertation submitted in partial fulfillment of the requirements for the degree ...»
The term language models originates from probabilistic models of language generation developed for automatic speech recognition systems in the early 1980’s . Speech recognition systems use a language model to complement the results of the acoustic model which models the relation between words (or parts of words called phonemes) and the acoustic signal. The history of language models, however, goes back to beginning of the 20th century when Andrei Markov used language models (Markov models) to model letter sequences in works of Russian literature . Another famous application of language models are Claude Shannon’s models of letter sequences and word sequences, which he used to illustrate the implications of coding and information theory . In the 1990’s language models were applied as a general tool for several natural language processing applications, such as part-of-speech tagging, machine translation, and optical character recognition.
Language models were applied to information retrieval by a number of research groups in the late 1990’s [4, 7, 14, 15]. They since became quite popular in information retrieval research.
common computers could handle word-level n-gram models for non-trivial inputs. Here we see how an implementation’s efficiency (e.g., with memory, with disk access, with processing power) and the need, or lack thereof, for certain kinds of correctness (e.g., in statistical distribution) can be determining factors in whether a technique is applicable to an artistic context. In fact, the modern availability of computing power has made practical, and shown the power of, a whole area of research, specifically that of “statistical” natural language, that previously was out of reach. Shannon’s initial experiments manually generating sentences that approximated those found in literature were the first step in this trajectory.
Parenthetically, it is interesting to note that the first application of Markov models was also linguistic and literary, modeling letter sequences in Pushkin's poem “Eugene Onegin”, though this was presented from a mathematical, rather than communication-oriented, perspective [Markov 1913].
For almost three decades after the publication of Shannon’s paper, the techniques that he outlined for text generation were barely explored. While their application to analysis was somewhat further investigated, there seems to have been little interest in generating text with it, and as Wardrip-Fruin states , it appears to have had no actual literary use until much later. In part this may have been due to the effort involved in building texts by hand, as Shannon did, combined with the fact that even for severely restricted versions of the method, computers that could handle the amount of data generated in accurate statistical models (at least when employing the most obvious approaches to the problem) were unavailable until the 1970s. How n-grams were later employed in actual literary practice79 is explored in below in our discussion of Charles O. Hartman’s “Monologues of Body and Soul”.
4.3.3 Joseph Weizenbaum Using Markov models in the language experiments above gave researchers access to new tools for the analysis and generation of natural language and laid the groundwork for the range of more sophisticated probabilistic methods to come. Joseph Weizenbaum approached natural language processing from a more elementary perspective, but with no less interesting results. Weizenbaum was a professor of computer science at MIT when, in 1966, he published a comparatively simple program called ELIZA, which used pattern-matching and transformation to simulate human conversation (programs like this are now generally called “chatterbots”, or “chatbots”).
Driven by a script named DOCTOR, it engaged a human user in a conversation with a simulated psychologist. Weizenbaum modeled the program’s conversational style after Rogerian therapy, which uses open-ended questions to encourage patients to communicate more effectively with their therapists. The results were surprisingly engaging, as Eliza/Doctor used simple rules to turn the audience member’s typewritten statements back to the user in the form of open-ended questions and prompts to talk further. Weizenbaum was shocked that his program was taken seriously by many users, who would open their hearts to it. In his article, “From Computer Power and Human Reason”, Weizenbaum  describes how quickly and deeply people became emotionally involved with the computer program, e.g., taking offence when he asked to view the transcripts, saying it was an invasion of their In addition to widespread literary use, n-Grams have also been used extensively in musical composition. One example is the contemporary Austrian composer Karlheinz Essl, who reassembled a Bach violin sonata via n-grams, calling it “Bach sausage.” privacy, even asking him to leave the room while they were working with the program, a phenomenon Weizenbaum found quite disconcerting.80 An example of Weizenbaum's man-machine conversation included a chat between a
simulated therapist and a patient. A segment follows:
At this point in the conversation when he tested it in his office, Weizenbaum’s secretary asked him to leave, because the conversation was getting too personal [Wallace 2009].
Eliza/DOCTOR was considered by many to be a forerunner of the “thinking machines” trumpeted by the press and members of the AI community, computers able to simulate human cognitive processes. Interestingly it was Weizenbaum himself who most strongly argued, in his book Computer Power and Human Reason , against this interpretation, explaining the limits of computers, and arguing that any anthropomorphic view of computers represents a reduction of the human being. David Gardner refers to When Weizenbaum informed his secretary that he, of course, had access to the logs of all the conversations, she reacted with outrage at this invasion of her privacy" [Wallace 2009].
Weizenbaum as “the brilliant MIT researcher who threw water on some of the wildest predictions about computers as ‘thinking machines’”.81
The system had a tremendous impact on a number of subfields in computer science:
from natural language processing to artificial intelligence; from interactive narratives to conversational agents, the relationship of computing and psychotherapy, the ethical uses of computers, and computer gaming, to name just a few. Janet Murray identifies Eliza/Doctor as the “moment in the history of the computer that demonstrated its representational and narrative power with the same startling immediacy as the Lumieres’ train did for the motion picture camera.” She calls Weizenbaum “the earliest, and still perhaps the premier, literary artist in the computer medium” [Murray 1997]. While not all writers would be prepared to recognize Eliza/Doctor as literature, most can accept (as Wardrip-Fruin  argues) the idea that presenting a character through conversation with the audience guided by previouslyauthored texts and rules, rather than through recitation of unvarying text, has the potential to be literary.
4.3.4 Selmer Bringsjord Perhaps less well know than the previous researchers, Selmer Bringsjord (1958-) is the current director of the Rensselaer Artificial Intelligence and Reasoning (RAIR) Laboratory and a professor of computer science, as well as a Professor of Philosophy, Logic, and Cognitive Science. Bringsjord is perhaps best-known for his meta-level proofs of contentious issues in computer science, e.g., his modal argument using analog computation to show that P=NP [Bringsjord and Taylor 2005]. Of particular interest here is his refutation of See: http://www.informationweek.com/news/globalcio/showArticle.jhtml?articleID=206903443. Accessed 7/01/09.
the Church-Turing thesis via what he refers to as “literary creativity”. While the specifics of the proof, and the arguments of his many critics, detailed in his book, Artificial Intelligence and Literary Creativity (AILI), with David Ferrucci  are beyond the scope of this research, his reasoning for selecting this specific domain is relevant. He argues, citing research by a range of scholars that “literary creativity” may represent the best measure of intelligence at our disposal. In fact, he argues for a literary alternative to the Turing test. He
Though the Turing test is currently out of reach of the smartest of our machines, there may be a simpler way of deciding between the strong and weak forms of AI – one that highlights creativity…. The test I propose is simple: Can a machine tell a story?
He goes on to describe his test in more detail.
But what would the story game look like? In the story game, we would give both the computer and a master human storytellers relatively simple human sentence, say “Gregor woke to find his abdomen was as hard as a shell, and that where his right arm had been, there now wiggled a tentacle.” Both players must then fashion a story designed to be truly interesting, the more literary in nature – in terms of rich characterization, lack of predictability, and interesting language, the better. We could then have a human judge the stories so that, as in the Turing Test, when such a judge cannot tell which response is coming from the mechanical muse and which is from the human, we say that the machine has won the game.  As he suggests that the creation of a novel is so far beyond the capabilities of today’s AI techniques that its existence “simply can’t be conceived”, he restricts the test described above to 500 words. Then, in an attempt to answer the question, he sets out, with the help of Ferrucci and senior scientists at IBM’s Watson Research Center, to build a system, called “Brutus”, that generates short fiction within that constraint. Their efforts make up the majority of the chapters of AILI, by the end of which they have created Brutus.1, an instantiation of the more generic Brutus architecture specializing in the literary theme of ‘Betrayal’, for which he provides a definition in formal logic, and operating over a narrow range of character types; an ontology that includes Professor, Students, Dissertations, Classes, etc. After 7 years of the 10-year project, he writes, “though I expect to make headway… firstrate story-telling will always be the sole province of human masters” . Whether or not he has successfully refuted the Church-Turing thesis in the process, or even provided a more accurate version of the Turing Test via his ‘story-telling game’, his choice of the literary context for his experiments again demonstrates its unique characteristics and continued utility in computer science research.
4.4 Procedural Writing: Tools And Practice Having presented (in section 4.2) an overview of educational tools for computer science students and writers wishing to explore computational methods, and (in section 4.3) a survey of research by computer scientists addressing literary language, we presents here, in rough chronological order, a range of procedural writing experiments undertaken by practicing artists. Unlike Strachey and Shannon, for instance, who did not use creative writing as their starting point, the work that follows is integrally tied to literary production.
Theo Lutz, for example, used stochastic methods to generate poetry, while Brion Gysin leveraged combinatory techniques to create new works, and Nanni Balestrini constructed poems by procedurally ‘mashing-up’ a number of different texts. In addition to these examples of procedural techniques, we also mention a number of individual works (e.g.
House of Dust), exhibitions (e.g. Cybernetic Serendipity), and programs (e.g. Auto-Beatnik) which further explore computational writing in the context of artistic practice. The section concludes with an investigation of the Dada and Oulipo movements who, though not always utilizing computational tools, were committed to procedural methods that link them closely to the other work discussed82.
Since long before the invention of computers, artists and creative writers have experimented with the use of procedural techniques in their art practice.
As Florian Cramer  puts it:
Executable code existed centuries before the invention of the computer, in magic, Kabbalah, musical composition and experimental poetry. These practices are often neglected as a historical pretext of contemporary software culture and electronic arts. Above all, they link computations to a vast speculative imagination that encompasses art, language, technology, philosophy and religion.
As Cramer notes, it is unlikely a coincidence that the Gospel of John was one of the first texts manipulated in the early computational poetry experiments of Brion Gysin and William S.
Burroughs , discussed below [Funkhouser, 2007].
Due to vast scope of this work, however, we are only able to touch on a handful of examples in the categories below. For a more detailed history of the topic, we recommend the full-length monographs, “Prehistoric Digital Poetry: An Archaeology of Forms” by C.
Funkhouser , and “Words Made Flesh: Code, Culture, Imagination” by F. Cramer , both of which present a wealth of in-depth information on the topic
As one might expect, as soon as computer technologies became accessible to artists working with procedural methods, they were put to immediate use, sometimes in quite surprising and productive ways. This section presents a brief history of artistic experiments by those working in, or at the borders of, the “literary”, with particular focus on those whose work led, directly or indirectly, to techniques employed in the RiTa toolkit.