WWW.DISSERTATION.XLIBX.INFO
FREE ELECTRONIC LIBRARY - Dissertations, online materials
 
<< HOME
CONTACTS



Pages:   || 2 | 3 | 4 | 5 |   ...   | 11 |

«THE UNIVERSITY OF CHICAGO GRAMMATICAL METHODS IN COMPUTER VISION A DISSERTATION SUBMITTED TO THE FACULTY OF THE DIVISION OF THE PHYSICAL SCIENCES IN ...»

-- [ Page 1 ] --

THE UNIVERSITY OF CHICAGO

GRAMMATICAL METHODS IN COMPUTER VISION

A DISSERTATION SUBMITTED TO

THE FACULTY OF THE DIVISION OF THE PHYSICAL SCIENCES

IN CANDIDACY FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

DEPARTMENT OF COMPUTER SCIENCE

BY ERIC PURDY CHICAGO, ILLINOIS MARCH 2013 Copyright c 2013 by Eric Purdy All Rights Reserved To my beloved wife

TABLE OF CONTENTS

LIST OF FIGURES.................................... vii LIST OF TABLES..................................... xii

Abstract

........................................ xiii 1 INTRODUCTION................................... 1

1.1 Examples of Grammatical Approaches..................... 2 1.1.1 Curve Grammars............................. 2 1.1.2 Visual Chalkboard............................ 4 1.1.3 Visual Search Engine........................... 4 1.1.4 Other Visual Grammars......................... 4

1.2 Grammatical Vision is Important........................ 6 1.2.1 Soft Decisions............................... 6 1.2.2 Modeling Clutter with Object Sub-parts................ 7 1.2.3 Whole Scene Parsing........................... 7

1.3 Hierarchical Decomposition and Rich Description............... 8 1.3.1 Training on rich annotations....................... 9 1.3.2 Some Problems with Rich Description, and Solutions......... 10 1.3.3 Rich Description and XML........................ 12

1.4 Grammars and Statistical Models........................ 13 1.4.1 Statistical Modules............................ 13 1.4.2 Independence and the Poverty of Stimulus............... 13

1.5 Grammatical Vision is Difficult...........

–  –  –

3.1 Experimenting with the Watson distribution, part 1. In the first row, the original triangle T on the left, and the mode of the estimated Watson distribution on the right. Subsequent rows are samples from Watson(T, 30.00). The estimated concentration was 63.48................................ 76

–  –  –

4.1 A parse exhibiting unintended reuse......................... 129

4.2 Motivation for local constraints. Grey regions represent areas where there is evidence for edges................................... 131

4.3 The angle between q − p and D(q) should be close to 90 degrees for interior p and close to 270 degrees for exterior p........................... 134

4.4 Output of detection algorithm. Final detection shown top left, segmentation shown top right, orientation assignments shown in four bottom images. The top left of the orientation images gives a key to the orientation labels......... 135

4.5 Output of detection algorithm. Final detection shown top left, segmentation shown top right, orientation assignments shown in four bottom images. The top left of the orientation images gives a key to the orientation labels......... 136

4.6 Output of detection algorithm. Final detection shown top left, segmentation shown top right, orientation assignments shown in four bottom images. The top left of the orientation images gives a key to the orientation labels......... 136

4.7 Output of detection algorithm. Final detection shown top left, segmentation shown top right, orientation assignments shown in four bottom images. The top left of the orientation images gives a key to the orientation labels......... 137

4.8 Output of detection algorithm. Final detection shown top left, segmentation shown top right, orientation assignments shown in four bottom images. The top left of the orientation images gives a key to the orientation labels......... 137

4.9 Output of detection algorithm. Final detection shown top left, segmentation shown top right, orientation assignments shown in four bottom images. The top left of the orientation images gives a key to the orientation labels......... 138

4.10 Output of detection algorithm. Final detection shown top left, segmentation shown top right, orientation assignments shown in four bottom images. The top left of the orientation images gives a key to the orientation labels......... 138 ix

4.11 Output of detection algorithm. Final detection shown top left, segmentation shown top right, orientation assignments shown in four bottom images. The top left of the orientation images gives a key to the orientation labels......... 139





4.12 Output of detection algorithm. Final detection shown top left, segmentation shown top right, orientation assignments shown in four bottom images. The top left of the orientation images gives a key to the orientation labels......... 139

4.13 Output of detection algorithm. Final detection shown top left, segmentation shown top right, orientation assignments shown in four bottom images. The top left of the orientation images gives a key to the orientation labels......... 140

–  –  –

6.1 With a θ-flexible decomposition family, any parse can be adjusted to an allowable parse by moving the midpoint of each binary rule slightly. On the left, a parse before adjustment. On the right, after the adjustment. Vertical lines denote allowable midpoint choices.............................. 157

6.2 Illustrating the construction from Theorem 6.4.1, with k = 4. Rectangles denote portions of the string between members of the index set. A selection of intervals that live at each level are shown........................... 160

–  –  –

In computer vision, grammatical models are models that represent objects hierarchically as compositions of sub-objects. This allows us to specify rich object models in a standard Bayesian probabilistic framework. In this thesis, we formulate shape grammars, a probabilistic model of curve formation that allows for both continuous variation and structural variation. We derive an EM-based training algorithm for shape grammars. We demonstrate the effectiveness of shape grammars for modeling human silhouettes, and also demonstrate their effectiveness in classifying curves by shape. We also give a general method for heuristically speeding up a large class of dynamic programming algorithms. We provide a general framework for discussing coarse-to-fine search strategies, and provide proofs of correctness.

Our method can also be used with inadmissible heuristics.

Finally, we give an algorithm for doing approximate context-free parsing of long strings in linear time. We define a notion of approximate parsing in terms of restricted families of decompositions, and construct small families which can approximate arbitrary parses.

–  –  –

We want to study grammatical methods in vision (also called compositional methods). Much past work has been done on this, including Amit and Trouv´ [2007], Bienenstock et al. [1997], e Fu [1986], Geman et al., Grenander and Miller [2007], Han and Zhu [2009], Jin and Geman [2006], Potter [1999], Tu et al. [2005], Zhu et al. [2009], Zhu and Mumford [2006], Felzenszwalb and McAllester [2010], but many fundamental questions remain unanswered.

Grammatical methods are characterized by the following:

• Decomposing images hierarchically, often in a semantically meaningful way. This is often referred to as parsing the image. Ideally, we would like an explanation for an entire scene. For example, in Figure 1.1, each image pixel belongs to at least one object (such as a car), and some objects are sub-parts of other objects (a wheel is part of a car).

• Part-based object models whose parts are other object models. For example, a model of a car would contain a model for a wheel, which could also be used to model wheels on their own.

• Models that contain reusable parts. For example, a model of a face could use a single model for both eyes. The reusable parts could also use themselves recursively, as is seen in fractal shapes.

• Modeling some object classes as mixtures of sub-class models. This is important because some conceptually meaningful classes such as “chair” contain wildly disparate elements (such as easy chairs and lawn chairs) that cannot be captured by a single model.

• Models that exhibit large amounts of structural variation. For example, trees of a single species will have wildly different shapes, which share some common properties.

In particular, object models that exhibit choice between different models of various object parts. These choices may be nested; for example, the number of branches in a tree is not fixed in advance, and neither is the number of sub-branches of each branch.

This is partly motivated by consideration of the human visual system:

• Humans use context to resolve ambiguities (Bar [2004]). This means that we can only identify some objects by modeling their relationship to other objects. This can be seen in Amit and Trouv´ [2007].

e

• Humans interpret some groupings of objects as a larger level object or activity, such as crowds of people or flocks of birds. Gestalt research demonstrates that perception has definite, repeatable grouping rules (Wertheimer [1938]).

• Humans seem to interpret whole scenes even when answering simpler visual questions, such as edge detection and segmentation. This can be seen in Figure 1.2, which shows an example from the Berkeley Segmentation Database (Martin et al. [2001]).

Grammatical methods are also motivated by theoretical considerations. They are a natural choice for modeling large structural variations (see Section 5.1). Grammatical models give a principled way to avoid hard decisions for low-level visual tasks (see Section 1.2).

They allow strong models of background clutter (see Section 1.2). They allow whole scene parsing (see Section 1.2). They make it easy to integrate models from other domains into visual models (see Section 1.4.1). Finally, there are theoretical reasons why grammars may provide better generalization than other models.

–  –  –

Figure 1.2: Even on low-level visual tasks such as segmentation, humans give answers based on an interpretation of the whole scene.

Figure adapted from Martin et al. [2001].

classes. The shape of a boundary is invariant to many photometric effects, in particular illumination (Canny [1986]).

–  –  –

As an example throughout this document, we discuss a system which would take images of a classroom chalkboard and attempt to parse them into lecture notes. This is an application with lots of noise. It is also an application where different levels of interpretation are required, since lectures can contain both sentences (which should be interpreted thoroughly, as text) and drawings (which could be interpreted partially as collections of lines, but which may contain unsummarizable elements which must be included verbatim).

In Section 1.4.1, we argue that grammars are a natural choice for such an application, because they make it possible to integrate statistical models from other domains in a straightforward and principled manner.

1.1.3 Visual Search Engine

Google image search is a nice and useful thing, but it relies partially on images being associated with relevant text on web pages. It would be nice to find raw or under-described images, and it would be nice to base search results more on the contents of the image. We might also submit images as queries, rather than text. The LabelMe dataset is a good challenge for this task.

In Section 1.3, we argue that hierarchical decomposition would allow the necessary rich understanding of relationships between objects.

–  –  –

Huttenlocher [2003], Felzenszwalb and McAllester [2010]. Recognizing these as a special case means that we may be able to improve upon this work by specifying richer models in some cases, or models that are better mathematically founded (and thus potentially trainable) in

other cases. There are two tricks for turning a model into a grammar model:

• Mixture models are a special case of grammar models. If we have mixture components M1,..., Mk with mixture weights p1,..., pk, then we can build a grammar model M

in which we have rules:

–  –  –

The same is true of nearest neighbor models. This is exciting because mixture models and nearest neighbor models are often very powerful, but do not generalize well from a small amount of data.

• Deformable parts-based models are a special case of grammar models (Felzenszwalb and McAllester [2010]). Let M (x) denote the hypothesis that model M appears at image location x. If we have part models P1,..., Pk, then we can build a grammar

model M which has the rules:

–  –  –

The δi represent the ideal displacement of each part Pi from the object model. The model is deformable because the second kind of rule allows the parts to be randomly displaced. The probability of Pi (x) → Pi (x + ∆) will depend on ∆. The third kind of rule gives the probability for placing a part, which can be thought of as the probability that a part Pi would produce the image data under it, I(x1 ± wi, x2 ± hi ).



Pages:   || 2 | 3 | 4 | 5 |   ...   | 11 |


Similar works:

«HEARING THE HILLSONG SOUND: MUSIC, MARKETING, MEANING AND BRANDED SPIRITUAL EXPERIENCE AT A TRANSNATIONAL MEGACHURCH Thomas J. Wagner A dissertation submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy (Music) Royal Holloway University of London ii Declaration of Authorship I Thomas Wagner hereby declare that this thesis and the work presented in it is entirely my own. Where I have consulted the work of others, this is always clearly stated. Signed: Date:...»

«Maternal Sensitivity in Mother-Infant Interactions for Infants with and without Prenatal Alcohol Exposure Jennifer M. Nash A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2013 Reading Committee: Tracy Jirikowic, Chair Marcia Ciol Deborah Kartin Susan Spieker Program Authorized to Offer Degree: Rehabilitation Science ©Copyright 2013 Jennifer M. Nash 2 University of Washington Abstract Maternal Sensitivity in...»

«Innovations in Imaging System Design: Gigapixel, Chip-Scale and MultiFunctional Microscopy Thesis by Guoan Zheng In Partial Fulfillment of the Requirements for the degree of Doctor of Philosophy CALIFORNIA INSTITUTE OF TECHNOLOGY Pasadena, California (Defended Oct 19th, 2012) ii  2013 Guoan Zheng All Rights Reserved iii Thesis Committee Professor Changhuei Yang (Chair) Professor Hyuck Choo Professor Yu-Chong Tai Professor Michael B. Elowitz Professor Scott E. Fraser iv This thesis is...»

«PATRICK GRIM SUNY Distinguished Teaching Professor Department of Philosophy State University of New York at Stony Brook Stony Brook, New York 11794 cell (631) 790-2356 fax (631) 632-7522 patrick.grim@stonybrook.edu www.pgrim.org Specializations Philosophical Logic, Philosophical Computer Modeling (Agent-Based Modeling, Networks, Artificial Societies, and Evolutionary Game Theory), Ethics, Philosophy of Religion, Philosophy of Science Positions Stony Brook: Distinguished Teaching Professor, 2001...»

«EXPLORING THE DIVERSITY OF GENTRIFICATION AND THE ROLE OF GENDER IN HONG KONG, 1986 TO 2006 By Minting Ye A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Geography –Doctor of Philosophy 2014 ABSTRACT EXPLORING THE DIVERSITY OF GENTRIFICATION AND THE ROLE OF GENDER IN HONG KONG, 1986 TO 2006 By Minting Ye Gentrification is restructuring the geography of cities all over the world (Clark, 2005; Hackworth and Smith, 2001; Lees,...»

«COMBINED EXPERIMENTAL/THEORETICAL APPROACH TOWARD THE DEVELOPMENT OF CARBON TOLERANT ELECTROCATALYSTS FOR SOLID OXIDE FUEL CELL ANODES by Eranda Nikolla A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Chemical Engineering) in The University of Michigan Doctoral Committee: Assistant Professor Suljo Linic, Co-chair Professor Johannes W. Schwank, Co-chair Professor Erdogan Gulari Professor John W. Halloran Professor Phillip E. Savage ©...»

«SYNAPTIC ACTIVITY AND THE FORMATION AND MAINTENANCE OF NEURONAL CIRCUITS Inauguraldissertation zur Erlangung der Würde eines Doktors der Philosophie vorgelegt den Philosophisch-Naturwissenschaftliche Fakultät der Universität Basel von Martijn Johan Louis Roelandse aus Oosterhout, die Niederlande Basel, September 2005 Genehmigt von der Philosophisch-Naturwissenschaftlichen Fakultät auf Antrag von Prof. Dr. phil. A. Matus Prof. Dr. phil. H.R. Brenner Prof. Dr. phil. M. Frotscher Basel, den...»

«PHILOSOPHY IN LITERATURE Introduction ACCORDING TO THE OXFORD ENGLISH DICTIONARY the novel is a fictitious prose narrative of considerable length, in which characters and actions representative of the real life of past or present times are portrayed in a plot of greater or less complexity.‖ An important characteristic of the novel is that its plot should unfold with sufficient cogency to capture the reader’s attention and induce what Coleridge called ―the willing suspension of...»

«Atatürk Üniversitesi Sosyal Bilimler Enstitüsü Dergisi 2010 14 (1): 1-10 Henderson the Rain King: Resolving Existential Despair with Theist Existential Philosophy Bülent Cercis TANRITANIR (*) Özcan AKŞAK (**) Abstract: Saul Bellow has written in the modern area. Many novelists in this period have seen to problems of man that are distinctive and prevalent in the modern period. Philosophical movements in this period have also influenced writers of the time. Existentialism is one of the...»

«Forensic Taphonomy: Investigating the Post Mortem Biochemical Properties of Cartilage and Fungal Succession as Potential Forensic Tools A thesis presented for the degree of Doctor of Philosophy Shawna N. Bolton, B.A. (Hons.), M.Sc. University of Wolverhampton Faculty of Science and Engineering PhD Supervisors: Dr Michael Whitehead and Dr Raul Sutton ii iii Abstract Post mortem interval (PMI – the time elapsed since death and discovery) is important to medicolegal investigations. It helps to...»

«A CONCEPTUAL FRAMEWORK AND APPROACH FOR ENHANCING TRANSPORTATION ASSET MANAGEMENT (TAM) IMPLEMENTATION FOR SUSTAINED TAM PROGRAMS A Dissertation Presented to The Academic Faculty By Margaret-Avis N. A. Akofio-Sowah In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in Civil and Environmental Engineering Georgia Institute of Technology August 2015 Copyright © Margaret-Avis A. Akofio-Sowah 2015 A CONCEPTUAL FRAMEWORK AND APPROACH FOR ENHANCING TRANSPORTATION ASSET...»

«KALIDASA TRANSLATIONS OF SHAKUNTALA & OTHER WORKS BY ARTHUR W. RYDER THIS IS NO. 629 OF EVERYMAN’S LIBRARY. THE PUBLISHERS WILL BE PLEASED TO SEND FREELY TO ALL APPLICANTS A LIST OF THE PUBLISHED AND PROJECTED VOLUMES ARRANGED UNDER THE FOLLOWING SECTIONS: TRAVEL · SCIENCE · FICTION THEOLOGY & PHILOSOPHY HISTORY · CLASSICAL FOR YOUNG PEOPLE ESSAYS · ORATORY POETRY & DRAMA BIOGRAPHY REFERENCE ROMANCE THE ORDINARY EDITION IS BOUND IN CLOTH WITH GILT DESIGN AND COLOURED TOP. THERE IS ALSO A...»





 
<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.