WWW.DISSERTATION.XLIBX.INFO FREE ELECTRONIC LIBRARY - Dissertations, online materials

<< HOME
CONTACTS

Pages:     | 1 |   ...   | 8 | 9 || 11 |

# «THE UNIVERSITY OF CHICAGO GRAMMATICAL METHODS IN COMPUTER VISION A DISSERTATION SUBMITTED TO THE FACULTY OF THE DIVISION OF THE PHYSICAL SCIENCES IN ...»

-- [ Page 10 ] --

Finding the optimal such grammar is a very hard unsolved problem even for standard CFG’s, as discussed in Section 1.6.1. In particular, we do not know of any optimal way to search over grammatical structures, so we propose to start from the initial grammars described in Section 2.7 and proceed by greedily applying transformations such as replacement (discussed in Section 5.4), merging (discussed in Section 5.5), and rule deletions. We could also try a slightly more ﬂexible search algorithm, such as the beam search used in Stolcke [1994].

In the light of Stolcke [1994], one should consider an initial structure and transformations such that we can reach any grammatical structure from the initial one.

To assess such a search strategy, a simple task would be to build a shape grammar by hand and take samples from it, and try to recover the grammar from the samples. It is unlikely that we will recover the same exact grammar, so we would instead try to show that we can improve some measure of closeness, for instance the cross-entropy on unseen samples.

–  –  –

for some binary encoding function enc(·) : {G} → {0, 1}∗. enc must be uniquely decodable, i. e., enc(N, R, S) = enc(N, R, S ) =⇒ N = N, R = R, S = S.

The encoding function enc should be as concise as possible, so that simple grammars are rewarded for their simplicity. We choose to represent N, R, and S as

–  –  –

where the iX denote some consistent numbering of the nonterminals. We assume that every nonterminal X ∈ N has the rule [X → ] ∈ R(X), and thus do not explicitly encode this information.

Each component of this representation can be encoded in binary:

• |N | is represented as a standard binary integer, taking up kN log2 N bits

• Each (iX, iY, iZ ) is represented as three binary integers, each zero-padded to take up exactly kN bits.

• Each (iY, iZ ) is represented as two binary integers, each zero-padded to take up exactly kN bits.

We thus use up kN (1 + 3|R| + 2|S|) bits. We also need to use 2 log2 kN + 2 bits to specify how to break the stream of bits into “words” of kN bits. (We simply need to encode kN itself. We need to be able to specify where the encoding of kN stops and the rest of the encoding begins. One relatively simple way to do this is to encode kN in binary, and then preface every digit with a 1. We then use 00 to denote the end of kN.) Thus our total description length is

–  –  –

Given a grammar G, we wish to create a grammar which still explains our training data, and is simpler. We would also like to create grammars that have reusable parts, as explained in Bernstein and Amit [2005]. This can be done using the operation of replacing Y with X: whenever Y is on the right-hand side of a rule, we replace it with X. Whenever Y is on the left-hand side of a rule, we delete that rule. This may lead to a merging of two rules [Z1 → XZ2 ], [Z1 → Y Z2 ], in which case we combine the multinomial probabilities and average the midpoint distributions.

Why does this operation create reusable parts? Ideally, before the replacement, Y and X are nonterminals that are used to parse similar curves (i.e., they have similar yields), but which ﬁt diﬀerently into the larger grammar. After the replacement, these curves will be parsed only with X, and X will ﬁt into the larger grammar in two diﬀerent ways. X has thus become a reusable part.

Note that replacement is asymmetric. This could be good; since approximate similarity is not transitive, we would like to replace nonterminals X1,..., Xk with whichever Xi is most representative.

This transformation should be applied when the nonterminals X and Y have similar internal structure. Intuitively, this means that any curve generated by X can be generated by Y with a similar probability, and vice versa.

Suppose that we have two nonterminals X and Y, and we want to decide whether we should replace Y with X, given that we want a simple model which still explains our data, which is formalized by trying to maximize the posterior probability of the chosen grammar.

Performing the replacement will simplify the grammar, which will be reﬂected in the MDL prior over structure. In order to make sure that we are still explaining the data, we need to consider the ratio of the likelihood under the current grammar to the likelihood under the grammar after performing the replacement.

Consider the quantity

–  –  –

The last two terms correspond to the reward for simplifying the grammar. The ﬁrst term measures how much better or worse the new grammar is at explaining the data.

We can make the simplifying assumption or approximation (used in Stolcke [1994]) that the distribution T ∼ T|C, G is arrived at by sampling from T ∼ T|C, G and performing the operation Replace(Y,X) on each parse tree; we then have

–  –  –

We can also arrive at a similar but slightly diﬀerent formula by trying to empirically estimate the KL divergence between the inside distributions of X and Y. The inside distribution of X is the distribution over curves that comes from sampling in the standard way. We call this distribution the inside distribution because its probabilities are calculated in the ﬁrst pass of the Inside-Outside Algorithm, as explained in Section 2.6.

The Kullback-Leibler divergence is a popular choice for measuring the distance between two distributions. It is deﬁned as

–  –  –

Let X be a nonterminal. For every subcurve C of any training curve, we can use reweighting to consider C as a sample from X’s inside distribution. In this case, P, Q are the inside distributions of X and Y, respectively. This approximation is especially appealing because the Inside-Outside algorithm gives us a very eﬃcient way to simultaneously compute PX (Ca [i : j]) for every nonterminal X and every subcurve Ca [i : j] of every training curve Ca. Speciﬁcally, we can compute all these in O(kn3 |N |) time, where k is the number of training curves, and n is the maximum length of a training curve.

When dealing with the initial grammar, this approximation should be reasonable, since our initial grammars don’t generate any curves that are too dissimilar from our training curves. We hope that generalization will take place as the grammar is simpliﬁed; thus, the approximation will get worse as we simplify the grammar.

Given training data C1,..., Cn, we can evaluate any transformation that takes G to G by examining the posterior

–  –  –

which is computable via parsing. This gives a general search strategy: simply pick the transformation that increases the value of the posterior the most.

–  –  –

We would like to show that grammars can capture an exponential amount of variability through factoring, as discussed in Section 5.1.

The easiest task would be to correctly apply merging (Section 5.5) to the n-armed shapes of Section 5.1, so that we can generate all of the shapes in the class after having seen a small number of them. We can make this easier by having diﬀerent arm variants in each of the n arm slots. We can also start on an easier version by training on bracketed samples.

A harder merging task is to use merging to get accurate models of the Romer and Weizmann Horse datasets.

5.5.2 Creating Choice with Merging

We want to create grammars that have choice, as described in Section 5.1.1. This can be done by the operation of merging X and Y : we replace every instance of Y in either the left or right-hand side of a rule with X. This may lead to a merging of two rules [X → Z1 Z2 ], [Y → Z1, Z2 ] (and some other cases). When this happens, we combine the multinomial probabilities and average the midpoint distributions of the rules.

As a heuristic to guide merging, we plan to apply the approach of the last section, but to use the outside probability instead of the inside probability. If two nonterminals X and Y have similar outside probabilities on every subcurve, then X and Y can be said to have similar outer structure, since the grammar is placing them in similar contexts. Intuitively, this means that X and Y are interchangeable, so that for any curve in which X is used to explain a subcurve, there is another equally likely curve where that subcurve has been replaced with a subcurve that can be explained by Y.

5.6 How do we Identify Natural Constituents?

5.6.1 Goal: Learning Constituents

The EM algorithm with a sparsifying prior will hopefully be able to recover grammatical structure. Therefore, if we start from a grammar with choice (see Section 3.7), we hope that doing EM iterations will give us a grammar with fewer choices that correspond to natural decompositions of shapes. Note that this method can only work with many training examples, and cannot be used to identify natural decompositions of a single shape.

Easy tasks for this would be trying to ﬁnd constituents in stars and polygons, and the n-armed shape. We could also try to get intuitively reasonable constituents for silhouettes of people and horses.

–  –  –

In Basri et al. [1998], Richards et al. [1985], Kimia et al. [2003], it is argued that some shapes have natural constituents, and some shapes do not. Some cues that we believe would give

us some clues to structure:

• Large-scale curvature. If we smooth the points of a curve with a Gaussian, then we can measure curvature at diﬀerent levels. We feel that large-scale curvature is likely to occur only at constituent boundaries.

–  –  –

We use explicit correspondences to learn the statistically best set of constituents when building a grammar from a single example. (This experiment is closely related to an experiment in Felzenszwalb [2003].) Speciﬁcally, since the points on each curve from the hand-annotated Romer dataset correspond semantically, we can ask which decomposition of the curves yields the model which gives the highest log-likelihood to the curves seen. This can be done with

a simple dynamic program:

–  –  –

Minimizing the sum of negative log-likelihoods of the Watson distribution is simply maximum likelihood estimation for the Watson distribution, which is explained in Subsection 3.1.5.

The constituents selected by the algorithm are shown in Figure 5.5 The constituents that seemed most intuitive to me are shown in Figure 5.6.

Figure 5.5: Finding optimal constituents.

This ﬁgure generated by the experiment experiments/6.structure/constituents.

Figure 5.6: Hand-picked constituents.

## APPROXIMATE PARSING OF ONE-DIMENSIONAL

–  –  –

Recall that the problem of parsing curves with a shape grammar is closely related to the problem of parsing strings with a weighted context free grammar. The running time of context-free parsing with a standard algorithm like CKY is cubic, and it is not known how to do better than this in practice. This makes parsing prohibitively slow for long strings in particular. This prevents the use of context-free grammars in modeling long one-dimensional signals, which are instead commonly modeled with Markov models.

The main contribution of this chapter is to introduce an approach for eﬃcient context-free parsing of very long strings. This would allow fast curve parsing with our shape grammars, and also the use of context-free models for other problems that involve long one-dimensional signals.

6.1 CKY parsing

We now recall the CKY algorithm for parsing strings with a weighted context-free grammar in Chomsky Normal Form, which is shown in Algorithm 7. Let s be a string of length n.

Let (X, R) be a weighted context-free grammar, with X a set of symbols and R a set of weighted rules. Let w(X → λ) the weight of the rule (X → λ) ∈ R.

Let [i, j) be the interval starting with i and ending with j − 1. The algorithm works by computing a cost COST [X, [i, j)] for each symbol X and each interval [i, j), corresponding to the cost of parsing the substring s[i]... s[j − 1] with the nonterminal X.

The O(n3 ) runtime of the CKY algorithm arises because there are O(n2 ) substrings, and many substrings can be decomposed in O(n) ways. If we could limit the number of constituents and decompositions, we could speed up the algorithm. This brings us to the next section, where we deﬁne a framework for considering restricted families of constituents Algorithm 7 CKY parsing. Returns cost of best parse.

Input: string s, context-free grammar (X, R) for X ∈ X do for i = 0 to n − 1 do COST [X, [i, i + 1)] ← w(X → s[i]) end for end for for = 2 to n − 1 do for i = 0 to n − 1 do COST [X, [i, i + )] ← ∞ for j = i + 1 to i + − 2 do for X → Y Z ∈ R do COST [X, [i, i + )] ← min{COST [X, [i, i + )], w(X → Y Z) + COST [Y, [i, j)] + COST [Z, [j, i + )]} end for end for end for end for return COST [S, [0, n)] and decompositions.

–  –  –

Deﬁnition 6.2.1. Let s be a string of length n. A decomposition family for s is a pair (I, D), where I is a set of intervals [i, j), and D is a set of decompositions of the form

–  –  –

where [i, j), [i, k), [k, j) ∈ I. I must contain the interval [0, n).

The most trivial decomposition family is the set of all intervals [i, j) such that 0 ≤ i j ≤ n, together with all possible decompositions of each [i, j) into [i, k) + [k, j). We will call this the complete decomposition family.

We would like small decomposition families that can approximate arbitrary parses reasonably well. We capture this notion with θ-ﬂexibility:

Figure 6.1: With a θ-ﬂexible decomposition family, any parse can be adjusted to an allowable parse by moving the midpoint of each binary rule slightly.

On the left, a parse before adjustment. On the right, after the adjustment. Vertical lines denote allowable midpoint choices.

Pages:     | 1 |   ...   | 8 | 9 || 11 |

Similar works:

«Expression of recombinant T-cell epitopes of major Japanese cedar pollen allergens fused with cholera toxin B subunit in bacterial strains HOANG VAN VINH A dissertation submitted to Kochi University of Technology In partial fulfillment of the requirements for the degree of Doctor of Philosophy The Graduate School of Engineering Kochi University of Technology Kochi, Japan September, 2014 -1Expression of recombinant T-cell epitopes of major Japanese cedar pollen allergens fused with cholera toxin...»

«JOINT INVERSION OF PRODUCTION AND TIME-LAPSE SEISMIC DATA: APPLICATION TO NORNE FIELD A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ENERGY RESOURCES ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Amit Suman March 2013 © Copyright by Amit Suman 2013 All Rights Reserved ii I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a...»

«The Philosophical Basis of Kazantzakis’s Writings Peter Bien My subject tonight is the philosophical basis of Nikos Kazantzakis’s writings. Let me start by saying that Kazantzakis was way ahead of his time philosophically and is still way ahead of our own time in many ways. When I use the word “philosophically” in this context, I really mean “religiously,” because Kazantzakis’s basic philosophy is a cosmology—namely, a study of what makes the universe tick. This is what religion...»

«School Travel Mode Choice Behaviour in Toronto, Canada by Raktim Mitra A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Program in Planning, Department of Geography University of Toronto © Copyright by Raktim Mitra 2011 School Travel Mode Choice Behaviour in Toronto, Canada Raktim Mitra Doctor of Philosophy Program in Planning, Department of Geography University of Toronto Abstract Interest in school transportation has emerged in response to concern...»

«EFFECTS OF DEFORESTATION AND RIPARIAN BUFFERS ON LOTIC COMMUNITIES IN SOUTHEASTERN COSTA RICA: IMPLICATIONS FOR BIODIVERSITY CONSERVATION IN TROPICAL STREAMS A Dissertation Presented in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy with a Major in Natural Resources in the College of Graduate Studies University of Idaho and in the Graduate School Centro Agronómico Tropical de Investigación y Enseñanza by Christopher M. Lorion December 2007 Major Professors:...»

«La Rosa Gardens – Ladysmith B.C. – Handbook for Residents Front entry and sitting bench – La Rosa Gardens, 1211 Cloke Road, Ladysmith B.C. website: http://4allseasonscare.com/ Our Philosophy La Rosa Gardens provides housing for seniors in a safe, secure, comfortable and welcoming environment within our community. Management and staff of La Rosa support residents to maintain their independence but do not provide care. Table of Contents: Page Number About the Community of Ladysmith.3...»

«Abstractions for Language-Independent Program Transformations Karl Trygve Kalleberg Thesis for the degree of Philosophiae Doctor (PhD) at the University of Bergen 2007-05-11 ISBN 978-82-308-0441-4 Bergen, Norway 2007 Copyright Karl Trygve Kalleberg Produced by: Allkopi Bergen Abstractions for Language-Independent Program Transformations Karl Trygve Kalleberg Department of Informatics Thesis for the degree of Philosophiae Doctor (PhD) at the University of Bergen 2007-05-11 iv Contents...»

«No God, No Laws Nancy Cartwright Philosophy LSE and UCSD Introduction. My thesis is summarized in my title, ‘No God, No Laws’: the concept of a law of Nature cannot be made sense of without God. It is not as dramatic a thesis as it might look, however. I do not mean to argue that the enterprise of modern science cannot be made sense of without God. Rather, if you want to make sense of it you had better not think of science as discovering laws of Nature, for there cannot be any of these...»

«MOLECULAR REGULATION OF SYNAPTOGENESIS IN DROSOPHILA by DAVID ALAN WALLA JR. A DISSERTATION Presented to the Department of Chemistry and Biochemistry and the Graduate School of the University of Oregon in partial fulfillment of the requirements for the degree of Doctor of Philosophy June 2014 DISSERTATION APPROVAL PAGE Student: David Alan Walla Jr. Title: Molecular Regulation of Synaptogenesis in Drosophila This dissertation has been accepted and approved in partial fulfillment of the...»

«THE EFFECTS OF THEORIES OF INTELLIGENCE AND TASK OUTCOME ON AVOIDANCE OF PERFORMANCE FEEDBACK By CORINNE A NOVELL A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA © 2012 Corinne A. Novell To my Mom Christine and my twin sister Jen—the most amazing women I know—for the support they have given me throughout my life journeys ACKNOWLEDGEMENTS I would like to thank...»

«A Renaissance Man in Memory: Appayya Dīkṣita Through the Ages Yigal Bronner Journal of Indian Philosophy ISSN 0022-1791 Volume 44 Number 1 J Indian Philos (2016) 44:11-39 DOI 10.1007/s10781-014-9251-6 1 23 Your article is protected by copyright and all rights are held exclusively by Springer Science +Business Media Dordrecht. This e-offprint is for personal use only and shall not be selfarchived in electronic repositories. If you wish to self-archive your article, please use the accepted...»

«“STUFF” AND SUBSTANTIAL CHANGE by Jan Swiderski A thesis submitted to the Department of Philosophy In conformity with the requirements for the degree of Master of Arts Queen’s University Kingston, Ontario, Canada (September, 2015) Copyright © Jan Swiderski, 2015 Abstract I consider whether coincident objects can be avoided, while preserving matter’s survival of substantial change and the everyday understanding of identity, by recognizing the ontological category of stuff. In Chapter 1,...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.