FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:     | 1 |   ...   | 5 | 6 || 8 | 9 |   ...   | 19 |

«Learning Implicit User Interest Hierarchy for Web Personalization by Hyoung-rae Kim A dissertation submitted to Florida Institute of Technology in ...»

-- [ Page 7 ] --

–  –  –

Negative correlation has a negative log CPR value. P4 is that F can distinguish positive and negative correlation of A and B. Since positive correlation is much more important than negative correlation in finding phrases, we only measured the change of correlation values over positive correlation.

Tan et al. (2002) illustrated those properties and extended them to each of the existing measures to determine if the existing measure satisfies the properties required (Kamber and Shinghal, 1996). Some of these properties have been extensively investigated in the data mining literature (Hilderman and Hamilton, 2001; Tan and Kumar, 2000; Tan et al., 2002). We examined 32 correlation functions of properties and cooperated them with phrase-finding algorithms. A complete list of the correlation functions to be examined in this study is given in Appendix 1.

3.2. Experiments 3.2.1. Experimental Data and Procedures We use five New York Times articles and five web pages collected from our department server. We use web pages to test, because contents in a web page differ from the content found in normal article. The data used in this study is accessible at http://cs.fit.edu/~hkim/dissertation/dissertation.htm. The article size was about 2 pages and we asked 10 human subjects, other than the authors, to read the 10 articles. Each article contains about 1,300 words - 720 words after removing stop-words. Since we assigned 6 as a threshold of maximum phrases length for Ahonen’s and Chan’s, the total possible number of phrases for each article is approximately 3600 (=720×5). Of the 10 subjects, 4 were graduate students from a department of computer sciences and 6 were undergraduates with various majors.

We asked the 10 human subjects to choose their top 10 meaningful phrases for each article or web page. One might insist that the results will be different depending on how 10 humans are chosen. If all volunteers have the same background the matching rate will be higher than the normal case. However, since we are not comparing the algorithm with humans, but comparing among algorithms, it does not matter how we chose the 10 volunteers. Furthermore, since the algorithm finds phrases statistically that cover general human meaningfulness, we choose 10 human subjects arbitrarily.

The instruction that we gave them were:

Identify the top 10 "meaningful/important" phrases for each article.

Phrases are defined as two or more adjacent words that are meaningful, for example, "computer science," "florida institute of technology,"...

The definition of meaningful is up to you.

We will measure the number of matches between the human subjects’ selections and different correlation functions’ selections as well as different phrase-finding algorithms.

We also count average matching of humans – in this case, we divided the sum by

9. There are cases for the human or algorithm to select less than 10 phrases. In order to be fair in these cases, we use an additional adjustment function. We also attempt to prevent a measure from being scored 1 by finding one phrase and having one matched phrase by chance – the results are too sensitive. We, therefore, decided to give a lower incentive as a measure finds fewer phrases 20% at the most. For example, if there are 5 matches out of 10, the number of matching is 5×1/10. If there are 5 matches out of 9, then we assigned 5×1/9. But, if there are 5 matches out of 8, then we assigned 5×1/8.5. The generalized

formula is:

–  –  –

, where m is the number of matched words and f is the number of selected words.

We also applied different correlation functions to Ahonen’s algorithm to see if the difference of the performance depended on the correlation functions. Ahonen used two different correlation functions: conditional probability (Confidence, F11) for filtering phrases and mutual confidence (F32) for ordering the collected phrases determining which phrase is more important than the other. Since he used fixed user-defined threshold (0.2) for filtering the phrases, we only varied the correlation function used for ordering phrases.

3.2.2. Evaluation Criteria We evaluate the meaningfulness of phrases. We believe the closer a match comes to our set of human-selected phrases, the better the phrase-finding algorithm is in terms of finding meaningful phrases. To evaluate the correlation functions for each phrase-finding algorithm, we have two evaluation criteria: the number of exact matches and the number of simple matches. We have 128 methods (4 algorithms × 32 correlation functions) – the 4 algorithms are VPF, Chan’s, Ahonen’s, and Ahonen’s with gap, and the correlation functions F1 through F32 are in Appendix A.

–  –  –

The number of exact matches of a method is measured by the percentage of the matches between the human’s and a method’s. We count each match with a human’s and then average the 10 compared results. This counting approach assigns more weights to the more meaningful words – more meaningful word means that they were selected by more human subjects. If a phrase is selected by several human subjects, every match is counted.

Therefore, finding more popular phrases increases the matching average. The number of matches will be very low, because only 10 phrases selected by a method and a human respectively are going to be compared.

The number of simple matches counts the matched phrases against the list collected by all human (i.e., the union of the words from the 10 human subjects). The list will be less then 100 because some phrases can overlap. Simple match is not directly related to finding more meaningful phrases, because this count removes the popularity information. We count this only to support the result of the exact match.

Comparison of the results is done using a matched-pair design (Robertson, 1981).

In this design, the top ten phrases in the ranking generated are compared. The comparison, which simply identifies if one group of ten phrases is better than the other, is on the basis of precision in other words the number of matched phrases. This type of evaluation has the following advantages: It is realistic in the sense that many information retrieval systems are interested in the top group. Traditional recall/precision tables are very sensitive to the initial rank positions and evaluate entire rankings (Crotf and Das, 1989). Another advantage is that significance measures can be readily used.

3.3. Results and Analysis Before comparing our algorithm with existing methods we need to decide whether to use pruning or not. After that we will be able to perform the comparison. In evaluating our method against related algorithms we use different scoring methods: exact match and simple match.

3.3.1. With-pruning vs. Without-pruning The VPF algorithm has a pruning function. The results differed whether we used the pruning function or not. We compared them by comparing the top 10 best methods with exact match (Sec 6.2). By composing the algorithm and 32 correlation functions (in Appendix A), we generated 32 methods. We ranked the top 10 methods using “with pruning” and presented the corresponding results of “without-pruning” next in Table 10.

The values are the average of matches across 10 human subjects and 10 articles. Most methods yielded improved results when they had been pruned. The top method VPF with F25 had improved its performance by 5.7%. With pruning is statistically significantly better than without-pruning with a 95% confidence interval (P=0.004).

3.3.2. Analysis with Exact Match Because “with-pruning” achieves a higher matching rate than “without-pruning”, we decided to use pruning in our algorithms for the rest of our experiment. Top 10 Methods The main purpose of the analysis in this section is to choose the best method.

Which method is the best is the most interesting question. We averaged the results from 10 articles and 10 human subjects and sorted by the average to rank all 128 methods. We presented the results in Table 11 and included the rank, methods used, and the average.

Each method was composed of an algorithm and a correlation function. Notice that, we also presented the results of previous methods. Ahonen used correlation function F32. He also introduced a method with gaps. The row Ahonen_gap represented the results using Ahonen’s method allowing gaps within a phrase.

The best method was the combination of VPF and correlation functions F25 followed by F16 and F28 – all those three correlation functions satisfied PiatetskyShapiro’s three desirable properties and distinguish positive from negative correlations.

The best method VPF with F25 matched 0.93 phrases on average with the phrases selected by a human subject. In the next section we measured the average number of matching phrases between human subjects and compared those results to the results from methods.

Interestingly, VPF won the top 3. Chan’s algorithm occupied the next ranks.

Another observation was that the correlation functions F25, F16, and F28 that marked high rank with VPF also marked high rank with Chan’s. This observation implied that the performance also depends on the correlation functions.

Table 11. Ranked by average across humans and articles – Exact match

–  –  –

Ahonen’s algorithm ranked 24 and Ahonen_gap 105. These methods matched

0.767 and 0.452 numbers of phrases with human subjects respectively. The low performance with gap is the same phenomenon as shown in (Ahonen et al., 1998). We conducted t-Test (paired two sample for means) between VPF with F25 and Ahonen with F32. There was a clear statistically significant difference between the two methods with 95% confidence (P=0.016). Therefore, we can conclude that VPF with F25 found statistically significantly more meaningful phrases than Ahonen’s previous algorithm.

Ahonen’s algorithm with other correlation functions received higher ranks such as F6, F10, F11, F12, F17, F26, and F20 as shown in Table 11. They all ranked 13, 15, and 20, which are higher than Ahonen’s original method (24). This indicates Ahonen’s algorithm can be improved upon by using different correlation functions. Comparing with Human Subjects To see the average number of matches among human subjects is interesting and also provides insight into interpreting the average number of matching by the algorithm.

For instance, if an algorithm matches 1 on average and the human matches 7, then the performance of the algorithm is almost negligible no matter how much higher its performance is compared to others.

We presented the best, average, and worst matching human results in Table 3. The results told us that only 1.3 phrases out of 10 picked by a human subject matched with the phrases picked by the others on average. This is not an unrealistic result. Considering that each document has approximately 1,300 words, more than 7779 possible phrase combinations exist and each person has a different background, matching 1.3 phrases out of 10 on average is a reasonable result. Our method achieved a result (0.93), which was close to the typical human result. We also conducted a t-Test with the human average and VPF with F25. The human subjects’ average was statistically significantly better than the best result obtained by the algorithm with a 95% confidence interval (P=0.02). It would be interesting to see if the worst case of human matching was higher than the algorithm’s. The answer was no. It was not statistically significantly better than the machine’s. This result indicates that human matching is better than the matching of algorithms in general but not always.

Table 13. Ranked by average across humans and articles – Simple match

–  –  –

3.3.3. Analysis with Simple Match This simple match count showed similar ranking to the exact match. VPF with F28 followed by F25 and F13 had the top matching rates: 3.70, 3.69, and 3.67 respectively as shown in Table 13. Since simple match uses a list of meaningful phrases by taking the union of phrases selected by the 10 human subjects, average number of matching phrases is higher than the average by exact match. Chan’s with F25 ranked 22 (3.05 matching rate), Ahonen without gap ranked 33 (2.93), and Ahonen with gap ranked 118 (1.76) out of 128.

Chan’s original method ranked 22 (3.053). These results also told us VPF found more phrases than Ahonen’s and Chan’s. The result from simple match also indicated that Ahonen’s algorithm could be improved by incorporating different correlation functions.

We also attempted to compare the methods’ results with the results from humans.

Human matched the list 5.6 out of 10 on average; the best and worst cases are 6.3 and 4.7 as shown in Table 14. The result 3.69 of method VPF with F25, which was the highest score with exact match, was quite significant considering that we only used the statistical information.

3.4. Summary We proposed a variable-length phrase-finding algorithm, which find more meaningful phrases – VPF – than older methods – Ahonen’s and Chan’s algorithms. We also coordinated these algorithms with 32 different correlation functions. They regenerate sequences recursively with the words selected in the previous stage and search for increased length of phrases in time O(nw), where nw is the number of distinct words in a page. Since our algorithm uses average as a threshold and stops when the length of phrases does not increase, no user-defined parameter is required.

In order to choose the best method, we conducted an experiment by asking 10 human subjects to select 10 phrases from 10 different articles. We compared the number of matching phrases chosen by a method to those phrases chosen by 10 human subjects. By comparing the top 10 best measures (matched-pair design (Robertson, 1981)), we observed that when we add pruning, the algorithm (VPF) had improved performance.

Pages:     | 1 |   ...   | 5 | 6 || 8 | 9 |   ...   | 19 |

Similar works:

«MULTI-PLATFORM STRATEGY AND PRODUCT FAMILY DESIGN Yanfeng Li Dissertation submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy In Industrial and Systems Engineering Dr. Janis Terpenny (Chair) Dr. Patrick Koelling Dr. Asli Sahin-Sariisik Dr. Subhash Sarin February 22, 1010 Blacksburg, VA Keywords: Product Design, Platform, Optimization, Costing     Multi-platform strategy and...»

«PRODUCTS OF TOPOLOGICAL MODAL LOGICS A DISSERTATION SUBMITTED TO THE DEPARTMENT OF PHILOSOPHY AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Darko Sarenac May 2006 c Copyright by Darko Sarenac 2006 All Rights Reserved ii Products of Topological Modal Logics ILLC Dissertation Series DS-2006-08 For further information about ILLC-publications, please contact Institute for Logic, Language and Computation...»


«3D DIGITAL RELIEF GENERATION MEILI WANG July 2011 National Centre for Computer Animation Bournemouth University This copy of the thesis has been supplied on condition that anyone who consults it is understood to recognise that its copyright rests with its author and due acknowledgement must always be made of the use of any material contained in, or derived from, this thesis. i 3D DIGITAL RELIEF GENERATION MEILI WANG A thesis submitted in partial fulfilment of the requirements of the Media...»

«Safe sisters: limitations of sister city relationships for international peace building Barbara Teresa Lloyd B.A. (Hons.) Submitted in fulfilment of the requirements for the degree of Doctor of Philosophy September 2010 School of Sociology and Social Work University of Tasmania Declaration of originality I declare that this thesis is my own work and has not been submitted in any form for another degree or diploma at any university or other institute of tertiary education. Information derived...»

«DIRECT MEASUREMENT OF THICKNESSES, VOLUMES OR COMPOSITIONS OF NANOMATERIALS BY QUANTITATIVE ATOMIC NUMBER CONTRAST IN HIGHANGLE ANNULAR DARK-FIELD SCANNING TRANSMISSION ELECTRON MICROSCOPY by BIAO YUAN B.S., Inorganic Materials Science & Engineering, Zhejiang University, 1984 M.S., Inorganic Nonmetallic Materials Science & Engineering, Zhejiang University, 1987 M.C.I.S., Computer & Information Science, Cleveland State University, 2003 A dissertation submitted in partial fulfillment of the...»

«SURFACES, SCALES, AND SYNTHESIS SCIENTIFIC REASONING AT THE NANOSCALE by Julia R. Bursten B.A., Philosophy, Rice University, 2008 M.A., Philosophy, University of Pittsburgh, 2010 Submitted to the Graduate Faculty of the Kenneth P. Dietrich School of Arts and Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh UNIVERSITY OF PITTSBURGH KENNETH P. DIETRICH SCHOOL OF ARTS AND SCIENCES This dissertation was presented by Julia R....»

«Sex ratios of sea turtle hatchlings: direct and indirect estimates by Matthew Howland Godfrey A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Zoology University of Toronto ©Copyright by Matthew Howland Godfrey 1997 i Sex ratios of sea turtle hatchlings: direct and indirect estimates ©1997 by Matthew Howland Godfrey. A thesis submitted for the degree of Doctor of Philosophy Graduate Department of Zoology, University of...»

«Educational Philosophy and Theory, Vol. 33, Nos. 3 & 4, 2001 Questioning as an Epistemic Process of Critical Thinking POLYCARP IKUENOBE Department of Philosophy, Kent State University Introduction The idea of questioning one’s idea is regarded by many as an affront. This attitude towards questioning suggests that it is rude, especially when it is persistent. Questioning is considered a way of casting aspersions on one’s ability or the reasonableness of one’s view. Questioning is often...»

«Rick Grush Introduction to emulation 1 An introduction to the main principles of emulation: motor control, imagery, and perception Rick Grush Department of Philosophy 0119 rick@mind.ucsd.edu UC San Diego http://mind.ucsd.edu 9500 Gilman Drive La Jolla, CA 92093-0119 0. Introduction 1. Motor control: forward models and Kalman filters 1.1 Feed-forward and feedback control 1.2 Emulators (forward models) 1.3 Kalman filters 1.4 Kalman filtering and control 1.5 Motor control 2. Motor Imagery 2.1 The...»

«3D MODELING WITH DATA-DRIVEN SUGGESTIONS A DISSERTATION SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Siddhartha Chaudhuri August 2011 © 2011 by Siddhartha Chaudhuri. All Rights Reserved. Re-distributed by Stanford University under license with the author. This work is licensed under a Creative Commons AttributionNoncommercial-No Derivative Works 3.0...»

«The Experience and Self-Management of Fatigue in Adult Hemodialysis Patients by Ann Elizabeth Horigan Nursing Duke University Date:_Approved: _ Julie Barroso, Supervisor _ Diane Holditch-Davis _ Susan Schneider _ Sharron Docherty Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Nursing in the Graduate School of Duke University ABSTRACT The Experience and Self-Management of Fatigue in Adult Hemodialysis Patients by Ann Elizabeth Horigan...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.