FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:     | 1 |   ...   | 6 | 7 || 9 | 10 |   ...   | 19 |

«Learning Implicit User Interest Hierarchy for Web Personalization by Hyoung-rae Kim A dissertation submitted to Florida Institute of Technology in ...»

-- [ Page 8 ] --

We concluded that VPF with F25 found a statistically significantly greater number of meaningful phrases than Ahonen’s previous method. We suspect the filtering stage of Ahonen’s algorithm filtered many meaningful phrases out or their weighting scheme using the length of a phrase and tightness (Ahonen et al., 1998) distracted the correlation value of a phrase. Interestingly, the correlation functions F25 and F28 were both included in the top 10 in both exact match and simple match. This result indicates the correlation functions F25 and F28 had higher matching rates than the other correlation functions. These two correlation functions both satisfied desirable properties for phrases. We can also improve Ahonen’s algorithm by incorporating correlation functions F10, F11, F12, F17, F26, F6, and F20. Those functions resulted in a higher match of average scores for both exact match and simple match experiments.

The performance of our method varied depending on the articles selected. We currently do not understand the reason for the variance in performance over different articles. We assume it is due to the intrinsic characteristics of an article, because the human subjects’ results are also different depending on the articles. Phrases in VPF grow backwards; however, in the future we will devise an algorithm that grows both forwards

–  –  –

Before selecting a correlation function, it is important to check whether a correlation function satisfies basic desirable properties of a correlation function. Likewise knowing the characteristics of a correlation function is important. We propose two new desirable properties that a correlation function should satisfy. These new properties will help understand the general characteristics of correlation functions and help to choose or devise a right correlation function for an application domain. Correlation functions can be used for finding phrases, building profile, and devising scoring functions in our system as shown in Figure 14. We tested with 32 correlation functions to see which one satisfies

–  –  –

Figure 14. Desirable properties in the framework of web personalization

4.1. Desirable Properties of a Correlation Function In this section, we describe several key properties of a correlation function.

Correlation functions differ in their ability to capture the dependencies between variables, because each correlation function has its own bias in preferring a set of diagrams to another. The dependencies between variables can be described in a Venn diagram as shown in Figure 15 – A, B, A∩B, and A∪B.

–  –  –

Piatetsky-Shapiro (1991) proposed three key properties that a good correlation function, F,

should satisfy:

P1: if A and B are statistically independent, then F is 0;

P2: F monotonically increases with P(A,B) when P(A) and P(B) remain the same;

P3: if P(A) (or P(B)) increases when the rest of the parameters, P(A,B) and P(B) (or P(A)), remain unchanged, then F monotonically decreases.

Tan et al. (2002) illustrated those three properties and the extent to which each of the existing measure satisfies the properties. Some of these properties have been extensively investigated in the data mining literature (Hilderman and Hamilton, 2001; Tan and Kumar, 2000).

4.1.1. Enhancing Property 1 The detection of statistical independence is a desirable property – PiatetskyShapiro (1991) specified this as his first desirable property (P1). However, should the correlation function return only 0?

When two events, A and B, are statistically independent, some functions return 0, but some functions return a different value such as 1. Each column in Table 19 is for each correlation function and we used “o” mark as satisfaction of P1 in the P1 row. The “value” row signifies the return value. For instance, the test functions Odds, Conv, Inte, and Coll in Appendix 1 return 1 when two events are statistically independent as shown in Table 19.

Therefore, we argue that the definition of P1 is too strict. We revise the definition of P1 to


P1: F can distinguish statistical independence of A and B.

The revised definition of P1 is used for the duration of this Chapter.

4.1.2. Additional Desirable Properties We propose and investigate additional properties desirable for a correlation function. When the universe is fixed, three events – A, B, and A∩B – can enumerate all possible cases. We list seven possible cases in Table 15. The notation “↑” indicates increase, “↓”indicates decrease, and “-”indicates no change in the value of a correlation function. We add two other possible cases – two variables increase at the same ratio (P4 and P5 in Table 15).

Case 1 has variable A fixed, B fixed, and A∩B increasing. Piatetsky-Shapiro (1991) refers to this case as P2, so we follow his designation. We show the proper change of a correlation function in the “Proper change” column. Other cases can be interpreted in the same way. The proper change of case 1 is increasing (Piatetsky-Shapiro, 1991). Case 2 and 3 have the same property name (P3) by similarity – variable A and B could be alternated with each other. The proper change of P3 is decreasing (Piatetsky-Shapiro, 1991).

–  –  –

Case 4 appears to be the same as the inverse of P2. But not all correlation functions return the same results (e.g., Supp in Appendix 1 increases for case 1 but no change for case 4). The value of a correlation function should decrease as the value of A∪B increases

–  –  –

A∩ B decreases when A∪B=A+B-A∩B. Therefore, property 4 (P4) states that if P(A,B) A∪ B increases when P(A) and P(B) remain unchanged, then F monotonically decreases.

An example of case 4 is shown in Table 16. The union is 100. Variable A and B increase with the same incremental ratio; A∩B remains unchanged. Intuition suggests as A and B increase, the value of a correlation function is expected to decrease.

–  –  –

Case 5 and 6 have the same property name (P5) due to their similarity. The proper change in a correlation value for P5 is monotonically increasing. In the formula of

–  –  –

A) and A∩B increase together at the same ratio, the whole value should increase. The P5 states that if P(A) (or P(B)) and P(A,B) increase at the same ratio, then F monotonically increases.

We illustrate cases 5 and 6 with an example below in Table 17. Since both cases are the same, we will only use case 5. Cases can have two possible sub-scenarios: A is smaller than B, or A is bigger than B. The union is 100. Variable A and A∩B increase with the same incremental ratio; B remains unchanged. In the case of AB, the value of correlation function is expected to increase, because the ratio between B and A∩B increases, while the ratio between A and A∩B remains the same. In the case of AB, A seemingly subsumes B gradually, because B remains the same while A∩B grows.

Consider the cases where A1=5.0, A2=10.0, B=10.0, A1∩B=3.0, A2∩B=6.0 as in Table 17. The correlations of the two cases (F(A1,B,A1∩B) and F(A2,B,A2∩B)) may have different values. The log CPR (Rosenfeld, 1994) values for the two cases are 4.2 and 5.0 respectively, which assess the significance of the correlation between A and B. However, for example Inte in Appendix 1 satisfies all previous 3 desirable properties (P1 – P3); but, it returns the same value, 6, for both cases (case of (A1,B,A1∩B) and case of (A2,B,A2∩B)).

Our P5 is to verify whether a correlation function can tell the difference between the two cases. This determines that the P5 is critical for measuring the characteristics of correlation functions.

–  –  –

Case 7 is defined as the values of A, B, and A∩B increase monotonically together at the same ratio. It is difficult to define what is a desirable change of the correlation function for this case. This property is somewhat related to the frequency. For example, too frequent and too rare words are usually removed in some researches (Croft et al., 1991;

Zamir and Etzioni, 1998). Due to the uncertainty inherent in case 7, we are unable to determine any proper changes to this property.

4.2. Experiments 4.2.1. Experimental Data and Procedures If all those correlation functions that satisfy our new 2 desirable properties also satisfy the previous 3 desirable properties, our proposed properties would not be a substantial improvement in characterizing correlation functions. Therefore, we collected 32 correlation functions to see which correlation functions only satisfy the previous 3 desirable properties. A complete listing of the correlation functions examined for further properties in this Chapter was given in Appendix 1. In order to verify whether a correlation function satisfies desirable properties, we systematically generated 923 different test cases (combinations of A, B, and A∩B). For P2 we generated 76 test cases, P3 had 399 test cases, P4 had 35 test cases, and P5 had 413 test cases. Generating test cases P3 and P5 were more complicated, because both had two different sub cases. We wrote our program with Excel macro and ran on Intel Pentium 4 CPU. The detailed test cases and source code are available on the web: http://cs.fit.edu/~hkim/dissertation/dissertation.htm.

We added dMAX, AEMI3, dMIN, and dMIN2 correlation functions to an experiment. The dMAX and the dMIN were cooperating with the Supp. The dMIN2 emphasized the original MIN value. The AEMI3 function was similar to AEMI4 except that AEMI3 did not include the negative relation part, A′B′ × log(A′B′/A′×B′). For example, the two matrixes in Figure 16 had different value assignments – basically the positive and negative intersection values were exchanged. An AEMI4 weighed two tables the same, even though Matrix A had a value of 5 for A∩B and Matrix B had a value of 75 for A∩B.

However, in some application domains the positive relation, A∩B× log(A∩B/(A×B)) is more important than the negative relation, which prefers Matrix B to Matrix A. Therefore, AEMI3 removed the negative relation part from AEMI4.

–  –  –

4.2.2. Evaluation Criteria In addition to the proposed properties (P1-P5), another important property was the ability to distinguish between positive and negative correlations (P6). Tan et al. (2002) described this property in detail. Positive correlation is more important than negative correlation in some application domains. Therefore, we also checked correlation functions whether they do or do not satisfy P6.

Statistical independence (P1) can be measured by the determinant operator, Det(A,B) = A∩B×A′∩B′ − A∩B′×A′∩B. Thus, a singular Venn diagram is independent when whose determinant is equal to zero (Tan et al., 2002). Measuring their cross product ratio (CPR) can assess the significance of the correlation between A and B (Rosenfeld,

1994) and is defined as:

–  –  –

Negative correlation function would have a negative log CPR value. The definition of P6 was that F can distinguish between positive and negative correlations of A and B. We also tested whether the result of each correlation function was normalized or not. P7 is that F returns normalized results. The total properties that we summarized are listed in Table 18.

Normalized functions have return values within certain boundaries such as between 0 and 1.

–  –  –

We measured mainly whether a correlation function shows consistent increasing pattern or decreasing pattern under a condition for each property. The changes with negative relations are disregarded. Since we can detect positive and negative correlations it would be more practical to check over positive correlations.

4.3. Results and Analysis The purpose of this experiment is to test whether those correlation functions that satisfy our new 2 desirable (new properties) properties also satisfy 3 established desirable properties (old properties). If there are correlation functions that satisfy only one of either the previous 3 properties or our 2 new properties, then the experiment provides evidence that our new properties will be useful for distinguishing properties of correlation functions.

This analysis based upon 3 categories. First, we compared old properties and new properties based upon the number of correlation functions that satisfies those properties.

Second, old properties and new properties were compared based upon the number of functions that satisfied the two categories of properties upon the satisfaction of P1. Third was regarding the satisfaction of P6. After that, we summarized whether correlation functions return normalized results.

4.3.1. Comparing Properties: Old verses New We compared how many correlation functions satisfied old properties and how many satisfied new properties. Each property of a correlation function and the desirable changes in the function for the properties P2 though P5 were given in Table 18. Table 19 contained the detailed results of our experiment designed to test correlation functions and the five desirable properties of correlation functions. If a given function satisfies a particular property, it receives an “o” notation, and “×” if the function does not satisfy the property. Some properties could have 2 conditions (AB or AB) depending on which variable was greater.

Twenty correlation functions – Coef, Odds, YulQ, YulY, Kapp, Mutu, JMea, Gini, Conv, Inte, Piat, Certa, Added, Coll, Klos, MI, EMI, AEMI4, dMI, and AEMI3 – satisfied old properties. Nineteen correlation functions – Coef, Odds, YulQ, YulY, Kapp, Mutu, JMea, Gini, Piat, Coll, Jacc, Klos, EMI, AEMI4, dMAX, dMI, AEMI3, dMIN, and MuConf – satisfied new properties. Fifteen correlation functions – Coef, Odds, YulQ, YulY, Kapp, Mutu, JMea, Gini, Piat, Coll, Klos, EMI, AEMI4, dMI, and AEMI3 – satisfied all five desirable properties.

Pages:     | 1 |   ...   | 6 | 7 || 9 | 10 |   ...   | 19 |

Similar works:

«JETS 57/4 (2014) 703–13 RECONSIDERING THE ROLE OF DECEPTION IN SOLOMON’S ASCENT TO THE THRONE MATTHEW NEWKIRK* I. INTRODUCTION 1. Defining deception. The study of deception in the Bible has become increasingly popular in recent years.1 Simply defined, “to deceive” means intentionally to cause someone to believe something one knows to be false. Deception is thus distinct from, though related to, lying. “To lie” means to communicate intentionally and explicitly that something is true...»

«NORTHWESTERN UNIVERSITY An Integrated Architecture for Engineering Problem Solving A DISSERTATION SUBMITTED TO THE GRADUATE SCHOOL IN PARTIAL FULFILLMENT OF THE REQUIREMENTS for the degree DOCTOR OF PHILOSOPHY Field of Computer Science By Yusuf Pisan EVANSTON, ILLINOIS December 1998 © Copyright by Yusuf Pisan 1998 All Rights Reserved ii ABSTRACT An Integrated Architecture for Engineering Problem Solving Yusuf Pisan Problem solving is an essential function of human cognition. To build...»

«UNIVERSITY OF CALIFORNIA, SAN DIEGO Gravity changes associated with underground injection of CO2 at the Sleipner storage reservoir in the North Sea, and other marine geodetic studies. A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Earth Sciences By Scott L. Nooner Committee in charge: Mark A. Zumberge, Chair Donna K. Blackman Ahmed-Waeil Elgamal Robert L. Parker Glenn S. Sasagawa Fred N. Spiess 2005 Copyright Scott L. Nooner, 2005 All...»

«International Journal of Humanities and Social Science Invention ISSN (Online): 2319 – 7722, ISSN (Print): 2319 – 7714 www.ijhssi.org ||Volume 4 Issue 10 || October. 2015 || PP.40-43 The Style of Gita Govinda Recital and Odissi Music Dheeraj Kumar Mohapatra Guest Faculty, Odissi Vocal Department, Utkal University Of Culture, India Abstract : In Gita Govinda, Sri Jayadeva has embellished the philosophy, metaphysics, ontology and mysticism in erotic words, melodious versifications, ardent...»

«Investigaciones Fenomenológicas, vol. Monográfico 4/I (2013): Razón y vida, 351-360. e-ISSN: 1885-1088 THE MELODY OF LIFE. MERLEAU-PONTY, READER OF JACOB VON UEXKÜLL LA MELODÍA DE LA VIDA. MERLEAU-PONTY, LECTOR DE JACOB VON UEXKÜLL Luís António Umbelino Associação Portuguesa de Filosofia Fenomenológica (APFFEN)/ Universidade de Coimbra, Portugal luis.um@megamail.pt Abstract: This paper aims to meditate on the Resumen: Este artículo trata de meditar acerca importance of Jacob von...»

«Marginal Freedoms: Journalism, Participation and Moral Multiplicity in Odisha, India by Katherine B. Martineau A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Anthropology) in the University of Michigan Doctoral Committee: Professor Webb Keane, Chair Professor Emeritus Thomas Trautmann Associate Professor Matthew Hull Associate Professor Aswin Punathambekar © Katherine B. Martineau 2014 Dedication This dissertation is dedicated to my...»


«The Representation of Animals and the Natural World in Late-Medieval Hagiography and Romance Thesis submitted for the degree of Doctor of Philosophy at the University of Leicester by David Salter BA (Leicester) Department of English University of Leicester September 1998 UMI Number: U113229 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and...»

«FRANCES YATES SELECTED WORKS FRANCES YATES Selected Works VOLUME I The valois Tapestries VOLUME II Giordano Bruno and the Hermetic Tradition VOLUME III The Art of Memory VOLUME IV The Rosicrucian Enlightenment VOLUME V Astraea VOLUME VI Shakespeare's Last Plays VOLUME VII The Occult Philosophy in the Elizabethan Age VOLUME VIII Lull and Bruno VOLUME IX Renaissance and Riform: The Italian Contribution VOLUME X Ideas and Ideals in the North European Renaissance FRANCES YATES Selected Works Volume...»

«ABSTRACT SENSING SMALL CHANGES Title of dissertation: IN A WAVE CHAOTIC SCATTERING SYSTEM AND ENHANCING WAVE FOCUSING USING TIME REVERSAL MIRRORS Biniyam Tesfaye Taddese, Doctor of Philosophy, 2012 Dissertation directed by: Professor Steven M. Anlage Department of Electrical and Computer Engineering Wave-based motion sensors, such as radar and sonar, are designed to detect objects within a direct line-of-sight of the sensor. The presence of multiple reflections from surrounding objects usually...»

«ABSTRACT DESIGN AND PERFORMANCE PREDICTION Title of dissertation: OF SWASHPLATELESS HELICOPTER ROTOR WITH TRAILING EDGE FLAPS AND TABS Jaye Falls, Doctor of Philosophy, 2010 Professor Inderjit Chopra Dissertation directed by: Department of Aerospace Engineering This work studies the design of trailing edge controls for swashplateless helicopter primary control, and examines the impact of those controls on the performance of the rotor. The objective is to develop a comprehensive aeroelastic...»

«ABSTRACT Title of dissertation: MODELING OF SEASONAL TRACE GAS AND PARTICULATE EMISSIONS FROM VEGETATION FIRES IN SOUTHERN AFRICA Stefania Korontzi, Doctor of Philosophy, 2004 Dissertation directed by: Professor Christopher O. Justice Department of Geography Fire is widespread in southern African savannas with important implications for tropical and global atmospheric chemistry. However, previous regional emission studies have not fully accounted for the variability of the emissions throughout...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.