«Learning Implicit User Interest Hierarchy for Web Personalization by Hyoung-rae Kim A dissertation submitted to Florida Institute of Technology in ...»
We concluded that VPF with F25 found a statistically significantly greater number of meaningful phrases than Ahonen’s previous method. We suspect the filtering stage of Ahonen’s algorithm filtered many meaningful phrases out or their weighting scheme using the length of a phrase and tightness (Ahonen et al., 1998) distracted the correlation value of a phrase. Interestingly, the correlation functions F25 and F28 were both included in the top 10 in both exact match and simple match. This result indicates the correlation functions F25 and F28 had higher matching rates than the other correlation functions. These two correlation functions both satisfied desirable properties for phrases. We can also improve Ahonen’s algorithm by incorporating correlation functions F10, F11, F12, F17, F26, F6, and F20. Those functions resulted in a higher match of average scores for both exact match and simple match experiments.
The performance of our method varied depending on the articles selected. We currently do not understand the reason for the variance in performance over different articles. We assume it is due to the intrinsic characteristics of an article, because the human subjects’ results are also different depending on the articles. Phrases in VPF grow backwards; however, in the future we will devise an algorithm that grows both forwards
Before selecting a correlation function, it is important to check whether a correlation function satisfies basic desirable properties of a correlation function. Likewise knowing the characteristics of a correlation function is important. We propose two new desirable properties that a correlation function should satisfy. These new properties will help understand the general characteristics of correlation functions and help to choose or devise a right correlation function for an application domain. Correlation functions can be used for finding phrases, building profile, and devising scoring functions in our system as shown in Figure 14. We tested with 32 correlation functions to see which one satisfies
Figure 14. Desirable properties in the framework of web personalization
4.1. Desirable Properties of a Correlation Function In this section, we describe several key properties of a correlation function.
Correlation functions differ in their ability to capture the dependencies between variables, because each correlation function has its own bias in preferring a set of diagrams to another. The dependencies between variables can be described in a Venn diagram as shown in Figure 15 – A, B, A∩B, and A∪B.
Piatetsky-Shapiro (1991) proposed three key properties that a good correlation function, F,
P1: if A and B are statistically independent, then F is 0;
P2: F monotonically increases with P(A,B) when P(A) and P(B) remain the same;
P3: if P(A) (or P(B)) increases when the rest of the parameters, P(A,B) and P(B) (or P(A)), remain unchanged, then F monotonically decreases.
Tan et al. (2002) illustrated those three properties and the extent to which each of the existing measure satisfies the properties. Some of these properties have been extensively investigated in the data mining literature (Hilderman and Hamilton, 2001; Tan and Kumar, 2000).
4.1.1. Enhancing Property 1 The detection of statistical independence is a desirable property – PiatetskyShapiro (1991) specified this as his first desirable property (P1). However, should the correlation function return only 0?
When two events, A and B, are statistically independent, some functions return 0, but some functions return a different value such as 1. Each column in Table 19 is for each correlation function and we used “o” mark as satisfaction of P1 in the P1 row. The “value” row signifies the return value. For instance, the test functions Odds, Conv, Inte, and Coll in Appendix 1 return 1 when two events are statistically independent as shown in Table 19.
Therefore, we argue that the definition of P1 is too strict. We revise the definition of P1 to
P1: F can distinguish statistical independence of A and B.
The revised definition of P1 is used for the duration of this Chapter.
4.1.2. Additional Desirable Properties We propose and investigate additional properties desirable for a correlation function. When the universe is fixed, three events – A, B, and A∩B – can enumerate all possible cases. We list seven possible cases in Table 15. The notation “↑” indicates increase, “↓”indicates decrease, and “-”indicates no change in the value of a correlation function. We add two other possible cases – two variables increase at the same ratio (P4 and P5 in Table 15).
Case 1 has variable A fixed, B fixed, and A∩B increasing. Piatetsky-Shapiro (1991) refers to this case as P2, so we follow his designation. We show the proper change of a correlation function in the “Proper change” column. Other cases can be interpreted in the same way. The proper change of case 1 is increasing (Piatetsky-Shapiro, 1991). Case 2 and 3 have the same property name (P3) by similarity – variable A and B could be alternated with each other. The proper change of P3 is decreasing (Piatetsky-Shapiro, 1991).
Case 4 appears to be the same as the inverse of P2. But not all correlation functions return the same results (e.g., Supp in Appendix 1 increases for case 1 but no change for case 4). The value of a correlation function should decrease as the value of A∪B increases
A∩ B decreases when A∪B=A+B-A∩B. Therefore, property 4 (P4) states that if P(A,B) A∪ B increases when P(A) and P(B) remain unchanged, then F monotonically decreases.
An example of case 4 is shown in Table 16. The union is 100. Variable A and B increase with the same incremental ratio; A∩B remains unchanged. Intuition suggests as A and B increase, the value of a correlation function is expected to decrease.
Case 5 and 6 have the same property name (P5) due to their similarity. The proper change in a correlation value for P5 is monotonically increasing. In the formula of
A) and A∩B increase together at the same ratio, the whole value should increase. The P5 states that if P(A) (or P(B)) and P(A,B) increase at the same ratio, then F monotonically increases.
We illustrate cases 5 and 6 with an example below in Table 17. Since both cases are the same, we will only use case 5. Cases can have two possible sub-scenarios: A is smaller than B, or A is bigger than B. The union is 100. Variable A and A∩B increase with the same incremental ratio; B remains unchanged. In the case of AB, the value of correlation function is expected to increase, because the ratio between B and A∩B increases, while the ratio between A and A∩B remains the same. In the case of AB, A seemingly subsumes B gradually, because B remains the same while A∩B grows.
Consider the cases where A1=5.0, A2=10.0, B=10.0, A1∩B=3.0, A2∩B=6.0 as in Table 17. The correlations of the two cases (F(A1,B,A1∩B) and F(A2,B,A2∩B)) may have different values. The log CPR (Rosenfeld, 1994) values for the two cases are 4.2 and 5.0 respectively, which assess the significance of the correlation between A and B. However, for example Inte in Appendix 1 satisfies all previous 3 desirable properties (P1 – P3); but, it returns the same value, 6, for both cases (case of (A1,B,A1∩B) and case of (A2,B,A2∩B)).
Our P5 is to verify whether a correlation function can tell the difference between the two cases. This determines that the P5 is critical for measuring the characteristics of correlation functions.
Case 7 is defined as the values of A, B, and A∩B increase monotonically together at the same ratio. It is difficult to define what is a desirable change of the correlation function for this case. This property is somewhat related to the frequency. For example, too frequent and too rare words are usually removed in some researches (Croft et al., 1991;
Zamir and Etzioni, 1998). Due to the uncertainty inherent in case 7, we are unable to determine any proper changes to this property.
4.2. Experiments 4.2.1. Experimental Data and Procedures If all those correlation functions that satisfy our new 2 desirable properties also satisfy the previous 3 desirable properties, our proposed properties would not be a substantial improvement in characterizing correlation functions. Therefore, we collected 32 correlation functions to see which correlation functions only satisfy the previous 3 desirable properties. A complete listing of the correlation functions examined for further properties in this Chapter was given in Appendix 1. In order to verify whether a correlation function satisfies desirable properties, we systematically generated 923 different test cases (combinations of A, B, and A∩B). For P2 we generated 76 test cases, P3 had 399 test cases, P4 had 35 test cases, and P5 had 413 test cases. Generating test cases P3 and P5 were more complicated, because both had two different sub cases. We wrote our program with Excel macro and ran on Intel Pentium 4 CPU. The detailed test cases and source code are available on the web: http://cs.fit.edu/~hkim/dissertation/dissertation.htm.
We added dMAX, AEMI3, dMIN, and dMIN2 correlation functions to an experiment. The dMAX and the dMIN were cooperating with the Supp. The dMIN2 emphasized the original MIN value. The AEMI3 function was similar to AEMI4 except that AEMI3 did not include the negative relation part, A′B′ × log(A′B′/A′×B′). For example, the two matrixes in Figure 16 had different value assignments – basically the positive and negative intersection values were exchanged. An AEMI4 weighed two tables the same, even though Matrix A had a value of 5 for A∩B and Matrix B had a value of 75 for A∩B.
However, in some application domains the positive relation, A∩B× log(A∩B/(A×B)) is more important than the negative relation, which prefers Matrix B to Matrix A. Therefore, AEMI3 removed the negative relation part from AEMI4.
4.2.2. Evaluation Criteria In addition to the proposed properties (P1-P5), another important property was the ability to distinguish between positive and negative correlations (P6). Tan et al. (2002) described this property in detail. Positive correlation is more important than negative correlation in some application domains. Therefore, we also checked correlation functions whether they do or do not satisfy P6.
Statistical independence (P1) can be measured by the determinant operator, Det(A,B) = A∩B×A′∩B′ − A∩B′×A′∩B. Thus, a singular Venn diagram is independent when whose determinant is equal to zero (Tan et al., 2002). Measuring their cross product ratio (CPR) can assess the significance of the correlation between A and B (Rosenfeld,
1994) and is defined as:
Negative correlation function would have a negative log CPR value. The definition of P6 was that F can distinguish between positive and negative correlations of A and B. We also tested whether the result of each correlation function was normalized or not. P7 is that F returns normalized results. The total properties that we summarized are listed in Table 18.
Normalized functions have return values within certain boundaries such as between 0 and 1.
We measured mainly whether a correlation function shows consistent increasing pattern or decreasing pattern under a condition for each property. The changes with negative relations are disregarded. Since we can detect positive and negative correlations it would be more practical to check over positive correlations.
4.3. Results and Analysis The purpose of this experiment is to test whether those correlation functions that satisfy our new 2 desirable (new properties) properties also satisfy 3 established desirable properties (old properties). If there are correlation functions that satisfy only one of either the previous 3 properties or our 2 new properties, then the experiment provides evidence that our new properties will be useful for distinguishing properties of correlation functions.
This analysis based upon 3 categories. First, we compared old properties and new properties based upon the number of correlation functions that satisfies those properties.
Second, old properties and new properties were compared based upon the number of functions that satisfied the two categories of properties upon the satisfaction of P1. Third was regarding the satisfaction of P6. After that, we summarized whether correlation functions return normalized results.
4.3.1. Comparing Properties: Old verses New We compared how many correlation functions satisfied old properties and how many satisfied new properties. Each property of a correlation function and the desirable changes in the function for the properties P2 though P5 were given in Table 18. Table 19 contained the detailed results of our experiment designed to test correlation functions and the five desirable properties of correlation functions. If a given function satisfies a particular property, it receives an “o” notation, and “×” if the function does not satisfy the property. Some properties could have 2 conditions (AB or AB) depending on which variable was greater.
Twenty correlation functions – Coef, Odds, YulQ, YulY, Kapp, Mutu, JMea, Gini, Conv, Inte, Piat, Certa, Added, Coll, Klos, MI, EMI, AEMI4, dMI, and AEMI3 – satisfied old properties. Nineteen correlation functions – Coef, Odds, YulQ, YulY, Kapp, Mutu, JMea, Gini, Piat, Coll, Jacc, Klos, EMI, AEMI4, dMAX, dMI, AEMI3, dMIN, and MuConf – satisfied new properties. Fifteen correlation functions – Coef, Odds, YulQ, YulY, Kapp, Mutu, JMea, Gini, Piat, Coll, Klos, EMI, AEMI4, dMI, and AEMI3 – satisfied all five desirable properties.