8.2.2. Confidence on the Results This section briefly discusses the extent to which we can trust the profile generated by the DHC algorithm. For example, if the profile indicates a user is interested in “laptop computer”, how much can we trust the result? The user profile is built out of a set of interesting web pages to a user. As previously explained, bookmarks or web pages detected by an implicit indicator can be used as the set. Since the set of interesting web pages can change over time, the user profile can change as well. The profiles should be rebuilt periodically. Then, are the interests of the user that appear over consecutive different periods more confident than the interests that occur only once? The results may also depend on how reliable the input data sets are. In order to answer the question, we may have to be able to measure the reliability of the set of interesting web pages. These questions are not easy for us to answer at this moment. It can be future work.

8.3. Limitation and Future Work In our system there are several limitations.

• We did not analyze differences among the UIHs’ obtained from various users because of the large numbers of web pages used in our experiments.

• The performance of the DHC algorithm varied depending on the articles selected.

We believe this is because of the intrinsic characteristics in a document.

• The performance of VPF varied depending on the articles selected. We currently do not understand the reason for the variance in performance over different articles. We assume it is due to the intrinsic characteristics of an article, because the human subjects’ results are also different depending on the articles.

• Our experiment for desirable properties of a correlation function was limited to positive correlations for our web personalization since many applications depend on positive correlation. We will extend our analysis to negative correlation as well.

• The improvement of WS was not statistically significant because the precision values of Google had large variance.

• The reason for the low performance of some search terms might be because there is no relation between his/her bookmarks and the search terms. We may be able to relieve this problem by incorporating interesting web pages based on implicit interest indicators.

• Our approach of penalizing the index pages did not make much improvement in our initial experiments. We will examine this approach further in the future.

• Since WS showed higher performance for links after Top 5 than Google, we expect that our method may get higher performance with clustered search engines.

• A longer evaluation would give more accurate results for the LookAtIt indicator, since users would act more naturally after more than 1 or 2 hours of surfing.

• We can combine this indicator to an application for personalized web search results in the future. The collected interesting web pages for a user can be used for building a user interest hierarchy.

