FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:     | 1 |   ...   | 7 | 8 || 10 | 11 |   ...   | 12 |

«by MICHAŁ WRÓBLEWSKI Supervisor JERZY STEFANOWSKI, Assistant Professor Referee ROBERT SUSMAGA, Assistant Professor MASTER THESIS Submitted in ...»

-- [ Page 9 ] --

• Java language itself enforces object-oriented-programming and good software practices.

• Java Virtual Machine performs runtime checks for common and often difficult-totrack errors such as null pointers, array indices running out of bounds, etc.

• Many libraries supporting Web technologies (particularly XML) written in that language.

A number of utilities for the Carrot2 system (component frameworks etc.) was already • implemented in this language.

Serious drawback of Java is unfortunately speed of execution of code written in this language (much lower than in case of natively-compiled code, written for instance in C++).

However, as we have stated before in Chapter 5-1-1, due to character of the system, speed was not the most important criterion in the choice of solutions. Efficiency issues are discussed in details further in the next subchapter.

5.3.2. EFFICIENCY ISSUES Unfortunately, speed of execution of our AHC module is quite slow. Performed tests showed that for more than just 200 snippets system's time of response exceeds several seconds, making it too slow for more impatient users. Exact results of these tests, with values for individual parts of modules are given below in Figure 5-3-2-1 (please have in mind fact

that these are just times of the AHC module, not the whole Carrot2 system):

–  –  –

It is easy to see that the most laborious element of the module is implementation of the AHC algorithm (only for small input data sets, containing less than 200 snippets extraction of terms and phrases lasts longer than this task), due to its computational complexity. Major goal of implementation of this algorithm in the Carrot2 system was to make it as simple and as flexible as possible. We have created a general framework, which enables easy replacing of stop conditions and linkage methods. Unfortunately, chosen solution has also the biggest complexity possible for this algorithm – O(n3) (see Chapter 4-2-1), being definitely too slow to meet Zamir and Etzioni's key requirements for Web documents clustering methods (presented in Chapter 2-3).

Another element that takes up bigger and bigger part of the whole running time with increase of the size of data is calculation of inter-document similarities. Usually it is said that this process takes O(n2) time for n documents [Salton 89]. This formula doesn't take into

- 65 account the fact that in case of all commonly used similarity measures time needed for their calculation also depends linearly on the number of terms in the input document set. So exact computational complexity of calculating document similarities is O(n2·m), where m is the number of terms, which increases with the number of documents (a fact not concerned by most of the authors). However, this increase is slower than linear (there is no exact formula – at least we haven't found any), so overall complexity of this process is between O(n2) and O(n3), being just a little bit smaller than that of the AHC algorithm.

- 66 EVALUATION OF THE AHC ALGORITHM Purpose of this chapter is to present an assessment of our version of the AHC algorithm. In order to achieve this goal, we have carried out several experiments, whose results are also presented. In this part of the thesis we refer often to the former chapters, showing influence of described there techniques on the quality of the algorithm's results. We have also performed a comparison between our implementation of AHC and several other important clustering algorithms.

Our experiments may be distinguished into two classes on the ground of the way of results


• user evaluation (presented in Chapter 6-2)

• expert evaluation (presented in Chapters 6-3 and 6-4) Major goal of the first type of experiment was to determine the quality of results given by the system from the common user's point of view. In order to achieve this goal, we have performed a survey on a group of several users, asking them to judge simple aspects of the algorithm's results. We have also created few simple metrics (presented in Chapter 6-1-2 and Chapter 6-1-3) that give exact, numerical values based on these users assessments.

On the other hand, experiments of the second kind consisted of comparing the results of different algorithms (or one algorithm for different parameters / input data) by a single person (expert) in order to find differences between them. In this part of evaluation we have concentrated our attention rather on finding and describing the properties of the algorithm's results than on attempt to calculate some exact metrics. Therefore, this part of the evaluation experiment was more research- than user-oriented, trying to find reasons behind the final results, not to judge them definitely.


It turns out that evaluation of clustering results (especially in case of hierarchical algorithms) is quite a difficult task. It is particularly hard to create objective, numerical

measures of their quality. Here we point out a few most important reasons of this fact:

• subjectivity of evaluation The primary criterion in clustering evaluation should be usefulness of the given results.

However, for such a definition of quality it is hard to find objective measures. Different end users may have different preferences (e.g. some may be interested in small, precisely described clusters while others prefer large, more general groups), therefore there may

- 67 exist many equivalent ways of partitioning the input documents set. This is a common problem in the Web Mining, as this domain is very user-oriented.

It turns out that there are at least two benefits of clustering – enabling the user to find faster the relevant information and "search refinement", consisting in creating a good partitioning of the input documents set, enabling to find different subtopics in the query results. The used similarity measures should be different depending on which application is more important.

• multi-criteria nature of the problem Apart from major goal of our evaluation, the quality of results is influenced by a number of factors such, as conciseness of created clusters or eligibility of their descriptions, and it is hard to aggregate them all to one factor. This issue will be discussed also in Chapter 6-1-2.

• difficulty in comparing results given by different algorithms As we have mentioned before in Chapter 4-1, results of different clustering algorithms may have different properties (i.e. be hierarchical or flat, overlapping or not). It is difficult to create quality measures that could be equally well applied to different algorithms and therefore fairly compare their results.

6.1.1. EXISTING WEB MINING METHODS Problem of the results evaluation is quite often addressed in the Web Mining literature.

During our studies of publications from this domain we have come upon several different approaches and in this subchapter we make a brief overview of them, pointing out both their strong and weak points.

• standard Information Retrieval metrics Most commonly used quality measures in the Information Retrieval domain are precision and recall [Salton 89]. These two metrics were originally defined for search

results evaluation problem as following:

–  –  –

- 68 Both metrics are different and opposing quality criterions, as usually the more precise the search results are, the less (also relevant) documents we get. However, these quality measures in their basic version don't satisfy our needs stated before.

• merge-then-cluster approach Technique that seems to be a natural approach to our problem is performing clustering on a specially prepared set of documents. This input set is created by collecting documents concerning different topics. Then the results of clustering algorithms performed on this set are compared with its original partition [Zamir 99]. The most important advantage of this approach is ability of automatic repeating of multiple tests of different methods (or one method with different parameters) after establishment of the input set.

This approach requires the construction of the input documents set to be done very carefully. In particular, one cluster appearing in results of some algorithm may be split by other into several more specific ones, and decision which partition is correct is often arbitrary. One may also criticize the "artificial" way of creating the test documents set by merging documents concerning different topics instead of using real data (ranked lists), in which case finding coherent clusters may be more difficult.

• ground truth hypothesis Another, but somehow similar approach to this problem is using a quality measure based on entropy ([Maarek et al. 00], [Stefanowski and Weiss 03]). In this approach, the deviation of the algorithm's results from ground truth set, representing some "ideal" partition is calculated. Constructing the ground truth set may be performed for instance manually by experts who try to find clusters in the input data set [Weiss and Stefanowski 03]. The deviation is calculated basing on a information theory measure presented in [Dom 2001]. This method is similar to merge-than-cluster as the subject of evaluation in both approaches is difference between some reference set and algorithm's results.

It also enables us to quickly perform many different tests after creating reference clustering results (the ground truth). Its main disadvantage is that used quality measure is defined only for flat partitioning (so neither hierarchical nor overlapping results may be correctly evaluated using this method).

• user evaluation There are many evaluation methods based on the examination of user's behavior. These methods may either make use of direct feedback from users (e.g. some kind of survey), or attempt to analyze automatically their reaction to the system's response for instance by logs tracing. Very interesting and detailed example of user's evaluation is presented in [Zamir and Etzioni 99], where authors of this paper present application of user behavior analysis to evaluation of the Grouper system.

This approach seems to be very promising as the quality of results is decided by the users themselves. However, there are several problems associated with performing of this type of evaluation such as different level of users experience or finding a statistically

- 69 significant number of experiment participants [Weiss 01]. Moreover, such experiments take quite a lot of time and cannot be easily, automatically repeated.


During the work on the Carrot2 system we have decided to make use of the last approach presented in the former subchapter and base system evaluation on user survey. The same method was also used to evaluate results of the LINGO algorithm (see [Osiński 03]), also a part of the Carrot2 system.

The survey has been created taking into account key requirements for Web search results clustering, such as browsable summaries of the clusters or relevance (refer to Chapter 2-3 for more details). Users taking part in the evaluation were asked to answer the following


• usefulness of created clusters For each of the created groups (except the "Other Topics" group), the users were asked to assess its usefulness, defined as user's ability to deduce topics of documents contained in it on the basis of its description. It is easy to see that the most important factor influencing evaluation of this criterion is the algorithm's ability to create meaningful, accurate cluster labels.

We have to add here that not only the clusters corresponding to topics, in which the user is interested are useful. Provided their descriptions are concise and accurate, also clusters "unexpected" in the context of the query that contain documents irrelevant to the user's interests are also helpful, as they enable him to quickly separate these documents from the interesting ones. This is exactly the way, in which clustering improves the search results.

In case of hierarchy also usefulness of subclusters is evaluated, independently of their parent clusters, as top-level clusters that are too general and therefore meaningless may be split into more detailed, useful ones (we continue this topic in the next subchapter).

Scale: useful cluster | useless cluster

• precision of assignment of documents to their respective clusters For each document assignment to a group, the users were asked to judge if this assignment is correct (i.e. whether its snippet fits this group's description). Here we have decided to use a more diverse than binary scale, as documents may fit to some degree in their clusters (e.g. be more or less specific than respective clusters, but still somehow related to their descriptions).

Precision of assignment is assessed only if the cluster to which the document has been assigned has been judged as useful. There are two rationales behind this decision. First of all, if group's description is nonsense, it is also nonsense to judge if a document fits it.

Secondly, if user cannot understand the cluster's title, he will probably discard it and therefore he won't browse documents contained in it. Moreover, the "Other Topics" group with its documents is also excluded from this process.

- 70 Here we have to make an important remark as in case of both overlapping and hierarchical results one document may be contained in many clusters. Therefore its assignment to all the clusters where it is contained should be independently judged.

Scale: well matching | moderately matching | not matching Having collected assessments of single groups and documents from users, we may try to define general quality measures for the whole clustering results. We have decided to use several metrics evaluating different aspects of the results due to the mentioned earlier multicriteria nature of the problem. Some of measures used in AHC evaluation were also used to assess the quality of LINGO and are presented already in [Osiński 03]. Here we will focus rather on presentation of AHC-specific solutions, to which we dedicate whole next subchapter.

After collecting of the forms from users we are able to calculate the following statistics:

Pages:     | 1 |   ...   | 7 | 8 || 10 | 11 |   ...   | 12 |

Similar works:

«Colenbrander – Aspects of Vision Loss – Visual Functions and Functional Vision ASPECTS OF VISION LOSS – Visual Functions and Functional Vision August Colenbrander, MD – San Francisco This paper is based on a draft for the paper that appeared in Volume 5, Issue 3 of “Visual Impairment Reasearch”, the journal of the International Society for Low Vision Research. That issue is devoted to papers presented at the Conference on “Functional Vision – Assessment and Outcomes”, held in...»

«Welcoming Children with Special Needs A Guidebook for Faith Communities Sally Patton UNITARIAN UNIVERSALIST ASSOCIATION BOSTON Copyright © 2004 by the Unitarian Universalist Association of Congregations. All rights reserved. Cover design by Kathryn Sky-Peck Text design by Communicáto, Ltd. ISBN 1-55896-479-7 978-1-55896-479-2 10 9 8 7 6 5 4 3 2 11 10 09 08 We are grateful for permission to reprint the following: Page xiii: “We Are,” Words and music by Dr. Ysaye M. Barnwell, Barnwell’s...»

«These Guidelines were originally developed for Mercy Hospice Auckland (formerly St Joseph’s Mercy Hospice), New Zealand, but demand from other palliative care providers and a substantial grant from the Genesis Oncology Trust (www. genesisoncology.org.nz) has enabled them to be produced in this convenient and easy to read book. The Guidelines have been independently reviewed to ensure the information presented within them is accurate and up-to-date. This review was undertaken by Jenny Phillips...»

«SEGMENTAL EFFECTS ON SYLLABLE SELECTION: EVIDENCE FROM HEBREW GALIT ADAM AND OUTI BAT-EL 1. Introduction The selection of the syllable from the target word is governed by two prosodic constraints: PROMINENCE for the selection of the stressed syllable, and POSITION, for the selection of the final syllable. As shown in Ben-David (2001) and Adam (2002), the preference is evident during the stage at which the children produce monosyllabic forms for targets with final stress (e.g. ka for kadúr...»

«Bollettino della Comunità Scientifica in Australasia Ambasciata d’Italia Dicembre 2008 CANBERRA Snoring: Obstructive sleep apnea syndrome Daniele Francioli, Grazia Ruggiero Introduction Obstructive Sleep Apnea Syndrome (OSAS) is a disorder with an increasing clinical relevance, characterized by recurrence of sleep apnea obstructive episodes. As early as 1889, William Hill recognised OSAS symptoms and considered it a real disease. (4) Bertini (1) says that the main OSAS symptoms are: snoring...»

«1 SORCERER KILLINGS IN 1998 EAST JAVA: AN ANALYSIS OF PRESS AND ACADEMIC REPORTS Nicholas Herriman Monash Asia Institute nicholasherriman@hotmail.com In 1998, large numbers of alleged sorcerers were killed in East Java, Indonesia.1 With around 100 fatalities peaking in September, Banyuwangi regency seems to have experienced the most killings. Hundreds of articles were published about this ‘outbreak’ in East Java’s two main newspapers—Surabaya Post and Jawa Pos. Most articles maintained...»

«1_40 Conference Program 2016.qxp_Layout 1 14/08/16 10:52 Pagina 1 8th International Conference on the Tear Film & Ocular Surface: Basic Science and Clinical Relevance Conference Program & Abstract Book Montpellier, France Title Sponsor: September 7-10, 2016 1 Tear Film & Ocular Surface Society 1_40 Conference Program 2016.qxp_Layout 1 14/08/16 10:52 Pagina 2 TFOS Conference Director David A. Sullivan (USA) TFOS Conference Organization & Management Amy Gallant Sullivan (USA) TFOS Conference...»

«Easter Island Statue Project Conservation Initiative The Explorers Club Dr. Jo Anne Van Tilburg UCLA, Project Director Cristián Arévalo Pakarati Rapa Nui, Project Co-Director Explorers Club Flag #50 Report Easter Island Statue Project Conservation Initiative Field Season IV July 7 to August 3, 2011 Jo Anne Van Tilburg, Ph.D Location Rano Raraku Quarry Rapa Nui National Park Easter Island (Rapa Nui), Chile Latitude 27.1247222 Longitude 109.2886111 Expedition Leaders Jo Anne Van Tilburg, Ph.D....»


«COVENANT THEOLOGICAL SEMINARY ONLINE RESOURCES A Heaven’s Eye View of Christmas 1 www.covenantseminary.edu/resource Spreading the Gospel With Confidence: Paul Among the Pagans By Jerram Barrs Professor of Christian Studies and Contemporary Culture, Covenant Theological Seminary Resident Scholar, Francis Schaeffer Institute Even though the Apostle Paul communicated the Christian message about nineteen and a half centuries ago, the world he spoke to dealt with many of the same issues we face...»

«Malmö högskola Lärarutbildningen Lärarkunskap Examensarbete 10 poäng Hårklyverier om betygsättning En studie om betygsättning av praktiska moment inom den gymnasiala frisörutbildningen Hair splitting while marking A study concerning marking of practical issues within the hairdresser education at upper secondary level. Författare Barbro Dahlin-Volsing Tina Rydén Kurs 6 Examensarbete 51-60 poäng Handledare: Björn Lundgren Lärarutbildning 60p Slutseminariet 2007-10-15 Examinator:...»

«Originalveroffentlichung in: B. Le Guen (ed.), De la scene aux gradins. Theatre et representations dramatiques apres Alexandre le Grand dans les cites hellenistiques. Actes du Colloque, Toulouse 1997 (Pallas, 41), Toulouse 1997, 219-259. De la scene aux gradins (ed. B. Le Guen), PALLAS, 47, 1997, pp. 219-259. Theatricality Beyond the Theater. Staging Public Life in the Hellenistic World* Angelos C H A N I O T I S (New York University) ecopaiccbs TrdvTa TO: EKET TrpdyuaTct T p a y c p S l a v...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.