FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:   || 2 | 3 | 4 | 5 |   ...   | 6 |

«Chapter 6. Classification Chapter author: Jess Hemerly jhemerly Table of Contents    6.1 Overview  ...»

-- [ Page 1 ] --

Chapter 6: Classification    Last revised: September 17, 2010 

Chapter 6. Classification

Chapter author: Jess Hemerly


Table of Contents 


6.1 Overview 

6.2 Classification Theory 

6.2.1 What is Classification? 

6.2.2 Purpose of a Classification 

6.2.3 Classification is Principled 

6.2.4 Spectrum of Classification 

6.3 Faceted Classification 

6.3.1 What are Facets? 

6.3.2 Faceted Classification as a Controlled Vocabulary 

6.3.3 Facets in Information Retrieval 

6.3.4 Designing a Faceted Classification for the Web 

6.4 Social/Distributed Classification 

6.4.1 What is Tagging? 

6.4.2 Folksonomy versus Tagsonomy 

6.4.3 Tagsonomies and Personal Information Management 

6.5 Computational Classification 

6.5.1 What is Computational Classification? 

6.5.2 Machine Learning 

6.5.3 Clustering 

6.5.4 Discriminant Approaches 


6.1 Overview Imagine how kitchen items are organized in a brick-and-mortar department store—think WalMart or Macy’s—that sells a variety of goods, from clothing to furniture. Within the store, the kitchen goods will be grouped together in a few aisles or on a single floor. Signs above the aisles or on the department store directory serve as descriptions pointing you to the section of the store that contains the items fitting that description. Within the kitchen area, you may see blenders grouped together on one shelf or a section of shelves; wooden spoons, tongs, and spatulas arranged by type and hung neatly; and rows of dishes unpacked and laid out in place settings to help you imagine how the different styles might look on your kitchen table. In this scenario, the department store comprises any number of kinds of items, grouped together in a location; the kitchen section comprises only kitchen supplies; and each shelf or area comprises a specific kind of kitchen item.

Next, imagine you’re shopping for kitchen items in a specialty kitchen store, like Williams-Sonoma or a wholesale kitchen supply store. As you walk in the door, you immediately find that the store sells one grouping of items: things to be used in your kitchen. Because this   ‐ 1 ‐  Chapter 6: Classification    Last revised: September 17, 2010  store is devoted only to this one type, or class, of items, the arrangement is somewhat different.

The specialization allows more variety within each class, expanding classes across aisles and displays. Instead of one or a few aisles devoted to kitchen items, you find one or a few aisles devoted to utensils, to blenders, to coffee makers, to knives, etc. Items within classes may be grouped by brand, size, or price, with labels on the shelf describing these specific attributes.

Because the contents of the store come from a narrow or specialized class, the selection and its organization differ from the organization in a department store. Here, the set of instances is more refined, catering to a specific clientele looking for a more specialized selection of goods.

Now, think about how you might shop for kitchen items online at either a department store or a specialty store’s website. Online, there are many different ways to locate items. You can enter a query and search for a generic term, like “knife,” a more specific term like “paring knife,” or a very specific term, like “Wüsthof Classic 9cm Paring Knife.” On many websites, search results will display a list of related terms or descriptors somewhere on the page, especially if you have entered a generic term. These descriptors are usually things like price, brand, and type, allowing you to browse and narrow down your results based on different characteristics you desire in the item. Maybe you didn’t know you wanted a Wüsthof Classic 9cm Paring Knife, but by narrowing down your search results using the characteristics, called facets, you end up discovering that this is just the knife you need. If you entered a very specific term, your item may not be available, but the system may suggest similar items that you might find just as desirable based on your terms and objects that have been assigned to categories sharing certain characteristics. The system takes your query and matches it to items available in the database to come up with recommendations that might work for you.

Let’s turn to the classification of the books in a kitchen or cooking store. Perhaps they’d be organized at the topmost level by topic or type—cookbooks, equipment, and techniques.

Looking at the subclass cookbooks, there are a few ways the subclasses could be arranged. They could be organized by cultural cuisine, like French, Indian, and Chinese; by main ingredient (fish, poultry, vegetarian); or alphabetically by author or title, within the class cookbooks overall or within each of the subclasses.

But looking for books on topics related to food and cooking in a library that uses the Dewey Decimal System is another story. You might first want to look under “700 – Arts and recreation,” since cooking is referred to as “the culinary arts” and many people enjoy cooking— and eating—as a recreational activity. But you wouldn’t find it there. Instead, food-related books live under “600 – Technology” in “640 Home economics & family living.” This classification may have made sense when the system was established, but “home economics” has become a dated term and cooking is more than an activity relegated to the home and family living space along with child rearing and sewing—historically, “women’s work.” Now think about your own kitchen. How have you grouped items in your kitchen?

Silverware is usually kept together in a single drawer, often separated by type with a silverware organizer. Pots may be stored in the same cabinet, baking items on the same shelf, and coffee next to or very near to your coffee maker. Items you use frequently may be in more accessible areas—on top in drawers, lower cabinets, and shelves—than items you use infrequently, which end up on high shelves or pushed to the back of cabinets. Containers in your freezer or fridge may be labeled, or tagged, with dates and names of items, in the same format. If you have a roommate or roommates, things could be labeled in a different format depending on who put it away—item name, item name and date, date only, etc. Or, worse, maybe you just have a collection of unlabeled mystery containers and wish you had taken the time to label things when   ‐ 2 ‐  Chapter 6: Classification    Last revised: September 17, 2010  you put them away. And maybe your kitchen isn’t organized at all, and every time you want to cook anything that involves the stove you spend some time searching for a specific item you need for the job. In a disorganized or unorganized kitchen, finding items is often difficult.

Finally, let’s turn to an example tied to technology. Let’s say you have an online collection of recipes and you want to figure out, without going through them individually, which ones are similar—for example, which of these recipes are vegetarian or contain ingredients in common? We can use computational classification to analyze these recipes, mine them for terms and combinations of terms —ingredients in this case—and cluster them based on the similarity of their term distributions. If the algorithm doesn’t start with a set of recipe categories its approach is called “unsupervised” machine learning. In contrast, you might have created a set of recipe categories and used an algorithm that sorts recipes into them. Or you could run the document analysis and the use a service like Mechanical Turk to have people read through the sorted documents and apply metadata. Both of these latter approaches are examples of “supervised” learning.

What does all this talk of kitchens have to do with classification? Everything. Each of the above examples includes concepts that we will discuss in depth in summary and in the topics within this chapter, from search, browsing, and retrieval to rules for arrangement and tagging.

Classification is, in a sense, applied categorization, but while categories are equivalence classes—sets of material and


things and processes we treat as the same—a classification is a system of categories, called classes, ordered using a predetermined set of principles. The terms “classification” and “categorization” are often used interchangeably, but they are not the same. Having a set of categories is not sufficient to create a classification. A classification must be principled so that we know where to place new items and entities in accordance with our system.

We apply principles to a set of instances or entities—concepts, objects, tasks, activities, etc.—within a domain in an attempt to sort and group the things that go together. In the kitchen examples, principles dictate how and why kitchen items are put in certain places or arranged in certain ways. But as we saw in Chapter 3, there are many different kinds or properties that can be used to describe things, and organizing principles can be based on any of them, not just those that are intrinsic or inherent. For example, credit bureaus classify borrowers by analyzing their purchasing and repayment history and assigning credit scores; insurance companies use accident and citation records to classify drivers and compute rate quotes.

The fundamental purpose of a classification is to help us make sense of relationships between concepts or objects within a domain or a set. A classification provides a reference model or a “semantic road map” to these concepts or objects within a domain, improving learning and

communication. When we talk about a classification being “principled,” we use three key terms:

lawful, systematic, and arbitrary. A classification is lawful because it follows at set of defined principles that determine the structure of categories and relationships; it’s systematic because the principles must be followed; and since it is designed by people or machines with a specific perspective for a specific purpose, a classification scheme is also arbitrary.

Classifications have perspectives and purposes, whether they are task-oriented or, at a higher level, serve individual, cultural, or institutional purposes. A classification may be as structured and widely used as the Dewey Decimal system in libraries, or as individual as a personal system for labeling genres in one’s music collection. Decisions must be made about what characteristics will be used to define the classes, and it is in these decisions that perspectives emerge. For example, a superordinate, or parent, class of “beer” could be divided   ‐ 3 ‐  Chapter 6: Classification    Last revised: September 17, 2010  into its first subdivision in a number of ways: color (light, dark), yeast (wild, lager, ale), style (lager, stout, porter), etc. The characteristics we choose for classification determine the shape of the ontology tree— the number of branches, sub-branches, and leaves at the endpoints. A different choice about classes and arrangements of those classes will create a very different ontology.

We classify largely to find things more easily later. That is, classification is as much for the organization of information as it is for retrieval. The development of the Dewey Decimal system provided multiple libraries a single standard of organization so that things could be found the same way in different libraries. In short, classification standards for books allow people to learn only one system to find books in many libraries. Without such a standard, we would need to learn a new system for every library—or worse, for every subject or type of publication.

Further, every system makes distinctions, either implicitly or explicitly, between “standard” and “nonstandard” ways of understanding things. These are often accompanied with the value judgment of “good” or “bad,” respectively. The politics of classification often show themselves in the labels or descriptors used to identify the class or its characteristics. For example, in the United States, people who have given up their job searches are not classified as “unemployed” even though in the literal sense of the word that is what they are. Here, the government has made a conscious decision to define the term in such a way as to exclude a group of entities from a class because it lowers the unemployment rate. How things are assigned to classes within a classification can even be politically motivated, as we’ll see with the example of NASA’s risk classification for space flight. A group of people may resist classifying an item in a certain way because of the implications or actions required by placing an item in that class.

An even more striking example classification can be found in the ethnic classifications of the United States Census and the classes in which census-takers have been forced to place themselves. We’ll discuss this in greater detail below.

Second, decisions must also be made about what to classify—that is, do we design our classification system based on characteristics of a given set of items or do we design it on a philosophical standpoint with universal classes intended to classify all knowledge? A justification for our order and selection of classes is known as warrant and it takes several forms. In the case of the Library of Congress Classification (LCC), the collection of books within the library of congress is the literary warrant. The taxonomic classification system used to classify all living organisms relies on scientific warrant.

Third, we must decide if we want the classification to be flexible based on new information. Let’s return to the census example. The terms for ethnic background have changed dramatically over the years, as census administrators adjust classifications to align with changes in what is and is not culturally acceptable. The census benefits from a 10-year delay between surveys and has the ability to adjust these classifications according to new information about the cultural climate regarding race identification (see Figure 6.1).

Figure 6.1: US Census Race Classification Changes

–  –  –

Pages:   || 2 | 3 | 4 | 5 |   ...   | 6 |

Similar works:

«Medicare Claims Processing Manual Chapter 15 Ambulance Table of Contents (Rev. 3620, 10-07-16) Transmittals for Chapter 15 10 Overview 10.1 Authorities 10.1.1 Statutes And Regulations 10.1.2 Other References to Ambulance Related Policies in the CMS Internet Only Manuals 10.2 Summary of the Benefit 10.3 Definitions 10.4 Additional Introductory Guidelines 20 Payment Rules 20.1 Payment Under the Ambulance Fee Schedule 20.1.1 General 20.1.2 Jurisdiction 20.1.3 Services Provided 20.1.4 Components of...»

«GMO analysis of feeding stuffs – current challenges The following article depicts the current challenges to the analyst for the testing of feed which are consisting, containing or produced from genetically modified organisms (GMOs). The Working Group PCR Analysis of the Section Feedingstuff Analysis of the Verband Deutscher Untersuchungsund Forschungsanstalten (VDLUFA) here particularly describes the challenges in testing of feed which arise from the Commission Regulation (EU) No 619/2011 [1]...»

«1 Editor in Chief: Assist. Prof. Dr. M. Uğur Türkyılmaz Editorial Board: Dr. Leyla Senturk (Lumina The University of South-East Europe, Romania) Dr. Elena Stoican (Lumina The University of South-East Europe, Romania) Dr. Alexandru Matei (Lumina The University of South-East Europe, Romania) Dr. Çağrı Tuğrul Mart (Ishik University, Iraq) Organizing Committee: Dr. Ugur Turkyilmaz (Lumina The University of South-East Europe, Romania) Dr. Leyla Senturk (Lumina The University of South-East...»

«From: FSIS (b) (6) To: FSIS; (b) (6) FSIS (b) (6) Subject: Emailing: Will BPIs Plant Closures Affect Americas Ground Beef.htm Date: Tuesday, March 27, 2012 11:07:09 AM Attachments: Will BPIs Plant Closures Affect Americas Ground Beef.htm This is a more accurate story regarding the lean beef technology and all the hoopla surrounding it. Dr. (b) (6) is a well respected food microbiologist and knows the regulated industry and FSIS well. This will undoubtedly impact 18226 as they were using the “...»

«This document is scheduled to be published in the Federal Register on 09/14/2016 and available online at Billing Code: 4210-67 https://federalregister.gov/d/2016-21868, and on FDsys.gov DEPARTMENT OF HOUSING AND URBAN DEVELOPMENT 24 CFR Part 100 [Docket No. FR-5248-F-02] RIN 2529-AA94 Quid Pro Quo and Hostile Environment Harassment and Liability for Discriminatory Housing Practices under the Fair Housing Act AGENCY: Office of the Assistant Secretary for Fair Housing and Equal Opportunity, HUD....»

«Demographic Research a free, expedited, online journal of peer-reviewed research and commentary in the population sciences published by the Max Planck Institute for Demographic Research Konrad-Zuse Str. 1, D-18057 Rostock · GERMANY www.demographic-research.org DEMOGRAPHIC RESEARCH VOLUME 25, ARTICLE 17, PAGES 545-564 PUBLISHED 26 AUGUST 2011 http://www.demographic-research.org/Volumes/Vol25/17/ DOI: 10.4054/DemRes.2011.25.17 Research Article The potential impact of intermarriage on the...»

«Willard Says. Booster Pump Location location, location applies to situations other than real estate—such as where to locate a booster pump. Improper booster location causes lost production, inefficiency, pump breakage, blown pipelines and fittings not to mention user dissatisfaction. See Pages 7, 8 and 9 for sketches depicting various booster pump arrangements. Probably 25 percent of the booster pumps that I see in operation should be relocated to obtain better performance. The dredge pump...»

«Rolling a Cube can be Tricky Marzio De Biasi marziodebiasi [at] gmail [dot] com May 2012 Version 0.03 Abstract We settle two open problems related to the rolling cube puzzle: Hamiltonian cycles are not unique even in fully labeled boards and rolling cube puzzle is NP-complete in labeled boards without free cells and with blocked cells. 1 Introduction In a rolling cube puzzle a die must be rolled on a board visiting all its labeled square cells and return to its starting location; at every step...»

«NON-REPORTING AND HIDDEN RECORDING OF SEXUAL ASSAULT IN AUSTRALIA* Denise Lievore PhD Research Analyst Australian Institute of Criminology *Special thanks go to Marianne James, Pat Mayhew and Jenny Mouzos for their helpful suggestions. Abstract Visible sex crimes, such as rapes that come to the attention of police or survey interviewers, have been described as the ‘tip of the iceberg’, as they comprise a small proportion of sexual violence against women. The ‘submerged’ levels of the...»

«International Journal of Humanities and Social Science Invention ISSN (Online): 2319 – 7722, ISSN (Print): 2319 – 7714 www.ijhssi.org Volume 3 Issue 1 ǁ January. 2014ǁ PP.43-48 “The State and Functioning Of Joint Liability Groups (JLGs) In Bangalore Urban District” 1, Mrs. Padma K.M.S, 2, Mr. Venkata Subrahmanyam C.V., 3, Dr. A. M. Suresh 1, Research Scholar, Department Of Management Studies, Bharathiar University, Coimbatore – 641 046 2, Data Scientist (Sr.) & Faculty (Visiting),...»

«Congress of the Social Sciences and Humanities.The Ethics of Humour: Preliminary Thoughts Jim Lyttle Abstract An attempt is made to define humour by looking at the etymology of the word, a conceptual map of labels, and an integration of humour theories. The process of humour appreciation is assessed along with the use of humour as a tool. The need for a shared context in order to “decode” humour makes it exclusionary and the existence of a ridiculed target makes humour aggressive. Humour...»

«596660I AS0010.1177/2233865915596660International Area Studies ReviewTønnesson review-article2015 Review Essay International Area Studies Review 2015, Vol. 18(3) 297–311 Deterrence, interdependence © The Author(s) 2015 Reprints and permissions: and Sino–US peace sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/2233865915596660 ias.sagepub.com Stein Tønnesson Department of Peace and Conflict Research, Uppsala University, Sweden, and Peace Research Institute Oslo (PRIO), Norway Abstract...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.