FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:     | 1 | 2 || 4 |

«Keyword-based searching and clustering of news articles have been widely used for news analysis. However, news articles usually have other attributes ...»

-- [ Page 3 ] --

— System configuration. Our system is highly configurable. The size of the wheel and belt, the order of the documents in the belt, and the encoding schemes are all configurable. Our system allows users to deploy multiple keyword wheels in the display. Each wheel may represent a separate group of keywords.

— Speed control. Users can pause, fast forward, or rewind the belt. Users can also set a time for the whole stream and then the speed will be automatically computed.

— Coordinated view. Users can link document glyphs in the TextWheel view with the real documents by simply clicking the glyph icons. Users can also click on a single edge connecting two keyword glyphs, then a line chart will pop up to show the correlation evolution between these two keywords over time.

— Filtering. Filtering is available to show a small set of articles that users are interested in. There is a control panel beside the visualization display for users to filter out some news articles by date, size, source, etc.

— Clustering and highlighting. Users can also cluster the documents or cluster the chains in the window region to reduce clutter. If users still cannot see individual chains in the focus region, they could also click a document glyph, and all the related chains and keywords will be highlighted accordingly.

ACM Transactions on Intelligent Systems and Technology, Vol. 3, No. 2, Article 20, Publication date: February 2012.

Watch the Story Unfold with TextWheel: Visualization of Large-Scale News Streams 20:11

6.6. Alternative Designs We have considered some other designs for our TextWheel system. For example, the transportation belt can also be a wheel or a straight belt. We have also considered putting the keywords in another straight or U-shape belt. The wheel-shape transportation belt does not work in our system, since not all documents can be positioned on the wheel. The straight belt is a better choice. But the U-shape belt looks more consistent with the keyword wheels. Moreover, since we may occasionally want to explore relations between document glyphs, it is much easier to draw edges inside a U-shape belt than on a straight belt. However, the movement of the keywords looks strange compared with the rotation of the keyword wheel. The wheel could also encode the micro relation between keywords more effectively. In addition, using different primitives for keywords and documents may avoid confusing these two different entities.

For the dynamic system, we have considered directly connecting the keyword glyphs with document glyphs, or to directly use a physical spring model on the keyword wheel.

However, they are too distracting and hard for users to focus on what they are really interested in. Thus we decided to simplify the design, such as grouping all the chains first before they connect to the wheel, and fix all the keyword glyphs on the wheel before further exploration.

7. CASE STUDY We have applied our system to the news streams mentioned in Section 3.1. The news streams are related to six major topics: Microsoft, Sony, NYSE, Merck, China, and Verizon. Each topic contains thousands of news articles from various sources. We first ran our system for each topic and did some initial screening. Once we found interesting patterns, we configured the system and fine-tuned the visual displays to bring out more details for analysis.

In this section, we describe two findings. The following encoding schemes were used in all experiments: the keyword size encodes the frequency (larger size means higher frequency); the document glyph represents all the articles in one day; the document glyph height encodes the average length of all the articles in that day (larger height means longer length); the document glyph width encodes the number of articles (larger width means more articles); the document glyph color encodes the average number of keywords mentioned in all the articles (darker color means larger number); the width of the lines in the keyword wheel encodes the strength of the co-occurrence (thicker line means higher co-occurrence); the arcs in the significance chart indicate the places having the documents most similar to the document at the sliding bar (thicker arc means higher similarity); the color of a line linking a wheel and a document encodes which keyword the document has sentiment towards (connecting to the upper part of a wheel means the sentiment is negative, and connecting to the lower part of a wheel means the sentiment is positive). All the experiments are conducted on a Macbook Pro with Intel Core 2 Duo 2.2 GHz CPUs and 2GB memory. With some preprocessing, our system can handle thousands of news articles in real time.

7.1. Verizon’s Acquiring MCI During the initial screening of the news streams related to Verizon, we noticed some interesting relations between Verizon and MCI. Sometimes these two keywords are quite large in the display and dominate the view. Meanwhile, there is a thick line linking them (see Figures 3(b) and 3(d)). Sometimes, the MCI keyword totally disappears from the documents in the focus window (i.e., MCI background color disappears) (see Figures 3(a) and 3(c)). Thus, we decided to focus on this situation and try figure out what happens. Figure 3 shows some screen shots of the exploration.

–  –  –

From Figure 3(a), we can see that the MCI keyword has not appeared in the keyword wheel before January 2005. Then around February 2005, MCI starts to appear in the keyword wheel and the line linking the Verizon glyph and the MCI glyph is quite thick (see Figure 3(b)). In all the document glyphs which have links with both MCI and Verizon keywords, we noticed that there is one that has green color, high height, and very thin width (highlighted by a red rectangle in Figure 3(b)), which indicates that it only contains one long article mentioning both keywords many times. We opened that glyph and read the document. After quickly going through this article, we found this paragraph: “MCI Inc. confirmed an agreement to be acquired by Verizon Communications in a deal with a total value of $6746000000.” This explains the reason that these two companies became hot topics and were frequently mentioned together.

Then around October 2005, the link connecting MCI and Verizon disappears (see Figure 3(c)). We believed that the merging of these two companies was no longer a hot topic. However, around December 2005, the thick link between these two glyphs shows up again (see Figure 3(d)). So we paused, and chose the only glyph that has links to both MCI and Verizon (highlighted by a red rectangle in Figure 3(d)). There are only two documents in that glyph, and one of them mentioned that “Verizon Communications, Inc., has completed its acquisition of MCI. MCI’s assets will be folded into a new Verizon unit called Verizon Business.” From this case study, we can see that our system can help users quickly identify the relations between keywords and then narrow down to some articles to find the reasons for these relations.

7.2. Merck and Its Troublesome Drug Vioxx For the news streams related to Merck, we divided the keywords into two groups, that is, company and drug. During the initial exploration, we noticed that a drug called Vioxx appears frequently in the display and the sentiment towards it changes ACM Transactions on Intelligent Systems and Technology, Vol. 3, No. 2, Article 20, Publication date: February 2012.

Watch the Story Unfold with TextWheel: Visualization of Large-Scale News Streams 20:13

–  –  –

dramatically in 2004 from neutral to bad. Then we reran the system and paid special attention to this drug. Figure 4 shows some screen shots.

According to Figure 4(a), Vioxx has not appeared in the drug keyword wheel around August 23, 2004. Then on August 30, Vioxx shows up and the sentiment towards it is quite negative (see Figure 4(b)). We followed the thickest link from the Vioxx glyph and identified a document glyph representing articles on August 26 (highlighted by a red rectangle in Figure 4(b)). This document glyph is slim because it only has one article from the AFX UK Focus on that day. This article says: “Analysis of a study on the safety of COX-2 inhibitors found that Vioxx doses above 25 milligrams per day tripled the risk of cardiovascular events...”. Therefore, we can see that the negative sentiment towards Merck, the maker of Vioxx, is slightly stronger than the positive sentiment (highlighted by blue rectangles in Figure 4(b)).

Both Merck and Vioxx remain stable until September, when Vioxx becomes much more negative (see Figure 4(c)). We found the glyph with the thickest links to Merck and Vioxx (highlighted by a red rectangle in Figure 4(c)) and retrieved the corresponding articles (two documents in total). One article from AFX International Focus mentioned that “Merck said Merck was withdrawing its Vioxx arthritis drug from shelves worldwide, resulting in a 50 cent to 60 cent reduction in per-share earnings.” We then followed the thickest arc on the significance trend chart to reveal similar news articles from the same source (highlighted by a blue rectangle in Figure 4(c)). This article also said something negative about Merck: “Merck slumped more than 5 percent and was the biggest percentage loser among Dow Jones Industrial Average components.” Finally, Vioxx is no longer a hot topic because it becomes small in size and often without any background color (see Figure 4(d)), which means it is not mentioned by any documents in the focus window.

–  –  –

Fig. 5. User study result for evaluating the efficiency of the encoding schemes.

This case study demonstrates that the sentiments expressed by the keyword glyphs are very useful in news analysis and with the macro/micro relation information provided by our system we can quickly identify the sources of these sentiments.

8. USER STUDY In addition to the case study, we also conducted an informal user study consisting of 12 college students. They were all year two students having no prior information visualization knowledge, each of whom was asked to use our system and answer three questions. Before attempting the study, the users were briefly introduced to our system.

Meanwhile, they were also encouraged to play with it using different configurations such that they could get more familiar with our visual encodings.

To perform the case study, we chose a fraction (718 documents in a half year) of the data used in the second case study as the testing data. Then we processed the data and picked the top 12 frequently mentioned companies from the documents. After that, the document corpus, along with all 12 company names, was loaded into our system.

The documents on the transportation belt were grouped by day, and all 12 companies were placed in the same wheel. Then, we presented this system to the users and asked them to finish three tasks. These tasks are mainly designed to test the efficiency of the encoding scheme of the macro/micro relations. (The effectiveness of macro/micro relation encoding scheme is demonstrated in the case study.) The first task is designed to test the efficiency of our sentiment encoding scheme.

In this task, the users need to find out the time when a specific company on the wheel reaches its largest positive sentiment. The second task is designed to test the efficiency of the relation encoding scheme. In this task, they are asked to discern which two companies have the strongest relationship in the whole document stream. The last task combines these two previous tasks. In this task, we challenge them to find out the company that has a very positive sentiment and still has a relatively strong relation with a specific keyword.

In each task, we recorded their answers and the response time they needed to finish it. The results are shown in Figure 5. For the first task, ten users generally found the correct time (with average error of 5.5 days), while the remaining two users found the time that was also a local peak of positive sentiment with the second largest value in the whole stream. The average response time is 35 seconds. However, we also noticed that the standard deviation is as big as 21 seconds (see Figure 5(a)), which is probably because some of the users were still not quite familiar with our system. The first task is designed to warm up the subjects. It does not demonstrate the advantages of our system, since a classical line chart may be better for this task. Therefore, we asked the subject to finish two more complicated tasks, in which our system may better show its advantages. The second task is a little harder than the first one, since users may need ACM Transactions on Intelligent Systems and Technology, Vol. 3, No. 2, Article 20, Publication date: February 2012.

Watch the Story Unfold with TextWheel: Visualization of Large-Scale News Streams 20:15 to track multiple keywords at the same time. This time, eight users found the correct pair with an average response time of 49 seconds, which is a little longer than that in the first task. However, the standard deviation is reduced to 18 seconds. The final task is the most difficult one, because it requires users to synthesize two different visual encodings and keep the results in mind such that they could find the most appropriate one in the whole stream. However, the result seems quite satisfactory. Since our system includes a sliding bar for quickly rolling the transport belt forward and backward, users can freely examine the documents at any speed they like. Ten users found the correct one with an average response time of 68 seconds. On the other hand, the standard deviation is further reduced to 15 seconds. All of the three examples demonstrate that, with a little training, most users can use our system to explore large news streams and correctly find some patterns in the testing data for the three tasks. On the other hand, classical line charts are not suitable for the remaining two tasks. Since it needs to generate a curve for each pair of keywords, users may be easily overwhelmed when there are many keywords to explore.

Pages:     | 1 | 2 || 4 |

Similar works:

«8 Neovascular Glaucoma Kurt Spiteri Cornish Aberdeen Royal Infirmary United Kingdom 1. Introduction Neovascular glaucoma (NVG) is an intractable sight-threatening disease which is extremely difficult to manage and can lead to permanent visual loss. It occurs as a result of iris neovascularization also known as rubeosis iridis. Once the condition develops, early diagnosis and management is essential to minimize visual loss, thus better understanding of the causes and pathogenesis is essential....»

«Introduction to Knot Concordance (Work in Progress) Charles Livingston Swatee Naik Indiana University, Bloomington E-mail address: livingst@indiana.edu University of Nevada, Reno E-mail address: naik@unr.edu 2010 Mathematics Subject Classification. 57M25 Key words and phrases. knot, link, concordance Contents Chapter 1. Introduction 1 1.1. Book Outline 3 Chapter 2. Knots and surfaces 5 2.1. Manifolds, orientations and embeddings 5 2.2. Handle decompositions and Morse functions 7 3 2.3. Knots...»

«Operating Experience of a Dry Scrubber/Baghouse at the ShuLin Waste-to-Energy Incineration Plant Paper #544 Liang Chia Chang W. L. Gore & Associates (Far East) Ltd. (Taiwan Branch) 4th Floor, No. 136 Section 3, Nanking East Road, Taipei, Taiwan, Republic of China Keith Fritsky John R. Darrow W. L. Gore & Associates, Inc. 1100 Lewisville Road, P.O. Box 1100, Elkton, Maryland 21922-1100, U.S.A.ABSTRACT ShuLin Waste-to-Energy Plant is owned by EPB Taipei County, Taiwan, and operated by Ta-Ho...»

«THE CAT'S WHISKERS THE JOURNAL OF THE JAGUAR DRIVERS' CLUB OF CANBERRA INC. October 2016 The Cat’s Whiskers, October 2016 Page 2 CONTENTS Page JDCC Directory 4 Editor’s Page 5 Presidential Musings 6 Monthly General Meeting Minutes 9 Annual General Meeting Minutes 11 Run to Tidbinbilla Tracking Station with Hunter Region Club 14 Touring New Zealand by Brian Johnston & Lee Thomas 15 Coming Events & Activities 18 Wot’s On 20 Toyota Commemorative Museum by Greg Johnston 22 RTA Notice re:...»

«Case 5:10-cv-00074-FJS-DEP Document 20-10 Filed 03/05/10 Page 1 of 8 EXHIBIT 9 My Position on Invent Help, the UIA and Inventors Digest | IPWatchdog.com. http://www.ipwatchdog.com/2009/10/26/my-position-on-invent-help-the-. Case 5:10-cv-00074-FJS-DEP Document 20-10 Filed 03/05/10 Page 2 of 8 Today's Date: March 4, 2010 Home | Contact | Services | Patent Attorney | Patent Search | Gene on Twitter | Renee on Twitter | Blog Search | Recommendations IPWatchdog.com Sponsors My Position on Invent...»

«Apr – Jun 2015 Harare, Zimbabwe Issue 10 From the President’s Desk Mrs. Faith Gandiya Dear Friends By God’s grace we are half way through 2015. This might be good time for us to reflect on what has transpired thus far, a time to thank God for his faithfulness and a time to commit to continue in his service. The theme for Bernard Mizeki commemoration, “The call to discipleship” can help us in our reflection. As we individually think about our walk with the Lord, may we daily renew our...»

«The Negation of Action Sentences Are there Negative Events? St´ phanie Weiser e Events have three properties: they have causes and effects, they need a time and space zone and they can happen. In his work, Davidson (1967, 1969) argues that events are necessary to provide a logical form for action sentences. We can then ask what happens when action sentences are negated: do they still refer to events? Native speakers often have the feeling that, when an action sentence is negated, it means that...»

«Draft-do not quote From the Periphery of the Visual Space: Women and the Video Film in Nigeria Onookome Okome (Ph.D) Department of Theatre Arts University of Calabar, CRS Nigeria 1. Third Cinema and Popular Video film Popular video film in Nigeria is only beginning to take its distinct form as an aesthetic practice. In the last twenty years since it made its way in to the center of the narrative life of the Nigerian society, it has raised questions about cinematic practices in Nigeria and...»

«THE UNDERGROUND RAILROAD IN MASSACHUSETTS WILBUR H. SIEBERT* OST of the fugitive slaves who passed through the M New England states on their way to Canada and secure freedom crossed some section of Massachusetts by means of the so-called Underground Railroad. The operatives of this curious combination of variable routes were, of course, abolitionists, whose pity for the oppressed slave impelled them to welcome and conceal him on his arrival at their doors, attend to his needs, and, a night or...»

«Slovenský Červený kríž, územný spolok Senica Počet členov SČK: 2600 z toho mládež SČK: 100 Počet miestnych spolkov: 31 Počet dobrovoľníkov: 205 Počet profesijných zamestnancov: 54 z toho: * na pracovné zmluvy: 30 * na dohody o prácach vykonaných mimo pracovného pomeru: 24 Dobrovoľníci (UoZ) v rámci aktivačnej činnosti formou dobrovoľníckych prác cez ÚPSVaR: 31 1. Prehľad aktivít uskutočnených v roku 2013  Zasadnutia orgánov SČK: 5 x zasadnutie Územnej...»

«FOSTER HANDBOOK Vancouver Orphan Kitten Rescue Association VOKRA Updated April, 2013 VOKRA FOSTER HANDBOOK May 2013 1 TABLE OF CONTENTS Welcome to Fostering! Meet your VOKRA Support Team Taking a Break Preparing to Foster Join our Facebook groups Access the Foster Website Mixing VOKRA kitties with your cat(s) Kitty-proof your home Checklist The Kitties are Coming, the Kitties are Coming! Picking up Kitties Your foster kitty’s “passport” Supplies Medications Where to keep your foster...»

«Phone: 717-232-0593 449 Eisenhower Boulevard, Suite 300 800-892-6532 Harrisburg, PA 17111-2302 Fax: 717-232-1799 E-mail: skellyloy@skellyloy.com Internet: www.skellyloy.com August 18, 2016 Howard County Department of Public Works 9250 Bendix Road Columbia, Maryland 21045 Re: Recommendations for June and July 2016 Indoor Air Quality Assessments To Whom It May Concern: Skelly and Loy, Inc. performed Indoor Air Quality (IAQ) Assessments at 12 different schools within the Howard County Public...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.