FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:   || 2 |

«Brief Description Just because data can be made more accessible to broader audiences does not mean that those people are equipped to interpret what ...»

-- [ Page 1 ] --

Interpretation Gone Wrong

by Alex Rosenblat, Tamara Kneese, and danah boyd

A workshop primer produced for:

The Social, Cultural & Ethical Dimensions of “Big Data”

March 17, 2014 - New York, NY


Brief Description

Just because data can be made more accessible to broader audiences does not mean

that those people are equipped to interpret what they see. Limited topical knowledge,

statistical skills, and contextual awareness can prompt people to read inferences into, be afraid of, and otherwise misinterpret the data they are given. As more data is made more available, what other structures and procedures need to be in place to help people interpret what's available?

Detailed Topic Description:

Data is increasingly being made available for public consumption. Expressions like “information is power” and “information wants to be free” have gained enormous rhetorical traction, and concepts like “transparency” and “open access” dominate discussions of the governance of all types of data: public, private, commercially- and academically-generated, and scientific. But what are the ramifications of broad information accessibility? Who is collecting, structuring, analyzing, and distributing information? Who is interpreting what is made available, for what purpose, and to what end?

Just because data is more accessible to broader audiences does not mean that its recipients are sufficiently equipped to interpret what they receive. Even when people know that the data has a bias, they make decisions based on what it seems to represent about an item, object, or issue. They may place their trust in the institutions or organizations that disseminate it in the hopes that they have looked at the complicated data in some objective and decisive way. Most people do not have experience creating or structuring datasets.

Many lack the statistical skills - or even basic fluency in probabilities - to draw meaningful inferences from the data at hand. And even when they do, only a few have sufficient knowledge or expertise to properly contextualize their findings and apply them appropriately. As a result, people can easily misinterpret data that they are given, leading to confusion, anxiety, and suboptimal decision-making. This affects individuals in a wide range of domains, including knowing which goods to purchase, understanding personal health risks, making smart personal financial decisions, or even evaluating how best to receive, question, or concur with news items.

Data &Society Research Institute datasociety.net Organizations may also face challenges in making sense of the data they have access to--especially as information becomes available in huge quantities from a multiplicity of sources. They may not anticipate potential interpretations of the data they collect or use.

Despite a certain sense of inevitable disaster resulting from information overload, both individuals and companies regularly do receive, analyze, and use lots of information from divergent sources to make successful decisions everyday. How do we reconcile the potential (and actual) harms of informational abundance with some of these positive outcomes? What are the right structures to put into place to limit potential misinterpretations? Curtailing individual or organizational access to data does not seem to be the right approach.

People who are accustomed to accessing to information as a right get upset when the state intervenes to curtail access. Yet, there can be serious individual and social consequences when people misinterpret information because they lack the skills, knowledge, and context to do so adequately. For example, Reddit users mistakenly identified a missing person, Sunil Tripathi, as the Boston bomber, Dzhokhar Tsarnev, based on grainy photos released by the FBI of the suspect, Tsarnev, that were compared to photos released by the Tripathi family in their search for Sunil. The media ran with the story, despite the FBI’s assurances to the Tripathi family that Sunil was not a suspect, effectively derailing the search for the deceased Sunil by both the public and a private missingperson’s agency, and upsetting his family with media attention and false accusations. The crowdsourced search was spurred on by the notion that anyone with access to the ‘big data’ of publicly-available photos, municipal video feeds, and other sources of information could properly identify the Boston bomber. Crowdsourced engagement with publicly available data can have serious consequences beyond the targeted ideal outcome. Access to sensitive or highly fraught information can be problematic, as not all data-interpreters are made equal, whether they are researchers or unqualified internet users. Put another way, access to information is not the same thing as access to knowledge.

The challenges of data interpretation raise issues about who should (and should not) have access to data in the first place. For example, New York state law requires professional genetic counseling for anyone who gets access to their genetic information. Are such moves valuable educational interventions or paternalistic governance? Do individuals have moral rights to their own data? To what extent, and under what circumstances? When do such rights trump society’s right to intervene in order to allay fears, preserve the public trust, or achieve desired outcomes? When, if ever, should individuals be denied access to their own data? Based on what principles, and with what limits?

As more data becomes readily available, how do we collectively address the challenges of interpretation? What other educational structures and procedures need to be put in place to help people interpret information that affects their interests? By whom? For example, to help the public have a better framework for data interpretation, Google is offering a free online course on making sense of the data. Should specialists with knowledge in particular fields—social scientists, physicians, or genetic counselors—also offer short courses so that Data &Society Research Institute datasociety.net data can be properly contextualized? Is education enough? Should collectors and purveyors of data be required to disclose facts relevant to data interpretation, such as the population sampled, characteristics measured, and the presence of any systematic bias that would skew results?

In medicine, there is a well-established principle that the lower the prevalence of any given condition, the higher the likelihood for false positives for even the most accurate diagnostic test. This is one reason why physicians may hesitate to run diagnostic tests for patients who don’t meet the right criteria for them. An educated data or informationmediator is considered necessary in the medical domain as a barrier between a consumer and access to services. Physicians act as ‘knowledge brokers’, and they are trusted to act and disseminate information in good faith because they subscribe to strong ethical principles as part of their regulated professional ethos. How does data accountability work in other domains? How would this model of data arbitration apply in other sectors? How would data brokers be regulated?

The philosopher Nicholas Taleb makes a similar comment about the “Big Data” phenomenon, asserting that more information results in more false information: he writes, “big data means anyone can find fake statistical relationships, since the spurious rises to the surface. This is because in large data sets, large deviations are vastly more attributable to variance (or noise) than to information (or signal).” How receptive are people to the notion that more information is not necessarily better, or that it can lead to wider misinterpretation? How does this notion affect how resources should be directed to making sense of the data?

Case Study 1: Personal Genetics

The company 23andMe offers direct-to-consumer genetic testing that gives personalized information on the consumer’s risk for various diseases based on a spit sample of (presumably) their DNA that they can mail into the company. In return, consumers receive results that convey information about probabilities compared to the larger population. For example, a user might be told that she has a 30% higher probability than average of having a rare lung disease. All too often, an ill-informed individual may interpret this to mean that she has a 30% likelihood of developing the disease. Even if she understands that this is not the case, she may not realize that a 30% higher probability than average is, for all intents and purposes, so trivial as to be absolutely meaningless with regards to a rare disease.

Unofficial diagnostic tests like 23andMe can heighten anxieties about one’s genetic risk for a range of diseases, without explaining those risks properly, or putting statistical information in layman's terms. Consumers may lack the proper education or tools to understand how that risk is computed, and how the information they have are given might be reasonably disputed. Some ethicists and officials are concerned that women will preemptively seek mastectomies if they have a heightened awareness of their ‘risk factors’ for Data &Society Research Institute datasociety.net breast cancer. Indeed, the reason that New York State requires genetic counseling for all who seek genetic tests is because so few people understand how to meaningfully interpret genetic information.

The FDA recently ordered 23andMe to stop offering personalized genetic health data to consumers amidst fears that users would react unreasonably to receiving alarming medical information that wasn’t delivered or curated by a trained medical professional. In addition to other quality control issues, the logic is that medical professionals have a different type of ethical obligation to their patients than a commercial enterprise has to its consumers with regard to the information it disseminates. These developments in health tracking and health data access raises a number of questions, such as: is health data generated through self-tracking commercial entities different than other kinds of health data generated by physicians or other experts? Who has the right to disseminate personalized health information? What interpretive tools should those entities provide to users? Who has the right to receive personalized health information? Who should be denied access? Could elevated health concerns from (mis)information lead patients to use health care resources unnecessarily?

Case Study 2: Obsessing Over Metrics

Any metric that is valued is gamed. An entire cottage industry exists to help with search engine optimization because companies want their sites to appear at the top of search queries. Authors and publishers try to game best seller lists by buying their own books because those lists signal quality. Each year, cheating scandals break out as teachers help students game assessment tests because their teaching is interpreted through their students’ performance.

Although gaming metrics is nothing new, the “big data” phenomenon has amplified this issue because more and more is dependent on results produced or indicated by data. This dependency reflects a certain idolatry of numbers, as though anything worth valuing can be quantified, and anything that can’t be quantified isn’t worth much. In other words, if a data-oriented solution to a problem isn’t available, people who are eager to verify their solutions with data-credibility can reframe the question because numbers are viewed as ‘scientific’ or trustworthy. To some degree, ‘the data’ has become a universallyaccepted rationale for choosing one course of action over another, or for seeing meaning or patterns that appear significant because they are attached to a numerical or statistical value, regardless of how well understood that data is, or how those numbers are produced.

Consider, for example, Klout scores and social media followers. These scores, intended to measure the influence that individuals have on a 1-100 scale, are now baked into search engine results. One factor in shaping Klout scores is follower count on major social media like LinkedIn, Facebook, and Twitter. The number of Twitter followers one has also shapes search engine results directly, both within Twitter and on third party sites. As data scientist Gilad Lotan learned, purchasing bots and fake followers is neither difficult nor expensive. Although services like StatusPeople exist to help people determine if an account Data &Society Research Institute datasociety.net is primarily followed by fake people, most people don’t notice; they see the follower count and presume that the followed is influential.

Services like Klout don’t seek to verify followers so when they pick up on these signals and pass them on to search engines, they reinforce inaccurate interpretations and bake them further into systems. People with high Klout scores receive perks, including free schwag and access to key opportunities. Of course, it’s not just marketing companies and search engines that rely on these numbers. When journalists cover people, they also often refer uncritically to people’s follower counts when discussing the importance of those people. Thus, even if everyone knows that metrics are gamed, they are repeatedly used, creating the appearance of truth or fact, even where none exists.

In some sense, the more closely we are networked into a high-accessibility society, the greater our echo chamber is, even though the concepts of “open access” and “greater transparency” suggest the opposite -- that truth is within reach because our browsing ability is engineered in tech-savvy ways. However, your ability to find a good answer, or to ask a good question, is not replaced by Google’s algorithmically-generated autocomplete function. How do we remain conscientious about information in light of greater access to it? How do we analyze metrics, or contextualize the data to avoid errors of misinterpretation?

Case Study 3: When Algorithms Imply Interpretation

Pages:   || 2 |

Similar works:

«Alpha Epsilon Pi Omega Chapter University of North Carolina at Chapel Hill 2015 Fall Newletter Our newly elected Executive Committee. Not pictured: Jacob Meyer Greetings Parents and Alumni! Recently, we held our biannual elections. Above, you can see the new Executive Committee. This year, we elected several members who had never before held an officer position. We’re sure you’d like to know more about our new leaders, so we’ve provided a short bio of each. First, our Master. The Master...»

«© 2013 Actualizaciones en Comunicación Social Centro de Lingüística Aplicada, Santiago de Cuba XABIER ARREGI ANA ARRUARTE XABIER ARTOLA MIKEL LERSUNDI IGONE ZABALA University of the Basque Country, UPV/EHU Basque Country {xabier.arregi | a.arruarte | xabier.artola | mikel.lersundi | igone.zabala}@ehu.es TZOS: An On-Line System for Terminology Service 1. Introduction Academics use terms in their everyday activity. Terms –specialized lexical units that must be shared by experts of a...»

«EL TIRAR BASURA: ES CONTRA LA LEY Con la licencia de conductor viene la responsabilidad de ser familiar con las leyes del camino. Como conductor usted es responsable de lo que puede ser lanzado de su vehiculo en una calle de una ciudad o autopista estatal. Codigo 8-6-404 (a)(1) (A) (i) una persona culpable de deja de pagar la multa que le ha sido ordenada por la violar el codigo § 8-6-406 o 8-6407 por la primera corte. tendra suspendida su licencia de conducir ofensa será culpable de un...»

«LEI ORGÂNICA DO MUNICÍPIO DE MACEIÓ ÍNDICE ALFABETÍCO-REMISSIVO A ABERTURA DE CRÉDITOS Suplementares, Especiais e Extraordinários – art. 20,II AÇÃO DO SAÚDE DO TRABALHADOR Competência do SUS – art. 127, VI ADMINISTRAÇÃO PÚBLICA MINICIPAL -Gestão da Documentação Oficial – art. 145 -Princípios – art. 80 ADVOCACIA GERAL DO MUNICIPIO -Definição e atribuições – art.61 ALIENAÇÃO -Bens imóveis e concessão de direito de uso – competência CMM, com a sansão do...»

«Building Massawepie From Diaries of Harley Burgdorf, Massawepie Property Superintendent The following accounts have been directly taken from diaries that Harley Burgdorf kept during his time as property superintendent at Massawepie Scout Camps. His first day of work was Wednesday, September 26 th, 1951. His last was October 22, 1966 when he retired. The accounts concern only construction years for camps Pioneer, Mountaineer and Voyageur. It has been my privilege to prepare this information for...»

«Page 1 of 29 SOCIAL FORESTRY SUPPORT PROGRAMME TRAINING OF TRAINERS IN TRAINING NEEDS ASSESSMENT WITHIN THE PCD PROCESS Dr. Peter Taylor September 1999 TABLE OF CONTENTS Introduction 5 1. Initial Training in TNA Concepts, Processes and strategies. 6 1.1. Training framework 6 1.1.1. Aims of the Training 6 1.1.2. Objectives 6 1.1.3. Teaching and Learning Process 6 1.1.4. Follow-up Activities 6 1.2 Outline Of the Initial Training Programme 7 2. Report of the training programme (process and...»

«This is an extended excerpt of Silent Witness by Shirley Wells, provided courtesy of Carina Press. To purchase the book, please visit www.carinapress.com Silent Witness By Shirley Wells After his ex-wife bled to death in a bathtub covered in his fingerprints, the case against Aleksander Kaminski seemed open and shut. Though sentenced to life in prison, he swears he’s innocent, a claim supported by his current wife. Private investigator Dylan Scott finds himself drawn back to dreary Lancashire...»

«freewargamesrules.co.uk presents: Aliens : This Time it's war By Pete Jones January 2010 ++ Introduction ++ Is this going to be a stand-up fight, sir, or another bug hunt (Hudson) These rules are inspired by the films AliensTM. They can be played solo as the Aliens are all generated by dice rolls. They are based on the old Milton Bradley game “Space Crusade”. Any scale of figure will do from 10mm to 25mm. Measurements used are inches or use squares if using floorplans. Dice: the game uses...»

«QUICK VIEW: Synopsis Ernst Ludwig Kirchner was one of the driving forces in the Die Brücke group that flourished in Dresden and Berlin before WWI, and he has come to be seen as one of the most talented and influential of all Germany's Expressionists. Motivated by the same anxieties that gripped the movement as a whole fears about humanity's place in the modern world, its lost feelings of spirituality and authenticity Kirchner had conflicting attitudes to the past and present. An admirer of...»

«In the name of Allah the Most Gracious the Most Merciful Thirty Lessons for Those Who Fast by A a’id A bdullah al Q arni Translated by Dr Daud A. Abdullah Contents In tro d u c tio n Lesson 1 Guidance of the Prophet in fa s tin g Lesson 2 W hy was fasting o rd a in e d ? L esson 3 The M ajestic Q uran and the m onth of R a m a d a n. 20 Lesson 4 The chants of those w ho fa s t l.csso n 3 R am adan is a school for the learning of generosity and sacrifice Lesson 6 R am adan: the m onth of...»

«Structural sketcher : a tool for supporting architects in early design Pranovich, S.DOI: 10.6100/IR576067 Published: 01/01/2004 Document Version Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication: • A submitted manuscript is the author’s version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published...»

«A PAEDIATRIC HANDBOOK for Malawi JA Phillips PN Kazembe EAS Nelson JAF Fissher E Grabosch First Edition published 1991 Revised Edition published 1998 Third Edition published 2008 Printed by Montfort Press, P.O. Box 5592, Limbe, Malawi.PREFACE TO THIRD EDITION In the ten years since the revised (second) edition of the Paediatric Handbook for Malawi many major changes have taken place in disease patterns and management of diseases in the country. It was therefore essential that the Handbook be...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.