«Anil Dagar, Yasuhiro Endo, Abhay Gupta, Yan Li, Kuldip Pabla, Sridhar Ramaswamy, Ikhlaq Sidhu College of Engineering University of California, ...»
2.3 Drawback of “Do Not Track” The latest Consumer Insights Survey reveals that 68 percent of the Internet population across 11 countries would select a "do not track" (DNT) feature if it were easily available. This would create a 'data black hole' in the Internet.
Industry claimed data collecting is used to enhance user experience, and to target advertising based on user activity. As regulation tightens, data collection could "diminish personal data supply lines and have a considerable impact on targeted advertising, CRM, big data analytics, and other digital industries" . The survey found that only 14 percent of respondents believe Internet companies are honest about how they use consumers' personal data, suggesting that it will be a challenge for online companies to change consumers' perceptions.
2.4 Business Approach to Government Regulations
Online publishers and advertisers are wary of the new regulations. Many large businesses have a group of lawyers who dedicate most of their time to fighting privacy litigations and lobbying against stricter regulations.
Companies have also created their own privacy policies and get explicit user consent from the users by making it a prerequisite to being able to use their services. In addition, they allow users to view and manage the information collected by the service provider.
3. Privacy Landscape
3.1 Data Ecosystem Figure 2: Internet Data Ecosystem (Source: FTC) As we see from the above Federal Trade Commission (FTC) diagram , there is a large ecosystem
around data. The User sits in the middle of the ecosystem. There are three main players:
Data Collectors – Typically these are the players who are responsible for collecting the data from • the users. Internet Websites are big Data Collectors from the users visiting their websites.
Data Brokers – These players get Data from the Data Collectors, and analyze the data, generate • some intelligence and make it available to Data Users. Credit Bureaus are Data Brokers who get data from different Data Collectors and make it available to banks for their business.
Data Users – These players buy data from the Data Brokers and use this information in their • business.
3.2 Technology Landscape The advent of the HTTP cookie in 1994 enabled data collectors to profile users. A cookie is a special piece of information, usually opaque, that servers can send down to users’ browsers. The browser then sends that information back to the server on subsequent accesses. By generating a unique cookie that identifies a user, the servers can now track users’ behavior. Many cookies are persistent and survive multiple browser restarts and computer reboots. This enables the server to identify the user even if the accesses are weeks apart.
Many Web advertising companies attach cookies to their ad imprints (often pictures). This enables the advertising companies to track users’ browsing behavior across multiple sites on which the ad company imprints their advertisement. This form of cookie usage is viewed as problematic, as tracking can occur without the knowledge of the data provider (the owner of the main content of the web page). It is also nearly impossible for users to recognize that a third party is tracking him or her.
Social network services take tracking to another level by using people’s desire to connect with others online to entice them to disclose more information. Users’ names, education, job history, place of birth, etc. are often public or accessible to the service provider irrespective of privacy settings. This reckless practice has raised controversy and fear among users. New technologies on the horizon can help the public take privacy into their own hands.
Proxy servers or anonymizers have been available since the early 90’s. This technique routs users’ web access to (untrusted) web services via one, often trusted, proxy server. To untrusted web servers, the access will appear as though it was originated from the proxy server, as it hides the IP address of the original user. Unfortunately, this technique does very little against cookies and web services that explicitly require users to log in (e.g. social networking sites.) Data brokers’ ability to analyze massive amount of raw tracking data has improved significantly in recent years, which has enabled them to produce more accurate and higher-value data. One obvious and expected reason for such change is the technology curve. Computers have been getting both faster and cheaper. More noteworthy is the advent of cloud computing and how it frees data brokers from making massive up-front capital investments. For example, instead of buying and managing 1000 computing nodes and allowing them to depreciate over 3 years, brokers can “rent” or buy the use of 2000 nodes for 1 year for the same price or maybe even rent 10,000 nodes for 1 month. This new service model is providing brokers with greater capabilities and flexibility to produce more valuable data.
As discussed above, the trend of technology and industry tends to favor data collectors, data brokers and data users, and makes it increasingly harder for online users to protect privacy and anonymity.
However, due to increased user awareness and legislative pressure, a plethora of solutions looking to balance the needs of the businesses and consumers is emerging.
Figure 3: Competitive Landscape Current players in the industry of data privacy solution providers can be categorized into following
1. Personal Data Store Solutions
2. Data Privacy Assurance Solutions
3. Data Privacy Scanning and Cleanup Solutions 3.3.1 Personal Data Store Solutions Personal Data Store Solutions take an end-user-centric approach by enabling users to create “data vaults” and have control over what they share, and with whom. One example is MyDex (mydex.org).
It allows users to manage, analyze and share personal data in a controlled manner. MyDex allows end-users to consume different services like Government Services, Energy Utility account and Telecom accounts through MyDex instead of directly consuming these services from the web. Endusers will be attracted to MyDex primarily due to privacy concerns and the ability to “take control” over their personal data. MyDex makes money by charging various organizations to share the
personal data with them. Essentially, MyDex brings privacy-conscious end users to these services that might otherwise shun using them. However, this solution provides these services with presumably higher quality information of the specific end-users and provides a channel for directed marketing.
Evidon is on a mission to reveal the “invisible web” while promising privacy for the general public and high quality data analytics for Web publishers. It specialized in Global Tracking of various tracking technologies, and strikes at the intersection between the demands for advanced advertising technology and public awareness of privacy concerns. Evidon has a product called Ghostery that enables customers to take total control over what information is shared on the web.
3.3.2 Privacy Assurance Solutions
Companies like Trust-e and Trust Guard allows websites to gain credibility with its visitors. They verify the privacy and security policies of websites, audit them at regular intervals, and then provide a seal of approval. These companies also act as watchdogs by taking customer complaints and having the ability to take the trust seal away from companies. They also specialize in understanding government regulations and incorporate them into their assurance programs. Trust-E is the first organization to join US-EU Safe Harbor law, which is the de facto framework for companies to comply with US-EU data and privacy standards.
3.3.3 Scanning and Cleanup Solutions
While companies like MyDex and Evidon try to prevent users’ private data from leaking online, Data Privacy Scanning and Cleanup solutions, e.g. Reputation.COM helps users when their data has already leaked and need to cleanup those records. These solutions scan the wide online world for reference to your private data and helps
1) Map out where a user’s data is referenced
2) Update any discrepancy
3) Remove unwanted references to a user’s data
4. Current Value Chains In the series “What They Know” , the Wall Street Journal (WSJ) documented how the information economy of the Internet tracks people’s behavior, activities, interests and data over several years, and how the privacy concerns associated with this. Below, we look at the value chains of this economy as described in the WSJ and how it is getting shaped and what opportunities arise from these trends.
Companies like Google, Facebook and Yahoo have spent billions of dollars to provide a long list of free services like blogs, news sites, search engines, email, and mapping tools that are largely taken for granted. These companies are counting on online advertising to fund and profit from these free
services. The following picture shows the ecosystem supporting the information economy – a web of tracking companies, data brokers, and advertising networks.
Figure 4: Ecosystem of the Internet economy (Source WSJ) In the 90s, online advertising focused on websites with most visitor traffic. Over the last decade, that focus has shifted to personalized advertising. According to a study sponsored by the ad industry in 2009, the average cost of a targeted ad was $4.12 per thousand viewers versus $1.98 per thousand viewers for an untargeted ad. This potential revenue motivated advertising companies to start using tracking tools like cookies, flash cookies and beacons to collect data, mine data, and generate user profiles. Companies like Google, Microsoft, Adobe, and Apple not only have a big say in how much information can be collected about users, but also have big stakes in online advertising. Microsoft bought aQuantive for $6 billion, Google bought DoubleClick Inc for $3.1 billion, and Adobe bought Omniture for $1.8 billion.
eMarketer estimates that the digital ad spending market will increase to $55 billion by 2016 in the US alone .
Figure 5: Internet Revenues (Source eMarketer) With the advent of location awareness in mobile devices and increased usage of smart devices like smartphones and tablets, targeted advertising is going one step further. Now an ad can be sent to a consumer at the desired time, at the desired coordinates. Data is collected through not only online activities but through Apps downloaded and used on these smart devices. Google operates its AdMob and Apple runs its iAd network for the Android devices and Apple devices respectively.
5. New Opportunity As companies begin to aggressively track consumers through online and mobile technologies, privacy advocates, consumers and governments have begun to engage in efforts to limit and control this activity. There is an obvious erosion or privacy and legitimate concerns about this user data falling into wrong hands – a situation very similar to that in the 1960s-70s in relation to credit agencies who had extensive information on customers and were willing to sell it to anybody . The result of such invasion of privacy led to the eventual regulation of credit agencies through the 1970 Fair Credit Reporting Act , which allowed consumers to access and collect or remedy their information.
Privacy and the information economy are going through a similar phase.
Privacy protection has become a commodity. A lot of startups selling monthly/annual services to “take control of your privacy” have emerged. These startups essentially monitor, disable tracking, mask users when accessing the Internet, delete personal information from websites, etc. Tightening regulations in Europe and a similar trend in the US and elsewhere also is moving to limit access to user data. These solutions have the potential to limit the rich Internet experience and its myriad and extensive uses.
In order to create a healthy and free Internet Services ecosystem, the following needs to happen:
Government regulation should bar tracking with user identity, but allow the collection of • consumer information for targeted service. This will require auditing web sites to ensure compliance.
New privacy tools and messaging campaigns must be developed by publishers to convince • consumers that they can be trusted. Improving the transparency of data collection and use will help to build trust, and that will increasingly become a sustainable competitive advantage.
There are substantial gaps in the current technologies and service offerings and cannot support this new environment. New services and additional/improved technologies are need in the following
spaces to help create a healthy web services ecosystem:
1) As privacy regulations allow collection of consumer information without violating user privacy, it will become harder to ensure that all companies are complying with regulations. Ensuring compliance will require independent auditors, similar to those in the finance industry, to check the behavior of data collectors. The certification by the auditors will help companies prove they
are accountable to the government as well as end users, so that users can trust the web services providers and are more open to do business with them.
2) In the new environment, the companies will still comply with the five core principles of privacy:
Notice/Awareness, Choice/Consent, Access/Participation, Integrity/Security, and Enforcement/Redress. Even though companies are addressing these principles individually, end users cannot access this information in a consistent and uniform way using a single standard taxonomy. Rather, there is an opportunity to make this information easily accessible to the end users across multiple publishers without (a) sharing information from one publisher to another and maintaining individual publishers’ competitive advantage or (b) storing user information in a single place and becoming a target for hackers.