Long-Term Issues and Insights
In addition to the near-term measures of establishing data taxonomies and measuring intended impact, a focus on longer term strategic issues will also contribute to strengthening trust and sustainability of the personal data ecosystem.
Strategic technological and business innovations
A core set of longer term issues will require business, legal and technical systems to more effectively interoperate at the pace and complexity of today’s socio-technical world. Given the challenges for effectively preventing and managing against harmful uses of data, technological innovation can be applied to address some of these concerns. Progress needs to occur within three layers: infrastructure, data management and user interaction.34 The dynamism and distributed nature of how the world is evolving requires a focus on the technological enablers, which will play a critical role in meaningful transparency, accountability and more effectively engaging individuals.
From a technology perspective, a key innovation that will need to be developed (and scaled) is the adoption of “smart data”, where policies for using data are logically bound to the data for when it crosses trust boundaries.35 Entities that touch the data are required to add a signature to the metadata for the purposes of auditing and provenance.
With such capabilities in place, new contexts of data usage can be explored as the overarching code of conduct for legitimate purposes prevent against uses which are legally prohibited or were not agreed upon through the codes of conduct.
The infrastructure layer contains the technology, services and applications required to assure the availability, confidentiality, security and integrity of the data, both while in transit and at rest. Seven areas of interoperable technical innovation have been identified as key enablers from a technology perspective: personal clouds, semantic data interchange, trust frameworks, identity and data portability, data-by-reference (and subscription), accountable pseudonyms, and contractual data anonymization.36 Additional technical considerations at this layer would include measures needed to protect against unintended data breaches and system attacks. Authentication services can verify identities. Federated identity services can operate across trust boundaries and provide claims assurances including anonymous and pseudonymous identities which are essential elements. Establishing trustworthiness at this layer may utilize a combination of market-based mechanisms (reputation, brands, price), codes of conduct with enforcement, or via direct forms of governance.37
The data management layer focuses on the flow and use of personal data based on specified permissions and policies (established via legal contracts). Metadata technology can be utilized to create an architecture that links permissions and provenance with the data to provide the means for upholding the enforcement and auditing of the agreed upon policies. This interoperable architecture for sharing claims involves associating data with a “tag” containing relevant information such as provenance, permissions and policies which remain attached throughout the data lifecycle. Sticky policy and “privacy by design” can ensure that individual privacy settings remain embedded in the data or metadata as these are processed.38 Policies will need to support context to enable context-aware user experience and data use. In addition to allowing for a dynamic level of individual control over data uses, this approach can provide regulators with the opportunity to focus upon principles and outcomes.
To address the coordination and accountability of various stakeholders, trust frameworks – which document the specifications established by a particular community – can serve as an effective means to govern the laws, contracts and policies of the system. It is in this capacity where the ability for actors to not only prevent but to respond (and provide restitution for the impacted individuals) can be strengthened. If individuals are well protected and processes for restitution are defined, it could become the seed for greater innovation where there is a commercial incentive for delivering privacy and trust. Combined with a new “social stack”39 at the identity level, this new data and policy management layer could enable data flows across jurisdictional boundaries. In this sense, the confidence of individuals would be strengthened knowing that in whatever jurisdiction things went wrong, the individual would be assured of restitution.40
To deliver these types of systems which can prevent, detect and respond to the misuse of data, an area of focus brought up multiple times is the need for “smart contracts”: integrating legal code with digital code. For centuries, contract law has been essential for establishing sustainable markets where the interests of all stakeholders can be equitably represented through legal agreements. The notion that has been suggested for further exploration is to automate the execution of legal code so it can uphold and enforce principles with contextually-based data usage at faster cycle times. Just as iron served as the “combustion chamber” for putting fire to work in the engines of the industrial era, automating the provisions contained within contracts can serve that purpose in the digital era.41
Significant development is needed for this approach to be technologically feasible. Security mechanisms are required to ensure that the provisions specified by individuals in a tag are not altered without permission. Further, there are challenges to creating a system with sufficient scale to be relevant in an internet economy context. These issues cannot be solved independently by either industry or government and will require a multistakeholder approach to gain traction.
The user interaction layer includes the elements that enable individuals to have a meaningful interaction with service providers regarding the permissions and policies associated with the use of their personal data. Individual, cultural and local/national legal jurisdictional considerations would need to be addressed to ensure that the richness of personal choice and autonomy could be addressed.
The user interaction layer is clearly the area where further research is needed to gain better insight on some of the underlying issues that make personal data so unique, complex and contradictory. Contributions from the fields of economics, behavioural decision research, psychology, usability, human-computer interaction and many others would be valuable.42 Focusing on how users define an “acceptable” use and the contextual elements surrounding that decision, the role of trust in establishing that context, what do individuals expect of other institutions for maintaining trust, and an array of cultural and regional norms are just some of the areas that have been identified for further research.43
Looking ahead: Accountable algorithms and the post-digital world
In many ways, the world is now post-digital. The achievements of digitizing and connecting everyone (and every thing) are largely taken for granted. The discourse is no longer about technology but how it is applied for socioeconomic change.44 The focus is on a new nexus of control and influence: the algorithm.
A sociological analysis must not conceive of algorithms as abstract, technical achievements, but must unpack the warm human and institutional choices that lie behind these cold mechanisms.
– Tarleton Gillespie, The Relevance of Algorithms 2013
Complex and opaque, algorithms generate the predictions, recommendations and inferences for decision-making in a data-driven society. While easily dismissed as abstract empirical processes, algorithms are deeply human. They reflect the intentions and values of the individuals and institutions which design and deploy them.45 The ability for algorithms to augment existing power asymmetries gives rise to an emerging set of questions on their influence over data-driven policy-making.46 “At some point, you’re in the hands of the algorithm,” notes John Clippinger, Chief Executive Officer of the Institute for Data Driven Design, a non-profit research and educational organization. “You’re whistling in the dark if you don’t think that day is coming.”47
As the socioeconomic impact of predictive machine learning and algorithms grows stronger, long-term concerns are emerging on the concentrated set of stakeholders (who both mediate communications and have access to powerful algorithms) and their influence over individuals. The focal point of these conversations centres on how data can be potentially abused to proactively anticipate, persuade and manipulate individuals and markets. The nature of these debates are complex, value-laden and give rise to some fundamental societal choices. Questions of individual autonomy, the sovereignty of individuals, digital human rights, equitable value distribution and free will are all a part of these conversations. There are no easy answers.
Through this long-term lens on the impact of proactive computing, the focal point for discussion begins to shift away from personal data, per se, to computer-based profiles of individuals and groups of individuals.48 These profiles — fueled by fine-grained behavioural and sensor data — make it possible to monitor, predict and instrument social phenomena at the micro and macro levels. Noted legal scholar Mireille Hildebrandt writes: “What we need is a complementary focus on the dynamically inferred group profiles that need not be derived from one’s personal data at all, but may nevertheless contain knowledge about the probability of one’s intentions, affiliations, risk taking and behaviours.”49 Alexander Pentland of the MIT Media Lab also notes: “Individuals are largely determined by their social context. One can tell all sorts of things about a person, even though it’s not explicitly in the data, because people are so enmeshed in the surrounding social fabric.”50
Accountable algorithms: Key questions for strengthening trust
How significant and likely are the intended consequences of the algorithm? How many people might be affected (or perceive an effect)? Who holds the risk if things go wrong?
Are there errors that may be acceptable to the algorithm creator, but not the public? If so, who decides what’s fair? Why was the algorithm tuned that way?
How might the algorithm steer public attention and perceptions in meaningful ways?
Is the algorithm’s output lawful and consistent with social norms? If not, what’s driving that inconsistency—a bug, an incidental programming decision, or a deep seated design intent?
What are the risks of transparency? Would publishing an algorithm negatively affect any individuals? Would it help those looking to game the system and establish an unfair advantage?
Source: Nicholas Diakopoulos, Algorithmic Accountability Reporting, Tow Centre for Digital Journalism, 2013.
The world of “smart” environments, where cars, toothbrushes, toasters, eyeglasses and just about everything else coalesce into the Internet of Things, creates a sea change in how data will be processed. Rather than being based on “interactive” human-machine computing, smart environments rely upon “proactive computing”.51 By design, these proactive environments are one step ahead of individuals. Connected cars need to anticipate accidents before they happen. Alerting systems for public health need to spot the spread of infectious diseases before they reach scale. Evacuating flood prone areas needs to occur before major storms hit.
The emphasis on proactive computing will change the role of human intervention from a governance perspective. Lacking a full understanding of how complex systems work, the ability of humans to understand, make decisions and adapt can be too slow, incomplete and unreliable. In this brave new world, building trust from the “principles up” will be essential and require new forms of governance that are open, inclusive, self-healing and generative.52
From a community and societal perspective, as civil “regulation-by-algorithm” begins to scale,53 incumbent interests and power asymmetries will play an increasing role in establishing who gets access to an array of commercial and governmental services. As such, there is a need to ensure that the algorithms driving proactive and anticipatory decisions will be lawful, fair and can be explained intelligibly. Meaningful responses must be given “when individuals are singled out to receive differentiated treatment by an automated recommendation system”.54
As Viktor Mayer-Schoenberger and Kenneth Cukier note in their 2013 book Big Data: A Revolution That Will Transform How We Live, Work and Think, a new class of professional is needed who can act as reviewers of big-data analysis and predictions. Part mathematician, statistician, computer scientist and data ethicist, these impartial individuals would function much like accountants and evaluate such things as the selection of data sources, the algorithms and the intended impact on identified individuals or communities.
One emerging set of concerns is the institutional ability “to discover and exploit the limits of an individual’s ability to pursue their own self-interest.”55 Given that a majority of consumer interactions in the future will be mediated via devices and commercially oriented communications platforms, data-centric institutions will have the means and incentives to trigger “predictable irrationality” from individuals.56
With a vast trail of “digital breadcrumbs” accessible for companies to mine and tailor highly personalized experiences, a growing set of concerns is arising on how individuals could be profiled and targeted at moments of key vulnerability (decision fatigue, information overload, etc.) and limit their ability to act with agency and in their own self-interest.57 With the lives of individuals becoming increasingly mediated by algorithms, a richer understanding is needed for how people adapt their behaviors to empower themselves and gain more control over the manner of how profiles and algorithms shape their lives in areas such as credit scores, retail experiences, differential pricing, reputational currencies, insurance rates, etc. As the New York Times R&D Lab has noted: “As algorithmic systems become more ubiquitous and impactful, what behaviours and strategies emerge to optimize, control, obscure, or otherwise manipulate the data that we emit?”58
In this light, one of the most provocative and strategic insights on strengthening trust that emerged from the global dialogue was the concept of exploring ways to share intended consequences of data usage to individuals. Participants cited language in the 2012 Draft European Data Protection Act (section 20), which calls for “the obligation for data controllers to provide information about the envisaged effects (emphasis added) of such processing on the data subject”.59
To address this emerging set of concerns, establishing a cross-disciplinary community of forward-looking experts, complexity scientists, biologists, policy-makers and business leaders with an appreciation of the long-term societal impact was identified as a priority. This group would proactively help design and test systems that balanced the commercial, legal, civil and technological incentives shaping outcomes at the individual and social level. They would need to develop some form of legal protection to limit liabilities and provide a safe space to explore complex issues in a real-world setting. One attribute of this safe space would be for it to be governed by an institutional review board where ethics and the interests of individuals could have a meaningful and relevant voice (similar to how they are used by the biomedical and behavioural science sectors). Institutions concerned about legal uncertainties, regulatory action or civil lawsuits could have a richer means for assessing ethical concerns using these approaches.60