Privacy concerns are raised more often as applications built on the Web platform have access to more sensitive data â€” including location, health and social network information â€” and users’ activity on the Web is ubiquitously tracked. – Web Privacy (WWW2017)
Managing big data with privacy in mind
BDE = Big Data Europe project
What is big data?
Data too big to process with a single computer
The three vâ€™s
Big Data is often described by the three vâ€™s:
- Velocity – lots of data coming in fast (real time traffic sensors)
- Volume – lots of data – interstitial data from satellites
- Variety – data is comprised of wide variety of formats. Variety is often ignored.
Handling volume and velocity
- use specialized software: HFDS, Spark, Flume
- parallel computing
BDE provides these and other software as Docker containers -> easy to deploy on a swarm of machines -> scalable to many nodes.
Variety is often ignored, but the Big Data Europe project is including this.
Metadata (RDF) is used to describe the various data types and their purpose. RDF and SPARQL (Query language for RDF) allows the data to be converted and translated. The SPARQL query is translated to a query that works with the individual data set.
Who owns the data
Metadata about permissible actions on data could be ODRL, which is a W3C Recommendation. Open Digital Rights Language
Special (Scalable Policy-awarE Linked Data arChitecture for prIvacy, trAnsparency and compLiance) is a new project within W3C.
- Builds on BDE
- Privacy added by means of metadata
- A model of the EU General Data Protection Regulation (consent, anonymization, erasure, etc) which should come in force 2018
It will also include tools to import privacy policies, anonymize data, visualize and explain policies, and versioning of policies.
- Special Consortium
- Big Data Europe
- Slides for this presentation: Managing Big Data with privacy in mind (.pdf)
Privacy concerns related to verifiable claims
By David Wood
- Prove age to purchase alcohol
- Prove income to buy house
Showing your id at a bar proves your identity and age. Youâ€™ve also given them your blood type, address, etc. They donâ€™t need that information. They only needed verification of age and identity.
Giving a credit card could also leak digital information on bank accounts, credit card information. Together, these could be used to begin identity theft.
Failure is less attributable to either insufficiency of means or impatience of labours than to a confused understanding of the thing actually to be done.
There is no such thing as absolute privacy in America
James Comey, FBI Director
Law-abiding citizens value privacy. Terrorists require invisibility. The two are not the same, and they should not be confused
Configuration options are where arguments go to die
Verifiable claims should not make the situation worse but make it better where it can.
- Cross-site tracking off credentials
- Verifiable claims are identifiable agnostic
- Use local identifiers when possible
- Accept the risks where agencies must collude
- A single â€œidentityâ€ per person?
How does an â€œidentityâ€ relate to a Verifiable Claims â€œentityâ€?
- Are â€œidentity profilesâ€ sufficient protection?
- How do identity profiles relate to an individual â€œentityâ€?
by Katryna Dow, Meeco – The API of me
We are on the edge of a [r]evolution
Itâ€™s not just about privacy, itâ€™s about power. Privacy is not dead or over. Itâ€™s under threat.
What we have:
- complex experiences
- limited accessibility
- broken trust
What we want:
- direct exchange
- shared access
- More powerful
- â€¦ but also morphing to get closer to the person: on the body and in the body
Current business models: collecting, storing, and monetizing personal information
We need personas to represent the multiple versions of our identity.
Digital birth is on average 6 weeks from birth for todayâ€™s children.
We are moving from products and services to outcomes and experiences.
We want: Minimal Viable Collection while achieving Maximum Viable Access.
The moment of entering a night club should not exist beyond the brief exchange of information from customer to bouncer.
- smart contracts
- if this then that
Meeco products and services
Meeco Labs: CIAM customer centric, context driven, consent based
- Linking bank and telco to create id eco-system for faster on-boarding in minutes. Building trust and creating mutual value.
- Point of sale protection in less than 3 minutes. Integrating retail + insurance + data collection + warranty
Is there a private future?
Privacy is not anonymity
Privacy is where you are known, but you get something for it.
Privacy online is where you can be anonymous, but you donâ€™t get the desired services. So the average person doesnâ€™t do it.
- Donâ€™t tell anyone… until it isnâ€™t important, about that bit, that I told you, who doesnâ€™t understand me, who gossips
- Iâ€™ll tell ifâ€¦ the police ask, youâ€™ll give me something, you wonâ€™t tell anyone, I can trust you
- Hard to express concisely, clearly, and accurately in natural language. Let alone designing an successfully using appropriate formalities
Privacy is not free
Your information – your privacy – your information – has value
Where are you?
Location data was one of the first things the W3C put into the web the triggered great privacy concerns. Where I am could reflect my associates and affects my safety.
Semantic Web Developer Map: representing locations of people, research groups and projects: foaf:nearestAirport
Who are you?
We know your:
- age, sex, location
- are you a bot
- income, employer, family status
- professional and personal interest
What could we do?
- Systems people understand
- What information identifies you?
- Forget some of that, pleaseâ€¨. But you already got services for it. You’ve gotten the value, do you owe your privacy?