Skip to main content

GDPR and Big Data


The information contained in this article is not to be understood as legal advice and should not be interpreted as such. Companies subject to the GDPR should not rely on the information contained herein and should seek legal advice from their own legal counsel or another professional legal services provider.

How the GDPR affects Big Data Analytics

The main objective of the GDPR (the new EU law covering the storage of personal data) is to give EU citizens back control over their personal data. Important points contained in this legislation include the need for individuals to consent to the use of their personal data, their right to delete personal data and the obligation on the part of companies and other parties to notify people in the event of any data protection violations. Violations can carry some hefty sanctions: Fines of up to €20,000,000 or up to 4% of a company's annual worldwide turnover. These fines represent a significant financial risk for companies.

Against this backdrop, managers and their analytical systems face technical, functional and organisational challenges such as:

  • Clarification of what constitutes personal data
  • 'Privacy by design' and 'privacy by default'
  • Pseudonymisation and anonymisation
  • Data quality as required by the GDPR

GDPR and data science

In practice, the GDPR influences data science and data warehousing in the following areas. Firstly, the GDPR sets tighter limits on the processing of personal data and the creation of consumer profiles. Secondly, companies that use automated decision-making technologies must give consumers a "right to explanation" regarding their practices and activities. Thirdly, the GDPR holds companies responsible for any distortions or any discrimination in their automated decision-making processes. Fourthly, companies must bear in mind that existing analyses using personal data could also become illegal when the GDPR comes into effect.