blog

git clone https://git.ce9e.org/blog.git

commit
eb0cbd174247492eeec48206911f2336f8ec9327
parent
3d5dac8871280a6d6075afb3ec30cb4568ba94da
Author
Tobias Bengfort <tobias.bengfort@posteo.de>
Date
2024-03-23 14:00
post: beyond gdpr

Diffstat

A _content/posts/2024-03-22-beyond-gdpr/index.md 201 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

1 files changed, 201 insertions, 0 deletions


diff --git a/_content/posts/2024-03-22-beyond-gdpr/index.md b/_content/posts/2024-03-22-beyond-gdpr/index.md

@@ -0,0 +1,201 @@
   -1     1 ---
   -1     2 title: Beyond GDPR
   -1     3 date: 2024-03-22
   -1     4 tags: [privacy]
   -1     5 ---
   -1     6 
   -1     7 When the General Data Protection Regulation (GDPR) came into effect throughout
   -1     8 the EU in 2018, it pushed the boundaries of privacy regulation world wide. It
   -1     9 enshrined principles such as *data minimisation* or the right to *data
   -1    10 portability* into law.
   -1    11 
   -1    12 In my work I often deal with the GDPR. And while I honestly think it is a great
   -1    13 step forward, I also have some grievance. So in this article I am trying to
   -1    14 explore what I would like to see in the next iteration of privacy regulation.
   -1    15 
   -1    16 Obvious disclaimer: I am not a lawyer and have no clue what I am talking about.
   -1    17 
   -1    18 ## What is personal data?
   -1    19 
   -1    20 [Art. 4](https://gdpr-info.eu/art-4-gdpr/) and [Recital
   -1    21 26](https://gdpr-info.eu/recitals/no-26/) define that data is *personal* data
   -1    22 if it can be linked to a natural person. [Art.
   -1    23 9](https://gdpr-info.eu/art-9-gdpr/) defines what "special categories" of
   -1    24 personal data are.
   -1    25 
   -1    26 I have several issues with this definition:
   -1    27 
   -1    28 -	Whether data can be linked to a natural person is not always clear cut. For
   -1    29 	example, consider IP addresses. Those are usually handed out by ISPs who have
   -1    30 	contracts with their users. So by checking the ISP's database you could link an
   -1    31 	IP address to a natural person. But is it reasonable to assume that you can
   -1    32 	access that database? To quote Recital 26:
   -1    33 
   -1    34 	> To ascertain whether means are reasonably likely to be used to identify
   -1    35 	> the natural person, account should be taken of all objective factors, such
   -1    36 	> as the costs of and the amount of time required for identification, taking
   -1    37 	> into consideration the available technology at the time of the processing
   -1    38 	> and technological developments.
   -1    39 
   -1    40 	I understand why this is so whishy-washy, but that doesn't change the fact
   -1    41 	that it is.
   -1    42 
   -1    43 -	Data about one person can have implications for another person. Example:
   -1    44 	Relatives share parts of their DNA. Should a single person even be allowed to
   -1    45 	decide about their own data if it affects others?
   -1    46 
   -1    47 -	The decision whether something is personal data is binary. If data can be
   -1    48 	linked to a single person, it is personal data. If it can be linked to a
   -1    49 	small group of people, say two, all of these rules no longer apply. That
   -1    50 	doesn't feel right. For example, this definition does nothing to prevent
   -1    51 	micro-targeting. An alternative approach could be
   -1    52 	[k-anonymity](https://en.wikipedia.org/wiki/K-anonymity).
   -1    53 
   -1    54 -	The sensitivity is also binary in this framework. Data either contains
   -1    55 	special categories or not. A single record of a heart attack or 50TB of MRI
   -1    56 	imagery, it's all the same to the GDPR.[^1]
   -1    57 
   -1    58 I don't have the perfect definition for personal data either. But the GDPR has
   -1    59 pushed to envelope once. I wish that it can do it again and introduce an even
   -1    60 better model.
   -1    61 
   -1    62 [^1]: The GDPR does have a more nuance perspective on data sensitivity when it
   -1    63 	comes to fines (see [Art. 83](https://gdpr-info.eu/art-83-gdpr/)).
   -1    64 
   -1    65 ## Easy to understand
   -1    66 
   -1    67 I really like how the GDPR tries to be easy to understand. But I quickly found
   -1    68 things I didn't understand or that seemed outright contradictory to me. Let me
   -1    69 give you some examples:
   -1    70 
   -1    71 For most of my usecases, [Art. 6](https://gdpr-info.eu/art-6-gdpr/) boils down
   -1    72 to: "If you have a contract with someone, you can safely process their data as
   -1    73 long as it is required for the contract. For anything else, you need consent
   -1    74 that was freely given and can be revoked at any time." Clear guidelines, easy
   -1    75 to understand.
   -1    76 
   -1    77 [Art. 9](https://gdpr-info.eu/art-9-gdpr/) explains that actually, there are
   -1    78 "special categories" that follow a slightly different set of rules. Basically,
   -1    79 a contract is not enough and you always need consent.
   -1    80 
   -1    81 It would have been nice if this exception had been mentioned in Art. 6. But
   -1    82 there is also a contradiction here, right? How can I "freely give" consent that
   -1    83 is required for a contract? Say you are caught in a kafkaesque legal battle and
   -1    84 your sleazy lawyer wants to know all of your secrets. Do you really have a
   -1    85 	choice in that situation? It cannot be required and freely given at the same
   -1    86 time, or am I missing something?
   -1    87 
   -1    88 [Art. 17](https://gdpr-info.eu/art-17-gdpr/) defines the *right to be
   -1    89 forgotten*. "You are allowed to demand the deletion of all your data from
   -1    90 anyone." That sounds nice, doesn't it? But it's not what that article actually
   -1    91 says. It just repeats that data processing is only allowed under specific
   -1    92 conditions, and that your data must be deleted if those conditions are no
   -1    93 longer met. I honestly don't know why this article exists, it just seems so
   -1    94 redundant.
   -1    95 
   -1    96 Maybe this article is meant to clarify some gaps in the previous rules, e.g.
   -1    97 that withdrawing your consent by default only affects future data processing,
   -1    98 and that you can demand deletion of already existing data in addition to that.
   -1    99 But even then I find it weird that these clarifications come several articles
   -1   100 later instead of simply providing a complete definition of consent from the
   -1   101 start.
   -1   102 
   -1   103 [Chapter 9](https://gdpr-info.eu/chapter-9/) then goes on to list a whole lot
   -1   104 of additional exceptions. Or rather, it lists cases in which national law might
   -1   105 overwrite the GDPR. So in order to know whether any of this applies you have
   -1   106 to check the entire national law.
   -1   107 
   -1   108 I am sure there are explanations for everything I don't understand. I guess
   -1   109 that regulation like this has some degree of inherent complexity. But there are
   -1   110 also some obvious improvements that could be made, either by changing the
   -1   111 structure of the text or by providing auxiliary material.
   -1   112 
   -1   113 ## Restrictions on data propagation
   -1   114 
   -1   115 GDPR contains plenty of restrictions for processing data. But once someone has
   -1   116 your data, there are next to no restrictions on who can access it. If you
   -1   117 give your data to a company with 10.000 employees, all of them can now legally
   -1   118 access that data. Heck, the company can also pass the data to subcontractors.
   -1   119 
   -1   120 One of the [principles](https://gdpr-info.eu/art-5-gdpr/) of the GDPR is "data
   -1   121 minimisation", which is super important just to limit the attack surface. But
   -1   122 to my knowledge there are basically no concrete rules that actually enforces
   -1   123 this.
   -1   124 
   -1   125 As an example: A local film festival recently started to sell their tickets
   -1   126 exclusively via Eventim. Before that, it was possible to buy tickets
   -1   127 anonymously in cash. Now you have tell Eventim what movie you want to see. It
   -1   128 is reasonable to assume that they are hosting their databases on AWS, so the
   -1   129 whole of Amazon can probably also see that. And the GDPR doesn't protect you
   -1   130 from any of it.
   -1   131 
   -1   132 ## Focus on principles instead of compliance
   -1   133 
   -1   134 The GDPR is based on some truly great principles, for example:
   -1   135 
   -1   136 -	[data minimisation](https://gdpr-info.eu/art-5-gdpr/): You can only process
   -1   137 	data if it is required for a given purpose, must not use it for anything but
   -1   138 	that purpose, and need to delete it once that purpose has been fulfilled.
   -1   139 -	[data portability](https://gdpr-info.eu/art-20-gdpr/): You can freely migrate
   -1   140 	from one platform to another and take all your data with you.
   -1   141 -	a very progressive [definition of consent](https://gdpr-info.eu/art-7-gdpr/)
   -1   142 	that requires plain language and even considers an imbalance of power.
   -1   143 
   -1   144 Unfortunately, none of that really materialized. The GDPR should have smashed
   -1   145 targeted advertising and centralized social media. Instead, companies were told
   -1   146 that they can continue as before as long as they fill out some paperwork and
   -1   147 add cookie banners to their websites.
   -1   148 
   -1   149 Some time ago I saw a website that had been build by a young colleague of mine
   -1   150 (I won't name names). It had no cookies. It had a cookie banner. They had come
   -1   151 up in a world where every "respectable" website had a cookie banner, so they
   -1   152 thought that having one was a legal *and aesthetic* requirement.[^2]
   -1   153 
   -1   154 I am not sure what exactly went wrong here. The power of advertising companies
   -1   155 such as Google and Facebook certainly played a role. But I also blame the EU.
   -1   156 With the benefit of hindsight, I hope that they can come up with a better
   -1   157 communication strategy next time around.
   -1   158 
   -1   159 [^2]: I understand that cookie banners are often actually required by GDPR, but
   -1   160 	by the [ePrivacy directive](https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A02002L0058-20091219).
   -1   161 	But the point that the underlying principles got lost somewhere still holds.
   -1   162 
   -1   163 ## Wild idea: make it a tax
   -1   164 
   -1   165 Imagine if companies had to pay taxes on the size of their database.
   -1   166 
   -1   167 I can easily come up with a justification that contains enough buzzwords to
   -1   168 sway your average politician: *In these trying times full of ransomware and
   -1   169 cyber terrorism, storing any kind of data is a public security hazard. The
   -1   170 companies that are most likely to leak data should also pay the biggest part of
   -1   171 the cleanup-bill.*
   -1   172 
   -1   173 So far the GDPR concentrates on [fines](https://gdpr-info.eu/art-83-gdpr/)
   -1   174 instead of taxes. I am not well versed in the discourse around these two
   -1   175 options. But maybe it's not even that important whether this is a fine or a
   -1   176 tax. The juice is in how it is calculated:
   -1   177 
   -1   178 The fines in the GDPR can be high and they are also supposed to consider the
   -1   179 "number of data subjects affected and the level of damage suffered by them".
   -1   180 But I want something more specific. I want something like this:
   -1   181 
   -1   182 ```
   -1   183 tax
   -1   184 = base value
   -1   185 * number of unique datasets
   -1   186 * sum of sensitivity for each field
   -1   187 * number of natural people with access
   -1   188 ```
   -1   189 
   -1   190 This would explicitly incentivize corporations to keep datasets small, throw
   -1   191 away historic data, avoid highly sensitive fields, and restrict the pool of
   -1   192 users. Also note that looking at *unique* datasets would encourage a high
   -1   193 k-anonymity, something that the GDPR doesn't even consider.
   -1   194 
   -1   195 There are clearly still a lot of details that need to be worked out. I also
   -1   196 have no clue how much administrative work this would cause. But it is an idea.
   -1   197 
   -1   198 ## Conclusion
   -1   199 
   -1   200 GDPR is great, but it could be better. It especially suffers from a lack of
   -1   201 enforcement of its principles. Maybe a tax could help with that.