A conversation about the General Data Protection Regulation (GDPR)
The GDPR is a new(ish) legislation by the European Union that regulates the processing of personal data when the person processing or controlling the data is in the EU, even if the actual processing occurs outside of the EU. Further, the GDPR also sometimes regulates the processing of personal data of people who are in the EU, even if the persons doing the processing are outside of the EU.
How does this affect neuroimaging? We sit down with neuroimaging expert and Open Brain Consent co-author Dr Cyril Pernet (CP) and Technology law expert Dr Mireille Caruana (MC) to discuss the implications of this law on our work.
The article flip-flops between the term “participants” and “data subjects” since ““data subject” is the term used in the GDPR but for the purposes of this article you can think of them as equivalent terms.
What follows is a summary of our conversation, edited for conciseness and clarity.
Who are our experts?
CP: I do a lot of method development in neuroimaging and in a clinical context. Data sharing is something that I have always been happy to work towards. Data sharing is like code sharing, we need it for good science. With the advent of the GDPR, we've got some extra constraints on what to share and how to share.
In the clinical context, the typical thing is to say is: “Oh, you know, we have patients’ data, therefore, privacy issues,” and people don't even try to share. This really annoys me because there are ways we can do it. It doesn't have to be completely open on the web so that everybody can download it. I've been working on all sorts of open science related projects and the Open Brain Consent is part of that line of work.
MC: I am the head of the Media, Communications and Technology Law Department within the Faculty of Laws at the University of Malta and my research has, since before the GDPR, focused on privacy and data protection issues. I would not contradict you that the GDPR is a relatively new law that has, from the start, been the subject to a lot of uncertainty and difficulty in implementation and application. It's well worth working our way through the legislation to seek correct interpretations of it.
Why is it important to discuss GDPR across disciplines?
CP: We are scientists, when we read the GDPR text, we don’t understand the implications. We do not know how judges will interpret the law. This means that we need lawyers to guide us on how to interpret what is written there.
MC: The problem is that in many instances there aren’t clear answers. In fact, while a lawyer may give legal advice, it may eventually be contradicted by a court. Nevertheless, scientists should behave as diligently and carefully as possible. If the perception of the GDPR ends up restricting research or not allowing researchers to do their work, that's a problem. It shouldn't be that way. But achieving this balance is very difficult.
Anonymous data are not governed by the GDPR. Do you think there's anything within neuroimaging that can be considered anonymous?
CP: In my opinion, one of the key points in GDPR that is relevant to neuroimaging is that neuroimagers are able to single out individuals from datasets, which makes the data identifiable. And I'm not just talking about brain structure data, I am also talking about EEG data, MEG data, etc.. With connectome matrices and a few tasks you can single out individuals, and we can thus consider that any imaging data should be considered identifiable. Others disagree with me and argue that singling out is not strictly identifiability, while I contend the opposite because GDPR indicates that singling-out is a prerequisite to identification.
This is a key difference between North American legislation and the GDPR. While North America differentiate between anonymised data, pseudonymised data and identifiable data, the GDPR only distinguishes between anonymised data or identifiable data. Pseudonymisation is just a process. Data can go through that process without changing their status as identifiable. That is we can remove the face, ID, etc ., but brain imaging data remain identifiable, in that we can potentially distinguish between individuals and even if we don’t have the metadata, link those data to someone by name.
Can we have an example?
CP: Imagine, for instance, that we have two independent datasets consisting of connectome matrices and tasks. There may be individuals who have been participating in each of those datasets. So we can now think about linking them and studies have indeed shown that it is possible to say that the same individual belongs to both datasets, because of the way connectomes look. Not only can we single out people within datasets, but we can also link datasets, and possibly by adding associated metadata we are getting even closer to identifying that person in the real world.
Are there any proposed solutions for this problem?
CP: The solutions that we have come up with are detailed in Open Brain Consent and involve two consent forms as well as a data user agreement for data collected in the EU. Of the two consent forms, one is the consent for the study and the other one is consent for people to share their data. The way you can legally share this is through a data user agreement, not through a licence, which means we ‘control’ who has access and to a lesser extent what can be done to the data. Now the control can be done in a way where people register to use specific datasets. For example, the Netherlands have a good system because every researcher is registered on a database. So for instance, if you log into the system of a particular institute, they know who you are, which institution you are affiliated with, and you can just download data, even if you're not part of the data-holding institute. This is possible because they can identify you. You can sign the data user agreement with a simple click.
A user agreement also helps researchers share data outside of the European Union. The GDPR refers to this as “standard contract clauses.” This allows you to get to a point where non-EU researchers can download the data and become the data controller. With the data user agreement, the downloader agrees with the terms of the GDPR. This way you can share data anywhere in the world, even outside the EU. But you cannot just put your data up on openneuro. This is important since openneuro servers reside in the US, and the US is special because it is not considered to be a “safe country” by the EU. Institutions can sign an agreement with the EU to become a safe repository. But that also means openneuro would have to change their infrastructure to support data user agreements.
Where does consent come into all of this? Could I just get consent from my participant to share all of my data in the US, and the rest of the world?
MC: In the GDPR, sharing or transferring data is considered to be a type of processing. Let's forget about how the original data were collected and focus on the sharing of these data. In this case, you should still have a legal basis for processing in the GDPR. I am also assuming that they're sensitive personal data, since I am assuming that they say something about an individual’s health status.
Article 9 of the GDPR has a legal basis specifically for research data processing. So perhaps you don't need to rely on consent to share data because there is another legal basis which speaks about the necessity for scientific research. However, this legal basis is somewhat unclear in its application because it speaks about individual member states laying down a law that provides appropriate safeguards.
With regard to data transfers to a third country such as the US, chapter 5 of the GDPR concerns transfers of personal data to third countries or international organisations. According to Article 45, transfer of personal data to a third country may take place where the EU Commission has decided that the third country, or one or more specified sectors within that third country, ensures an adequate level of protection. Such a transfer does not require any specific authorisation. In the absence of an adequacy decision, a controller or processor may transfer personal data to a third country only if the controller or processor has provided appropriate safeguards, and on condition that enforceable data subject rights and effective legal remedies for data subjects are available.
Under Article 49, in the absence of an adequacy decision, or of appropriate safeguards, a transfer or a set of transfers of personal data to a third country may take place only on one of a set of stated conditions, which include that “the data subject has explicitly consented to the proposed transfer, after having been informed of the possible risks of such transfers for the data subject due to the absence of an adequacy decision and appropriate safeguards”.
How do we deal with requests for deletion of data?
MC: Article 17, GDPR sub article 2 states that “Where the controller has made the personal data public and is obliged pursuant to paragraph 1 to erase the personal data, the controller, taking account of available technology and the cost of implementation, shall take reasonable steps, including technical measures, to inform controllers which are processing the personal data that the data subject has requested the erasure by such controllers of any links to, or copy or replication of, those personal data.” It talks about reasonable steps that would, by way of good practice, mean a record of people who accessed the data and contacting them to inform them about the request.
How long can we store data?
CP: You are required to set a time frame within which you must review the need for continued storage of the data. However, if the data keep being necessary, the data can be kept indefinitely.
Is it true that under the GDPR, legally, you're not allowed to reuse your own data in your own lab to answer different questions than what it was originally collected for?
MC: The GDPR speaks about purpose limitation (“personal data shall be collected for specified, explicit and legitimate purposes and not further processed in a manner that is incompatible with those purposes”) and ‘specific’ consent (“‘consent’ of the data subject means any freely given, specific, informed and unambiguous indication of the data subject’s wishes…”). So ideally, I think even ethically, your research participants should understand how you're going to use their personal data; but no, research is treated in a particular manner under the GDPR. Research is not considered to be incompatible with the original purpose for data collection (“further processing for ... scientific ... research purposes ... shall, in accordance with Article 89(1), not be considered to be incompatible with the initial purposes”).
Furthermore, recital 33 of the GDPR clarifies “It is often not possible to fully identify the purpose of personal data processing for scientific research purposes at the time of data collection. Therefore, data subjects should be allowed to give their consent to certain areas of scientific research when in keeping with recognised ethical standards for scientific research. Data subjects should have the opportunity to give their consent only to certain areas of research or parts of research projects to the extent allowed by the intended purpose.” So, legally, you may be covered, even though the debate surrounding so-called ‘broad consent’ is not conclusive (cf. for example the Article 29 Working Party’s Guidelines on consent under Regulation 2016/679).
CP: In my opinion, the “purpose” research is not specific enough. But if you say the purpose is “memory” that's too specific because that way you could not even use a T1w image to create a template. So, we came up with a compromise. If you look at the Open Brain Consent GDPR edition, our solution is to say that, for instance, the purpose of conducting the study is one thing, but also that the data may be used for future research projects in the field of medicine and cognitive neuroscience, which strikes the balance.
MC: Article 5 (1) (b) of the GDPR states that “personal data shall be collected for specified, explicit and legitimate purposes and not further processed in a manner that is incompatible with those purposes; further processing for ... scientific or historical research purposes or statistical purposes shall, in accordance with Article 89(1), not be considered to be incompatible with the initial purposes”
This gives researchers quite a bit of flexibility. This is maybe one area where law and ethics overlap. The debate within research on genetic data that I have come across when dealing with biobanks, is that people speak of dynamic consent and they want to use dynamic consent to have more granular consent for specific projects. The thinking behind this is that certain people might object morally to particular research. So of course, you're being more respectful to the data subject if you don't use the data in ways that they would not approve of. Specific, granular consent is in line with the spirit of GDPR, but I don't think that the GDPR excludes such broader consent for scientific purposes.
The GDPR refers to data minimization? How do you guarantee that we don't collect data that are unnecessary?
CP: This is something that we also struggled with. On one hand, you would want to be able to collect participants' data and typically, in my lab, we go through a bunch of health questionnaires, handedness, medical history, language etc… because, of course, you can then reuse these data in a larger dataset. You've got 100 different studies, but for each participant, you have the common six questions, and you can do a nice big analysis. You could possibly connect these studies and perform richer analyses. What is the balance? We know that this may be the only way to aggregate enough data from multiple studies to then have a study that is powerful enough to look at the effect of some type of medication.
MC: Unfortunately, I think that this is an outstanding difficulty or problem because as a researcher you may not know exactly what you're looking for; for example, what analysing the patterns may reveal. It is a known tension in the GDPR that may also go against the purpose specification principle. So I think it's a tension that is real. I would however always emphasise in such contexts that the sole purpose for processing these data is in fact scientific research, that there may be uncertainty associated with research, but that there is also an important public good to be gained from such research that affects the balance to be achieved between the different competing interests, including the privacy and data protection rights of the data subjects.
What are the next steps?
CP: I think the next steps are twofold. One is for neuroimagers to engage with their own institutional repositories. We need to work with them and with data protection officers to come up with solutions for sharing data. You need to explain what systems need to be in place and how to implement them. We do have power because we do receive money from funders who often actively ask us to share the data. And it is the university’s job to provide us with the tools to be successful in funding applications and to comply with funders.
The other aspect is more ambitious. There are systems that work under the architecture of any repository to index them, so that for instance, every university in Europe could very well have their information connected. But this would necessitate that all universities cooperate with each other. It's more like a dream.
I am also very keen on making sure that everyone reading this interview knows about all the efforts of the Open Brain Consent project. I would like to highlight all of the hard work put in by many, in particular Chis Gorgolewski and Yaroslav Halchenko who started the project, Stephan Heunis and Peer Herholz who organize work on this during the Organisation for Human Brain Mapping (OHBM) hackathon, and all the people who helped sharing their consent, experience, and proposed translations (now available in 12 languages) thanks to the COST association support (GliMR2). Note that we are keen on having more people involved, in particular having and sharing more information about how these issues are dealt with in countries from the Global South that are currently under-represented.
You can find more details on the Open Brain Consent website.