Irish Government projects and the Data Protection Rake

The more I see the mindset of the Irish Civil Service around data and its potential for use (and misuse and abuse), the harder I find it to get this video out of my mind. Over the past two years, literally at every turn, an initiative has been launched which has, within a short period of time, raised questions about the fairness of obtaining of personal data, the legitimacy of the purpose for processing, the scope and scale of data sharing, retention periods for data, and the governance of the data once it has been obtained.

Government Departments seem intent on continuing with poorly planned, inappropriately executed, and ill-advisedly governed initiatives. This happens even in the face of valid comment and concern from an increasingly informed and aware citizenry, and in some cases in the face of question and comment from experts in the field who are raising valid concerns based on little more than practical experience and deep professional knowledge. Questions or requests for less haste and more analysis are met with a grim determination to hit specific timelines. “This is a data protection disaster waiting to happen” is greeted with a continued roll out of the initiative that gives rise to concern.

While Side Show Bob illustrates the inevitable public fall out of not engaging with concerns in a constructive manner, it is the Marx Brothers who give the most apposite quote.

In Duck Soup, the following exchange takes place between the President of Freedonia (Groucho Marx) and the Ambassador of a neighbouring country on the eve of war…

Ambassador Trentino: I am willing to do anything to prevent this war.

Rufus T. Firefly: It’s too late. I’ve already paid a month’s rent on the battlefield.

On Irish Government data projects, the all to oft-repeated script now reads:

Data Protection Expert: I think this raises significant issues and may be illegal

Government Representative: It’s too late. I’ve already paid a months’ rent on the PR agency project.

Last year it was Irish Water. This year it will be eircodes and Primary Online Database. Both are things that have potentially great benefits for society, but both are becoming hallmarked with the rake-mark of poor planning and execution, especially when the questions of Data Protection and Privacy are considered. If the investment in PR agencies to spin the projects and manage the media once questions are asked was matched by an investment in proper design and planning for Data Protection and Data Privacy issues, there would be fewer blogs, tweets, column inches and broadcast minutes devoted to discussing the issues and asking awkward questions for the media consultants to spin.

Leaving eircodes to one side for a moment (that’s a big bucket of fish to discuss from a data quality and data privacy perspective), the on-going roll out of the Primary Online Database project is a classic example of valid and legitimate purposes and objectives in processing data being undermined by poor planning, execution, design, and governance around the fundamental rights issues of Data Protection and Privacy.

The Good

Our education system is broken. Scare resources are not applied or allocated effectively. Schools have resources rationed from the Department, but under privileged schools are unable to supplement those resources (such as psychological assessments, SNA hours, other classroom supports) to the same level as schools in middle class or more privileged areas. Children drop out of the system and drop off the radar. Having data about outcomes in education, and about social or demographic issues that might affect those outcomes is valuable to identifying causal factors and prioritizing investment in education services and interventions in an ‘evidence based’ policy framework. Questions like: Does Timmy start primary? does he go to secondary? Does he go straight to University or do a PLC or Further Education course? What schools did he attend? Did Timmy drop out and then re-enter as a mature student either at 3rd level or re-entering 2nd level.

Of course, this longitudinal data is valuable. And if at the granular level of the individual it is a deeply personal snapshot of the life, trials, and tribulations of little Timmy from the age of four years of age.

This data is to be held in the Revenue Commissioner’s data centre. This is a good thing. The Revenue Commissioners have a very secure data centre. I would not automatically assume a nefarious intent in putting data that requires a high level of protection in a location that has been designed, built, and resourced to have a high level of technical security protection.

The Bad

There are three bits of bad that concern me.

Bad  #1

The first bad arises where the planning and execution of this data gathering fails to consider the data subject and the context. It’s data about children. It’s data about medical and psychological conditions that a child might have (that’s Sensitive Personal data even though the Department of Education appears to think that it isn’t). It’s data about their ethnicity, their family make-up, their socio-economic status, and a range of other factors. It’s data that is tied uniquely to them by their PPSN. It’s data that includes comments written about the child by administrators in the school, which will be written to a database in the Department of Education’s control. And that data will be held until the child is 30 years old.

Of course, documentation tells us that access to that field is going to be restricted so Department of Education staff can’t access it and only people in the school that the child is in will be able to see it. Of course, that means that anyone with administrator rights to that database can access that data. And that means that it will almost inevitably be looked at. This is despite the Department having no statistical reason for having detailed notes about students.

Bad #2

The bad goes to worse when the means for gathering  the data is looked at. The data is being obtained from schools, with only a subset being asked for from parents. The schools have obtained data for a particular purpose. The Department’s purpose is a new purpose, and it is the Department’s purpose not the school’s. So it is incompatible with the purpose for which the school originally obtained the data. Schools are being asked to provide data based on their own records, or their own guesswork about ethnicity or religion or other socio-demographic data.

Upshot: data will either not be returned, or will be inaccurate. So statistical analyses based on that data will have skew and bias that will need to be controlled for. The Department’s own pilot programme only had a 50% response rate.

  • A better option: Invest time and effort in a proper strategy for educational data management. Educate parents and guardians and school management about the purposes, benefits, and strategic objective. Seek the data from the parents of the children. (Difficulty: requires budget, means you need to have a load of key decisions made and documented up front, and you need to take time to engage with the citizenry… even the tin-foil hat wearers).

Bad #3

Another level of bad arises in the context of the sharing of the data. What data will be shared, with whom, and why, and under what controls? These are basic questions that need clear and intelligible answers. And the answers need to be understandable. And the sharing needs to be necessary and proportionate. With defined governance controls over the changes to the use of that data or the changes to the sharing of that data. If data is being obtained for a statistical analysis purpose, there is no operational data management purpose that would permit the sharing of that data with another entity. If the data is being obtained for both statistical analysis and planning purposes and for day-to-day operational purposes, it means that the question of who actually has access to the data on a day to day basis arises – notwithstanding the assurances that only a small number of people in the Statistics unit of the Department of Education would be able to access the data.

  • What data will be shared? Will it be identifiable data or will it be aggregated statistical data?
  • If identifiable data will be shared, on what basis and in what format? Will it be on a record by record basis for specific intervention in a specific case where there may be a risk to the health or welfare of the data subject? Or will it be possible to request the data for other purposes such as the investigation of alleged criminal offenses?
  • If the scope of sharing changes, either in terms of entity that data will be shared with or the format and scale of sharing, what controls are in place at the time the data is gathered to ensure that those changes are subject to an appropriate Privacy Impact Assessment.

The Ugly

There are three levels of Ugly that emerge.

Ugly #1

The first is the traditional fig-leaf that is dangled on projects like this: “We have consulted with the Office of the Data Protection Commissioner”. This is the Public Sector data project equivalent of waving a hand to dismiss an inquisitive Storm Trooper: “These aren’t the droids you’re looking for; Move along.”

But… that is NOT the role of the Data Protection Commissioner. Their role is not to advise that an organisation is compliant. Their role in the context of a Prior Consultation process is to flag any glaring issues of non-compliance that would need to be addressed. All too often their advisory is ignored by organisations. Their role then becomes one of investigation and prosecution should the mechanisms of processing that are implemented breach the Data Protection Acts.

In a prior consultation process, the DPC’s comments are made based on the information provided to them at that time. Their assessment is based on the quality of the information, the detail of the proposed processing, the assessment of risk, and their ability to follow the proposal that they have been given. And they can get it wrong based on that information. And the Data Controller who goes to them for a prior consultation process might misunderstand what is being asked of them or implement a system that doesn’t match what is actually needed. So, on foot of a complaint, the DPC may find that a particular instance of processing does actually breach the requirements of the Data Protection Acts even if their prior consultation didn’t find a specific thing that would be a breach.

Take the retention until 30 years of age. The DPC may have advised the Department that a retention period for personal data that is necessary and proportionate to the purpose for which the data was obtained is required under the Acts. The Department may not have had any retention period in mind and simply pulled a figure that gave a long range of data for longitudinal analysis and study (I call that “The Anglo-Irish Bank approach to critical data”).

The DPC will not have determined if that is necessary and proportionate. That is the Department of Education’s job to determine and justify the necessity and proportionality.

The new Data Protection Commissioner, Helen Dixon, has made it very clear that it is NOT the role of the Office of the Data Protection Commissioner to do the homework of public sector departments for them. They need to own the decisions they take about the processing of personal data.

Ugly #2

The second strand of ugly that arises:

  • There is no standard communication of purpose, or of the data that is being processed, to the parents of children.

So far this month I’ve seen at least three different versions of letters that have gone home to parents. Clients of mine in other sectors have been ending meetings with questions about this database and showing me the letters. They are all different.

There is a standard letter from the Department website. Some schools are using this. It attempts valiantly to explain the purposes for processing and the length of retention and who data will be shared with. But fails in that regard.

Other schools have taken just the questions that the Department has identified as requiring explicit consent (the ones about ethnicity and religion) and have included them in a letter that says that the Department wants this information. No further explanation. And no mention of all the other sensitive personal data such as data about physical or mental health that the Department is getting from the school directly without explicit consent. That’s another Data Protection #fail.

Ugly #3

The third strand of ugly that arises is this:

Part of the defence raised by the Department to the processing of data in the Primary Online Database is that it is being done already for pre-school, post-primary, and beyond.

That’s a line of argument that presumes there is no breach of fundamental rights in the design and execution of data processing or data sharing in relation to any other database about participants in the education system that is under the control of the Department of Education. And while the data in these databases is different (the post-primary database is more focussed on academic achievement and results on courses – particularly as it encompasses Further Education courses such as those accredited by QQI/FETAC).

It’s like arguing that you haven’t broken the law by stealing a car because you were never arrested for stealing a motorbike and a truck in the past.

Or like a child insisting to their parent that their misbehaviour is justified because all the other kids are doing it too.

But then… everyone else is stepping on Data Protection rakes, why not the Department of Education?

Posted in Data Protection, Information Quality, The Business of Information.