Census and Data Protection

My significant other has acted as an enumerator for the Irish Census of Population in the past, and has applied to do it again.

Every census season, I see lots of ill-informed comment about the nature of the census, what the data can or will be used for, and who it will be shared with. This ill-informed comment actually highlights the importance of trust in government in the obtaining of personal data, something which the former Chairman of one of my company’s clients (a very large Government agency) was obsessed with – loss of trust was directly linked in their mind to a loss of their ability to conduct their agency’s primary function, which is a very important one.

So, what is the legal position regarding data provided in the Census?

  1. Data that is obtained for a statistical purpose (i.e. obtained for a purpose under the Statistics Act 1993) is subject to a specific exemption under the Data Protection Acts 1988 and 2003.
  2. However, that exemption is justified largely by reason of the fact that it is prohibited under the Statistics Act 1993 to use the data obtained under that Act for any purpose other than “statistical compilation and analysis purposes” (section 32), and that to disclose data obtained under the Statistics Act which may be related to an identifiable individual without their consent (or the consent of their representative if they are deceased) is an offence under Section 33, except under specific circumstances, pretty much all of which relate to the operation of the function of the Central Statistics Office.
    • For the purposes of prosecuting an offence under the Act (you need to be able to identify the records that were the subject of the offence to prosecute the offence, so s33(1)(a) allows for them to be disclosed for that purpose
    • For the purposes of actually doing the statistical analysis functions of “officers of statistics” so that data can be aggregated and reported on (you need to have access to raw data to do the analysis and aggregation, so this is an obvious use of the data that has a very clear statistical basis)
    • For processing data for the purposes of the CSO in a form and manner governed by a contract in writing. This covers the use of 3rd party analysis tools or services or data enrichment, but ONLY for the purposes of the CSO, which is ONLY concerned with the publication of AGGREGATED statistical analysis.
  3. These restrictions do not apply to census data over 100 years old. However, the Data Protection Acts would still apply to data relating to any living individual in that data. Statistically, that is currently a small population and reasonably easy to check, and with a low probability of impact on fundamental rights for any disclosure. But as the life span of population increases, this would need to be kept under review.
  4. It is arguable that, should the CSO provide raw data to other government Departments for matching against their databases to append data for the CSO’s purposes, the recent CJEU ruling in Bara  would require them to disclose the fact of providing data to such Departments, but the Statistics Act 1993 would prevent those departments from making use of the CSO data for their own purposes (but this would likely need to be flagged by the “other side” of such a data enrichment process along the lines of “We get data from CSO and append information to it for statistical purposes but do not retain any CSO data at any time“).
  5. Regarding the actual census forms themselves, there is a very clear requirement under Section 42 of the Statistics Act 1993 that any records held by “officers of statistics” (which includes enumerators) be kept safe and secure “in such manner as to ensure that unauthorised persons will not have access thereto “, and that non-return of records constitutes an offence. Of course, the penalties on summary conviction (a prosecution taken by the Director General of the CSO, not the DPC) are pretty paltry (up to €1000 per offence), so might not be a sufficiently dissuasive penalty under the forthcoming General Data Protection Regulation.

It’s important to note that breaches of data security or misuse of statistical data are prosecuted not by the DPC but by the Director General of the CSO. To my mind this is not ideal, but reflects the fact that the Data Protection Acts didn’t cover paper records in 1993 as this only became a function of the DPA under the 2003 Act (enacting the 1995 Directive). It does, however, make clear that there are offences, sanctions, and a prosecuting body for breaches of the 1993 Act.

But of course, none of this will placate the tinfoil hat brigade who act on the default setting that any data you give to the Government is shared willy-nilly.  This highlights the importance of proactive data protection controls and data privacy considerations on the part of Government agencies and the legislature.

While it is tempting to build ‘databases o’ the people’, every instance of non-transparent and inadequately controlled sharing of data creates a threat to trust. When trust expires, key data simply becomes unavailable or unreliable as people cease to provide it or provide misleading information (which is an offence under the Statistics Act). Trust is fragile and ‘mushroom management’ approaches and “bit of an oul’ law” fig leaves are no longer sustainable when the tinfoil hat can be a fashion trend before the facts and truth of a process has its boots on (to mangle Churchill).

So: Census data is very strongly protected (albeit with sanctions that could and should be higher), and it is census data that underpins the priorities in government strategy, investment, and expenditure. It’s important for people to fill out the census accurately so that accurate data drives appropriate strategic decisions in Government.

However, Government needs to realise the impact that damaged trust in public sector data management and respect for data protection has on the willingness of people to trust the government with large amounts of data in the form of  a census. From POD to Health Identifiers to Irish Water there is a litany of error and misstep. Trust is fragile. Government needs to learn how not to step on it, or get used to tinfoil hat fashion shows and policy decisions grounded on statistical quicksand.

One route to restoring trust would be for our independent Data Protection regulator to regulate independently and take decisive action against public sector organisations that breach the Data Protection Acts. Enforcing the law is a key step towards ensuring that people trust the law will be enforced.

 

 

The General Data Protection Regulation and “Mental Discounting”

other_peoples_moneyThe General Data Protection Regulation (GDPR) is now in the home straight, with publication of final, final text expected in Q1 of 2016 (expect something to happen towards the end of January).

One of the small and subtle changes that is buried in the 209 pages of text in the most copy I have come into possession of is the apparent removal from the Regulation of any specific reference to personal liability of officers, directors or managers of bodies corporate where their actions (or inactions) cause an offence to be committed. This is a power that the Irish DPC has used judiciously over the past few years under current legislation (it is a power of the DPC under Section 29 of the Data Protection Acts and Section 25 of SI336 (ePrivacy Regulations), but which has served to focus the minds of managers and directors of recidivist offending companies when the sanction has been threatened or applied. The potential knock-on impact of such a personal prosecution can affect career prospects in certain sectors as parties found guilty of such offences may struggle to meet fitness and probity tests for roles in areas such as Financial Services.

The omission of this power from the GDPR weakens the enforcement tools that a Regulator has available, weakens the ability for Regulators to influence the internal organisational ethic of a body corporate when it comes to personal data, and invites officers, directors, and managers (particularly in larger organisations) to engage in “Mental Discounting” because the worst case scenario that can occur is a loss of “Other People’s Money”, not a direct impact to them.

I’ve written about this before on this blog in the context of organisations in compliance contexts weighing up “worst case scenarios” and assessing if the financial or other penalties are greater than or lesser than the value derived from breaching rules (search for “mental discounting“). However, the absence of a personal risk to the personal money of officers, directors, or managers also creates a problem when we consider the psychology of risk, given that our risk assessment faculties are among the oldest parts of our brain:

  1. We are really bad at assessing abstract risk (we evolved to understand direct physical risks, not the risks associated with abstract and intangible concepts, like fundamental rights, data, and suchlike).
  2. We are tend to down play risks that are not personalised (if there isn’t a face to it, the risk remains too abstract for our primitive brain. This is also the difference between comedy and tragedy… comedy is somebody falling off a ladder. Tragedy is me stubbing my toe).

So, when faced with a decision about the processing of personal data that has a vague probability of a potentially significant, but more probably manageable, financial penalty to an abstract intangible entity (the company we work for), with no impact of any kind on a tangible and very personal entity (the individual making the decision), invariably people will decide to do the thing that they are measured against and that they are going to get their bonus or promotion based on.

The absence of an “individual accountability” provision in the GDPR means that decision makers will be gambling with Other People’s Data and Other People’s Money  with no immediate risk of tangible sanction. If the internal ethic of a company culture is to take risks and ‘push the envelope’ with personal data, and that is what people are measured and rewarded on, that is what will be done.

In a whitepaper I co-authored with Katherine O’Keefe for Castlebridge, we discussed the role of legislation in influencing the internal organisational ethic. The potential for personal sanctions for acting contrary to the ethical standards expected by society creates a powerful lever for evolving risk identification, risk assessment, risk appetite, and balancing the internal ethic of the organisation against that of society. Even if only used judiciously and occasionally, it focuses the attention of management teams on data and data privacy as business critical issues that should matter to them. Because it may impact their personal bottom line.

Absent such a means of sanction for individuals, I fear we will see the evolution of a compliance model based around “fail, fail fast, reboot” where recidivist offender decision makers simply fold the companies that have been found to have committed an offence and restart with the same business model and ethic a few doors down, committing the same offences. Regulators lacking a powerful personal sanction will be unable to curtail such an approach.

After all, it’s just other people’s money when you get it wrong with other people’s data.

 

 

Farewell Caspar

Over the course of my career I’ve been lucky to meet and become friends with many of the pioneers in the fields of Information Quality, Data Governance, and Data Protection.  I have been doubly fortunate that some of these people have also become mentors – helping me to figure out what I wanted to do, and more importantly what I stood for, in the world of Information Management.

I had hoped one day to make the same connection with Caspar Bowden. Sadly that will not be possible now. This saddens me.

However, over the past few years, twitter has allowed me some level of contact with Caspar. It was often affirming to see him retweet one of my rants or rambles, or engage with me to clarify some point I was making or question I was raising.  At times it felt like I was getting a gold star from teacher… “10/10 for effort… keep paying attention to the details”.

I have no doubt that, had we met, we’d probably have wound up arguing about something. I’m sure it would have been an argument I’d have lost. But it would have been fun (and educational) to have argued.

The world has lost a true pioneer, a prophet of the dark consequences of unfettered digital privacy invasion, and a staunch advocate for finding better ways to do things.

It is never easy to be an advocate swimming against the tide, as Caspar often seemed to be.  However, sometimes the fight is worth fighting so that the pendulum finds a balance between rights, duties, and obligations in society, and so that people become more aware of the erosion of their privacy rights through legislative or technological changes.

So, if anyone in Ireland wants to remember Caspar Bowden, I can think of no better way then donating to Digital Rights Ireland or any of the other digital rights advocacy groups who fight the same fight that Caspar fought.

He may be gone but his spirit, and the fight, remain.

 

Me, speaking and teaching in 2015

alec guiness as Obiwan kenobi

Elderly data jedi imparts cryptic wisdom (film at 11)

So, a bunch of people have asked me to speak at events this year. And this is ON TOP of events and training I’m doing with my company (Castlebridge Associates).

Due to client commitments I’m unable to make it to my usual Californian summer conference DGIQ this year, but my colleague Katherine will be presenting there in June.

Not a bad diary! Now, to fit the big client engagements in around that…

We might be in a bit of a #gemalto

Gemalto is a manufacturer of mobile phone SIM cards based in the Netherlands. If you have a mobile phone, there is a good chance you have a SIM card manufactured by Gemalto. They also manufacture smart cards and identity validation solutions for financial services and government.

It has been revealed that Gemalto has been hacked by US and British intelligence agencies (GCHQ and NSA) and the encryption keys that encrypt the communication between your phone and the mobile phone network have been taken. This means that messages and calls can be intercepted and decrypted with ease by intelligence agencies. And anyone else who has these keys.

This arguably (in my view definitely) represents a particular risk of a breach of the security of the public telecommunications network.

In Ireland, Section 4(4) of SI336, the legislation that enacted the 2009 ePrivacy Directive (the “cookies law” as it has incorrectly become known) places a specific requirement on telecommunications companies to inform their customers of the issue without delay and, where the phone company isn’t in a position to fix the issue themselves they have to advise on steps that can be taken to minimise risk.

(4) In the case of a particular risk of a breach of the security of the public communications network, the undertaking providing the publicly available electronic communications service shall inform its subscribers concerning such risk without delay and, where the risk lies outside the scope of the measures to be taken by the relevant service provider, any possible remedies including an indication of the likely costs involved.

That Section enacts verbatim the text of Article 4.2 of the original 2002 ePrivacy Directive.

Irish telcos have been required by the Data Protection Commissioner in the past to provide blanket notification on their website regarding smishing (SMS-based phishing) threats and similar risks to the security of data on their networks. This is a whole level of complexity higher again.

The threat of unauthorised interception of GSM calls was perceived as relatively low risk due to the calls being encrypted between device and the network. Some threat vectors were identified, but in general the view was the encryption on any call would need to be cracked on a case by case basis. Now that encryption cannot be relied on. There is a particular risk.

My view is that telcos in Ireland, and potentially other EU countries, would need to inform their customers, and telcos should ideally be looking for a solution to reinstate the security of the SIM-to-Network link and issue new SIM cards to their subscribers. While National Security is outside the remit of the Data Protection laws and ePrivacy directives, that should be interpreted narrowly to relate to the actions of the Intelligence services in their spying. Hacking Gemalto may have been just on the right side of the line (I’m not saying that it is). However, it creates a problem for Telecoms companies in that the day to day operation of their networks is not a National Security or Intelligence service activity and the networks are now compromised if the telecoms company uses Gemalto SIM cards.

That will be costly and complex and, inevitably, telecoms companies will pass the cost on to their customers (it’s a tight margin business at the best of times, and reinstating a chunk of your customers with new SIMs is not to be undertaken lightly).

Of course, it requires EU Data Protection Authorities to engage with the companies in their jurisdictions to ensure they are acting in compliance with the relevant legislation. And that means ALL EU Data Protection Authorities, not just the one that everyones likes to beat up on for being “light touch”.

[Update: What about National Security and Criminal investigation exemptions?]

The Data Protection Acts in Ireland, and equivalent legislation across EU, has limited exemptions for activities of law enforcement and intelligence services relating to National Security and the investigation of criminal offences. This is being relied on by the UK ICO in relation to the Gemalto hack (see https://twitter.com/lisafleisher/status/569482404521496576/photo/1)

And I agree. In the context of the specific action of an intelligence service, the Data Protection Authorities have little authority due to the exemptions given under current legislation (Note: the exemptions are still subject to the Article 8 ECHR provision around a right to personal data privacy, which has been ruled on by the CJEU in the context of mass surveillance). So, in relation to the actual accessing of a company network and taking encryption keys, there is no role for a Data Protection Authority. In the conduct of intelligence service and law enforcement activities, Data Protection Authorities have very limited roles.

However, the fact that the keys are no longer under the control of Gemalto creates a “particular risk of a breach of security” in a communications network. So, telcos would still, in my view, need to give serious consideration to their obligations under Article 4.2 of the ePrivacy Directive. Yes, it is an intelligence agency (or two) that has the keys. Yes, they may have, in certain circumstances, a legitimate national security or criminal investigation purpose and associated exemption. But a risk to security of a public telecommunications network exists, and telcos are required to do something about it under Article 4.2. And that is something that national Data Protection Authorities are entitled to enforce.

In effect, the action that a telco needs to take should be no different than if a criminal organisation had executed a similar attack on a SIM card manufacturer. Because Article 4.2 doesn’t include a “… unless the particular risk arises from an action of an authorised intelligence agency or law enforcement body”. And, as I’ve said earlier in this post, the Irish DPC has previously required telecommunications companies to provide blanket notifications about the risk of Smishing as a security issue in the public telecommunications network.

I believe that telcos need to have some alert to customers about the risk that has been created.

For example, any telco that uses Gemalto SIMS could use a notice like this on their website:

It has been reported that the encryption keys for SIM cards manufactured by our supplier Gemalto have been taken by intelligence services acting, as we understand it, within their legal remit. These keys keep your calls and messages private and secure in our network in the normal course of activities, and this action creates a risk that calls and messages which would otherwise be encrypted between your device and our network can now be intercepted by anyone in possession of the correct encryption key without our knowledge. While we have no reason to believe the keys will be misused by the intelligence agencies in question or any other entities, a risk to security in the network does exist. We continually examine our options to keep your data safe and secure in our network and will provide updates on this situation as they arise.

Wording along these lines would meet the requirement of Article 4.2, and doesn’t take away from the legitimate access to telecoms network traffic and call data by intelligence services and law enforcement for the investigation of crimes or national security purposes. It has the added bonus of showing that the telco takes data security seriously enough to at least try to comply with the letter of the law.

It doesn’t get around the mass surveillance issues that arise when any call from any device using a Gemalto SIM can be decrypted, which almost certainly raises issues under Article 8 of the Charter of Fundamental Rights. But that is not the telecommunications companies’ issue to address, nor is it a matter for Data Protection Authorities. It’s one for Governments.

Data Protection Rake: WHACK!!

Sideshowbob walking on rakesSo, the Minister for Education is fighting a rear-guard action to justify the method of execution of the Primary Online Database. Get ready for the rakes.

Correctly, she is stressing the need for a means to track education outcomes as children move from primary to secondary education, where there is a drop-out rate which is rightly concerning. It’s been concerning since 2006 when Barnados highlighted the mystery of what was happening to the 1000 children a year who didn’t progress from primary to secondary education.

She has stated that the Data Protection Commissioner has been consulted and “and that office is satisfied with what we are doing“. The Data Protection Commissioner has commented that the Department has presented “a legitimate and proportionate purpose for requesting to be provided with the data it is seeking“. Now… that’s not the same thing as being “satisfied with what we are doing” as the Minister has said. It also depends very much on what purpose was communicated to the Office of the Data Protection Commissioner in 2013.

Even in an ideal world scope creep occurs, particularly when the objective for processing the data seems to be a bit confused. Is it for purely statistical purposes (which is implicit in the statements that the data would only be accessed by a small number of people in the statistics unit of the Department of Education), or is it for more day-to-day operational decision making purposes (which is implicit in comments made by the Minister that school funding could be at risk if data was not returned)? Those are two different categories of purpose.

[Whack]

But what about the DPC’s position?

The Data Protection Commissioner’s statement to the Irish Times actually limits its comment to the legitimacy and proportionality of the purpose that the Department may have for seeking to process this data. Ensuring children move from Primary to Secondary education and ensuring that the State has data available to help identify any trends in drop-out rates and ensure that limited resources are deployed as efficiently as possible to ensure equality of access to education (here’s a link to some more stuff from Barnardo’s on that) and support children in getting the best education outcomes possible.

Legitimacy and proportionality are linked to the purpose for which the data is being obtained. And the need to ensure that data is “Obtained fairly and processed for a specified and lawful purpose” it is just the first two of eight Data Protection principles. So what is the purpose the DPC was told about? Are there new purposes?

So, when the Minister comments on the retention of data about primary school children until they are 30 years old, and says that

“I did say I would examine it but it looks to me that up to the 30th birthday is probably appropriate and it satisfies the Data Commissioner as well which is obviously very important,”

it is really important to ask: What is the purpose for which this long a retention period is required?

[Whack]

It’s actually more than that: it’s essential that the Minister is able to say categorically what the purpose is for this retention and why a 25 to 26 year retention period for personal and sensitive personal data is required (“probably appropriate” is not the test… “retention for no longer than is necessary for the purpose for which the data is being processed” is the test under the Data Protection Acts. It is also important to assess whether the purpose and requirement can be met by less personally identifying data: would anonymised or pseudonymised data support the objective? If yes, then it ceases to be necessary to hold the raw data, so it is no longer “probably appropriate”).

[Whack]

So… what is the specific purpose for which a retention period of “until 30th birthday” is required? State it. Assess it. Compare against other alternative methods. And then make a clear decision based on the Privacy impact and the necessity and proportionality of the processing. “Probably appropriate” is not a form of words that fills me with confidence. “Assessed to be necessary and proportionate against other options, which were rejected because of X, Y, Z reasons” would be more illustrative and evidential of a proper Privacy Impact Assessment and Privacy by Design thinking at work.

[Whack]

For other purposes it might not be appropriate to allow access to the identifiable data even 90 seconds after it is recorded. Those purposes need to be identified and appropriate governance and controls defined and put in place to ensure only appropriate data is disclosed that is adequate, relevant, and not excessive to the purpose for which it is being processed. And that purpose needs to be consistent with and not incompatible with the purpose. The Data Protection Commissioner doesn’t appear to have actually commented on that. So the standard protocol of clear statutory basis and an appropriate system of Governance still needs to be considered and put in place for any sharing of data or subsequent use of data to be compliant with the Data Protection Acts (and, just in case we forget, Article 8 of the EU Charter of Fundamental Rights).

[Whack]

Disturbingly, the Minister seems to imply that it is irrelevant if parents provide their PPSN to the Department or not as they will be able to obtain that data from the Department of Social Protection. It is true that name, address, date of birth and mother’s maiden name can be used to validate a PPSN. However I would question the  basis under which the passing of that data to obtain the PPSN would be valid, given that the Dept of Education’s registration with Client Identity Services in the DSP seems to presume the Department has the PPSN it needs.The rent has been paid up on the battlefield it appears, and there is no going back.

[Whack. Whack]

(Name, address, date of birth, and mother’s maiden name could form a composite key to identify a child uniquely on the database where no PPSN is available. In which case, what is the purpose for the PPSN?)

[Whack]

What does the Minister’s statement mean?

In my opinion, the Minster’s statement means that the Department are mis-understanding the role of the Data Protection Commissioner and what it means for the DPC to give an opinion on the appropriateness of processing. The DPC will determine if there is risk of non-compliance with a proposed purpose for processing and will give guidance and feedback based on the information that is provided to them.

If that information is incomplete, or doesn’t match the final implementation of a system, then the DPC can (and does) change their position. It’s also not the role of the DPC to correct the homework of a Government Department, and the new Commissioner Helen Dixon has made that exceptionally clear to Public sector representatives in at least two forums since November. Her role is to enforce the legislation and support the protection of fundamental data privacy rights of individuals and to be independent of Government (that’s a Treaty obligation by the way since 2009… and towards the end of his term Billy Hawkes the former Commissioner exercised that independence by, for example, prosecuting the Minister for Justice).

It also means that the Minister is at risk of having to dig herself out of an entrenched position. The road to heck is paved with good intentions. This scheme (and all the other education outcome tracking databases that the Department has) are all valid and valuable as part of a coherent information strategy for the design and implementation of education services and delivery of education outcomes in Ireland. But the design and execution of the systems of processing (not just the technology systems but the wider scheme of stakeholder engagement, controls, governance, and impact assessments) is leaving a lot to be desired.

It means, unfortunately, that rather than display their homework around Privacy Impact Assessment, Governance controls, and Privacy by Design, the Minister and her Department are reacting exactly as I described in yesterday’s blog post:

Data Protection Expert: I think this raises significant issues and may be illegal

Government Representative: It’s too late. I’ve already paid a months’ rent on the PR agency project.

So far the report card reads:

  • Intention: 10 /10
  • Effort: 4 /10 for effort.
  • Execution:  2 / 10  (and negative marking applies here).

“Trust us, we’re the Government” doesn’t work any more because the Government has failed spectacularly to build and engender trust on previous data gathering and data sharing initiatives. So, laudable as the goals are, there was already a mountain to climb to put this data gathering inside the “circle of trust”.

My €0.02

Having reviewed a range of documentation around the Primary Online Database (including the specifications for the drop down fields in the database).

  1. The project has mis-identified as “non-sensitive” data a range of questions which are capturing sensitive personal data about medical or psychological assessments.
  2. The system has a notes field which currently can be accessed by users of the system in the Department but it is proposed that that will be restricted to just schools but in reality that means that the data is still being stored on a system designed and controlled by the Department and which would be accessible by anyone with an administrator access to the underlying database.
  3. The communication of purpose for processing, and the explanation of the retention period, is bordering on the unintelligible to me. And I read and write those kind of things for a living. I teach this stuff to lawyers. The defence that “it’s based on the Department circular” is not a defence. The requirement under the Data Protection Acts is that data be fairly obtained for a specified purpose. That requires that the statement of purpose be comprehensible (I advise clients to apply adult literacy standards to their text and aim for a reading age of 12 to 15). If the circular is incomprehensible, write a ‘friendly version’ or get the Circular redone.
  4. The project has gone to the wrong source for the data. The schools do not have a lot of this data, and even then they have obtained it for a different specified purpose. Schools guessing at ethnicity or religion or other aspects of the data being gathered makes little sense and creates an admin burden for the schools. The 50% response rate in the pilot project should have been a warning that the execution method was not appropriate.
  5. The use of “local” versions of the questionnaire by schools (where schools have modified the Department’s form and sent it out to parents) means that the Department (as Data Controller) has lost control of the statement of and explanation of purposes and processing. That means that no assumptions can be made now about what parents understood they were agreeing to because the ‘official’ form of communication may not have been used.
  6. There is no clear justification for a retention period of raw, identifiable, data until a child’s 30th year.
  7. The stance adopted by the Minister is not good. In the face of valid criticism she has adopted an entrenched position, clutching to the DPC as a shield rather than a fig leaf. Given the narrative arc in the Irish Water debacle that is, as Sir Humphrey Appleby would say, “Courageous Minister, very courageous”. (Data relating to children, “all cleared by the DPC”, challenge in public by knowledgeable experts, public disquiet, “DPC said it was OK”, immediate reverse ferret after a reshuffle… [we are at stage 3 now].)

Pausing. Assessing and defining an appropriate strategy for strategic use of data in education for statistical planning and centralisation of operational data, combined with an appropriate Privacy Impact Assessment that takes in to account recent rulings on necessity and proportionality by the CJEU would be advisable at this time.

Anything else is simply courageous, Minister.

Irish Government projects and the Data Protection Rake

The more I see the mindset of the Irish Civil Service around data and its potential for use (and misuse and abuse), the harder I find it to get this video out of my mind. Over the past two years, literally at every turn, an initiative has been launched which has, within a short period of time, raised questions about the fairness of obtaining of personal data, the legitimacy of the purpose for processing, the scope and scale of data sharing, retention periods for data, and the governance of the data once it has been obtained.

Government Departments seem intent on continuing with poorly planned, inappropriately executed, and ill-advisedly governed initiatives. This happens even in the face of valid comment and concern from an increasingly informed and aware citizenry, and in some cases in the face of question and comment from experts in the field who are raising valid concerns based on little more than practical experience and deep professional knowledge. Questions or requests for less haste and more analysis are met with a grim determination to hit specific timelines. “This is a data protection disaster waiting to happen” is greeted with a continued roll out of the initiative that gives rise to concern.

While Side Show Bob illustrates the inevitable public fall out of not engaging with concerns in a constructive manner, it is the Marx Brothers who give the most apposite quote.

In Duck Soup, the following exchange takes place between the President of Freedonia (Groucho Marx) and the Ambassador of a neighbouring country on the eve of war…

Ambassador Trentino: I am willing to do anything to prevent this war.

Rufus T. Firefly: It’s too late. I’ve already paid a month’s rent on the battlefield.

On Irish Government data projects, the all to oft-repeated script now reads:

Data Protection Expert: I think this raises significant issues and may be illegal

Government Representative: It’s too late. I’ve already paid a months’ rent on the PR agency project.

Last year it was Irish Water. This year it will be eircodes and Primary Online Database. Both are things that have potentially great benefits for society, but both are becoming hallmarked with the rake-mark of poor planning and execution, especially when the questions of Data Protection and Privacy are considered. If the investment in PR agencies to spin the projects and manage the media once questions are asked was matched by an investment in proper design and planning for Data Protection and Data Privacy issues, there would be fewer blogs, tweets, column inches and broadcast minutes devoted to discussing the issues and asking awkward questions for the media consultants to spin.

Leaving eircodes to one side for a moment (that’s a big bucket of fish to discuss from a data quality and data privacy perspective), the on-going roll out of the Primary Online Database project is a classic example of valid and legitimate purposes and objectives in processing data being undermined by poor planning, execution, design, and governance around the fundamental rights issues of Data Protection and Privacy.

The Good

Our education system is broken. Scare resources are not applied or allocated effectively. Schools have resources rationed from the Department, but under privileged schools are unable to supplement those resources (such as psychological assessments, SNA hours, other classroom supports) to the same level as schools in middle class or more privileged areas. Children drop out of the system and drop off the radar. Having data about outcomes in education, and about social or demographic issues that might affect those outcomes is valuable to identifying causal factors and prioritizing investment in education services and interventions in an ‘evidence based’ policy framework. Questions like: Does Timmy start primary? does he go to secondary? Does he go straight to University or do a PLC or Further Education course? What schools did he attend? Did Timmy drop out and then re-enter as a mature student either at 3rd level or re-entering 2nd level.

Of course, this longitudinal data is valuable. And if at the granular level of the individual it is a deeply personal snapshot of the life, trials, and tribulations of little Timmy from the age of four years of age.

This data is to be held in the Revenue Commissioner’s data centre. This is a good thing. The Revenue Commissioners have a very secure data centre. I would not automatically assume a nefarious intent in putting data that requires a high level of protection in a location that has been designed, built, and resourced to have a high level of technical security protection.

The Bad

There are three bits of bad that concern me.

Bad  #1

The first bad arises where the planning and execution of this data gathering fails to consider the data subject and the context. It’s data about children. It’s data about medical and psychological conditions that a child might have (that’s Sensitive Personal data even though the Department of Education appears to think that it isn’t). It’s data about their ethnicity, their family make-up, their socio-economic status, and a range of other factors. It’s data that is tied uniquely to them by their PPSN. It’s data that includes comments written about the child by administrators in the school, which will be written to a database in the Department of Education’s control. And that data will be held until the child is 30 years old.

Of course, documentation tells us that access to that field is going to be restricted so Department of Education staff can’t access it and only people in the school that the child is in will be able to see it. Of course, that means that anyone with administrator rights to that database can access that data. And that means that it will almost inevitably be looked at. This is despite the Department having no statistical reason for having detailed notes about students.

Bad #2

The bad goes to worse when the means for gathering  the data is looked at. The data is being obtained from schools, with only a subset being asked for from parents. The schools have obtained data for a particular purpose. The Department’s purpose is a new purpose, and it is the Department’s purpose not the school’s. So it is incompatible with the purpose for which the school originally obtained the data. Schools are being asked to provide data based on their own records, or their own guesswork about ethnicity or religion or other socio-demographic data.

Upshot: data will either not be returned, or will be inaccurate. So statistical analyses based on that data will have skew and bias that will need to be controlled for. The Department’s own pilot programme only had a 50% response rate.

  • A better option: Invest time and effort in a proper strategy for educational data management. Educate parents and guardians and school management about the purposes, benefits, and strategic objective. Seek the data from the parents of the children. (Difficulty: requires budget, means you need to have a load of key decisions made and documented up front, and you need to take time to engage with the citizenry… even the tin-foil hat wearers).

Bad #3

Another level of bad arises in the context of the sharing of the data. What data will be shared, with whom, and why, and under what controls? These are basic questions that need clear and intelligible answers. And the answers need to be understandable. And the sharing needs to be necessary and proportionate. With defined governance controls over the changes to the use of that data or the changes to the sharing of that data. If data is being obtained for a statistical analysis purpose, there is no operational data management purpose that would permit the sharing of that data with another entity. If the data is being obtained for both statistical analysis and planning purposes and for day-to-day operational purposes, it means that the question of who actually has access to the data on a day to day basis arises – notwithstanding the assurances that only a small number of people in the Statistics unit of the Department of Education would be able to access the data.

  • What data will be shared? Will it be identifiable data or will it be aggregated statistical data?
  • If identifiable data will be shared, on what basis and in what format? Will it be on a record by record basis for specific intervention in a specific case where there may be a risk to the health or welfare of the data subject? Or will it be possible to request the data for other purposes such as the investigation of alleged criminal offenses?
  • If the scope of sharing changes, either in terms of entity that data will be shared with or the format and scale of sharing, what controls are in place at the time the data is gathered to ensure that those changes are subject to an appropriate Privacy Impact Assessment.

The Ugly

There are three levels of Ugly that emerge.

Ugly #1

The first is the traditional fig-leaf that is dangled on projects like this: “We have consulted with the Office of the Data Protection Commissioner”. This is the Public Sector data project equivalent of waving a hand to dismiss an inquisitive Storm Trooper: “These aren’t the droids you’re looking for; Move along.”

But… that is NOT the role of the Data Protection Commissioner. Their role is not to advise that an organisation is compliant. Their role in the context of a Prior Consultation process is to flag any glaring issues of non-compliance that would need to be addressed. All too often their advisory is ignored by organisations. Their role then becomes one of investigation and prosecution should the mechanisms of processing that are implemented breach the Data Protection Acts.

In a prior consultation process, the DPC’s comments are made based on the information provided to them at that time. Their assessment is based on the quality of the information, the detail of the proposed processing, the assessment of risk, and their ability to follow the proposal that they have been given. And they can get it wrong based on that information. And the Data Controller who goes to them for a prior consultation process might misunderstand what is being asked of them or implement a system that doesn’t match what is actually needed. So, on foot of a complaint, the DPC may find that a particular instance of processing does actually breach the requirements of the Data Protection Acts even if their prior consultation didn’t find a specific thing that would be a breach.

Take the retention until 30 years of age. The DPC may have advised the Department that a retention period for personal data that is necessary and proportionate to the purpose for which the data was obtained is required under the Acts. The Department may not have had any retention period in mind and simply pulled a figure that gave a long range of data for longitudinal analysis and study (I call that “The Anglo-Irish Bank approach to critical data”).

The DPC will not have determined if that is necessary and proportionate. That is the Department of Education’s job to determine and justify the necessity and proportionality.

The new Data Protection Commissioner, Helen Dixon, has made it very clear that it is NOT the role of the Office of the Data Protection Commissioner to do the homework of public sector departments for them. They need to own the decisions they take about the processing of personal data.

Ugly #2

The second strand of ugly that arises:

  • There is no standard communication of purpose, or of the data that is being processed, to the parents of children.

So far this month I’ve seen at least three different versions of letters that have gone home to parents. Clients of mine in other sectors have been ending meetings with questions about this database and showing me the letters. They are all different.

There is a standard letter from the Department website. Some schools are using this. It attempts valiantly to explain the purposes for processing and the length of retention and who data will be shared with. But fails in that regard.

Other schools have taken just the questions that the Department has identified as requiring explicit consent (the ones about ethnicity and religion) and have included them in a letter that says that the Department wants this information. No further explanation. And no mention of all the other sensitive personal data such as data about physical or mental health that the Department is getting from the school directly without explicit consent. That’s another Data Protection #fail.

Ugly #3

The third strand of ugly that arises is this:

Part of the defence raised by the Department to the processing of data in the Primary Online Database is that it is being done already for pre-school, post-primary, and beyond.

That’s a line of argument that presumes there is no breach of fundamental rights in the design and execution of data processing or data sharing in relation to any other database about participants in the education system that is under the control of the Department of Education. And while the data in these databases is different (the post-primary database is more focussed on academic achievement and results on courses – particularly as it encompasses Further Education courses such as those accredited by QQI/FETAC).

It’s like arguing that you haven’t broken the law by stealing a car because you were never arrested for stealing a motorbike and a truck in the past.

Or like a child insisting to their parent that their misbehaviour is justified because all the other kids are doing it too.

But then… everyone else is stepping on Data Protection rakes, why not the Department of Education?

Irish Water, Data Protection, and the Cut and Paste Fairy

A few weeks ago I wrote a post here about Irish Water’s Data Protection Policy, which was very poorly written and had all the hallmarks of having been cut and paste from another document (for example references to numbered clauses that were not in the Data Protection Notice).

Today they have advertised on RecruitIreland.com for a Data Protection and Information Security Manager. Ignoring for a moment that this conflates two completely different but related skill sets, the advert on RecruiteIreland.com has all the hallmarks of being a cut and paste job from elsewhere. The clues are very obvious to anyone who knows about international data privacy law and practice. Like me.

Take this paragraph for example:

  • Develop and implement Irish Water Information Security and Data Protection policies, processes, procedures and standards based on the existing Ervia framework, legislation and best practice (eg ISO 27000, other industry security standards such as PCI-DSS, NERC/CIP, and FERPA; HIPAA and other privacy/security legislation);

Lots of alphabet soup there that looks very impressive. But what does it mean?

  • PCI-DSS  is a credit card processing data security standard. Scratch that… it is THE credit card processing data security standard.
  • ISO27000 is the benchmark standards family for Information Security.
  • NERC/CIP is a critical infrastructure security standard from the US for electricity networks. It’s used as a reference standard as the EU lacks equivalents at the moment (thanks to Brian Honan for pointing that nugget out)
  • FERPA is not a standard. It is the Family Education Rights and Privacy Act, a US Federal law covering data privacy of student education records. It actually creates rights and duties not unlike the Irish Data Protection Acts, but it applies only to schools that receive funds under an applicable program of the U.S. Department of Education. So, unless Irish Water has a subsidiary teaching creationism in the boonies of Louisiana, it’s not entirely relevant to the point of actually being entirely irrelevant to an EU-based utility company.
  • HIPAA is the Health Insurance Portability and Privacy Act. It is privacy law that applies to certain categories of patient data for patients of US hospitals and healthcare providers and processors of health data such as insurers. In the United States.

Reading through the rest of the job description, the role is weighted heavily towards Information Security professionals. The certifications and skills cited are all very laudable and valid information security certifications. But they are not Data Protection qualifications. Indeed, the only data protection qualification that is specified is an ability to “work the Data Protection Acts”. Work them? I can play them like a pipe-organ!

Given the range of qualifications that exist now for Data Protection practitioners such as the IAPP’s CIPP/E or the Law Society’s Certificate in Data Protection Practice (disclaimer: I helped design the syllabus for that course, lecture on it, and have  set and correct the assignments for it), it’s odd that there is no reference to appropriate Data Protection skills. The question I would pose is what would happen if a Data Protection specialist with experience in ISO27000 implementation, a formal data protection qualification, and experience in data governance applied for the job and wound up shortlisted against someone with a CISSP certification and no practical data protection/data privacy experience, who would get the job?

My reading of the job advert on RecruitIreland.com is that it was cut and paste from somewhere else with minimal review of the content or understanding of what the role of a Data Protection Officer is and how that is related to but different from an Information Security Officer role.

Perhaps it was cut and paste from this advert that appeared almost six months ago http://www.dole.ie/cache/job/3853096. It’s for an Information Security and Data Protection Manager in… Irish Water.

Irish Water Boarding

A few weeks ago I did a lot of research to find the specific section of legislation that authorised Irish Water to request PPSN details from people. It is Section 20 of the Social Welfare and Pensions Act 2014.

So, a bit of a law was done to do a thing. But could that thing actually be done? Were other things needed to be done to make the request of and processing of PPS numbers lawful?

Simon McGarr correctly points out that putting a body on the list of registered bodies is only part of the governance. A protocol is required to be in place governing the use of the data which needs to be approved by the Minister. http://www.mcgarrsolicitors.ie/2014/10/22/irish-water-ppsns-and-the-missing-ministers-agreement/

That protocol appears not to have been in place as of the end of September. After the forms were finalised and sent out. Any PPSN data obtained prior to the finalisation of such protocols was obtained unlawfully. This is a failure of Data Governance. A key Regulatory requirement appears to have been missed.

This is a good example of how doing “a bit o’law” to enable sharing of data is insufficient to ensure compliance. In the absence of a strong Data Governance function to ensure that the right things are done in the right way errors occur, disproportionate processing takes place, and groupthink takes hold. I discuss this at length in a submission my company Castlebridge Associates made in conjunction with Digital Rights Ireland to the Dept of Public Expenditure and Reform on a proposed Data Sharing and Governance Bill.

That document is here: http://castlebridge.ie/products/whitepapers/2014/09/data-governance-and-sharing-bill-consultation-submission

Guest Post: An Overview of the International Data Quality Summit

When I was Director of Publicity for IAIDQ I introduced a policy of writing up the events of conferences the Association ran or was taking part in. This write up was usually published in the IAIDQ journal/Newsletter. Joy Medved has asked if I could let her do the same here so she can thank the people who helped make IDQS14 happen. As she no longer has access to the IAIDQ to publish content, and given the erratic nature of IAIDQ communications, I’m delighted to oblige to let Joy say a deserved “Thank You” and an undeserved “Good Bye”.

An Overview of the International Data Quality Summit,
Richmond, VA, USA, October 6-9, 2014 (by Joy Medved)

When I was first presented with the opportunity to become Director of Events for IAIDQ, I found the challenge of chairing a conference quite exciting. I have been a conference speaker since 1993 and really liked the idea of expanding my experience in this area. Thankfully, I was working with two extremely well-organized individuals, Alex Doyle and Melissa Hildebrand. Together, the three of us plotted and planned, and were able to outline an exciting program for what was to become the first joint conference of the International Association for Information and Data Quality (IAIDQ) and the Electronic Commerce Code Management Association (ECCMA). Melissa and I decided to call this joint adventure the International Data Quality Summit (IDQSummit.org).

Alex’s main responsibilities centered on contract negotiations with the hotel (though he proved instrumental in a number of other ways!), and Melissa, being the ECCMA Associate Director, was to be my co-chair. Unfortunately, Melissa was laid off from ECCMA and was unable to continue as co-chair; but before she left, she proved most invaluable. She was a pleasure to work with, and demonstrated superb organizational skills. I also found her work to be extremely high quality (which, as a professional quality consultant, isn’t something I say about just anyone!). Thank you, both, for all your hard work; I couldn’t have done it without you!

So, almost a year ago, and with no budget to speak of, Melissa and I set out to organize the first IDQSummit. The result of our efforts finally came to fruition last week (October 6-9, 2014) in Richmond, VA, at the Wyndham Virginia Crossings Hotel and Conference Center. It was quite exciting to see a year’s worth of work unfold before my eyes. Approximately 100 attendees joined us from 11 countries around the world.

Attendees enjoyed 40 sessions, 12 tutorials, two expert panels, and four keynotes during the four-day event, covering a variety of topics within four key tracks: Data Quality, Data Governance, Data Analytics/Big Data, and Metadata. Speakers included well-known industry authors, such as: Bill Inmon (the Father of Data Warehousing), Dr. Peter Aiken, Laura Sebastian-Coleman, Danette McGilvray, Ed Lindsey, Dr. Alex Borek, Dr. John Talburt, David Marco, and Dr. Rajesh Jugulum. Other expert practitioners included: Alan Duncan, Anne Marie Smith and Sue Geuens (from DAMA International), Kelle O’Neal, Martha Dember, Michael Scofield, Nicola Askham, Ronald Damhof and Shane Downey. For a great overview of the tutorials and sessions from an attendee point of view, please read Alan Duncan’s blog post: “IDQSummit: Context is Crucial, but People are Paramount.”

We also hosted an Hawai’ian Shirt Social Monday evening, and a Vendor Expo Tuesday that included two of our top sponsors (Melissa Data and EWSolutions). The Vendor Expo also included an authors’ booth, a Civil War costume rental, a reception, and a join-in music jam with the “Porch Rockers.” It was awesome to see Pei Wang and Daniel Pullen, our two student speakers from UALR’s Ph.D. program, get up and perform. Daniel played guitar, while Pei sang a beautiful rendition of “Let it Go,” from the movie Frozen. We even heard our Closing Keynote, Dr. Alex Borek, joining us from Munich, Germany, play the guitar while singing “Stairway to Heaven,” by Led Zepplin. It was great to see so many people joining in singing, playing guitar and jamming away with the various percussion instruments brought by the Porch Rockers. Everyone rocked!

Wednesday’s events included a fiery Data Quality Expert Panel about data quality definitions, moderated by Michael Scofield and sponsored by Data Blueprint, and an insightful conversation about ethics in our Data Governance Expert Panel, moderated by Anne Marie Smith and sponsored by Castlebridge Associates. Representatives from both IAIDQ and ECCMA stressed how important ethics are, both in business and in data.

Wednesday evening saw, a number of attendees dressing up in US Civil War era (1860s) costumes. Everyone gathered on the terrace to participate in an interactive troupe show depicting life during the US Civil War. One performer, Debbie, dressed as a Southern Bell, humorously told us how embarrassed she was to see women wearing trousers, which every good Southerner of the 1860s knows are only worn by men! She educated us on the language of the fan (information quality is very important here!) and the importance of knitting matching socks for the soldiers. (Yes, even during the throes of a civil war, quality is important!).

We also enjoyed watching as Mario Cantin was “recruited” into the ranks as a soldier of the Confederate South. (The Colonel didn’t mind one bit that Mario was from Canada. He said he’d take anyone who could button at least the top button of his uniform, which Mario did, expertly.) The troupe ended with musical entertainment as a soldier musician played troop songs on his banjo, with everyone joining in, including the “Alabama Yankee,” Anne Marie Smith, who entertained dancing a jig. I was amazed to find out how many of my favorite childhood songs were really from the US Civil War. I knew them all!

But, the entertainment didn’t stop there. After the troupe gave us a taste of what it was like to live in the South, we went inside for a real taste of Southern cooking at The Banquet of 1862. We feasted on a scrumptious Southern dinner of Grilled Ham Stakes with Whiskey-Apple Cider Glaze, Brown Sugar Glazed Sweet Potatoes, Fingerling Potatoes, Slaw, Corn Bread with whipped Apple Butter, and Bourbon Pecan Pie.

We then got a taste of the North, in the form of our Celebrity Keynote, President Abraham Lincoln. (Yeah, how many data conferences can say they had a former US president as a keynote!?) President Lincoln, performed by professional celebrity impersonator, Tim Beasley, provided us a first-hand account of how the Union Army was able to beat the Confederate Army, thanks in part to their “weapon of knowledge”– namely, the telegraph. President Lincoln explained how having the ability to share information more quickly and more accurately (two data quality dimensions) by way of the telegraph, his Union Army was able to stay ahead of the Confederate troops, and ultimately win the US Civil War. It was a thought-provoking insight to how information quality played a pivotal role in shaping what is the United States of America today.

After President Lincoln departed, we raffled off 26 books, donated by our resident authors and their respective publishers. Winners were drawn from session evaluations and our Civil War Trivia Hunt, which was developed by volunteer, Ken Hansen. Ken did a fabulous job coming up with 20 questions that spurred conversation throughout the conference – Thank you, Ken! After the raffle, we finished up the Wednesday evening festivities with our Closing Keynote, Dr. Alex Borek, who presented “Cognitive, Cloud and Big Data: A New Beginning for Data Quality?,” offering insights to the future of information and data quality.

Throughout the conference attendees were provided never-ending Southern hospitality by hotel staff, not to mention the never-ending all-day snack bar. Breakfasts and lunches were also full of tasty Southern delights that changed daily. I don’t think anyone was hungry the entire week!

When all was said and done, we hosted one last event – the Friday Historical US Civil War Tour of Richmond, VA, the capital of the Confederacy. 23 people from various countries stayed an extra day to enjoy a private tour hosted by our Southern Bell from Wednesday night, Debbie. Debbie shared her passion of US Civil War history highlighting a number of historical sites along the way. Our bus driver, Bud, kindly pulled over several times so we could take pictures. The tour included stops and private tours at the Virginia State Capital building and St. John’s Church, where Patrick Henry gave his ever-famous, “Give me liberty, or give me death” speech. The tour ended with one last delicious Southern lunch at the famous Hanover Tavern, originally owned by Patrick Henry’s father-in-law.

All-in-all, it was a great conference! I would personally like to thank the Sponsors, Speakers, Keynotes, Authors, Volunteers and Staff who all helped make this conference a success. I’d also like to thank the Attendees – all of this was done for you with the hope that you would be engaged and excited about information and data quality (with a little US Civil War history thrown in). I hope you enjoyed the IDQSummit and were able to take away some great insights.

I would also like to say how thankful I am for having had the opportunity to chair the 2014 International Data Quality Summit. It was an exciting and educational challenge for me. And, although I have left IAIDQ and will not be chairing future events with the organization, I look forward to other similar opportunities already on the horizon.

Joy L. Medved, SSBB, IQCP, ADKAR
CEO / Principal Consultant
Paradata Consulting, LLC
Email: joy@paradata.us