Tag: Information/Data Quality Issues

  • Please buy Expedia an Atlas…

    Following on from Michel Neylon’s on-going battle with Amazon, it looks like the illness has begun to affect Expedia (who may need to buy an atlas from Amazon).

    A colleague of mine just tried to book a hotel room in London for a weekend away. She got her itinerary number and had confirmed availability and price and was trying to give her credit card details to pay for the booking.

    On Expedia, you have to tell them if you are a UK address or a non-UK address (I suspect that this is to present different address format templates). My colleague selected “Non-UK” and proceeded to fill in her address details.

    Until she got to the part where they wanted to capture Country. Ireland wasn’t listed. Neither was Éire, Republic of Ireland, Irish Republic or Southern Ireland (all common alternatives that are sometimes used).

    Nepal and the South Mariana Islands were available options though. Lucky for them.

    Let me put it another way… the drop down list of countries was significantly incomplete for a company that is operating within the European Union (25 states and counting). Ireland hasn’t been part of the UK since 1922.

    My colleague rang Expedia to find out what was going on and to see if the order could be completed over the phone. To her surprise she was told that “expedia can’t take orders from Ireland”. Which is the equivalent of “the computer says no” from Little Britain.

    I wonder if the legal eagles who hang out over at tuppenceworth would have an opinion on the legality of Expedia’s business model, which to my mind smacks of an unjustified (and unjustifiable) restriction on free movement of services within the European Union and the European Free Trade Area.

    In the mean time, my colleague will be using a different site to book her accomodation in London. Until, of course, “the computer says no”.

    (editor’s note: I’ll stick the links ‘n’ stuff into this later).

  • Propogation of information errors and the risks of using surrogate sources

    ….ye wha’?

    There has been a lot written in relation to the electoral register and other matters about using information from other sources to improve the quality of information that you have or to create a new set of information.

    This makes sense, other people may already have done much of the work for you and, effectively, all you need to do is to copy their work and edit it to meet your needs. In most cases it may be faster and cheaper to use such ‘surrogates’ for reality to meet your information needs than to go to the effort of going to the real-world things (people, stock-rooms where ever) and actually starting from scratch to build exactly the information you need in the format you require to exactly your standards and formats.

    There is, however, a price to pay for having such surrogate sources available to you. You need to accept that

    1. The format and structure of the information may need to be changed to fit your systems or processes
    2. The information you are using may itself be innaccurate, incomplete or inconsistent.
    3. If you are combining it with other information, it will require investment in tools and skills to properly match and consolidate your information into a valid version of the truth.

    These risks apply to organisations buying marketing lists to integrate with their CRM systems but also could be applied to students relying on the Internet to present them with the content for their academic projects or journalists trawling for content for newspaper articles or reviews.

    Recurrence of common errors, phrases or inaccuracies in term papers is one way that academia has of identifying academic fraud. Similar techniques might be applied in other arenas to identify and track instances of copyright infringement.

    In businesses dealing with thousands of records, the cost/risk analysis is relatively straightforward. The recommendation I would make is that clear processes to manage suppliers and to measure the quality of the information they provide you based on a defined standard for completeness, consistency, duplication, conformity etc. is essential. Random sampling of surrogate data sources for accuracy (not every 100th record but a truly random sample) is also strongly recommended.

    These are EXACTLY the same techniques that manufacturing industries use to ensure the quality of the raw material inputs to their processes. If it works for industries where low quality can kill (such as pharmaceuticals), why shouldn’t work for you?

    For students, journalists and those of us hacking away in the blogosphere the recommendation is simple. Only rely on surrogate sources if you absolutely have to. If you use someone elses work as your source, credit them. If you don’t want to credit them then make sure you verify the accuracy of their work either by actually verifying against reality or by checking with at least one other source.

    That way you avoid having the errors of your source become your errors also and you don’t run the risk of someone crying foul and either suing you for stealing their copyright (and copyright does apply to content posted on the internet and in blogs) or taking whatever other sanctions might apply (such as kicking you off your college course).

    In many cases the costs and effort involved in double checking (particularly for a once of piece of writing) are neglibily different to the costs of actually starting from scratch and building your information up yourself. And, depending on the context, it may even be more enjoyable.

    The New York Times not so long ago had to relearn the lessons of checking stories with at least one other source for accuracy.

    Horatio Caine in CSI:Miami always tells his team to “trust, but verify”.

    When using surrogate sources for real-world information in any arena you must assess the risk of doing so and put in place the necessary controls so that you can trust that you have verified.

    (c) Daragh O Brien 2006 (just in case)

  • The real cost to business of poor quality Information

    The Irish Independent, the Irish Times and Silicon Republic have all carried coverage over the last days about TalkTalk, the CarphoneWarehouse fixed line subsidiary’s operation in Ireland (recently acquired from Tele2).

    According to Silicon Republic:

    Talk Talk has been ordered by the Commission for Communications Regulation (ComReg) and the Data Protection Commissioner to make a public apology over complaints by consumers who received cold calls despite recording their preference not to receive unsolicited marketing calls.”

    In addition, they have been asked by BOTH regulators (Comreg and the Data Protection Commissioner) to immediately cease all direct marketing until an audit has been carried out.

    The root of the problem is that TalkTalk talked to people who had opted out of direct telemarketing on the National Directory Database. As such TalkTalk should not have been talktalking to these people. And some of them complained, to both the Data Protection Commissioner and the Communications Regulator.

    TalkTalk have pointed the finger of blame at “data integrity issues in their internal processes” and gaps in the data that they acquired from Tele2 when they purchased it.

    In the increasingly comptetive telecommunications market, not being able to direct market to prospective customers effectively puts you out of the game, with an increased reliance on indirect marketing such as posters or TV ads, none of which match the conversion rate of outbound telemarketing.

    The Information Quality lessons here are simple:

    1. Ensure that your critical core processes (such as marketing database maintenance) are defined, measured and controlled in an environment that supports Quality information.
    2. Make sure that your Information Architecture is capable of meeting the needs of your knowledge workers. If a key fact needs to be known about a customer or potential customer (such as their telemarketing preferences) this should be clearly defined and maintained and accessible.
    3. When you are buying a new business or merging with another organisation, an important element of due diligence should be to look at the quality of their information assets. If you were buying a grocery store you would look at the quality of their perishable goods (are you buying a shop full of rotten tomatoes?). Buying the information assets of a business should be no different.
    4. “The obligation to the customer never ceases”. At some point somebody must have berated a TalkTalk Customer Service/Sales rep for ringing them during Corrie when they had opted out of direct marketing. Why was this not captured? Toyota’s Quality management method allows any employee to ‘stop the line’ if a quality problem is identified. In the context of a Call Centre, staff should have the ability to at least log where the information they have been provided doesn’t match with reality and to act on that. If these call outcomes weren’t being logged there is an absence of a valid component in the process. If the call outcomes were being logged but were not being acted on by Management there is an absence of control in the process.
    5. “Cease management by Quota”. My guess is that all the staff in the call centres were being measured on how many calls they made and how many contacts they converted. Where these measures were not met I would suspect that there was a culture that made failure to hit targets unacceptable. Unfortunately taking time out to figure out why a customer’s view of their suppressions is different to what is on the screen impacts call duration and the number of calls you can make in a night. Also, removing records from calling lists as scrap and rework slows down the campaign management lifecycle (if the processes aren’t in place to do this as par for the course).

    So now TalkTalk’s call centres are lying idle. TalkTalk has joined Irish Psychics Live as being among the first businesses to have a substantial penalty in terms of fines or interruption of business imposed on them by the Regulatory authorities for Data Protection issues. There’s a lot of call quotas not being met at the moment.

    I will be interested to hear what the audit of TalkTalk brings to light.

  • Electoral Information Quality – A Consolidating post

    As the blog is getting legs a bit now, I thought it best to consolidate the posts of the last few weeks on the Electoral Register issues into one point of reference, particularly for readers new to the site.

    I am also taken the opportunity to upload a few additional articles etc. that I have written on the issue to the blog for reference.

    Articles:

    First up is a draft paper I have put together on the proposed solutions and why they are likely to be inadequate. 

    Next up is a link to an article I have had published in an International newsletter for Information Quality Management Professionals.

    Finally there is an article based on my post on what scrap and rework is of earlier this month. This article was submitted to national newspapers as an opinion piece – and I should acknowledge the assistance of Simon over on Tuppenceworth with whipping it into shape. Click here to download Scrap and Rework article. The article is also reproduced as an appendix in the previously mentioned report.

    As regards posts – pretty much any of the posts in the Information Quality/Electoral Data Quality category are relevant. I will double check all the post categorisations to make sure that nothing is missing.

    That’s my update for today.

     

     

  • Process Design & Quality

    Quality is defined as the ability of a product or piece of information to meet or exceed the expectations of its customers/consumers.

    Quality begins in the design stage, at the white board when you are figuring out how your process should work. I won’t waste my energy today rattling on about our Electoral Register issues, rather I’ll take a different example…

    http://www.theregister.co.uk/2006/05/10/ms_messenger_paradox/

    This is an example of poor quality information. The instructions presented to the customer are illogical and set up a logical recursion that would stump many a Dalek.

    Reminds me of the joke about the computer programmer who was found dead in the shower. The shampoo bottle instructions read “Wash, Rinse, Repeat”.

     

  • The Irish Times Editorial yesterday…

    The Irish Times ran a nice editorial piece yesterday (2006/05/11) on the Electoral Register that highlighted the administrative failure that surrounds the Electoral Register.

    One good line: “IF the potential for abouse exists within the voting system – and it unequivocally does – the Government’s responsibility is to protect democracy and correct the electoral register

    The Old Lady of D’Olier Street correctly identifies that there is a “reluctance at official and political level to step outside traditional mechanisms to address the situation”.

    The reported ruling out by Dick Roche of the use of personal identification at polling stations reflects this apparent lack of willing to change processes to improve the quality of the register and improve the controls on the Electoral process. As the Irish Times rightly points out, you have to produce evidence of address to get a parking permit from Local Authorities. I can’t join a library in Dublin because I don’t have a utility bill at an address in Dublin.

    The chronic lack of leadership on this issue is appalling. Here’s a link to a piece I’ve written in an International Journal for Information Quality people… it references some reseach that was done in 2004 into attitudes to Information Quality in UK Public Sector organisations…

    … in the mean time as the Irish Times haven’t responded to the opinion piece I sent them based on my chocolate cake/scrap & rework post I suppose I’ll have to tout it around the other papers… I’ll stick up the draft for reference here later today.

  • Why Scrap and Rework isn’t good enough

    Simon has thrown down a bit of a challenge…  can I show why Information Scrap and Rework isn’t good enough because it seems like a sensible starting point…

    First off… let me provide a reference that should educate and delight (at least some of you) that explains what this Information Quality yoke is all about… THERE we go. The reference is a little old (2002) but for an update come to ICTEXPO on Friday.

    Now… why isn’t Scrap and Rework good enough?

    Who likes chocolate cake? Isn’t it a pain when your face gets covered in chocolate from mashing handfuls of cake into your gob? But you can wipe your face (usually in your sleeve) and carry on. That’s scrap and rework. A better solution is to wipe your face and take a smaller division of cake (a forkful). That is a change in the process based on an analysis of why you keep getting a chocolatey face, coupled with a scrap and rework task to set a baseline of cleanliness for your face that you will seek to maintain.

    Simon is right – scrap and rework looks like a good place to start, and when you say “Data Quality” to most people that’s what they think, under the labels “data scrubbing”, “data cleansing” or similar. However, it doesn’t address the actual source of the poor information quality, much as wiping your face in your sleeve doesn’t stop your face getting covered in chocolate.

    Therefore, once you clean your database, you will very quickly find it filling up with duff data again. Which eventually results in another round of scrap and rework to fix things again. Which then leads people to say that Information Quality management doesn’t work and costs lots of money. But scrap and rework isn’t information quality management. It is a process step to improving the quality of your information but it is just one step in many that range from culture change (from apathy to active interest) to process change to training etc.

    Tom Redman is one of the co-founders of the IAIDQ. His metaphor is that databases are like lakes. No matter how many times you clean the lake, if you don’t address the sources of ‘pollution’ (root causes, cake-eating processes) then you will never achieve good quality.

    To put it in professional terms that Simon (law-talking boyo that he is) might understand, scrap and rework is like apologising and offering some compensation everytime you punch a complete stranger in the face. A far better solution is to examine why it is you punch strangers in the face and stop doing it. Your apologies and offers of money to the injured fix the historical damage but do not prevent future occurences. And I doubt Simon would counsel any of his firm’s clients to continue punching strangers in the face.

    Scrap and rework is costly. Scrap and rework on a repetitive institutionalised basis is futile, creating a sense of doing something about your Information quality without actually getting anywhere but burning a pile of cash to stand still. It is an important step in any information quality management programme. However, understanding your data capture processes and the root causes of your poor quality data and then acting to improve those processes to address those root causes are the components that contribute to a sustained improvement in quality.

    Scrap and rework solves the problems of today at a short-term economic cost. However, it serves to bury the problems of tomorrow unless it takes place in tandem with process improvement to address root cause and the development of a ‘Quality culture’.

    To tie this back to the Electoral Register, to rely on scrap and rework would mean that we would get a clean register this time around at a point in time. However, over time the register would degrade in quality again, in the same way as your face gets dirty again if you don’t change the way you eat your cake.

    Now put that chocolate cake down and get a fork!

  • Reponse to Damien Blakes’ post on irishelection.com

    Damien Blake written a good piece on IrishElection.com regarding his view on how to address the Electoral Register.

    Much of what he says has merit as a short term solution. He suggests that we scrap the register (no problem there) and rework it from scratch (again, no problem there) rather than limp on with a defective register.

    Damien suggests that my writings will eventually find their way into undergraduate or post graduate theses.  I too look forward to the day that clear thinking about the fundamental best practices of Quality Management applied to Information form part of University curricula at undergraduate level and post-graduate level, much like the practices of Manufacturing Quality Management gained acceptance. In the US this is already beginning to happen, and I have been involved in curriculum development work with an Irish University in a similar vein.

    What I have written is based on a number of years (best part of a decade) working in complex Information Management environments and on the shared experiences of other practitioners in the Information Quality Management space with whom I have spoken at conferences (internationally), and with whom I work on a regular basis as a Director of the International Association for Information & Data Quality. The techniques, methodologies and approaches I put forward are based on my real world,practical experiences in applying best practices that have been proven in other industries and disciplines.

    Damien’s further goes on to suggest using the PPS Number and associated data to register people – preferably automatically. What Damien has suggested here is a process change to address root causes of poor data quality. Excellent. That is what I have been writing on… well at least as far at the review/change of processes goes. I’ll come ot my concerns with his proposal in a moment. Well done for thinking about the root causes of the problem and how the processes can be changed to address it. Top of the class that man.

    Damien’s suggestion doesn’t address the fact that there is no legal obligation on anyone to register to vote, and it could even be argued that one has a constitutional right not to register to vote. Automatic voter registration based on a “Single View of Voter” may not be a runner. Also, the Data Protection Commissioner has limited the uses that a PPS number can be put to – however I am sure legislation could get around that. The Digital Rights Ireland site has a nice paper on it about the ‘scope creep’ in the use of PPS Numbers that I’ve referenced in an earlier post.

    Ultimately, even if we scrapped the Register in the morning and rebuilt it in a shining pristine form, the simple fact is that name and address data degrades at a significant rate. In the absence of clear controls and processes to manage and maintain that data at an acceptable level of quality we will find ourselves rapidly returning to a situation where the Register is unreliable – Scrap and Rework is not the route to a high quality Information asset. It is a step on the journey, and an important process – but in the absence of process re-engineering to address root causes of defects and deficiencies the inevitable result is more scrap and rework.

    Damien’s proposal to use the name/address/citzenship data associated with the PPS number would serve to reduce redundancy of data (multiple copies of the same data held in multiple data stores) but it may run into Data Protection issues. But as a change of process for the management of Electoral Register data to address deficiencies in the existing process it has merit. But should you have to be a tax payer or a recipeint of State benefits in order to vote?

    Damien’s suggestion for a Statutory agency or a reallocation of resources/roles to task someone with maintaining the Register. Again, I am in wholehearted agreement. What Damien proposes here is a review of the Governance model for this data to give clear accountability, authority and mandate and (I would surmise) a standardisation of processes, controls and toolsets for managing and measuring the quality of Electoral Register data. I fully agree with the general thrust of Damien’s proposal here, although the specifics of what that Governance model in my view should be aligned with the requirements of the process and the requirements of the controls necessary to ensure the quality of the electoral register – simply assigning a role with a stroke of a pen does not deliver quality improvements.

    I agree with Damien that the process for voter registration and for maintaining that data should be a simple as possible. Clear definition of processes and business rules to support ‘flow-through’ registration and data maintenance are part of the Information Architecture design that should underpin any long term solution. Simplicity of process  could be part of the ‘customer expectation’ against which the quality of the process (and the information it produces) could be measured. However a simple ‘customer’ interface that sits on top of chaotic processes riddled with deficiencies and absences of controls to ensure the quality of the Information will not achieve the full objective of a simple to operate set of processes or functions that deliver reliable and high-quality Electoral Register information.

    Damien is right. We need to start again. We need to start again in terms of the information in the Register. We need to start again in terms of the Governance model that is put in place to manage this Information Asset. We need to start again in terms of the processes that people follow to create, update and maintain that information to ensure that we achieve our objective of a reliable, accurate (within a margin for error) Electoral Register. We need to start again in terms of how we think about the ‘architecture’ that this Electoral Information is held. We need to start again in terms of ensuring that we adopt appropriate technologies and strategies to address identified weaknesses in the processes for managing our Electoral Register data.

    However, to focus just on scrap and rework simply solves the problem of today. Addressing the root causes in the processes and governance as well as conducting scrap and rework on the data solves today’s problems and prevents those of tomorrow.

    I’m glad Damien and I are in such agreement on the principles, even though we may differ in our view on specifics (specific solutions aren’t my goal here – raising awareness of Best Practices was my intention). I hope he can find the time to attend the Information Quality Master Class that is being held in the RDS on Friday where a person with even more years experience in this domain than I will be sharing his knowledge with delegates

  • My cunning plan…

    I have pondered the electoral roll situation for the last few days. I believe I have come up with a suggestion for the Government that

    a) is feasible (at least based on my wet finger estimates of what is involved)

    b) doesn’t break any laws (as far as I can see)

    c) might actually work (assuming we don’t let politics get in the way).

    My plan is so cunningly simple that I might consider copyrighting, however as it is based on core principles of best practice in Information Quality Managment that I don’t think I’d get very far. What it doesn’t do is blow smoke up journalist’s wossits (thanks Simon at Tuppenceworth for point that out) while the deckchairs on the Titanic are rearranged in an unamusing game of ‘Find the Lady’.

    My suggestion is as follows:

    1. Review the Electoral Register processes – look for the root causes of our whopping over statement
      • Is there a source of information for registered deaths to remove the dear departed from the list?
      • does the process handle the mobility of the population in a suitably robust manner? If not, how can that be addressed
      • Do people actually know the correct process to follow when they move house? If not, how can that be addressed (no pun intended)?
      • Does the current structure of Local Authorities managing Electoral Register data without a clear central authority with control/co-ordination functions (such as to build the national ‘master’ file) have any contribution to the overstatement of the Register?
        • Actually, if I’m not too busy over the coming weeks I might actually do some research and do this bit for the Government. If anybody wants it they can email me at daragh AT obriend DOT com.
    2. Modify the Processes and Information Architecture to address the Root Causes identified in 1) above.
      • What additional ‘data markers’ can/should be used to more uniquely identify people – PPS numbers may be a good candidate key, but other data markers (date of birth, mother’s maiden name etc) might also be useful to allow for matching on name + 2 other values
      • Could the Electoral Register process make use of a data source of people who are moving house (such as An Posts’s mail redirection service or newaddress.ie)? How can that be utilised in an enhanced process to manage & maintain the electoral register? These are technically surrogate sources of reality rather than being ‘reality’ itself, but they might be useful.
      • Document the revised processes
      • Define new systems and database requirements based on those processes
      • Implement revised organisation frameworks (such as centralised ‘master file’ and a centralised Electoral Register governance board)
      • Change work practices in Local Authorities
      • Define stewardship roles and responsibilities for Electoral Register Data Quality
    3. Build the necessary culture, process, systems etc.
      • Update the Electoral Register Information Architecture to support the revised processes – process first then technology is the rule.
      • Invest in appropriate software tools to automate and support matching across datasets to build a “Single View of Voter” master file.
      • Invest in training of people (local authority staff, central government staff etc.) to use the processes correctly.
      • Instill a culture of quality with regards to our Election Register data
    4. Write to all individuals currently on the electoral roll (including the probable duplicates)
      • Inform them that the electoral register is being renewed in its entirety and if they do not re-register at their current address they will not be entitled to vote. Put a closing date on this well in advance of the election and the issuing of the draft register of electors for the election…
      • Re-registration could be by a paper form, an OCR scannable form or (if the processes are right) through an on-line registration function feeding the central Single View of Voter.
      • Much like the Sweepstakes – if you are not in you can’t win (hell, offer a holiday as a prize or something if necessary).
    5. Destroy the current electoral register & begin media awareness campaign
      • Once the mail out has been done, the current electoral register should be retired from use
      • TV, Radio, Internet, Information leaflets etc. should be produced to train/educate people on the process for registering to vote and maintaining your voter record if you move address.
    6. Monitor and control
      • The ‘Governance Board’ should insitute a monitoring and control check process using ‘surrogate’ data sources such as newaddress.ie to verify the percentage of known ‘movers’ or  who are updating their electoral register details.
      • Likewise the Register of Deaths should be used to check the % of ‘ex-citizens’ who have had their details removed from the register
      • Census data might be used when available to do a full audit of the electoral register on a regular basis.
      • Where the percentage error rate for movers and/or deceased goes above a defined threshold per measurement cycle this should trigger a Root Cause Analysis review and may prompt a dump and refresh of the electoral register as proposed here.

     The key to this is clear definition of process, and an acceptance that juggling with things in a half-baked way will not deliver the desired sustainable improvement in quality and reliability. Measurement is also critical – not in the context of setting quotas or targets but from the perspective of measuring if the process is performing as we expected.

    It takes a bit more cogitation time than plumping to use Enumerators to do the work (although there is a potential role for the CSO and Enumerators in an ‘audit’ capacity for the electoral register in this context). However it is based on sound principles of quality management and will deliver a sustained increase in the quality of our core Democratic Information Asset.

    To learn more about Total Information Quality Management, come and see Larry English at the Irish Computer Society/IAIDQ Information Quality Master Class in the RDS on the 5th of May.