Category: The Business of IQ

A category to collect and collate posts on the business aspects of information quality. Will be used to create a pre-defined view of pages in menu bar.

  • St. Patrick’s Day Special

    image with bottled water being passed through a kettle and into a sink to give hot water

    I found this on http://www.motivatedphotos.com and it struck me that it is a wonderful metaphor for data integration, information quality, and data governance in many organisations where they are reacting to issues, sustaining silos, or viewing all of this as an IT issue rather than a business challenge, or trying to solve the challenge with series of fragmented department level initiatives.

    Thoughts?

  • Personal Data – an Asset we hold on Trust

    There has been a bit of a scandal in Ireland with the discovery that Temple St Children’s Hospital has been retaining blood samples from children indefinitely without the consent of parents.

    The story broke in the Sunday Times just after Christmas and has been picked up as a discussion point on sites such as Boards.ie.  TJ McIntyre has also written about some of the legal issues raised by this.

    Ultimately, at the heart of the issue is a fundamental issue of Data Protection Compliance and a failure to treat Personal Data (and Sensitive Personal Data at that) as an asset (something of value) that the Hospital held and holds on trust for the data subject. It is not the Hospital’s data. It is not the HSE’s data. It is my child’s data, and (as I’m of a certain age) probably my data and my wife’s data and my brothers’ data and my sisters-in-laws’ data…..

    It’s of particular interest to me as I’m in the process of finishing off a tutorial course on Data Protection and Information Quality for a series of conferences at the end of February (if you are interested in coming, use the discount code “EARLYBIRD” up to the end of January to get a whopper of a discount). So many of the issues that this raises are to the front of my mind.

    Rather than simply write another post about Data Protection issues, I’m going to approach this from the perspective of Information as an Asset which has a readily definable Life Cycle at various points in which key decisions should be taken by responsible and accountable people to ensure that the asset continues to have value.

    Another aspect of how I’m going to discuss this is that, after over a decade working in Information Quality and Governance, I am a firm believer in the mantra: “Just because you can doesn’t mean you should“. I’m going to show how an Asset Life Cycle perspective can help you develop some robust structures to ensure your data is of high quality and you are less likely to fall foul of Data Protection issues.

    And for anyone who thinks that Data Protection and Data Quality are unrelated issues, I direct you to the specific wording in the heading of Chapter 2, Section 1 of the Directive 95/46/EC. (more…)

  • Who then is my customer?

    Two weeks ago I had the privilege of taking part in the IAIDQ’s Ask the Expert Webinar for World Quality Day (or as it will now be know, World Information Quality Day).

    The general format of the event was that a few of the IAIDQ Directors shared stories from their personal experiences or professional insights and extrapolated out what the landscape might be like in 2014 (the 10th anniversary of the IAIDQ).

    A key factor in all of the stories that were shared was the need to focus on the needs of your information customer, and the fact that the information customer may not be the person who you think they are. More often than not, failing to consider the needs of your information customers can result in outcomes that are significantly below expectations.

    One of my favourite legal maxims is Lord Atkin’s definition of who your ‘neighbour’ is who you owe legal duties of care to. He describes your ‘neighbour’ as being anyone who you should reasonably have in your mind when undertaking any action, or deciding not to take any action. While this defines a ‘neighbour’ from the point of view of litigation, I think it is also a very good definition of your “customer” in any process.

    Recently I had the misfortune to witness first hand what happens when one part of an organisation institutes a change in a process without ensuring that the people who they should have reasonably had in their mind when instituting the change were aware that the change was coming.

    My wife had a surgical procedure and a drain was inserted for a few days. After about 2 days, the drain was full and needed to be changed. The nurses on the ward couldn’t figure out how to change my wife’s drain because the drain that had been inserted was a new type which the surgical teams had elected to go with but which the ward nurses had never seen before.

    For a further full day my wife suffered the indignity of various medical staff attempting to figure out how to change the drain.

    1. There was no replacement drain of that type available on the ward. The connections were incompatible with the standard drain that was readily available to staff on the ward and which they were familiar with.
    2. When a replacement drain was sourced and fitted, no-one could figure out how to actually activate the magic vacuum function of it that made it work. The instructions on the device itself were incomplete.

    When the mystery of the drain fitting was eventually solved, the puzzle of how to actually read the amount of fluid being drained presented itself, which was only of importance as the surgeon had left instructions that the drain was to be removed once the output had dropped below a certain amount. The device itself presented misleading information, appearing to be filled to one level but when emptied out in fact containing a lesser amount (an information presentation quality problem one might say).

    The impacts of all this were:

    • A distressed and disturbed patient increasingly worried about the quality of care she was receiving.
    • Wasted time and resources pulling medical staff from other duties to try and solve the mystery of the drain
    • A very peeved and increasingly irate quality management blogger growing more annoyed at the whole situation.
    • Medical staff feeling and looking incompetent in front of a patient (and the patient’s family)

    Eventually the issues were sorted out and the drain was removed, but the outcome was a decidedly sub-optimal one for all involved. And it could have been easily avoided had there been proper communication about the change to the ward nurses and the doctors in the department from the surgical teams when they changed their standard. Had the surgical teams asked the question of who should they have in their minds to communicate with when taking an action, surely the post-op nurses should have featured in there somewhere?

    I would be tempted to say “silly Health Service” if I hadn’t seen exactly this type of scenario play out in day to day operations and flagship IT projects during the course of my career. Whether it is changing the format of a spreadsheet report so it can’t be loaded into a database or filtered, changing a reporting standard, changing meta-data or reference data, or changing process steps, each of these can result in poor quality information outcomes and irate information customers.

    So, while information quality is defined from the perspective of your information customers, you should take the time to step back and ask yourself who those information customers actually are before making changes that impact on the downstream ability of those customers to meet the needs of their customers.

  • Bank of Ireland – again

    The Irish Times today reports that Bank of Ireland are again investigating incidents of double charging of customers who use LASER cards.

    I wrote about this last month (see the archives here), picking up on a post from Tuppenceworth.ie earlier in the summer. I won’t be writing anything more about the issue (at least not for now).

    Looking back through my archives I found the picture below in a post that I’d written back in May when Simon on Tuppenceworth first raised his issue with BOI’s Laser Cards.

  • What’s in a name?

    Mrs DoBlog and I are anxiously awaiting the arrival of a mini-DoBlog any day now. So we have spent some time flicking through baby name books seeking inspiration for a name other than DoBlog 2.0.

    In doing so I have been yet again reminded of the challenges faced by information quality professionals when trying to unpick a concatenated string of text in a field that is labelled “Name”. The challenges are manifold:

    • Name formats differ from  to culture to culture – and it is not a Western/Asian divide as some people might assume at first.
    • Master Data for name spellings is notoriously difficult to obtain. My wife and I compared spellings of some common names in two books of baby names and the variations were staggering, with a number of spellings we are very familiar with (including my own name) not listed in either.
    • Often Family Names (surnames) can be used as Given Names (first names) such as Darcy (D’Arcy) or Jackson (Jackson) or Casey.
    • Often people pick names for their children based on where they were born or where they were conceived (Brooklyn Beckham, the son of footballer David Beckham is a good example).
    • Non-name words can appear in names, such as “Meat Loaf” or “Bear Grylls
    • Douglas Adams famously named a character in the Hitchhiker’s Guide to the Galaxy after one of the “dominant life forms” – a car called a “Ford Prefect
    • Names don’t always fit into an assumed varchar(30) or even varchar(100) field.
    • It is possible to have a one character Given name and a one character Family name.
    • Two character Family names are more common than we think.
    • Unicode characters, hyphens, spaces, apostrophes are all VALID in names – particularly if they are diacritical marks which change the meaning of words in particular languages.
    • And then you have people who change their names to silly things to be “different” or “special”,  but who create interesting statistical challenges for data profilers and parsing tools.

    Among the examples I found flicking through one of our baby name books last evening where “Alpha” and “Beta”. Personally I think it sends the wrong signals to name your children after letters of the Greek alphabet, but I’m sure it is helpful if you have had twins to keep them in order.

    I also found “Bairn” given as a Scots Gaelic name for a baby girl. I had to laugh at this as “Bairn” is actually a Scots dialect word for Child. Even Wikipedia recognises this and has a redirect from “Bairn” to “child“.  But it does remind me of the terribly sexist “joke” where the father asks the doctor after the birth whether it is a boy or a child his wife has just delivered. (more…)

  • A game changer – Ferguson v British Gas

    Back in April I wrote an article for the IAIDQ’s Quarterly Member Newsletter picking up on my niche theme, Common Law liability for poor quality information – in other words, the likelihood that poor quality information and poor quality information management practices will result in your organisation (or you personally) being sued.

    I’ve written and presented on this theme many times over the past few years and it always struck me how people started off being in the “that’s too theoretical” camp but by the time I (and occasionally my speaking/writing partner on this stuff, Mr Fergal Crehan) had finished people were all but phoning their company lawyers to have a chat.

    To an extent, I have to admit that in the early days much of this was theoretical, taking precedents from other areas of law and trying to figure out how they fit together in an Information Quality context. However, in January 2009 a case was heard in the Court of Appeal in England and Wales which has significant implications for the Information Quality profession and which has had almost no coverage (other than coverage via the IAIDQ and myself). My legal colleagues describe it as “ground breaking” for the profession because of the simple legal principle it creates regarding complex and silo’d computing environments and the impact of disparate and plain crummy data. I see it as a clear rallying cry that makes it crystal clear that poor information quality will get you sued.

    Recent reports (here and here) and anecdotal evidence suggest that in the current economic climate, the risk to companies of litigation is increasing. Simply put, the issues that might have been brushed aside or resolved amicably in the past are now life and death issues, at least in the commercial sense. As a result there is now a trend to “lawyer up” at the first sign of trouble. This trend is likely to accelerate in the context of issues involving information, and I suspect, particularly in financial services.

    A recent article in the Commercial Litigation Journal (Frisby & Morrison, 2008) supports this supposition. In that article, the authors conclude:

    “History has shown that during previous downturns in market conditions, litigation has been a source of increased activity in law firms as businesses fight to hold onto what they have or utilise it as a cashflow tool to avoid paying money out.”

    The Case that (should have) shook the Information Quality world

    The case of Ferguson v British Gas was started by Ms. Ferguson, a former customer of British Gas who had transferred to a new supplier but to whom British Gas continued to send invoices and letters with threats to cut off her supply, start legal proceedings, and report her to credit rating agencies.

    Ms Ferguson complained and received assurances that this would stop but the correspondence continued. Ms Ferguson then sued British Gas for harassment.

    Among the defences put forward by British Gas were the arguments that:

    (a) correspondence generated by automated systems did not amount to harassment, and (b) for the conduct to amount to harassment, Ms Ferguson would have to show that the company had “actual knowledge” that its behaviour was harassment.

    The Court of Appeal dismissed both these arguments. Lord Justice Breen, one of the judges on the panel for this appeal, ruled that:

    “It is clear from this case that a corporation, large or small, can be responsible for harassment and can’t rely on the argument that there is no ‘controlling mind’ in the company and that the left hand didn’t know what the right hand was doing,” he said.

    Lord Justice Jacob, in delivering the ruling of the Court, dismissed the automated systems argument by saying:

    “[British Gas] also made the point that the correspondence was computer generated and so, for some reason which I do not really follow, Ms. Ferguson should not have taken it as seriously as if it had come from an individual. But real people are responsible for programming and entering material into the computer. It is British Gas’s system which, at the very least, allowed the impugned conduct to happen.”

    So what does this mean?

    In this ruling, the Court of Appeal for England and Wales has effectively indicated a judicial dismissal of a ‘silo’ view of the organization when a company is being sued. The courts attribute to the company the full knowledge it ought to have had if the left hand knew what the right hand was doing. Any future defence argument grounded on the silo nature of organizations will likely fail. If the company will not break down barriers to ensure that its conduct meets the reasonable expectations of its customers, the courts will do it for them.

    Secondly, the Court clearly had little time or patience for the argument that correspondence generated by a computer was any less weighty or worrisome than a letter written by a human being. Lord Justice Jacob’s statement places the emphasis on the people who program the computer and the people who enter the information. The faulty ‘system’ he refers to includes more than just the computer system; arguably, it also encompasses the human factors in the systemic management of the core processes of British Gas.

    Thirdly, the Court noted that perfectly good and inexpensive avenues to remedy in this type of case exist through the UK’s Trading Standards regulations. Thus from a risk management perspective, the probability of a company being prosecuted for this type of error will increase.

    British Gas settled with Ms Ferguson for an undisclosed amount and was ordered to pay her costs.

    What does it mean from an Information Quality perspective?

    From an Information Quality perspective, this case clearly shows the legal risks that arise from (a) disconnected and siloed systems, and (b) inconsistencies between the facts about real world entities that are contained in these systems.

    It would appear that the debt recovery systems in British Gas were not updated with correct customer account balances (amongst other potential issues).

    Ms. Ferguson was told repeatedly by one part of British Gas that the situation was resolved, while another part of British Gas rolled forward with threats of litigation. The root cause here would appear to be an incomplete or inaccurate record or a failure of British Gas’ systems. The Court’s judgment implies that that poor quality data isn’t a defence against litigation.

    The ruling’s emphasis on the importance of people in the management of information, in terms of programming computers (which can be interpreted to include the IT tasks involved in designing and developing systems) and inputting data (which can be interpreted as defining the data that the business uses, and managing the processes that create, maintain, and apply that data) is likewise significant.

    Clearly, an effective information quality strategy and culture, implemented through people and systems, could have avoided the customer service disaster and litigation that this case represents.  The court held the company accountable for not breaking down barriers between departments and systems so that the left-hand of the organization knows what the right-hand is doing.

    Furthermore, it is now more important than ever that companies ensure the accuracy of information about customers, their accounts, and their relationship with the company, as well as ensuring the consistency of that information between systems. The severity of impact of the risk is relatively high (reputational loss, cost of investigations, cost of refunds) and the likelihood of occurrence is also higher in today’s economic climate.

    Given the importance of information in modern businesses, and the likelihood of increased litigation during a recession, it is inevitable: poor quality information will get you sued.

  • Finding Red Herrings or Missing a Trick?

    This post is written by Colin Boylan, an independent market research professional based in Wicklow, Ireland with extensive experience in Market Research in pharma and other industries in the UK and Ireland. In this post, Colin explains how the quality of the population sample used in a market research study can have significant effects on the quality of the findings. His post was inspired by recent posts here and here about “Golden Databases“. I’m glad to give Colin a chance to try his blogging chops out and I hope visitors here enjoy reading his insights in to information quality and market research.

    Finding Red Herrings or Missing a Trick?

    For most businesses there are major advantages to investing money in doing direct research with your customer base   In theory it’s a ready built list of people who are familiar with your business – so they can speak with authority on their experience as your customer.
    The value of customer research to business should be by now fairly obvious, but there’s an old saying in research (and elsewhere) – “garbage in, garbage out”. The insights built off the data
    generated from your customer list is only as relevant as the list of people you ask to participate in the research.
    However if, for example, they are lapsed customers then researching them is going to give you a picture of what your past customers wanted from you (unless these people are the focus of your research of course).   Is this the same as what your present customers want?  And if you are looking for why past customers stopped dealing with you and use a list full of current  customers you end up with either few people able to answer the questions you set or …worse….data from people who shouldn’t have answered the question – which leads to another scenario.
    Picture an important piece of research done with a list of past and present customers mixed in together with no way to tell who is who.  Do current and ex-customers differ in their wants and needs from your business?     I don’t know – and neither do you.   So how useful are any insights generated from this research?  Not being able to separate these two groups gives rise to two potential scenarios.  Either the excess numbers in there are throwing up ‘clear’ results that are not applicable to your current customers or the combination of both bodies is adding noise which stops you uncovering real insights about the customers you’re interested in – you’re either finding red herrings or you’re missing a trick!
    I’ve used just one scenario here to make a point that can be applied to lots of customer data stored by companies – be it incorrect regional information, incorrect gender, you can add whatever block of data is relevant to your own company here and the story is the same.   If the data is not accurate then any use it is put to suffers.

    For most businesses there are major advantages to investing money in doing direct research with your customer base In theory it’s a ready built list of people who are familiar with your business – so they can speak with authority on their experience as your customer.

    The value of customer research to business should be by now fairly obvious, but there’s an old saying in research (and elsewhere) – “garbage in, garbage out”. The insights built off the data generated from your customer list is only as relevant as the list of people you ask to participate in the research.

    However if, for example, they are lapsed customers then researching them is going to give you a picture of what your past customers wanted from you (unless these people are the focus of your research of course). Is this the same as what your present customers want? And if you are looking for why past customers stopped dealing with you and use a list full of current customers you end up with either few people able to answer the questions you set or …worse….data from people who shouldn’t have answered the question – which leads to another scenario.

    Picture an important piece of research done with a list of past and present customers mixed in together with no way to tell who is who. Do current and ex-customers differ in their wants and needs from your business? I don’t know – and neither do you. So how useful are any insights generated from this research? Not being able to separate these two groups gives rise to two potential scenarios. Either the excess numbers in there are throwing up ‘clear’ results that are not applicable to your current customers or the combination of both bodies is adding noise which stops you uncovering real insights about the customers you’re interested in – you’re either finding red herrings or you’re missing a trick!

    I’ve used just one scenario here to make a point that can be applied to lots of customer data stored by companies – be it incorrect regional information, incorrect gender, you can add whatever block of data is relevant to your own company here and the story is the same. If the data is not accurate then any use it is put to suffers.

  • The Risk of Poor Quality Information (2) #nama

    84% fail. Do you remember that statistic from my previous post?

    In my earlier post on this topic I wrote about  how issues of identity (name and address) can cause problems when attempting to consolidate data from multiple systems into one Single View of Master Data. I also ran through the frightening statistics relating to the failure rate of these types of projects, ranging up to 84%.

    Finally, I plugged two IAIDQ conferences on this topic. One is in Dublin on the 28th of September. The other is in Cardiff (currently being rescheduled for the end of next month).

    A key root cause of these failure rates has been identified. At the heart of many of these failures is a failure to understand and profile data  to better understand the risks and issues in the data that is being consolidated.

    So, if we assume that the risk that is NAMA is a risk that the Government will take, surely then it behoves the Government and NAMA to ensure that they take necessary steps to mitigate the risks posed to their plan by poor quality information and reduce the probability of failure because of data issues from around 8 in 10 to something more palatable (Zero Defects anyone?)

    Examples of problems that might occur (Part 2)

    Last time we talked about name and address issues. This time out we talk a little about more technical things  like metadata and business rules. (You, down the back… stay awake please).

    Divided by a Common Language (or the muddle of Metadata)

    Oscar Wilde is credited with describing America and Britain as two countries divided only by a common language.

    When bringing data together from different systems, there is often an assumption that if the fields are called the same thing or a similar thing, then they are likely to hold the same data. This assumption in particular is the mother of all cock-ups.

    I worked on a project once where there were two systems being merged for reporting purposes. System A had a field called Customer ID. System B had a field called Customer Number. The data was combined and the resulting report was hailed as something likely to promote growth, but only in roses. In short, it was a pile of manure.

    The root cause was that System A’s field was a field that uniquely identified customer records with an auto-incrementing numeric value (it started at 1, and added 1 until it was done). The Customer Number field in System B, well it contained letters and numbers and, most importantly, it didn’t actually identify the customer.

    ‘Metadata’ (and I sense an attack by Metadata puritans any moment now) is basically defined as “data about data” which helps you understand the values in the field and also helps you make correct assumptions about whether Tab A can actually be connected to Slot B in a way that will actually make sense. It ranges from the technical (this field has only numbers in it for all the data) to the conceptual (e.g. “A customer is…”).

    And here is the rub. Within individual organisations, there is often (indeed I would say inevitably) differences of opinion (to put it politically) about the meaning of the meta data within that organisation. Different business units may have different understandings of what a customer is. Software systems that have sprung up in silo responses to immediate tactical (or even strategic need) often have field names that are different for the same thing (synonyms) or are the same for different things (homonyms). Either can cause serious problems in the quality of consolidated data.

    Now NAMA, it would seem, will be consolidating data from multiple areas from within multiple banks. This is a metadata problem squared, which increases the level of risk still further.

    Three Loan Monty (or “One rule to ring them all”)

    One of the things we learned from Michael Lynn (apart from how lawyers and country music should never mix) was that property developers and speculators occasionally pull a fast one and give the same piece of property as security to multiple banks for multiple loans. The assumption that they seemed to have made in the good times was:

    1. No one would notice (sure, who’d be pulling all the details of loans and assets from most Irish Banks into one big database)
    2. They’d be able to pay off the loans before anyone would notice

    Well, number 1 is about to happen and number 2 has stopped happening in many cases.

    To think about this in the context of a data model or a set of business rules for a moment:

    • A person (or persons) can be the borrowers on more than one Loan
    • One loan can be secured against zero (unsecured), one (secured), or more than one (secured) assets.

    What we saw in the Lynn case broke these simple rules.

    An advantage of NAMA is that it gives an opportunity to actually get some metrics on how frequently this was allowed to happen. Once there is a Single View of Borrower it would be straightforward to profile the data and test for the simple business rules outlined above.

    The problem arises if incidents like this are discovered where there are three or four loans secured  against the same asset and one of them has a fixed charge or a crystallised charge over the asset and the others have some sort of impairment on their security (such as paperwork not being filed correctly and the charge not actually existing).

    If the loan with the charge is the smallest of the four, this means that NAMA winds up with three expensive unsecured loans as the principle in Commercial Law is that first in time prevails- in other words the first registered charge is the one that secures the  asset.

    It may very well be that the analysis of the banks loan books has already gone into the detail here and there is a cunning plan to address this type of problem as it arises. I’d be interested to see how such a plan would work.

    Unfortunately, I would fear that the issues uncovered in the Michael Lynn debacle haven’t gone away and remain lurking under the surface.

    Conclusion (for now)

    Information is ephemeral and intangible. Banks (and all businesses) use abstract facts stored in files to describe real-world things.

    Often the labels and attributes associated with those facts are not aligned or are created and defined in “silos” which create barriers to effective communication within an organisation. Such problems are multipled manifold when you begin to bring data from multiple independent entities together into one place and try to make it into a holistic information asset.

    Often things get lost or muddled in translation.

    Furthermore, where business rules that should govern a process have been broken or not enforced historically there are inevitably going to be ‘gaps of fact‘ (or ‘chasms of unknowingness’ if there is a lot of broken rules). Those ‘gaps of fact’ can undermine critical assumptions in processes or data migrations.

    When we talk of assumptions I am reminded of the old joke about how the economist who got stranded on a desert island was eventually rescued. On the sand he wrote “First, assume a rescue party will find me”.

    Where there is a ‘gap of fact’, it is unfortunately the case that there is a very high probability (around 84%) that the economist would be found, but found dead.

    Effective management of the risk of poor quality information requires people to set aside assumptions and act on fact and mind the gap.

  • The Risk of Poor Information Quality #nama

    I thought it timely to add an Information Quality perspective to the debate and discussion on NAMA. So, for tweeters the hashtag is #NAMAInfoQuality.

    The title of this post (less the Hashtag) is, co-incidentally, the title of a set of paired conferences I’m helping to organise in Dublin and Cardiff in a little over a week.

    It is a timely topic given the contribution that poor quality information played in the sub-prime mortgage collapse in the US. While a degree of ‘magical thinking’ is also to blame (“what, I can just say I’m a CEO with €1million and you’ll take my word for it?”) and , ultimately the risks that poor quality information posed to down stream processes and decisions  were not effectively managed even if they were actually recognised.

    Listening to the NAMA (twitter hash-tag #nama) debate on-line yesterday (and following it on the excellent liveblog.ie I couldn’t help but think about the “Happy path” thinking that seems to be prevailing and how similar it is to the Happy Path thinking that pervaded the CRM goldrush of the late 1990s and early 2000’s, and the ERP and MDM bandwagons that have trundled through a little place I call “ProjectsVille” in the intervening years.

    (note to people checking Wikipedia links above… Wikipedia, in its wisdom, seems to class CRM, ERP and MDM as “IT” issues. That’s bullshit frankly and doesn’t reflect the key lessons learned from painful failures over the years in many companies around the world. While there is an IT component to implementing solutions and excuting  projects, these are all fundamentally part of core business strategy and are a business challenge. )

    But I digress….

    Basically, at the heart of every CRM project, ERP project or MDM project is the need to create a “Single View of Something”, be it this bizarre creature called a “Customer” (they are like Yeti.. we all believe they exist but no-one can precisely describe or define them), or “Widget” or other things that the Business needs to know about to, well… run the business and survive.

    This involves taking data from multiple sources and combining them together in a single repository of facts. So if you have  999 seperate Access databases and 45000 spreadsheets with customer  data on them and data about what products your customers have bought, ideally you want to be boiling them down to one database of customers and one database of products with links between them that tell you that Customer 456  has bought 45000 of Widget X in the last 6 months and likes to be phoned after 4:30pm on Thursdays and prefers to be called ‘Dave’ instead of “Mr Rodgers”, oh… and they haven’t got around to paying you for 40,000 of those widgets yet.

    (This is the kind of thing that Damien Mulley referred to recently as a “Golden Database“.)

    NAMA proposes to basically take the facts that are known about a load of loans from multiple lenders, put them all together in a “Single View of Abyss” (they’d probably call it something else) and from that easily and accurately identify under-performing and nonperforming loans and put the State in the position where it can ultimately take the assets on which loans were secured or for which loans were acquired if the loans aren’t being repaid.

    Ignoring the economists’ arguments about the merits and risks of this approach, this sounds very much like a classic CRM/MDM problem where you have lots of source data sets and want to boil them down to three basic sets of facts, in this case:

    • Property or other assets affected by loans (either used as security or purchased using loans)

    • People or companies who borrowed those monies

    • Information about the performance of those loans.

    Ideally then you should be able to ask the magic computermebob to tell you exactly what loans Developer X has, and what assets are those loans secured on. Somewhere in that process there is some magic that happens that turns crud into gold and the Irish taxpayer comes out a winner (at least that’s the impression I’m getting).

    This is Happy Path.

    The Crappy Path

    Some statistics now to give you an insight into just how crappy the crappy path can be.

    • Various published studies have found that over 70% of CRM implementations had failed to deliver on the promised “Single View of Customer”

    • In 2007 Bloor Research found that 84% of all ERP data migrations fail (either run over time, over budget or fail to integrate all the data) because of problems with the quality of the data

    • As recently as last month, Gartner Group reported that 75% of CFOs surveyed felt that poor quality information was a direct impediment to achieving business goals.

    • A study by IBM found that the average “knowledge worker” can spend up to 30% of their time rechecking information and correcting errors.

    Translating this to NAMA’s potential Information Management Challenge:

    1. The probability of the information meeting expectations is about the same as the discount tthat has been applied on the loans. (30%).
    2. The probability of the migration and consolidation of information happening on time, on budget and to the level of quality required is slightly better than the forecast growth rate in property prices once the economy recovers (16% versus 10%)
    3. Around 30% of the time of staff in NAMA will likely be spent checking errors, seeking information, correcting and clarifying facts etc.

    There is a whole lot more to this than just taking loans and pressing a button on a money machine for the banks.

    Ultimately the loans are described in the abstract by Information, the assets which were used as security or which were purchased with those loans are defined by data, and the people and businesses servicing those loans (or not as the case may be) are represented by facts and attributes like “Firstname/LastName”, “Company Registration Number”. Much as we taxpayers might like it, Liam Carroll will not be locked in a dungeon in the basement of Treasury Buildings while NAMA operates. However, the facts and attributes that describe the commercial entity “Liam Carroll” and the businesses he operated will be stored in a database (which could very well be in

    This ultimate reliance on ephemeral information brings with it some significant risks across a number of areas, all of which could signpost the detour from Happy Path to Crappy Path.

    Rather than bore readers with a detailed thesis on the types of problems that might occur (I’ve written it and it runs to many words), I’ve decided to run a little series over the next few days which is drawing on some of the topics I and other speakers will be covering at the IAIDQ/ICS IQ Network Conference on the 28th of September.

    Examples of problems that might occur (Part 1)

    Address Data (also known as “Postcode postcode wherefore art thou postcode?”)

    Ireland is one of the few countries that lacks a postcode system. This means that postal addresses in Ireland are, for want of a better expression, fuzzy.

    Take for example one townland in Wexford called Murrintown. only it’s not. It has been for centuries as far as the locals are concerned but according to the Ordnance Survey and the Place Names commission, the locals don’t know how to spell. All the road signs have “Murntown”.

    Yes,  An Post has the *koff* lovely */koff* Geodirectory system which is the nearest thing to an address standard database we have in Ireland. Of course, it is designed and populated to supprt the delivery of letter post. As a result, many towns and villages have been transposed around the country as their “Town” from a postal perspective is actually their nearest main sorting office.

    Ballyhaunis in County  Mayo is famously logged in Geodirectory as being in Co. Roscommon. This results in property being occasionally misfiled.

    There are also occasionally typographical errors and transcription errors in data in data. For example, some genius put an accented character into the name of the development I live in in Wexford which means that Google Maps, Satnavs and other cleverness can’t find my address unless I actually screw it up on purpose.

    Of course, one might assume that the addresses given in the title deeds to properties would be accurate and correct (and, for the most part, they are I believe). However there is still the issue of transcription errors and mis-reading of handwriting on deeds which can introduce small and insidious errors.

    It is an interesting fact that the Land Registry has moved to computerised registers in recent years but the Property Registration Authority trusts still to the trusty quill and only recently moved to put the forms for registering property deeds on-line. Please let me know what you think of the layout of their web form.

    I am Spartacus (No, I am Spartacus. No I’m Brian Spartacus).

    Identity is a somewhat fluid thing. When seeking to build their consolidated view of borrowings, NAMA will need to create a “single view of borrower”. This will require them to match names of companies to create a single view of businesses who have borrowed (and then that will likely need to have some input from the CRO to flag where such companies have a change in status such as being wound up or bought).

    The process will also likely need to have a Single View of Borrower down to the level of a person a) because some loans may be out to people and b) because the link between some borrowings and various companies who would have borrowed will likely turn out to be an individual.

    Now. Are these people the same:

    • Daragh O Brien

    • Dara O’Brien

    • Daire O Brian

    • Daragh Ó Briain

    • Dara Ó Briain

    • Darach O Brien

    • D Patrick O Brien

    • Pat O’Brien

    The answer is that they are. They are variations in spelling on my name, one possible variation in use of my middle name, and the last one is what a basketball coach I had decades ago called me because he couldn’t pronounce Daragh.

    However, the process of matching personal data is very complex and requires great care be taken, particularly given the implications under the Data Protection Act of making an error.

    The challenge NAMA potentially faces is identifying if Joe Bloggs in Bank A is Joseph Bloggs in Bank B or J.P Bloggs in Bank C or both or none at all. Recommended practice is to have name plus at least two other ‘facts’ to feed your matching processes. And at that the process inevitably requires human review.

    However, the problem is relatively solvable if you invest in the right tools and are willing to invest in the people necessary to use the tools and (unfortunately) take care of the manual review and approval that would be required.

    A related risk is the risk of not having the customer’s name correct. Simply put, where that happens the lender or controller of the loans effectively hands the borrower a “get out of jail” card as they can simply claim that the named person is not them. Courts are pedantically persnickety about accuracy of information in these types of matters.

    A corollary of this is where the lender or controller of the loans (in this case NAMA) starts chasing the wrong person for the payment of a loan due to a mismatch of data about people. Here, the aggrieved party could ultimately sue the controller of the loans for libel if they publish to anyone the allegation that the person owed money and wasn’t paying it back.

    While each of these are risks that the banks individually manage on their own at the moment, the simple fact of pulling all this information together under NAMA increases the risk factor here. Each bank may have individually had problems with errors and mismatching etc. but resolved them quietly and locally. The root causes of those errors may not be addressed and may not be possible to address in a bulk data migration. Therefore, the data may get muddled again, leading to the issues outlined above.

    Conclusion (for Part 1)

    NAMA will not be managing physical loans or physical assets. What NAMA will work with is the information about those things, the facts and figures that describe the attributes of those things in the abstract.

    To assume that individually Irish banks have levels of data quality that are as pure as the driven snow is naive. To assume that taking a load of crummy data from multiple sources and mixing it together to create a new “Single View” of anything without first understanding the quality of that information and ensuring that you have steps in place to manage and mitigate the risks posed by non-quality information means you risk failure rates of between 70% and 84%.

    To put it another way, there is only between 16% and 30% of a chance that NAMA will be able to deliver the kind of robust management of all loans out to individual property developers and property development companies that would be necessary to ensure that the risk to the taxpayer is properly managed.

    The key to managing that risk of failure will be discussed in my next post and at the IAIDQ/ICS IQ Network conference on the 28th of September

  • Bank of Ireland Overcharging – another follow up

    Scanning the electronic pages of the Irish Independent this morning I read that

    1. They claim to have had the scoop on this story (no, it was Tuppenceworth.ie and IQTrainwrecks.com)
    2. They have “experts” (unnamed ones) who tell them that the actual number of impacted customers over the weekend could be up to 200,000.
    3. “Some other banks admitted there have been cases where Laser payments have mistakenly gone through on the double. But they said they have not had any serious problems.” (BOI had that angle on the issue back in June).
    4. The bank cannot guarantee that it won’t happen again.

    I’ll leave points one to three for another time and focus at this point (as my bus to Dublin leaves soon) on the matter of the Bank of Ireland not being able to guarantee that it won’t happen again.

    The Nature of Risk

    Fair play to BOI for admitting that they can’t guarantee that this problem won’t happen again. It has happened before (in May), it has happened now, it is only prudent to say that it may happen again.

    But are they not able to guarantee that it won’t happen again because they understand the causes of this problem, have properly mapped the process and information flows, understand the Happy Path and Crappy Path scenarios and the triggering factors for them  and have established robust detective and preventative controls on the information processes to prevent or flag errors to have a foolproof process but are hedging their bets against the occurence of idiots?  In that case, they have a managed process which will (hopefully) have effective governance structures around it to embed a quality culture that promotes early warning and prompt action to address the incidence of idiots which inevitably plagues fool proof processes.

    Or are they unable to guarantee it won’t happen again because they lack some or all of the above?

    Again I am forced to fall back on tortured analogies to explain this I fear.

    A few years ago I had an accident in my car. I am unable to guarantee that for the rest of my driving life I won’t have another accident. Hence I have taken out insurance. However, I have also taken a bit of time to understand how the first accident occured and modified my driving to improve my ability to control the risk of accident. Hence I am able to get insurance.

    Had I not modified my driving the probability of the same type of accident occuring would have been high, and as a result the cost of my insurance would be higher (no no-claims bonus for example).

    However, because I understand the “Happy Path” I want to travel on when driving and also understand the Crappy Path that I can wind up on if I don’t take the appropriate care and apply the appropriate controls (preventative and detective) on how I drive I haven’t had an accident since I reversed into the neighbour’s car many moons ago.

    I can’t guarantee it won’t happen again, but that is because I understand the nature of the risk and the extent to which I can control it, not because I am blissfully unaware of what is going on when I’m driving.

    Information Quality and Trust

    What does this idea of Information Quality and Trust mean? Well, the Indo put it very well this morning:

    Revelations about the Laser card glitch, disclosed in yesterday’s Irish Independent, have shaken confidence in banks’ payments systems at a time when people are nervous about all financial transactions.

    As I have said elsewhere, information is the key asset that fuels business and trade and is a key source of competitive advantage. In a cashless society it is not money that moves between bank accounts when you buy something, it is bits of information. Even when you take money from an ATM all you are really doing is turning the electronic fact into the physical thing it describes – €50 in your control to spend as you will.

    When the quality of information is called into question there is an understandable destruction of trust. “The facts don’t stack up”…. “the numbers don’t add up”… these are common exasperated comments one can often hear, usually accompanied by a reduction in trust in what you are being told or a reluctance to make a decision on that information.

    Somewhat ironically, it is the destruction of trust in the information around sub-prime mortgage lending and the bundled loan products that banks started trading to help spread their risk of in mortgage lending that has contributed to the current economic situation.

    In the specific case of Bank of Ireland and the Laser card problems, the trust vacuum is compounded by

    • The bank’s failure to acknowledge the extent or timescale of the issue
    • The bank’s apparent lack of understanding of how the process works or where it is broken. [Correction & Update: Yesterday’s Irish Daily Mail says that the Bank does know what caused the problem and is working on a solution. The apparent cause is very similar to the hypotheses I set out in the post  previous to this one.]

    This second one isn’t helped unfortunately by the fact that these issues can sometimes be complex and the word count available to a journalist is not often amenable to a detailed treatise on the finer points of batch processing transactions and error handling in complex financial services products.

    That’s why it is even more important for the bank to be communicating effectively here in a way that is customer focussed not directed towards protecting the bank.

    To restore trust, Bank of Ireland (and the other banks involved in Laser) needs to

    1. Demonstrate that they know how the process works… give a friendly graphic showing the high level process flow to the media (or your friendly neighbourhood voice of reason blogger). Heck, I’d even draw it for them if they would talk to me.
    2. From that simple diagram work out the Happy Path and Crappy Path scenarios. This may require a more detailed drill down than they might want to publish, but it is necessary. (they don’t need to publish the detail though).
    3. Once the Happy and Crappy paths are understood, identify what controls you currently have in place to keep things on the Happy Path. Test these controls. Where controls are lacking or absent, invest ASAP in robust controls.

    The key thing now is that the banking system needs to be able to demonstrate that it has a handle on this to restore Trust. The way to do this is to ensure that the information meets the expectations of the customer.

    I am a BOI customer. I expect to only pay for my lunch once. Make it so