The Risk of Poor Information Quality #nama

I thought it timely to add an Information Quality perspective to the debate and discussion on NAMA. So, for tweeters the hashtag is #NAMAInfoQuality.

The title of this post (less the Hashtag) is, co-incidentally, the title of a set of paired conferences I’m helping to organise in Dublin and Cardiff in a little over a week.

It is a timely topic given the contribution that poor quality information played in the sub-prime mortgage collapse in the US. While a degree of ‘magical thinking’ is also to blame (“what, I can just say I’m a CEO with €1million and you’ll take my word for it?”) and , ultimately the risks that poor quality information posed to down stream processes and decisions  were not effectively managed even if they were actually recognised.

Listening to the NAMA (twitter hash-tag #nama) debate on-line yesterday (and following it on the excellent I couldn’t help but think about the “Happy path” thinking that seems to be prevailing and how similar it is to the Happy Path thinking that pervaded the CRM goldrush of the late 1990s and early 2000’s, and the ERP and MDM bandwagons that have trundled through a little place I call “ProjectsVille” in the intervening years.

(note to people checking Wikipedia links above… Wikipedia, in its wisdom, seems to class CRM, ERP and MDM as “IT” issues. That’s bullshit frankly and doesn’t reflect the key lessons learned from painful failures over the years in many companies around the world. While there is an IT component to implementing solutions and excuting  projects, these are all fundamentally part of core business strategy and are a business challenge. )

But I digress….

Basically, at the heart of every CRM project, ERP project or MDM project is the need to create a “Single View of Something”, be it this bizarre creature called a “Customer” (they are like Yeti.. we all believe they exist but no-one can precisely describe or define them), or “Widget” or other things that the Business needs to know about to, well… run the business and survive.

This involves taking data from multiple sources and combining them together in a single repository of facts. So if you have  999 seperate Access databases and 45000 spreadsheets with customer  data on them and data about what products your customers have bought, ideally you want to be boiling them down to one database of customers and one database of products with links between them that tell you that Customer 456  has bought 45000 of Widget X in the last 6 months and likes to be phoned after 4:30pm on Thursdays and prefers to be called ‘Dave’ instead of “Mr Rodgers”, oh… and they haven’t got around to paying you for 40,000 of those widgets yet.

(This is the kind of thing that Damien Mulley referred to recently as a “Golden Database“.)

NAMA proposes to basically take the facts that are known about a load of loans from multiple lenders, put them all together in a “Single View of Abyss” (they’d probably call it something else) and from that easily and accurately identify under-performing and nonperforming loans and put the State in the position where it can ultimately take the assets on which loans were secured or for which loans were acquired if the loans aren’t being repaid.

Ignoring the economists’ arguments about the merits and risks of this approach, this sounds very much like a classic CRM/MDM problem where you have lots of source data sets and want to boil them down to three basic sets of facts, in this case:

  • Property or other assets affected by loans (either used as security or purchased using loans)

  • People or companies who borrowed those monies

  • Information about the performance of those loans.

Ideally then you should be able to ask the magic computermebob to tell you exactly what loans Developer X has, and what assets are those loans secured on. Somewhere in that process there is some magic that happens that turns crud into gold and the Irish taxpayer comes out a winner (at least that’s the impression I’m getting).

This is Happy Path.

The Crappy Path

Some statistics now to give you an insight into just how crappy the crappy path can be.

  • Various published studies have found that over 70% of CRM implementations had failed to deliver on the promised “Single View of Customer”

  • In 2007 Bloor Research found that 84% of all ERP data migrations fail (either run over time, over budget or fail to integrate all the data) because of problems with the quality of the data

  • As recently as last month, Gartner Group reported that 75% of CFOs surveyed felt that poor quality information was a direct impediment to achieving business goals.

  • A study by IBM found that the average “knowledge worker” can spend up to 30% of their time rechecking information and correcting errors.

Translating this to NAMA’s potential Information Management Challenge:

  1. The probability of the information meeting expectations is about the same as the discount tthat has been applied on the loans. (30%).
  2. The probability of the migration and consolidation of information happening on time, on budget and to the level of quality required is slightly better than the forecast growth rate in property prices once the economy recovers (16% versus 10%)
  3. Around 30% of the time of staff in NAMA will likely be spent checking errors, seeking information, correcting and clarifying facts etc.

There is a whole lot more to this than just taking loans and pressing a button on a money machine for the banks.

Ultimately the loans are described in the abstract by Information, the assets which were used as security or which were purchased with those loans are defined by data, and the people and businesses servicing those loans (or not as the case may be) are represented by facts and attributes like “Firstname/LastName”, “Company Registration Number”. Much as we taxpayers might like it, Liam Carroll will not be locked in a dungeon in the basement of Treasury Buildings while NAMA operates. However, the facts and attributes that describe the commercial entity “Liam Carroll” and the businesses he operated will be stored in a database (which could very well be in

This ultimate reliance on ephemeral information brings with it some significant risks across a number of areas, all of which could signpost the detour from Happy Path to Crappy Path.

Rather than bore readers with a detailed thesis on the types of problems that might occur (I’ve written it and it runs to many words), I’ve decided to run a little series over the next few days which is drawing on some of the topics I and other speakers will be covering at the IAIDQ/ICS IQ Network Conference on the 28th of September.

Examples of problems that might occur (Part 1)

Address Data (also known as “Postcode postcode wherefore art thou postcode?”)

Ireland is one of the few countries that lacks a postcode system. This means that postal addresses in Ireland are, for want of a better expression, fuzzy.

Take for example one townland in Wexford called Murrintown. only it’s not. It has been for centuries as far as the locals are concerned but according to the Ordnance Survey and the Place Names commission, the locals don’t know how to spell. All the road signs have “Murntown”.

Yes,  An Post has the *koff* lovely */koff* Geodirectory system which is the nearest thing to an address standard database we have in Ireland. Of course, it is designed and populated to supprt the delivery of letter post. As a result, many towns and villages have been transposed around the country as their “Town” from a postal perspective is actually their nearest main sorting office.

Ballyhaunis in County  Mayo is famously logged in Geodirectory as being in Co. Roscommon. This results in property being occasionally misfiled.

There are also occasionally typographical errors and transcription errors in data in data. For example, some genius put an accented character into the name of the development I live in in Wexford which means that Google Maps, Satnavs and other cleverness can’t find my address unless I actually screw it up on purpose.

Of course, one might assume that the addresses given in the title deeds to properties would be accurate and correct (and, for the most part, they are I believe). However there is still the issue of transcription errors and mis-reading of handwriting on deeds which can introduce small and insidious errors.

It is an interesting fact that the Land Registry has moved to computerised registers in recent years but the Property Registration Authority trusts still to the trusty quill and only recently moved to put the forms for registering property deeds on-line. Please let me know what you think of the layout of their web form.

I am Spartacus (No, I am Spartacus. No I’m Brian Spartacus).

Identity is a somewhat fluid thing. When seeking to build their consolidated view of borrowings, NAMA will need to create a “single view of borrower”. This will require them to match names of companies to create a single view of businesses who have borrowed (and then that will likely need to have some input from the CRO to flag where such companies have a change in status such as being wound up or bought).

The process will also likely need to have a Single View of Borrower down to the level of a person a) because some loans may be out to people and b) because the link between some borrowings and various companies who would have borrowed will likely turn out to be an individual.

Now. Are these people the same:

  • Daragh O Brien

  • Dara O’Brien

  • Daire O Brian

  • Daragh Ó Briain

  • Dara Ó Briain

  • Darach O Brien

  • D Patrick O Brien

  • Pat O’Brien

The answer is that they are. They are variations in spelling on my name, one possible variation in use of my middle name, and the last one is what a basketball coach I had decades ago called me because he couldn’t pronounce Daragh.

However, the process of matching personal data is very complex and requires great care be taken, particularly given the implications under the Data Protection Act of making an error.

The challenge NAMA potentially faces is identifying if Joe Bloggs in Bank A is Joseph Bloggs in Bank B or J.P Bloggs in Bank C or both or none at all. Recommended practice is to have name plus at least two other ‘facts’ to feed your matching processes. And at that the process inevitably requires human review.

However, the problem is relatively solvable if you invest in the right tools and are willing to invest in the people necessary to use the tools and (unfortunately) take care of the manual review and approval that would be required.

A related risk is the risk of not having the customer’s name correct. Simply put, where that happens the lender or controller of the loans effectively hands the borrower a “get out of jail” card as they can simply claim that the named person is not them. Courts are pedantically persnickety about accuracy of information in these types of matters.

A corollary of this is where the lender or controller of the loans (in this case NAMA) starts chasing the wrong person for the payment of a loan due to a mismatch of data about people. Here, the aggrieved party could ultimately sue the controller of the loans for libel if they publish to anyone the allegation that the person owed money and wasn’t paying it back.

While each of these are risks that the banks individually manage on their own at the moment, the simple fact of pulling all this information together under NAMA increases the risk factor here. Each bank may have individually had problems with errors and mismatching etc. but resolved them quietly and locally. The root causes of those errors may not be addressed and may not be possible to address in a bulk data migration. Therefore, the data may get muddled again, leading to the issues outlined above.

Conclusion (for Part 1)

NAMA will not be managing physical loans or physical assets. What NAMA will work with is the information about those things, the facts and figures that describe the attributes of those things in the abstract.

To assume that individually Irish banks have levels of data quality that are as pure as the driven snow is naive. To assume that taking a load of crummy data from multiple sources and mixing it together to create a new “Single View” of anything without first understanding the quality of that information and ensuring that you have steps in place to manage and mitigate the risks posed by non-quality information means you risk failure rates of between 70% and 84%.

To put it another way, there is only between 16% and 30% of a chance that NAMA will be able to deliver the kind of robust management of all loans out to individual property developers and property development companies that would be necessary to ensure that the risk to the taxpayer is properly managed.

The key to managing that risk of failure will be discussed in my next post and at the IAIDQ/ICS IQ Network conference on the 28th of September