84% fail. Do you remember that statistic from my previous post?
In my earlier post on this topic I wrote about how issues of identity (name and address) can cause problems when attempting to consolidate data from multiple systems into one Single View of Master Data. I also ran through the frightening statistics relating to the failure rate of these types of projects, ranging up to 84%.
Finally, I plugged two IAIDQ conferences on this topic. One is in Dublin on the 28th of September. The other is in Cardiff (currently being rescheduled for the end of next month).
A key root cause of these failure rates has been identified. At the heart of many of these failures is a failure to understand and profile data to better understand the risks and issues in the data that is being consolidated.
So, if we assume that the risk that is NAMA is a risk that the Government will take, surely then it behoves the Government and NAMA to ensure that they take necessary steps to mitigate the risks posed to their plan by poor quality information and reduce the probability of failure because of data issues from around 8 in 10 to something more palatable (Zero Defects anyone?)
Examples of problems that might occur (Part 2)
Last time we talked about name and address issues. This time out we talk a little about more technical things like metadata and business rules. (You, down the back… stay awake please).
Divided by a Common Language (or the muddle of Metadata)
Oscar Wilde is credited with describing America and Britain as two countries divided only by a common language.
When bringing data together from different systems, there is often an assumption that if the fields are called the same thing or a similar thing, then they are likely to hold the same data. This assumption in particular is the mother of all cock-ups.
I worked on a project once where there were two systems being merged for reporting purposes. System A had a field called Customer ID. System B had a field called Customer Number. The data was combined and the resulting report was hailed as something likely to promote growth, but only in roses. In short, it was a pile of manure.
The root cause was that System A’s field was a field that uniquely identified customer records with an auto-incrementing numeric value (it started at 1, and added 1 until it was done). The Customer Number field in System B, well it contained letters and numbers and, most importantly, it didn’t actually identify the customer.
‘Metadata’ (and I sense an attack by Metadata puritans any moment now) is basically defined as “data about data” which helps you understand the values in the field and also helps you make correct assumptions about whether Tab A can actually be connected to Slot B in a way that will actually make sense. It ranges from the technical (this field has only numbers in it for all the data) to the conceptual (e.g. “A customer is…”).
And here is the rub. Within individual organisations, there is often (indeed I would say inevitably) differences of opinion (to put it politically) about the meaning of the meta data within that organisation. Different business units may have different understandings of what a customer is. Software systems that have sprung up in silo responses to immediate tactical (or even strategic need) often have field names that are different for the same thing (synonyms) or are the same for different things (homonyms). Either can cause serious problems in the quality of consolidated data.
Now NAMA, it would seem, will be consolidating data from multiple areas from within multiple banks. This is a metadata problem squared, which increases the level of risk still further.
Three Loan Monty (or “One rule to ring them all”)
One of the things we learned from Michael Lynn (apart from how lawyers and country music should never mix) was that property developers and speculators occasionally pull a fast one and give the same piece of property as security to multiple banks for multiple loans. The assumption that they seemed to have made in the good times was:
- No one would notice (sure, who’d be pulling all the details of loans and assets from most Irish Banks into one big database)
- They’d be able to pay off the loans before anyone would notice
Well, number 1 is about to happen and number 2 has stopped happening in many cases.
To think about this in the context of a data model or a set of business rules for a moment:
- A person (or persons) can be the borrowers on more than one Loan
- One loan can be secured against zero (unsecured), one (secured), or more than one (secured) assets.
What we saw in the Lynn case broke these simple rules.
An advantage of NAMA is that it gives an opportunity to actually get some metrics on how frequently this was allowed to happen. Once there is a Single View of Borrower it would be straightforward to profile the data and test for the simple business rules outlined above.
The problem arises if incidents like this are discovered where there are three or four loans secured against the same asset and one of them has a fixed charge or a crystallised charge over the asset and the others have some sort of impairment on their security (such as paperwork not being filed correctly and the charge not actually existing).
If the loan with the charge is the smallest of the four, this means that NAMA winds up with three expensive unsecured loans as the principle in Commercial Law is that first in time prevails- in other words the first registered charge is the one that secures the asset.
It may very well be that the analysis of the banks loan books has already gone into the detail here and there is a cunning plan to address this type of problem as it arises. I’d be interested to see how such a plan would work.
Unfortunately, I would fear that the issues uncovered in the Michael Lynn debacle haven’t gone away and remain lurking under the surface.
Conclusion (for now)
Information is ephemeral and intangible. Banks (and all businesses) use abstract facts stored in files to describe real-world things.
Often the labels and attributes associated with those facts are not aligned or are created and defined in “silos” which create barriers to effective communication within an organisation. Such problems are multipled manifold when you begin to bring data from multiple independent entities together into one place and try to make it into a holistic information asset.
Often things get lost or muddled in translation.
Furthermore, where business rules that should govern a process have been broken or not enforced historically there are inevitably going to be ‘gaps of fact‘ (or ‘chasms of unknowingness’ if there is a lot of broken rules). Those ‘gaps of fact’ can undermine critical assumptions in processes or data migrations.
When we talk of assumptions I am reminded of the old joke about how the economist who got stranded on a desert island was eventually rescued. On the sand he wrote “First, assume a rescue party will find me”.
Where there is a ‘gap of fact’, it is unfortunately the case that there is a very high probability (around 84%) that the economist would be found, but found dead.
Effective management of the risk of poor quality information requires people to set aside assumptions and act on fact and mind the gap.