Information Quality – Do we have an app for that?

A few weeks back I got a new iphone. I’d resisted for years, enjoying the pleasures of Nokia and Symbian and the challenges of Palm and Windows Mobile 6.1.

The fun part for me of any new mobile phone purchase is playing with the new toy  tool and seeing what it can do that my old one couldn’t. For example, back in the 1990s when I did my first upgrade from my first mobile phone (an ericsson model so old that I actually can’t find it referenced on the internet), I found that the new phone was so much smaller and lighter I was actually able to carry it around.

The irritation I have is when it comes to moving my contacts and synchronising with my various other technologies that hold contact details (laptop, gmail, company address book). Inevitably I wind up with duplication and triplication of contacts. I thought I had the problem licked on the iphone though as there are a number of apps available for managing contact details and reducing duplicates.

However, having spent a few days using them I am unimpressed as they seem to be making a the traditional rookie mistake in de-duping records – assuming that name matching is enough.

My brother and father share a given name and a family name. They have different middle initials, different addresses, different phone numbers, different email addresses (all the stuff that you would have in a contact record on your phone). Each application I tried decided that they were a duplicate entry and merged the records. This was annoying.

In other cases, I have duplicate entries with varying degrees of record completeness. For example, my friend Cathal exists at least 4 times, with one entry having most of his contact details,  with spurious email addresses or social networking nicknames in the others.  The “data quality tool” very kindly merged all the records into the entry that had the least amount of data, and deleting the other records.

Right now I’m considering firing up talend, datanomic, or informatica tools to dedupe a dump from my iphone and reload it to the phone, and then hopefully that will cascade through the rest of my data stores when I synchronise.

But I’ll need to draw a data flow map of all of that to make sure.

Grrrrhhh.

So. If the existing tools for data quality on the iphone are not up to the jobs, what is missing? The good news is that the data sets are fairly clearly structured (once they get into the iphone), so that is less of a concern than the actual processing of matching and consolidation of records.

  1. Probability scoring across multiple fields would be nice. If two people have the same name but significantly different contact details then it is very probable they are not the same person. A corollary – if there are two records with the same name and one has contact information and the other record has only a name, chances are they are duplicates.
  2. Presentation of matches for review. While the machine can make good guesses where the name and contact details are the same, where there is confusion, the matches should be flagged for a review by the phone user (the “Data Controller”). This way we can avoid having to unpick erroneous matches.
  3. Merging of records should be done on a more structured basis, with mapping of fields being user-customisable based on a standard template. I despair of important contact information being dumped into a notes field (it reminds me too much of when I had to try and migrate data out of a Siebel call centre system a few years ago).
  4. The matching should be able to cater for multi-lingual input (as phones don’t all live and work in english speaking lands).

There may be other requirements that I am not thinking of here at the moment, but those 4 are a starting point. Perhaps an obliging Data Quality tool vendor will develop an iphone app to a web service for matching contact records.

Personally, I think that having such a service available would help raise awareness of the value of quality non-duplicated contact information to individuals and to organisations.  However, the app on its own isn’t enough as the average smart-phone user may have personal information held in a variety of places and, just like in a large enterprise with lots of data stores, creating a “Single View of Contact” will require you to understand the flow of your contact information around your tools (i.e. does the phone update the laptop and does the laptop synch to google apps and does google apps synch to the phone?) to avoid the cleanup work being undone the next time you plug your phone into your PC.

Information Quality Management poses challenges for the enterprise, but can also create friction for the individual trying to manage something as simple as a list of contacts across multiple information stores.

Do we have an app for that?

Posted in Business, Information Quality, Web 2.0 and tagged , , , .

4 Comments

  1. Excellent post Daragh,

    I continue to resist upgrading my mobile phone. I still have a non-3G phone, which is almost like saying I carry around a rotary dial land line phone in a backpack with a really long cord connected back to a telephone jack in my garage, making my “calling circle” literally a 100 meter circle surrounding my house — wow, that was a long way to go for that joke 🙂

    I definitely agree that “we need an app” for information quality. However — and with acknowledgment to the many issues of data privacy and data protection that remain a significant and unresolved issue — I believe the future is cloudy.

    Instead of maintaining multiple contact lists on multiple devices, we should use a single online contact list that our “dumb devices” authenticate with and use.

    The idea of a “dumb phone” and not a “smart phone” is not mine. It is another idea I read about in David Siegel’s great book “Pull” that looks at the potential future provided by the emerging semantic web.

    Best Regards,

    Jim

    • @jim Your curmudgeonly ways are offensive to the Great Jobs. Begone.
      @steve If an iphone app could be bolted in front of a Talend webservice that might be a nifty marketing tool for their Open Source data quality tools. Heck. I might even pay for it 😉

  2. Pingback: Tweets that mention Information Quality – Do we have an app for that? | The DOBlog -- Topsy.com

Comments are closed.