Category: Information Quality

  • Certified Information Quality Professional

    Recent shenanigans around the world have highlighted the importance of good quality information. Over at idqcert.iaidq.org I’ve written a mid-sized post explaining why I am a passionate supporter of the IAIDQ’s Certified Information Quality Practitioner certification.

    Basically, there is a need for people who are managing information quality challenges to have a clear benchmark that sets them and their profession apart from the ‘IT’ misnomer. A clear code of ethics for the profession (a part of the certification as I understand iit) is also important. My reading of the situation, particularly in at least one Irish financial institution, is that people were more concerned with presenting the answer that was wanted to a question rather than the answer that was needed and there appears to have been some ‘massaging’ of figures to present a less than accurate view of things – resulting in investors making decisions based on incomplete or inaccurate information.

    Hopefully the CIQP certification will help raise standards and the awareness of the standards that should be required for people working with information in the information age.

  • Data Protection Awareness

    This post has been triggered by two things.

    Firstly, I had a nice chat with Hugh Jones who is running the ICS’s Data Protection training (see www.ics.ie/dp) for details. Hugh is interested in raising awareness of data protection issues both for businesses and for individuals. I wholeheartedly agree with him that this is important, not least because Data Protection has a strong Information Quality component.

    Secondly, just yesterday I saw two very clear examples of poor data protection practices. And that is not counting the dozen or so CCTV cameras I saw in the Dundrum Town Centre without any notification signage alerting me to the cameras or who to contact to get a copy of my personal image. Both of the incidents I saw related to sign up sheets for various things which were left in public places.
    The first Data Protection heeby-jeeby
    The least worrying one was in Wexford, where the sign up sheet for a contact list for a community group was left lying on a table that was unattended (although staff were standing near by). The information being captured was names, email addresses, mobile phone numbers and postal addresses. Each of those records would be worth approximately €100 to the right people. At 20 lines per sheet, each sheet would be worth €2000.

    That pays my mortgage for 2 months.

    Ideally, the voluntary organisation in question should have put someone sitting on a chair beside the clip pad to keep an eye on one of the most valuable things in the room.

    The second Data Protection Heeby-Jeeby (and this one scared the bejesus out of me)

    A car dealership has a display model parked up in the hallways of a large shopping mall in Arklow. On the table beside the car they have a sign up sheet (ho hum) inviting you to leave your personal details in order to be entered into a raffle.
    The first problem here is that this is very obviously a way for them to collect sales leads, contact details for people who they can phone or write to to offer test drives and such like. However the sign up form doesn’t say that. There is no information about what the information is being captured for, what uses it may be put to, or who to contact if you have a query about the information. So, it is not being captured fairly for a specified use – that’s the first Data Protection breach.

    More worrying is that the table (and the sheets and box full of personal data) were left unattended when I walked past yesterday afternoon. Personal data for about a dozen people was clearly visible on the table, unsecured, unprotected. I took a photograph with my phone. I had considered uploading it to this blog post, but there is some personal information clearly visible. So I won’t. But I have 19 rows of personal data, including at least 1 mobile phone number in an image on my (secure to a point of paranoia) archive drive at home.

    Unfortunately, I suspect that someone else took something more as the sheet was gone a few minutes later. 19 rows of data at €100 a pop… not bad for 3 seconds work. The sheet may have fallen on the floor. However, even in that case the data was no longer in the control of the Data Controller.

    So, to the car dealership that put that blue Hyundai I20 in the Bridgewater Shopping Centre in Arklow: you REALLY REALLY should consider sending a few of your staff to the Data Protection Lunch & Learn session or to the 1 day or 3 day Data Protection courses run by the ICS. Currently your entire marketing set up in the Bridgewater Shopping Centre is in breach of the Data Protection Act.

    Conclusion
    I would advise everyone to make themselves aware of the provisions of the Data Protection Act and to evaluate every time someone asks you for personal information. Don’t give your information to anyone who isn’t capturing it fairly, processing it fairly or treating it as a valuable asset. If they leaving it lying around in a public place unattended and unsecured… think twice.

    If you are a person or organisation capturing personal information about people, then you should put some time and effort into planning how you will capture the information, secure it, prevent it being photographed, swiped or mislaid, and ultimately put it to use. You should avoid the temptation to promote your data capture as something that it is not… yes, offer a raffle prize but let people know if you are planning to use the data to drive a marketing campaign.

  • Information Quality Train Drivers

    The IAIDQ is working to develop an industry standard certification/accreditation programme for Information/Data Quality Professionals (similar to the PMI for Project Managers). This is a valuable and significant initiative that will (hopefully) lead to a reduction in the types of issues we see over at IQTrainwrecks.com.

    The IAIDQ has set up a blog over at idqcert.iaidq.org to share news and feedback from the Certification development project. Currently there are some good posts there about the first international workshop that was held in October in North Carolina to thrash out the ‘knowledge areas’ that needed to be addressed. That workshop was a key input into the next stage of the project – a detailed Job Analysis study.

    Of course, industry defining initiatives like this need to be funded and the IAIDQ is eager that this be a Community lead project “by IQ Professionals for IQ Professionals”, rather than being driven by the objectives of vendors (although vendors are good and the IAIDQ is looking for vendor sponsorship to help this initiative as well). To make this a ‘community’ initiative it was felt that individuals might like to ChipIn a few quid. If you are in the US it is tax-deductible due to the legal status of the IAIDQ (a 501(3) not for profit). The rest of us might just need to be less generous.

    I personally think this is a great initiative that will raise standards and objectivity in the field of Information Quality. Please give generously.

  • Obama’s win… a win for information quality

    Barack Obama just might be the first ‘Information Age’ President of the US.

    The Houdini Project that his team ran has highlighted the value of information, and especially good quality and timely information, when making decisions or trying to gain a competitive advantage. From the details that have leaked out (and while Newsweek get the credit for breaking the story, I found it discussed here a few days ago) it is clear that from the top down there was an understanding of the value of timely and accurate data with additional ‘richness’ of information to help focus resources (ie not calling people who’d already voted or who weren’t going to vote Obama), prioritise effort (ie putting the priority on calling in areas where voter turn out was lower than expected), and generally just getting the edge on the opposition.

    On the DailyKos, UMassLefty wrote:

    We were plugged in to the GOTV operation throughout the day, and we knew that it was working, that what we were doing mattered.

    Ironically, only yesterday I was delivering a presentation on how information quality professionals needed to work with their customers (stakeholders) to make that link between the goals and priorities of the Executive Committee and the actions, deeds and drivers of the people in the front line to give a clear and coherent alignment of information quality to strategy (and vice versa).

    The IAIDQ has issued a press release commenting on the value of the information to the success of Obama’s campaign.

    As more information emerges about how the Houdini project worked, I’m sure either the IAIDQ or I will be writing more about it.

  • Imitation the sincerest form of flattery

    I noticed that Informatica have launched a new website called www.doyoutrustyourdata.com, to highlight issues with poor quality information from the media.

    My personal opinion on the site is that it isn’t very nice looking (but then I’m not a big fan of black on green). However, I’m biased as I moderate the IQTrainwrecks.com blog for the IAIDQ which has been doing this for over 2 years now in an occasionally tongue in cheek manner. IQTrainwrecks.com gets reasonably good search returns on google (and we’re looking at ways to improve that further).

    I’m flattered that Informatica have stumbled upon the same idea that the IAIDQ had back in 2006. I hope that we can figure out a way to have both sites working together for the benefit of information consumers everywhere. For example, the IAIDQ would love to reward members for submitting stories to IQTrainwrecks.com but our resources aren’t extensive enough to fund that (yet).

    [Update] As Vincent McBurney correctly points out, the IAIDQ wasn’t the first to try to create a resource like this. IQTrainwrecks is a spiritual descendant of www.dataquality.com and also the listing of issues that Tom Redman has been tracking over on www.navesink.com). [/update]

  • How not to handle a customer…

    So, I’ve been having problems with my broadband. Problems significant enough that I would suggest that the Dept of Comms actually think through the potential reliance on Fixed Wireless solutions for Ireland’s broadband deficit. More on that another time.

    What annoys me in the immediate sense is the level of customer service that people seem to think is OK. I had my FWA antenna removed from my house today. I found out about it when I looked out the window and saw the van from my provider in the drive way and the legs of a ladder going up the side of the house. I expected a binglybong on the door bell to let me know what was happening, but nowt. I was working so I couldn’t rush out to talk to the man. By the time I’d finished the work stuff he’d vanned away again.

    I’d complained to my provider in writing back in May about some issues. I got a nice email addressing part of my complaint and bugger all else. After this morning’s visitation I emailed them to find out what was going on.

    Apparently they’ve tried to contact me “numberous” [sic] times over the past month to talk to me about the problems I was having.

    Checked email… nowt.
    Checked spam filter… nowt.
    Checked missed calls on phone… nowt.
    Checked the drawer in the kitchen where all the things that look like bills get hidden… nowt.

    I know I had no voicemails from them on the phone as I would have remembered it (and I would have downloaded the voicemail from the webmail service provided by my mobile service provider -betcha didn’t know you could do that did you, unified messaging almost – and put it in the folder of documents/evidence I am compiling to go with my inevitable ComReg complaint).

    Apparently the only contact information they have for me is my mobile number. Apart from the fact they’ve sent me emails to my email address and a man-in-a-van could find my house, where letters also go. And I included all of that information again on my complaint letter.

    So the lack of a follow up email, or a letter responding to my complaint or a friendly binglybong on the doorbell from the man in the van to fill me in on things were all beyond them, because they didn’t have the information. Which they, errmmmm, had, for the reasons mentioned above.

    So that thing about only having a mobile number to contact me is a… [mistake] [lie] [cop out] [failure of internal processes to properly manage customer information]… (select one or more options as appropriate).

    It would seem it’s all my fault I didn’t know what was going on. I should have felt the disturbance in The Force, as if a small call centre of people suddenly cried out as one and then suddenly felll silent. Curse my failing and fading Jedi skills.

    At least that’s how I’d feel if I wasn’t so peeved at the whole thing. I think that once I’ve updated ComReg with the nonsense I’m dealing with I’ll send my ex-provider a request for all personal information they hold about me (electronic and paper file, and ip and traffic logs etc. ) under the terms of the Data Protection Act. ‘Coz I am fond of my regulatory frameworks and codes of practice etc.

    Notice that I’ve not named the service provider or discussed the specific issues here. That would be unfair to my (it would seem former – at their initiative) Broadband Provider. However, they are exactly the type of organisation that DCENR seems to be pinning the Great Broadband Hope on.

    The good news is that the Vodafone broadband dongle I have for using while commuting and which has been my main tool for getting on line at home recently – even though it is just 2G around these parts, picked up a 3 3G network last night. Couldn’t connect to it but knew it was there. So that’s got me thinking….

  • An IQ Trainwreck…

    From Don Carlson, one of my IAIDQ cronies in the US comes this YouTube vid from Informatica (a data quality software tool vendor) that sums up a lot of why Information Quality matters.

    Of course, I could get snooty and ask what gave them the idea to juxtapose Information Quality and Trainwrecks…. gosh, I’d swear I’ve seen that somewhere before

  • The Electoral Register Hokey-Cokey

    When I was a small child, my grandmother used to entertain me and my siblings by getting us to sing and dance the hokey cokey, a playful little song and dance routine if ever there was one.

    This dance was brought to mind yesterday when Fergal of the Tuppenceworth bloggers emailed me to let me know that he appears to have been taken off the Electoral Register in his home county. Again.

    You put your right to self-determination and election of a government by proportional representation as mandated by the constititution of the Irish Republic in.
    You put your right to self-determination and election of a government by proportional representation as mandated by the constititution of the Irish Republic out.
    In. Out. In. Out.
    And you shake it all about.

    It would seem that Fergal had been taken off the Register during the Great Clean up of 2006. He then had his ballot reinstated. The other day, in a fit of electoral existentialism he decided to try and find himself on the Electoral Register website www.checktheregister.ie

    Zen like, he found himself encountering the concept of nothing as a search for his name at his address revealed nothing. Oh Hokey Cokey Cokey indeed.

    So what may have gone wrong here?

    • Is Fergal’s name transposed on the Register (surname first, firstname last)?
    • Is the address registered against Fergal on the Register different to his address?
    • Does the search function on the Electoral Register require an exact character match on names/addresses? Is “Fergal” interpreted as a different name to “Fearghal” (both Fergals in my book)?
    • If Fergal has indeed been deleted from the Register (again), what triggered the Hokey Cokey here? Was an old copy of the Register loaded to the website?
    • Is the version you search on-line up to date with the version you might find in your library or Garda Station? Might Fergal be on the Register, but just not on the Register that is searched? Might it work in the contrary… Might people be listed as ‘on the register’ in an on-line search but be off the Register in the ‘paper’ world (ie the version that counts on polling day)?

    The list of potential root causes is (especially as I am speculating a bit) quite long. However this is further evidence that the processes for the management of the Electoral Register are a bit knackered. This has been accepted by the Government and the Oireachtas Committee on the Electoral Register recently published a series of recommendations which eerily echoed comments and recommendations made on this blog over 2 years ago.

    However, while there is an urgent need to have as accurate an electoral register as possible (1 Referendum in our immediate future and Local Elections in the not to distant future), care must be taken to ensure we solve the problems of tomorrow as well as the problems of today.

    But in the words of Tom Jones – “I think I’m gonna dance now”…

    “Oh, hokey cokey cokey…. Oh hokey cokey cokey…..”

  • Telephone numbers and Information Quality – the risk of assumption

    There is an old saying that the word “Assume” makes an “Ass” out of “You” and “Me”.

    Yet we see (and make) assumptions every day when it comes to assessing the quality (or otherwise) of information. Anglo-Saxon biassed peoples (US, English speaking Europe etc) often assume that names are structured Firstname Surname. “Daragh” = First Name, “O Brien” = Surname. The cultural bias here is well documented by people like Graham Rhind (who advises the use of “Given Name/Family Name” constructs on web forms etc. to improve cross-cultural usability.

    But what if you see “George Michael” written down (without the context of labels for each name part) with a reference to “singer”? Would this relate to the pop singer George Michael, or the bass baritone singer Michael George?

    One of the common ‘rules of thumb’ with telephone numbers is that, when you are trying to create the full ‘internationalised’ version of a telephone number (+[international access code] [local area code] [local number]) you take the number as written ‘locally’ and drop the leading zero. Of course, like most conventional wisdom a little scrutiny causes this rule of thumb to fall apart.

    For example, in the Czech Republic there is no ‘leading zero’ as it is actually part of the international access code (which actually makes more sense to me…). One might assume that Europe, with the standardisation ethos of the European Union would all have plumped for “0” as a leading digit on local area codes. Not so, as Portugal doesn’t use any leading digit on their area codes. Some countries that used to be part of the USSR (like Russia, Belarus and Azerbijan) use 8 instead of 0.

    You might not be safe in assuming that you just need to consider the first digit of the local area code. Hungary has a 2-digit prefix (06), so you would need to parse in 2 characters in the string to remove the correct digits. Just stripping the leading zero will result in a totally embuggered piece of information.

    Also, everyone assumes that a telephone number will consist only of numbers. However, there are a few instances where the code required to dial out from a country (the International Direct Dial code) is actually alphanumeric in that it contains either the * (star) or # (hash key/pound key). Our buddies in Belarus are an example of this, where to dial out from Belarus you need to dial “8**10” (which even more confusingly is often written “8~10”.
    So what does this mean for people who are assessing or seeking to improve the quality of telephone number data in their systems?

    Well, first off it means you need to have some context to understand the correct business rules to apply. For example, the rules I would apply to assessing the quality (and likely defects) in a telephone number from Ireland would be different to what I’d need to apply to telephone numbers relating to Belarus. In an Irish telephone number it would be correct to strip out instances of “**” and then validate the rest of the string based on its length (if stripping the ** made it too short to be a telephone number then we would need to tag it as duff data and remove it). With data relating to Belarus it might simply be that the person filling in the form (the source of the data) got confused about what codes to use.

    Secondly, it means you need to put some thought into the design of information capture processes to reduce the chances of errors occuring. Defining a structure with seperate fields, linking the international access code to a country drop down (and a library of business rules for how to interpret and ‘standardize’ subsequent inputs) would not be too difficult – it would just require investment of effort in researching the rules and maintaining them once deployed. Here’s a link to a useful resource I’ve found (note that I can’t vouch for the frequency of updates to this site, but I’ve found it a fun way to figure out what the rules might be for various countries). Also, Wikipedia has a good piece on Telephone number plans. Graham Rhind also has some good links to references for telephone number format rules
    Looking at the data of a telephone number in isolation will most likely result in you screwing up some of the data (if you have international telephone number). Having the country information for that data (is the number in France or Belarus) allows you to construct appropriate rules and make your assumptions in the appropriate context to reduce your risks of error.

    Ultimately, blundering in with a crude rule of thumb and simply stripping any leading zeros you find because that is the assumption you’ve made will result in you making an ass out of you and your data.

    Which raises an interesting question…

    Imagine you have been given a spreadsheet of telephone numbers that you have been told are international numbers in the ‘local’ formats for the respective countries. You open the spreadsheet and there are no leading zeros (because Excel -and most other spreadsheets- assumes that numbers don’t begin with zero and strip it out). What to you do to get the data back to a format that you can actually use?

    Answers on a post card (or in the comments) please.

  • Cripes, the blog has been name-checked by my publisher…

    TwentyMajor isn’t the only blogger in the pay of a publisher (I’m conveniently ignoring Grandad and the others as Irish bloggers are too darned fond of publishing these days. If you want to know who all the Irish bloggers with publishers are then Damien Mulley probably has a list)!

    I recently wrote an industry report for a UK publisher on Information Quality strategy. The publisher then swapped all my references to Information Quality to references to Data Quality as that was their ‘brand’ on the publication. I prefer the term Information Quality for a variety of reasons.

    As this runs to over 100 pages of A4 it has a lot of words in it. My fingers were tired after typing it. Unlike Twenty’s book, I’ve got pictures in mine (not those kind of pictures, unfortunately, but nice diagrams of concepts related to strategy and Information Quality. If you want the other kind of pictures, you’ll need to go here.)

    In the marketing blurb and bumph that I put together for the publisher I mentioned this blog and the IQTrainwrecks.com blog. Imagine my surprise when I opened a sales email from the publisher today (yes, they included me on the sales mailing list… the irony is not lost on me… information quality, author, not likely to buy my own report when I’ve got the four drafts of it on the lappytop here).

    So, for the next few weeks I’ll have to look all serious and proper in a ‘knowing what I’m talking about’ kind of way to encourage people to by my report. (I had toyed with some variation on booky-wook but it just doesn’t work – reporty-wort… no thanks, I don’t want warts).

    So things I’ll have to refrain from doing include:

    1. Engaging in pointless satirical attacks on the government or businesses just for a laugh, unless I can find an Information Quality angle
    2. Talking too loudly about politics
    3. Giving out about rural/urban digital divides in Ireland
    4. Parsing and reformatting the arguments of leading Irish opinion writers to expose the absence of logic or argument therein.
    5. Engaging in socio-economic analysis of the fate of highstreet purveyors of dirty water parading as coffee.
    6. Swearing

    That last one is a f***ing pain in the a**.

    If any of you are interested in buying my ‘umble little report, it is available for sale from Ark Group via this link.. . This link will make them think you got the email they sent to me, and you can get a discount, getting the yoke for £202.50 including postage and packing (normally £345+£7.50p&p. (Or click here to avoid the email campaign software…)

    And if any of you would like to see the content that I’d have preferred the link in the sales person’s to send you to (coz it highlights the need for good quality management of your information quality) then just click away here to go to IQTrainwrecks.com

    Thanks to Larry, Tom, Danette, the wifey for their support while I was writing the report and Stephanie and Vanessa at Ark Group for their encouragement to get it finished by the deadline.