Propogation of information errors and the risks of using surrogate sources

….ye wha’?

There has been a lot written in relation to the electoral register and other matters about using information from other sources to improve the quality of information that you have or to create a new set of information.

This makes sense, other people may already have done much of the work for you and, effectively, all you need to do is to copy their work and edit it to meet your needs. In most cases it may be faster and cheaper to use such ‘surrogates’ for reality to meet your information needs than to go to the effort of going to the real-world things (people, stock-rooms where ever) and actually starting from scratch to build exactly the information you need in the format you require to exactly your standards and formats.

There is, however, a price to pay for having such surrogate sources available to you. You need to accept that

  1. The format and structure of the information may need to be changed to fit your systems or processes
  2. The information you are using may itself be innaccurate, incomplete or inconsistent.
  3. If you are combining it with other information, it will require investment in tools and skills to properly match and consolidate your information into a valid version of the truth.

These risks apply to organisations buying marketing lists to integrate with their CRM systems but also could be applied to students relying on the Internet to present them with the content for their academic projects or journalists trawling for content for newspaper articles or reviews.

Recurrence of common errors, phrases or inaccuracies in term papers is one way that academia has of identifying academic fraud. Similar techniques might be applied in other arenas to identify and track instances of copyright infringement.

In businesses dealing with thousands of records, the cost/risk analysis is relatively straightforward. The recommendation I would make is that clear processes to manage suppliers and to measure the quality of the information they provide you based on a defined standard for completeness, consistency, duplication, conformity etc. is essential. Random sampling of surrogate data sources for accuracy (not every 100th record but a truly random sample) is also strongly recommended.

These are EXACTLY the same techniques that manufacturing industries use to ensure the quality of the raw material inputs to their processes. If it works for industries where low quality can kill (such as pharmaceuticals), why shouldn’t work for you?

For students, journalists and those of us hacking away in the blogosphere the recommendation is simple. Only rely on surrogate sources if you absolutely have to. If you use someone elses work as your source, credit them. If you don’t want to credit them then make sure you verify the accuracy of their work either by actually verifying against reality or by checking with at least one other source.

That way you avoid having the errors of your source become your errors also and you don’t run the risk of someone crying foul and either suing you for stealing their copyright (and copyright does apply to content posted on the internet and in blogs) or taking whatever other sanctions might apply (such as kicking you off your college course).

In many cases the costs and effort involved in double checking (particularly for a once of piece of writing) are neglibily different to the costs of actually starting from scratch and building your information up yourself. And, depending on the context, it may even be more enjoyable.

The New York Times not so long ago had to relearn the lessons of checking stories with at least one other source for accuracy.

Horatio Caine in CSI:Miami always tells his team to “trust, but verify”.

When using surrogate sources for real-world information in any arena you must assess the risk of doing so and put in place the necessary controls so that you can trust that you have verified.

(c) Daragh O Brien 2006 (just in case)

7 thoughts on “Propogation of information errors and the risks of using surrogate sources”

  1. Indeed you did. I saw your posted review on tuppenceworth. Apparently a Sunday paper shared your overall opinions on the place recently also.

    I really must try their desserts.

  2. Pingback: The DOB Blog » Customer focus in the mee-ja

  3. Excellent analysis, but lets face it. DQM in relation to the electoral register will remain a desperately unsexy, bit creepy, slightly smelling of milk and vaseline pursuit right up till five minutes after some shite declares martial law for the duration of the ‘constitutional crisis’ and some one asks “And how did this happen then”. Truly that will be your moment to shine. Waiting with baited breath.

  4. Unfortunately, examining electoral register data quality is an intellectual pursuit. And about 20 seconds after martial law is declared, our erudite scribe will be rounded up (along with the gipsies and anyone who looks at McDougal funny) and ‘relocated’ to a work camp or some such).

  5. It’ll be the most extraordinary rendition since Hendrix did All along the watchtower. And you probably desrve it pinko. But on a serious note the discrepancies in the Irish Electoral Register remain the purview of the political OCD crew while the constellation of Irish Political parties lacks something appoaching stellar wattage. As long as the country continues to vacillate between the Party of No Ideas (and company) and the Parties of No Ideas electoral turn out will remain a bad joke. The porblem isn’t the mechanism it’s the message.

  6. Ahhh tFunk. I bless your musical taste. However I must disagree. If the register is as fucked as the Bishop’s boy then it can be abused by the party of but one idea… to stay in power or the party of the idea of stealing power. Fixing the register might actually make them think about the electorate for a change and the issues in Irish Society (like chumps like me that spend 2 to 3 nights away from their spouses because they couldn’t afford a house nearer their work place). But only if they pay attention to the problems that the earlier posts here have highlighted -like the fact that you can’t change your f*cking name on the register unless you move house. Who the hell can afford to do that now days?

Comments are closed.