Data Quality Through Dynamic Data Entry

Right First Time

Finally, more and more commentators and practitioners in the data quality world are beginning to realise that the essential key to quality data is to ensure that all data is correct at the time it is created.  Getting data right first time is obviously a far better approach than allowing erroneous data to be created and then having to spend time and money finding and correcting it. Creating error-free data is, in general, a lot simpler than many people imagine. However, it requires taking a new, sideways (often lateral) approach to the design of data entry screens.

One of the sets of data that is input into computer systems millions of times each day around the world is a parties (customer, supplier, etc) name and address. So you would think that by now system designers would have this sorted so that it can be done both quickly and error-free each and every time. Sadly, the opposite is the case.

Locked in the Past

The fact is that the layout of most input screens for this purpose follows a format that is over 400 years old!

The practice of writing the name and address of the intended recipient on a piece of mail in the format shown above dates back as far as the 1600s; that’s 200 years before the first adhesive postage stamp, the Penny Black, was developed.

Outdated name and address format.

This format is entirely the reverse of what is required in order to achieve data quality in today’s online media.  It is a bit like the design of the QWERTY keyboard, totally antiquated but still persists. However, unlike the QWERTY keyboard, there are no barriers to changing the design of name and address input screens.  So, what ought they look like? The ideal answer is given by the Dynamic Data Entry screens as shown below.

They Design Themselves

The first thing you do is select the language that you want to use.  This immediately changes the prompts for each of the input fields into that language.

Multi-lingual address input an validation.

After entering the First Name and Surname, you then select the Country for the address.   This is entirely opposite to the conventional order, yet it is the key to proper data validation. Once the Country is known, then the appropriate address fields for the selected country will be displayed in the language selected. Knowing the country is also essential for proper validation. For example, the next field to be filled in is the Postcode. This could not be properly validated if the country had not been specified.

Entering a Postcode

If you enter a postcode, then as you type in the characters, the system will (after you have typed 2 characters) begin to display the valid postcode options for the selected country based on the characters you have just entered.Multi-lingual address input validation using postcode.

If you do not know the postcode then you can leave this field blank.

What happens next depends on whether or not you entered a postcode.

If you have entered a postcode, then the appropriate address fields for the selected country are displayed in the standard for that country, as shown on the right-hand side.

Using the postcode entered, the system will fill in all the fields that it can. In the example on the right, for a UK Postcode, the system was able to fill in the Street, Town and County. Because the postcode entered does not relate to a single specific address, the building number or building name are not known and you will need to input them manually.

Postcode Not Known

If you do not know what the postcode is, then you can leave that field blank, in which case the fields following the postcode will be displayed in the order shown on the right.Multi-lingual address validation with no postcode.

Now the fields are displayed in the reverse order to what they were when the postcode was entered.

This enables both prompts and validation for each field to be provided as it is entered. In this UK example, once we know the country, all of the counties can be accessed from a dropdown list.

Once the county is selected, a list of all of the possible towns in that county can be displayed.

Once the town is selected, then a list of all possible streets can be displayed.

All that is left to be entered is the building number and building name, if applicable. Once all this information is entered, the system can now establish what the postcode for this address is and display it.

Head in the Clouds Feet on THe Ground

With all of the resources available in the cloud, it is now extremely easy to validate data entry in this way for any address in almost any country.

However, the key element that underpins the ability to provide this dynamic screen formatting and input validation is the data structure shown in the Logical Data Model below.

Logical data model to support dynamic data entry

Summary

Building the dynamic structures described above into all systems that store the names and addresses of customers, suppliers, etc. is one of the most powerful ways of ensuring that all data is entered correctly first time.

However, there are exceptions that can arise even when you have collected the correct data using such screens. For example, it could be that the name and address you are inputting already exists in your systems. If this were the case, then adding this information as a new record would create duplication. For this reason, the next step in assuring quality when collecting name and address information is to check to see if the information already exists and, if it does, getting the system to take the appropriate action.

I will cover exception handling, such as duplicate entries, in a follow-up post on this subject.

Share the Love

If you enjoyed reading this post and think that it would be of value to a colleague or friend, then please share a link to this page with them. Thank you. 

Share Button

4 Responses to “Data Quality Through Dynamic Data Entry”

  1. Larry Robert July 22, 2013 2:52 pm #

    Hi John,

    There is a lot of merit in this approach. However I think you should at least make mention of the fact that the reference data for the countries of the world is not always available and when it can be found it is of varying detail and quality and invariably in different formats. Using these sources can be costly to license and implement. You mention “resources in the cloud” but the postal files are normally maintained by government or government approved agencies that update the reference data on a regular basis. If the reference data is not updated regularly the system can produce “bad” data. There is also the issue of “good” data becoming “bad” data as soon as a given postal administration makes changes to the postal system. This entails updates to existing records as well as possible changes to the data collection design.

    This may not be a stumbling block to large companies but the implementation and maintenance of the reference data is not trivial.

    I like your approach but it is presented as a bit too simple of a project.

    • John Owens July 22, 2013 8:47 pm #

      Hi Larry

      Thank you for you comments.

      Although all that you say is true, the issues you listed would only apply to some parts of the world and some of the time.

      In the meantime what I propose would speed up data entry and eliminate millions of errors in most countries around the globe each day.

      Regards
      John

  2. Richard Ordowich March 13, 2013 12:14 pm #

    There is frequently a struggle with capturing quality data at point of entry. The desire to capture the order or service request versus the need for quality data. From a business perspective, the order entry trumps data quality. Any data entry constraints that delay or hinder the order, service request, payment or claims processing are viewed as directly affecting revenue. Mandatory field edits, the need to validate data or any data entry constraints are frequently resisted. “We will enter or update it later” is a common refrain.

    Although there is a cost for rework, these costs are factored as the cost of doing business.
    This may not be the best common sense or best practice strategy but it is a reality of business.

    • John Owens March 13, 2013 7:12 pm #

      Thanks, Richard.

      “From a business perspective, the order entry trumps data quality.” This may be true for the sales team. It is definitely not true for the business as a whole.

      The cost of rework on bad data is infinitesimal compared to the cost of bad customer service, failed deliveries, loss of repeat business, bad customer retention, etc., etc.

      One of the major causes of bad data entry is the bad design of input screens. The information is requested in the wrong order, unnecessary information is requested, data is not validated until the end – causing the user to have to start all over again, etc.

      Good design actually speeds up the whole process, while helpfully prompting and validating at field level. It leaves the customer and business far better off and happier.

      Regards
      John

Leave a Reply

Visit Us On TwitterVisit Us On FacebookVisit Us On Youtube