How to Locate Records With Missing Data in Microsoft® Access Databases
Last updated on 2024-05-18.
Introduction
If the information output from a database is to be relied upon, then it's essential that its data is complete and accurate.
The database developer should have included checks on the accuracy of data at the point of input. However, in my experience, there will be some incomplete or wrong records that have got past the checks.
In this article are suggestions on how to weed out missing, inaccurate or inappropriate records. These, if left undetected and uncorrected, will lead to misreporting of, and incorrect conclusions drawn from, the data.
What Classes as Missing Data
These are circumstances in which we might consider data to be “missing”:
- No input whatsoever has been made. That is, the field contains a Null.
- A cell that appears on screen to be blank, yet harbours content.
- An entry that contains a value, yet that value is patently wrong.
- An entry that can be deemed as wrong according to some rule.
- Stretching the sense of “missing”, yet there's a pattern of inputs having been made into the wrong columns. In many databases I've been charged with upgrading, users have done their best to record data in circumstances in which the database has made it difficult for them to achieve this. Hence mobile numbers appear in email fields, countries in county fields, and so on.
The following sections aim to address each of these circumstances.
Missing Data in Text Fields
These are conditions that count towards missing data in fields to which the Text data type has been applied;
- Nothing input, i.e. a Null value.
- The user has input a zero-length string in order to record that “I know there is no value”.
- The user has typed spaces to make it appear that for a record there's no data. This technique is not infrequently resorted to by users instead of deleting content.
In all three of the instances above cells will appear empty.
To locate Nulls and zero-length strings is straightforward:
The query locates both. Criteria are on different rows establishing an OR condition between them.
Locating records that contain space characters only in a field takes a little more thought. Here's one solution in which we introduce an expression filed into the query and apply a criterion to that.
To create the expression we need to appreciate the action of a couple of Access functions:
- Trim removes leading and trailing spaces from a string of characters.
- Len returns the number of characters in a string of text.
To test this, work in the Immediate Window of your database's Visual Basic Editor. Input ?Trim(" abc "). Press Enter to see the result.
To test that Trim removes leading and trailing spaces only, try
?Trim(" abc def ").
Hence ?Len(" abc def ") returns 11.
Note that Len returns zero when tested on a zero-length string.
?Len("")
By combining there two functions we have the basis of a test for text that consists of space characters only:
?Len(Trim(" "))
For which the result is zero. We use this formula as the expression field ItemLength in our query:
Apply the appropriate criterion, of zero, to complete the query:
Missing Data in Numeric Fields
To test for missing values in a field to which Numeric data type has been applied, you're criterion is Null, as in this example of a query used to list missing values:
Using the logical OR condition, as explained in Logical AND and OR Conditions in Microsoft Access Query Criteria, you can combine searches for values missing from more than one field:
With these results:
Missing Data in Date/Time Fields
As explained in How to Work with Dates and Times in Microsoft Access Databases, Access stores a date as a number. So, for finding missing data in any field to which the Date/Time data type has been applied, you go ahead as you would for missing numbers:
Wrong Values in Text Fields
In the section above we covered missing values in text fields. In this one we consider entries that are wrong, or inappropriate, and therefore should be brought to attention in any data cleansing exercise.
You'd probably agree that an email field into which an address has been input that does not include the @ character needs attention. Let's consider achieving this using the example of these tables from How to Create a Contacts Database in Microsoft Access:
Here's the query that will list missing emails — null and zero-length strings — and those addresses that don't include an @ character:
Note the one-to-many link between the two tables has been adapted for purposes of the query to show all records for people in tblPerson, not just for those that have an entry recorded in tblPersonEmail.
Another example, is when the user hadn't understood the significance of the default value set by the developer, and just let it suffice.
Conclusions
This article addresses some of the issues relating to quality of recorded data. The examples are based on my experience of upgrading databases that have been poorly designed. Had more thought gone into the design, then the possibilities for missing, or wrongly recorded, data could have been minimised.
If you have examples of missing or incorrect values, and you consider including these in this page would make it more useful, then please email me about them.
Your Support for DMW TIPS
Please support this website by making a donation to help keep it free of advertising and to help towards cost of time spent adding new content.
To make a contribution by PayPal in GBP (£ sterling) —
To make a contribution by PayPal in USD ($ US) —