First, let’s talk about dates as a very important topic regarding standardisation. Secondly we will take a look at ways to get standardised values.
Writing dates can be confusing but actually there is only one way writing a date properly (ISO 8601) and visualized in this comic:
CC-NC 2.5, xkcd, https://xkcd.com/1179/
You can easily make sure your data is uniform and standardised (throughout your document) when you use a controlled vocabulary.
Let’s look at column “Sample handling” of the CABS-data sheet. There are actually only two accepted values for this column:
Fresh. But due to spelling errors or different people entering data and having a different way of describing the sample handling there are values like
This is an error to your data and might lead to problems when you examine or do some calculations with it.
Introducing a controlled vocabulary will make sure you can only select the value
Setting up a controlled vocabulary (dropdown list) can be easily done in Excel or LibreCalc.
Let us take another look at column “Group”. Since there are also different values stored
ards one might think of using a controlled vocabulary, too. But the value of this column can actually be described as a Boolean expression:
false, either it belongs to “ARDS” or not. So, every
ards will be
non ARDS will be replaced by
By replacing an individual value with a common expression you have made another standardisation.