What a massive dense pancake can teach us about data hygiene



Liam Thorp was not supposed to be next in line for a COVID-19 vaccine.  He’s a 32-year-old Liverpool native with no underlying health conditions.  He describes himself as chunky but not obese.

So why did he get the call?  Because the database had, instead of listing him as 6’2″, listed him as 6.2 centimeters tall, approximately the distance between your fingers when you say “missed it by that much.”

Because of this, plus a correct weight in the system, his body mass index was 28,000, making him a massive dense pancake, weighing over 200 pounds but less than three inches high.

This need not have happened.  But it did, because of lack of data guardrails.  Two lines of code that raised a flag when someone was an adult and, say, under two feet or over nine feet tall would have sufficed.

For us, the cost of lack of data quality is high.  A 1% donor lost rate from lack of deliverability (or from calling someone the wrong name, turning them off from donating) compounds, just like a 1% increase in retention does.

We can solve this with some of our own simple data guardrails.  For example, there correct email address all have both @s and periods in them.  Neither of them comes at the end of the email address.  Nor do email addresses end in .con.  And there are certain characters that aren’t allowed.

Yes, your email system will catch these errors, at worst marking them as a hard bounce.  But if that email goes out, it can count against deliverability scores — each email sent to an out-of-date address is a black mark in your virtual book, making every other email less likely to get there.  And, according to EveryAction, over 20% of nonprofit email goes to spam filters.  If you have 100,000 email addresses, each percentage point of email going to spam costs you over $1,000.

And email addresses aren’t the only data where it pays to install guardrails.  The average nonprofit has about five percent of its mail as undeliverable, much of which can be minimized through regular address verification and NCOA.  Beyond that, though, there are prioritization guardrails that can be set up.  For example, does your caging vendor use the NCOA’ed address in your database or the address printed on the check to enter white mail?  Considering my wife and I just used the last check that had an address on it from three states and five moves ago, I’d recommend the former.

In short, human beings are imperfect.  We strive upwards, but this is always true.  If we can set up a few rules that make sure we aren’t accidentally mailing Nr. and Nrs. So-and-So, the better off we will be.

Sign up for Moore updates