Full Poster Here.
Motivation
We receive thousands of files a year that need to be cleaned and harmonized into a standardized
format. Today, we have to write a custom code script for each file. If we keep going on this way,
we’ll 🔥 out our team and our budget.
Key Results
🖼No-code dashboard replaces the need for thousands of lines of custom code
🧹Automated cleaning and harmonization system ensures highest quality results
📈System is ready for future growth and scale
Challenge
In order for new customers to be onboarded to a client, their census data must be ingested into
the clients’ database.
Census files are often messy, requiring extensive cleaning before they can be uploaded.
The client’s data science team was on track to spend 1 FTE / year writing hundreds of custom
scripts to clean files.
Preparing for January 2023 with the current process would have taken 100% of the data
foundation team capacity, blocking any new investments.
Deliverables
No-code configuration and testing dashboard. Empowers implementation team to define
company specific business logic and test outputs in a UI
Smart cleaning package. Extensible package automates file cleaning tasks, removing the need
for company-specific scripts
Daily CRON job to process latest census files. Deployed, scheduled ETL runs without the need
for manual running or babysitting
Business Impact
All files for January 2023 launches leveraged the new process. 100% passed the new quality
control checks.
One implementation team member implemented and monitored cleaning using a no-code
interface.