Business scenario where data cleaning matters for investor decisions
Why Clean Data Matters to Investors
Before you can forecast, you need data you can trust. Investors lose confidence when analysts present messy, unverified data.
Sarah's café client needs a weekend forecast to plan inventory and staffing. The POS system exports data that looks ready—but has hidden problems that will wreck any analysis.
What's Wrong
- Dates stored as text (can't do time analysis)
- Prices include $ signs (break formulas)
- Duplicate transaction rows
- Product names with inconsistent spacing
- Missing values where system timed out
Why Investors Care
- Dirty data → unreliable forecasts
- Unverified data → audit failures
- Documented cleaning → trust and credibility
- Clean pipeline → reproducible analysis
"How do you know this data is accurate enough to base decisions on?"
Your answer shapes whether investors trust your forecast. Show documented cleaning steps, before/after metrics, and data quality flags. If you can't prove the data is clean, they won't trust your model.
1. Sarah receives the café's raw weekend sales data. The dates are formatted as text, prices include currency symbols, and some rows are duplicated. What do you clean first?
2. An investor asks: 'How do you know this data is reliable?' What data cleaning evidence should you show?
3. The POS system exported product names with extra spaces: ' Latte ' vs 'Latte'. Why does this matter for analysis?
Discussion Prompt (3 minutes):
What would you do if you had 30 minutes to clean this data before a 2pm investor meeting?
- Which cleaning steps give you the most confidence quickly?
- What would you skip if time runs out?
- What would you tell the investor about your data limitations?