Decision Tree (before pruning) — v2.4
In productionRecommended model per §5.2 — best balance of recall (catching spam) and precision (not over-blocking) for a security use case. Promoted Jun 1, 2026.
Test accuracy
96.59%
Test precision (spam)
93.50%
Test recall (spam)
91.50%
Trained on
5,572 msgs
Full model comparison (§5.1)
Four candidates evaluated on a 70/30 train/test split of the 5,572-message corpus (4,825 ham · 747 spam)
| Model | Train acc. | Test acc. | Test recall | Test precision | Verdict |
|---|---|---|---|---|---|
| Decision Tree (before pruning) | 97.22% | 96.59% | 91.50% | 93.50% | Selected — best balance |
| Decision Tree — pruned | 96.75% | 94.79% | 89.90% | 88.18% | Lower precision |
| Random Forest (before pruning) | 95.83% | 95.42% | 82.89% | 97.49% | Misses 17% of spam |
| Random Forest — pruned | 97.49% | 94.70% | 85.31% | 90.78% | Lower recall |
Why this trade-off is acceptable: a 93.5% precision means roughly 6.5% of legitimate messages may be quarantined — but because spam is held (not deleted) and false positives are released within 1 business hour, the security upside of catching 91.5% of spam outweighs that inconvenience cost (§5.2).
Last retraining run — Jun 1, 2026
Passed accuracy gate-
Pulled training data
1,842 confirmed feedback corrections + last 30 days of classified messages → corpus grew to 7,414 labelled examples
-
Retrained Decision Tree on expanded corpus
Completed in 2h 41m · within the 4-hour monthly window (§4.4 Performance)
-
Evaluated against accuracy gate
Precision (need ≥ 93%)
93.8% ✓
Recall (need ≥ 90%)
91.2% ✓
-
Promoted to production as v2.4
Approved by Devon Reyes (ML Engineer) · v2.3 archived with rollback capability — human approval required per §4.3
Pipeline schedule
Version history
Every promoted (and rejected) version is retained with rollback capability