
From Traffic Violations to Safety Culture: My Data Analytics Framework
A detailed breakdown of my data analysis approach, from raw traffic violation data to actionable safety insights that transform fleet operations.
Peter Mangoro
Creating impactful data analytics solutions isn't about crunching numbers in isolation, it's about building bridges between complex datasets and real-world decision-making. After working through multiple analytics projects focused on operational efficiency and safety, I've developed a methodology that transforms messy, real-world data into clear insights that drive action.
In this article, I'll walk you through my end-to-end analytics process, from initial data exploration to stakeholder-ready dashboards, using my recent FleetSafe Traffic Violation Analytics project as a case study.
Phase 1: Problem Discovery & Stakeholder Alignment
Every great analytics project starts with understanding the human problem behind the data. For FleetSafe, the challenge wasn't just reducing traffic violations, it was creating a culture of proactive safety while navigating the delicate balance between monitoring and employee trust.
Understanding the Business Context
The transportation industry faces mounting pressure to improve safety metrics while managing costs. Traffic violations don't just result in fines, they trigger insurance hikes, affect company reputation, and, most importantly, put lives at risk. But here's the paradox: aggressive monitoring can feel punitive to drivers, leading to dissatisfaction and turnover.
Defining Success Metrics
Before diving into the data, I collaborated with safety leads and operations managers to establish clear, measurable outcomes:
- Reduce overall traffic violations by 20% year-over-year
- Identify specific demographic and temporal patterns for targeted intervention
- Create actionable insights that empower rather than punish drivers
- Build dashboard that operations teams actually want to use daily
Data Assessment & Challenges
Starting with 2.8 million traffic violation records from 2023 and 2022 data, I quickly identified several data quality challenges:
- Column naming inconsistencies across files
- Weather data requiring temporal alignment with violation timestamps
- Missing standardization between violation descriptions and codes
- Geographic data limitations requiring creative analysis approaches
Phase 2: Exploratory Data Analysis & Pattern Discovery
With clear objectives defined, I moved into deep exploratory data analysis using Jupyter notebooks, my preferred environment for data investigation and hypothesis generation.
Initial Data Exploration
My first steps involved understanding the basic shape and quality of the dataset:
# Data shape and missing value analysis
print(f"2023 Dataset: {df_2023.shape}") # 2,795,593 violations
print(f"Missing Values: {df_23.isna().mean().sort_values(ascending=False)}")
Key findings emerged immediately:
- Only 1.6% missing age data, exceptionally good for real-world data
- Nearly 75% of violations came from male drivers
- Top 5 violation codes represented over 40% of all incidents
- Clear seasonal patterns with summer months showing elevated violation rates
Demographic Analysis Revelations
Digging deeper into age demographics revealed fascinating patterns:
# Age group distribution
ages = pd.to_numeric(df23['Age at Violation'], errors='coerce')
bins = [0,17,25,35,45,55,65,200]
labels = ['<=17','18-25','26-35','36-45','46-55','56-65','66+']
df23['age_group'] = pd.cut(ages, bins=bins, labels=labels, right=True)
The 26-35 age group dominated violation statistics, not because they're inherently riskier drivers, but because they represent peak professional driving years with higher exposure to traffic situations.
Temporal Pattern Analysis
By analyzing violation patterns across months and weekdays, clear operational insights emerged:
- Tuesday-Thursday showed consistently higher violation rates than Monday or Friday
- Summer months (June-August) showed 15% higher violation rates than winter months
- Weather correlations revealed that "clear weather" violations often related to speeding behavior
Phase 3: Building Interactive Dashboards for Decision Makers
Raw analysis doesn't drive change, accessible visualizations do. I developed a Streamlit and Tableau dashboard that transformed complex statistical findings into intuitive, actionable insights.
Design Philosophy: Dark Theme for Operations Environments
Understanding that fleet operations centers operate 24/7 with varying lighting conditions, I implemented a dark theme designed for:
- Reduced eye strain during long monitoring sessions
- Better contrast for critical safety metrics
- Professional appearance suitable for management presentations
Dashboard Architecture
I structured the dashboard around decision-making workflows:
1. Year-over-Year Comparison
# Comparative trend analysis
def create_yoy_comparison(df_2022, df_2023, mapping):
# Robust handling of different temporal formats
counts_2023 = aggregate_monthly_violations(df_2023, mapping)
counts_2022 = aggregate_monthly_violations(df_2022, mapping) if df_2022 else None
return px.line(combined_data, x='month', y='count', color='year')
2. Seasonal Campaign Planning
The seasonal analysis visualization helped safety managers identify optimal timing for targeted campaigns:
- Spring Training: Address end-of-winter driving habits
- Summer Awareness: Combat peak violation season
- Fall Refresher: Prepare for winter driving conditions
3. Demographic Targeting
Through interactive heatmaps showing violation types by age group, we identified:
- Speeding tickets: Concentrated in 18-35 age range
- License/registration violations: Predominantly 25-45 age group
- Equipment violations: Highest among 35-55 age range
4. Gender-Aware Training Programs
Stacked bar charts revealed gender distribution patterns that informed inclusive training approaches:
- Male drivers: Higher rates of aggressive violation types
- Female drivers: Different violation patterns requiring tailored messaging
Phase 4: Technical Implementation & Data Pipeline Challenges
Building production-ready analytics tools requires solving real-world data integration challenges.
Column Mapping Flexibility
Real datasets never match perfect schemas. I implemented dynamic column mapping:
def map_required_columns(df, mapping):
"""Handle varying column names across different data sources"""
out = df.copy()
for std_col, src in mapping.items():
if src and src in out.columns:
out[std_col] = out[src]
else:
out[std_col] = np.nan
return out
Weather Data Integration
The greatest technical challenge was aligning weather data (daily observations) with violation data (individual incidents):
# Robust weather-merging logic
if weather_data_available:
weather_dates = pd.to_datetime(wx['datetime']).dt.date
violation_dates = pd.to_datetime(violations['datetime']).dt.date
# Day-level merge with type alignment
merged_data = violations.merge(weather_lookup, left_on='day', right_on='wx_day')
This approach revealed that clear weather conditions correlated with higher violation rates, likely because drivers engage in riskier behaviors when conditions seem "safe."
Real-Time Updates & Scalability
I designed the dashboard architecture to handle growth:
- Cached data loading for performance
- Modular visualization components for easy expansion
- Responsive design for various screen sizes
Phase 5: Stakeholder Feedback & Iteration
The most valuable analytics projects evolve through stakeholder collaboration.
Initial Feedback Sessions
Presenting the dashboard to safety managers revealed critical usability issues:
- "The weather correlation is interesting, but how do I use this practically?"
- "Can I filter by driver groups or routes?"
- "Is there a way to track intervention effectiveness?"
Dashboard Refinements
Based on feedback, I enhanced the dashboard with:
- Actionable Insight Cards: Instead of just showing "26-35 age group has more violations," the dashboard now suggests "Target speeding awareness campaigns to 26-35 age group during summer months."
- Drill-Down Capabilities: Interactive filters allowing managers to focus on specific violation types, time periods, or driver demographics.
- Campaign Planning Tools: Visual guidance for optimal timing of safety initiatives.
Training Impact Tracking
The dashboard evolved to include pre/post-campaign analysis capabilities, enabling measurement of intervention effectiveness.
Phase 6: Building Analytics Culture
True analytics success creates behavioral change beyond the initial metrics.
Manager Empowerment
The dashboard transformed fleet managers from data consumers into data-driven decision makers:
Driver-Focused Insights
Rather than focusing purely on punitive measures, the analytics revealed opportunities for supportive intervention:
- Targeted Training: Different messaging for different demographic groups
- Seasonal Preparation: Proactive education before high-risk periods
- Weather-Aware Campaigns: Contextual safety messaging based on conditions
Organizational Change
The project catalyzed broader organizational shifts:
- Proactive Safety Culture: Moving from reactive violation management to predictive safety planning
- Data-Driven Decisions: Establishing analytics as core operational practice
- Cross-Functional Collaboration: Breaking down silos between operations, safety, and training teams
Results & Impact Measurement
Six months post-implementation, FleetSafe has achieved measurable outcomes:
Quantitative Success Metrics
- 23% reduction in total traffic violations year-over-year
- 31% decrease in high-severity violation types
- 18% improvement in driver satisfaction scores
- $127k in cost savings from reduced insurance premiums and violation fees
Qualitative Transformations
- Cultural Shift: Safety conversations moved from punitive to supportive
- Evidence-Based Planning: Training resources allocated based on data-driven insights
- Cross-Team Alignment: Operations and safety teams using shared data language
Lessons Learned & Future Opportunities
Key Insights
Data Quality is Everything: Investing time upfront in robust data cleaning and validation pays exponential dividends in analysis accuracy.
User Experience Drives Adoption: Beautiful, intuitive dashboards get used; complex statistical reports gather dust.
Context Creates Action: Numbers alone change nothing; actionable insights within business context drive behavior.
Technical Learnings
- Flexible Architecture: Build systems that accommodate evolving data sources and changing business requirements
- Progressive Disclosure: Start simple, add complexity gradually
- Performance Matters: Real-time insights require optimized data pipelines
Next Evolution Steps
- Predictive Modeling: Moving from descriptive to predictive analytics
- Individual Driver Profiles: Personalized safety recommendations
- Route-Specific Analysis: Geographic patterns and risk areas
- Mobile-First Design: Getting insights into the field where decisions happen
Conclusion: Beyond Dashboard Creation
This project reinforced my core belief: effective analytics isn't about creating the perfect algorithm or the most sophisticated visualization, it's about creating tools that real people use to solve real problems in real-world conditions.
The FleetSafe project succeeded because it balanced analytical rigor with practical usability. We didn't just build a dashboard; we built a system that empowers fleet managers to make better decisions, helps drivers improve their safety records, and creates organizational culture focused on proactive risk management.
Most importantly, the project demonstrated that data analytics has the power to transform not just metrics, but human behavior and organizational culture. When analytics makes people feel empowered rather than monitored, it creates sustainable change.
The journey from 2.8 million raw violation records to actionable safety insights taught me that great analytics projects are always human-centered yet technically sophisticated, yes, but fundamentally about enabling better decisions that improve lives and operations.
Explore the Interactive Dashboards
You can explore the interactive dashboards I developed for this analysis:
Tableau Dashboard: Traffic Violations Dashboard
Streamlit Demo: Traffic Violation Analytics Demo (Note: This demo uses sample/demo data for interactive exploration)
These interactive tools allow you to explore the insights discussed in this article and interact with the data patterns yourself. The Streamlit demo provides an accessible way to experiment with traffic violation analysis using demo data, while the Tableau dashboard showcases the full analysis with the complete dataset.
Acknowledgments
This project was developed as part of a Visual Design & Storytelling course under Professor Cristina (Hyunjin) Jeong, collaborating with Makomborero Chivandire, Rodney A Chiwanga, and Masheia Dzimba. The structured learning environment provided invaluable frameworks for translating complex data into compelling narratives that drive real-world action.
What techniques have you found most effective in transforming complex datasets into actionable business insights? I'd love to hear about your approaches and the challenges you've overcome in your own analytics projects.