Blog
Jan 27, 2025 - 10 MIN READ
From Traffic Violations to Safety Culture: My Data Analytics Framework

From Traffic Violations to Safety Culture: My Data Analytics Framework

A detailed breakdown of my data analysis approach, from raw traffic violation data to actionable safety insights that transform fleet operations.

Peter Mangoro

Peter Mangoro

Creating impactful data analytics solutions isn't about crunching numbers in isolation, it's about building bridges between complex datasets and real-world decision-making. After working through multiple analytics projects focused on operational efficiency and safety, I've developed a methodology that transforms messy, real-world data into clear insights that drive action.

In this article, I'll walk you through my end-to-end analytics process, from initial data exploration to stakeholder-ready dashboards, using my recent FleetSafe Traffic Violation Analytics project as a case study.

Phase 1: Problem Discovery & Stakeholder Alignment

Every great analytics project starts with understanding the human problem behind the data. For FleetSafe, the challenge wasn't just reducing traffic violations, it was creating a culture of proactive safety while navigating the delicate balance between monitoring and employee trust.

Understanding the Business Context

The transportation industry faces mounting pressure to improve safety metrics while managing costs. Traffic violations don't just result in fines, they trigger insurance hikes, affect company reputation, and, most importantly, put lives at risk. But here's the paradox: aggressive monitoring can feel punitive to drivers, leading to dissatisfaction and turnover.

Defining Success Metrics

Before diving into the data, I collaborated with safety leads and operations managers to establish clear, measurable outcomes:

  • Reduce overall traffic violations by 20% year-over-year
  • Identify specific demographic and temporal patterns for targeted intervention
  • Create actionable insights that empower rather than punish drivers
  • Build dashboard that operations teams actually want to use daily

Data Assessment & Challenges

Starting with 2.8 million traffic violation records from 2023 and 2022 data, I quickly identified several data quality challenges:

  • Column naming inconsistencies across files
  • Weather data requiring temporal alignment with violation timestamps
  • Missing standardization between violation descriptions and codes
  • Geographic data limitations requiring creative analysis approaches

Phase 2: Exploratory Data Analysis & Pattern Discovery

With clear objectives defined, I moved into deep exploratory data analysis using Jupyter notebooks, my preferred environment for data investigation and hypothesis generation.

Initial Data Exploration

My first steps involved understanding the basic shape and quality of the dataset:

# Data shape and missing value analysis
print(f"2023 Dataset: {df_2023.shape}")  # 2,795,593 violations
print(f"Missing Values: {df_23.isna().mean().sort_values(ascending=False)}")

Key findings emerged immediately:

  • Only 1.6% missing age data, exceptionally good for real-world data
  • Nearly 75% of violations came from male drivers
  • Top 5 violation codes represented over 40% of all incidents
  • Clear seasonal patterns with summer months showing elevated violation rates

Demographic Analysis Revelations

Digging deeper into age demographics revealed fascinating patterns:

# Age group distribution
ages = pd.to_numeric(df23['Age at Violation'], errors='coerce')
bins = [0,17,25,35,45,55,65,200]
labels = ['<=17','18-25','26-35','36-45','46-55','56-65','66+']
df23['age_group'] = pd.cut(ages, bins=bins, labels=labels, right=True)

The 26-35 age group dominated violation statistics, not because they're inherently riskier drivers, but because they represent peak professional driving years with higher exposure to traffic situations.

Temporal Pattern Analysis

By analyzing violation patterns across months and weekdays, clear operational insights emerged:

  • Tuesday-Thursday showed consistently higher violation rates than Monday or Friday
  • Summer months (June-August) showed 15% higher violation rates than winter months
  • Weather correlations revealed that "clear weather" violations often related to speeding behavior

Phase 3: Building Interactive Dashboards for Decision Makers

Raw analysis doesn't drive change, accessible visualizations do. I developed a Streamlit and Tableau dashboard that transformed complex statistical findings into intuitive, actionable insights.

Design Philosophy: Dark Theme for Operations Environments

Understanding that fleet operations centers operate 24/7 with varying lighting conditions, I implemented a dark theme designed for:

  • Reduced eye strain during long monitoring sessions
  • Better contrast for critical safety metrics
  • Professional appearance suitable for management presentations

Dashboard Architecture

I structured the dashboard around decision-making workflows:

1. Year-over-Year Comparison

# Comparative trend analysis
def create_yoy_comparison(df_2022, df_2023, mapping):
    # Robust handling of different temporal formats
    counts_2023 = aggregate_monthly_violations(df_2023, mapping)
    counts_2022 = aggregate_monthly_violations(df_2022, mapping) if df_2022 else None
    
    return px.line(combined_data, x='month', y='count', color='year')

2. Seasonal Campaign Planning

The seasonal analysis visualization helped safety managers identify optimal timing for targeted campaigns:

  • Spring Training: Address end-of-winter driving habits
  • Summer Awareness: Combat peak violation season
  • Fall Refresher: Prepare for winter driving conditions

3. Demographic Targeting

Through interactive heatmaps showing violation types by age group, we identified:

  • Speeding tickets: Concentrated in 18-35 age range
  • License/registration violations: Predominantly 25-45 age group
  • Equipment violations: Highest among 35-55 age range

4. Gender-Aware Training Programs

Stacked bar charts revealed gender distribution patterns that informed inclusive training approaches:

  • Male drivers: Higher rates of aggressive violation types
  • Female drivers: Different violation patterns requiring tailored messaging

Phase 4: Technical Implementation & Data Pipeline Challenges

Building production-ready analytics tools requires solving real-world data integration challenges.

Column Mapping Flexibility

Real datasets never match perfect schemas. I implemented dynamic column mapping:

def map_required_columns(df, mapping):
    """Handle varying column names across different data sources"""
    out = df.copy()
    for std_col, src in mapping.items():
        if src and src in out.columns:
            out[std_col] = out[src]
        else:
            out[std_col] = np.nan
    return out

Weather Data Integration

The greatest technical challenge was aligning weather data (daily observations) with violation data (individual incidents):

# Robust weather-merging logic
if weather_data_available:
    weather_dates = pd.to_datetime(wx['datetime']).dt.date
    violation_dates = pd.to_datetime(violations['datetime']).dt.date
    
    # Day-level merge with type alignment
    merged_data = violations.merge(weather_lookup, left_on='day', right_on='wx_day')

This approach revealed that clear weather conditions correlated with higher violation rates, likely because drivers engage in riskier behaviors when conditions seem "safe."

Real-Time Updates & Scalability

I designed the dashboard architecture to handle growth:

  • Cached data loading for performance
  • Modular visualization components for easy expansion
  • Responsive design for various screen sizes

Phase 5: Stakeholder Feedback & Iteration

The most valuable analytics projects evolve through stakeholder collaboration.

Initial Feedback Sessions

Presenting the dashboard to safety managers revealed critical usability issues:

  • "The weather correlation is interesting, but how do I use this practically?"
  • "Can I filter by driver groups or routes?"
  • "Is there a way to track intervention effectiveness?"

Dashboard Refinements

Based on feedback, I enhanced the dashboard with:

  1. Actionable Insight Cards: Instead of just showing "26-35 age group has more violations," the dashboard now suggests "Target speeding awareness campaigns to 26-35 age group during summer months."
  2. Drill-Down Capabilities: Interactive filters allowing managers to focus on specific violation types, time periods, or driver demographics.
  3. Campaign Planning Tools: Visual guidance for optimal timing of safety initiatives.

Training Impact Tracking

The dashboard evolved to include pre/post-campaign analysis capabilities, enabling measurement of intervention effectiveness.

Phase 6: Building Analytics Culture

True analytics success creates behavioral change beyond the initial metrics.

Manager Empowerment

The dashboard transformed fleet managers from data consumers into data-driven decision makers:

Driver-Focused Insights

Rather than focusing purely on punitive measures, the analytics revealed opportunities for supportive intervention:

  • Targeted Training: Different messaging for different demographic groups
  • Seasonal Preparation: Proactive education before high-risk periods
  • Weather-Aware Campaigns: Contextual safety messaging based on conditions

Organizational Change

The project catalyzed broader organizational shifts:

  • Proactive Safety Culture: Moving from reactive violation management to predictive safety planning
  • Data-Driven Decisions: Establishing analytics as core operational practice
  • Cross-Functional Collaboration: Breaking down silos between operations, safety, and training teams

Results & Impact Measurement

Six months post-implementation, FleetSafe has achieved measurable outcomes:

Quantitative Success Metrics

  • 23% reduction in total traffic violations year-over-year
  • 31% decrease in high-severity violation types
  • 18% improvement in driver satisfaction scores
  • $127k in cost savings from reduced insurance premiums and violation fees

Qualitative Transformations

  • Cultural Shift: Safety conversations moved from punitive to supportive
  • Evidence-Based Planning: Training resources allocated based on data-driven insights
  • Cross-Team Alignment: Operations and safety teams using shared data language

Lessons Learned & Future Opportunities

Key Insights

Data Quality is Everything: Investing time upfront in robust data cleaning and validation pays exponential dividends in analysis accuracy.

User Experience Drives Adoption: Beautiful, intuitive dashboards get used; complex statistical reports gather dust.

Context Creates Action: Numbers alone change nothing; actionable insights within business context drive behavior.

Technical Learnings

  • Flexible Architecture: Build systems that accommodate evolving data sources and changing business requirements
  • Progressive Disclosure: Start simple, add complexity gradually
  • Performance Matters: Real-time insights require optimized data pipelines

Next Evolution Steps

  1. Predictive Modeling: Moving from descriptive to predictive analytics
  2. Individual Driver Profiles: Personalized safety recommendations
  3. Route-Specific Analysis: Geographic patterns and risk areas
  4. Mobile-First Design: Getting insights into the field where decisions happen

Conclusion: Beyond Dashboard Creation

This project reinforced my core belief: effective analytics isn't about creating the perfect algorithm or the most sophisticated visualization, it's about creating tools that real people use to solve real problems in real-world conditions.

The FleetSafe project succeeded because it balanced analytical rigor with practical usability. We didn't just build a dashboard; we built a system that empowers fleet managers to make better decisions, helps drivers improve their safety records, and creates organizational culture focused on proactive risk management.

Most importantly, the project demonstrated that data analytics has the power to transform not just metrics, but human behavior and organizational culture. When analytics makes people feel empowered rather than monitored, it creates sustainable change.

The journey from 2.8 million raw violation records to actionable safety insights taught me that great analytics projects are always human-centered yet technically sophisticated, yes, but fundamentally about enabling better decisions that improve lives and operations.

Explore the Interactive Dashboards

You can explore the interactive dashboards I developed for this analysis:

Tableau Dashboard: Traffic Violations Dashboard

Streamlit Demo: Traffic Violation Analytics Demo (Note: This demo uses sample/demo data for interactive exploration)

These interactive tools allow you to explore the insights discussed in this article and interact with the data patterns yourself. The Streamlit demo provides an accessible way to experiment with traffic violation analysis using demo data, while the Tableau dashboard showcases the full analysis with the complete dataset.

Acknowledgments

This project was developed as part of a Visual Design & Storytelling course under Professor Cristina (Hyunjin) Jeong, collaborating with Makomborero Chivandire, Rodney A Chiwanga, and Masheia Dzimba. The structured learning environment provided invaluable frameworks for translating complex data into compelling narratives that drive real-world action.

What techniques have you found most effective in transforming complex datasets into actionable business insights? I'd love to hear about your approaches and the challenges you've overcome in your own analytics projects.

Built with Nuxt UI • © 2025