Public Mobility Data Visualization POC — Proposal Introduction
Purpose
This proof-of-concept (POC) will demonstrate an interactive, decision-focused visualization layer for public mobility data. It will unify operational and planning perspectives by surfacing reliable, real-time and historical insights on ridership, service reliability, and network performance, enabling faster diagnostics and evidence-based service adjustments.
Objectives
- Integrate and visualize key mobility datasets to validate technical feasibility and user value.
- Deliver a modular, scalable visualization layer with map-centric exploration and time-based analytics.
- Provide actionable KPIs for operations, planning, and policy evaluation.
- Establish data governance, accessibility, and privacy practices suitable for production scale-up.
Primary Users and Core Questions
- Operations managers: Where and when do headway gaps, bunching, or on-time performance (OTP) failures occur?
- Service planners: How do ridership, load factor, and transfer wait times vary by corridor and time-of-day?
- Policy analysts: What are equity and accessibility impacts across demographic and spatial segments?
- Customer comms: How to summarize disruptions and reliability trends for stakeholders?
Data Scope (POC)
- GTFS Static (routes, trips, stops, schedules) and GTFS Realtime (TripUpdates, VehiclePositions, Alerts).
- AVL/AVM (vehicle locations, timestamps), APC (automatic passenger counts) or fare/smartcard aggregates.
- Supplemental: weather, major events, roadworks/incidents (as available).
- Optional: shared micro-mobility or bike-share station status for first/last-mile context.
Key Metrics and Definitions
- Ridership: boardings/alightings; Aggregation by stop/route/time.
- Load factor: passengers ÷ vehicle capacity.
- Reliability: OTP (% within schedule tolerance), headway adherence, excess waiting time (EWT), travel time reliability (e.g., 90th/50th percentile ratio).
- Disruptions: count, duration, affected trips/stops, impact radius.
- Transfers: average wait, transfer success within threshold.
- Equity: KPI distribution by area classification (e.g., low-income or low-access zones).
- Sustainability (optional): estimated CO2e per passenger-km (methodology disclosed if included).
Proposed Visualizations (POC scope)
- Network Performance Map: deck.gl layers over base map; route color encodes reliability or load factor; stop-level circles with tooltips; time slider for playback.
- Headway Adherence Heatmap: route x time-of-day grid; color shows deviation from scheduled headway; drill to trip-level charts.
- Ridership and Load Factor Timeseries: small multiples by route/segment; peak identification and trend comparison (week-over-week).
- OD Matrix or Flow Map (aggregated): zone-to-zone ridership as heatmap; optional H3 hex aggregation.
- Disruption Impact View: timeline of alerts, affected routes/stops highlighted on map; pre/post KPI comparison.
- Transfer Experience: histogram of transfer wait times; spatial view of transfer hotspots.
Interaction Model
- Global controls: date range, time-of-day, day type (weekday/weekend), route/line filters, direction, headsign.
- Brushing and linking: selection on map filters all panels; hovering reveals contextual metrics.
- Drill-through: stop → segment → trip; automatic context retention.
- States: saved views and shareable links for reproducibility.
Architecture and Technology (POC)
- Ingestion: GTFS Static (batch), GTFS-RT (stream or frequent poll), APC/AVL (batch or stream).
- Storage/Modeling: DuckDB or Postgres for POC; dbt for transformations (trips, stop_times, events, KPIs).
- Indexing: H3 or spatial indexes for map tiles; pre-aggregations for time buckets.
- Visualization: deck.gl + Mapbox (map), Vega-Lite or Observable Plot (charts), or Superset for rapid assembly.
- Orchestration: lightweight scheduler (e.g., Airflow/Prefect optional for POC).
- Deployment: containerized app; CDN for static assets; optional feature flags for experiments.
Data Model (POC)
- Fact tables: events_vehicle (positions), events_trip (arrival/departure), fact_ridership (by stop/time), fact_disruptions.
- Dimensions: dim_route, dim_trip, dim_stop, dim_calendar, dim_zone (H3 or admin boundaries).
- KPI tables: precomputed reliability, headway adherence, EWT, load factors by route/stop/time grain (5–15 min).
Accessibility and UX Standards
- Color palettes color-vision safe (Okabe–Ito or ColorBrewer qualitative/diverging).
- WCAG 2.1 AA: keyboard navigation, focus order, alt text for charts, sufficient contrast.
- Dual encoding (color + shape/size) for critical states; tooltips with plain-language labels.
- Map clutter control: progressive disclosure and density-based aggregation.
Privacy, Security, and Governance
- No raw PII in the POC UI. For fare/smartcard data, use aggregation thresholds (e.g., k ≥ 50 per cell/time) and hashing with salted rotation where needed.
- Time and spatial binning to prevent re-identification.
- Role-based access; audit logs for data exports.
- Data retention policy documented for streams.
Performance and Reliability
- Target <2 s median response for typical filter changes; <500 ms for cross-filter updates with cached aggregates.
- Pre-aggregation by time buckets (e.g., 15 min) and tiles; vector tiles or MVT for high-performance maps.
- Graceful degradation: fallback to static aggregates if RT feed is unavailable.
POC Plan and Timeline (6–8 weeks)
- Week 1: Requirements and data audit; KPI definitions; success criteria finalized.
- Week 2: Data ingestion/modeling; schema and dbt pipelines; initial QA.
- Week 3: KPI computation; map and chart prototypes; interaction design.
- Week 4: Integration; cross-filtering; accessibility pass; performance tuning.
- Week 5: User testing with target personas; iterate on usability and latency.
- Week 6: Documentation; handover; demo; backlog for scale-up.
Deliverables
- Interactive POC dashboard with map and KPI panels.
- Data model and transformation scripts; reproducible environment.
- KPI dictionary and visualization style guide.
- Ops handbook (data refresh, monitoring) and privacy assessment.
- Demo dataset and scripted scenarios for stakeholder review.
Success Criteria
- User task completion: identify top 5 reliability bottlenecks and quantify EWT within 10 minutes.
- Performance: median interaction latency <2 s; map renders <1 s on standard hardware.
- Data quality: <2% orphaned stops/trips; clock skew resolved within 5 s tolerance.
- Adoption: positive usability scores from at least three target user roles.
Risks and Mitigations
- Data gaps or inconsistent IDs: implement reconciliation rules; fallback joins on shape and time proximity.
- Clock drift across sources: server-side normalization; tolerance windows.
- Sparse APC coverage: blend with modeled estimates; flag confidence levels.
- RT feed instability: cache last-known-good; display data freshness indicators.
This POC will validate end-to-end feasibility and user value, establish a scalable visualization blueprint, and provide a clear pathway to production with defined governance, accessibility, and performance standards.