Teaching AI to Read the City: Predicting Scooter Demand with Open Data

The Problem

Scooters Arrived Without a Manual. Here's the Operating Guide.

Shared e-scooters are notoriously hard to run efficiently. Operators deploy a fixed number of vehicles and hope for the best — even though demand swings wildly by season, day of week, weather, and local events. Too many scooters clog sidewalks; too few mean missed revenue and unhappy riders.

The challenge is especially acute for new deployments: a city launching scooters has no historical data of its own. This paper tackles that problem directly by asking: can we train a machine learning model on one city's data and use it to predict demand in another?

The answer, using Austin (source) and Louisville (target), is yes — with the right techniques.

7M+

Austin trips, Apr 2018–Jan 2020

390K

Louisville trips, Aug 2018–Jan 2020

ML models tested head-to-head

100%

Open-source data sources used

What They Built

The Framework

From Raw Data to Daily Fleet Predictions

The researchers built a pipeline that takes four types of freely available data — scooter trips, weather, census demographics, and built environment info from OpenStreetMap — and combines them to predict the single most operationally useful number: trips per vehicle per day (fleet utilization).

Predicting utilization rate (rather than raw trip count) is smart design. It controls for fleet size differences between cities, avoids the supply-demand chicken-and-egg problem, and directly informs how many vehicles to deploy on any given day.

Model Transfer Pipeline — Austin → Louisville

Source City

Austin, TX

Long-term data · 21 months · 15,000 max fleet

Train model +
Transfer learning

Target City

Louisville, KY

Short pilot data · 3 months · 1,200 max fleet

Step 01

Feature Engineering

Extract time-series features: recent demand, weekly patterns, trend differences

Step 02

Sample Normalization

Rescale each time window to mean=0, std=1 to bridge the gap between cities

Step 03

Label Differencing

Remove trend from demand data so the model predicts changes, not absolute levels

Step 04

Predict Louisville

Apply the Austin-trained model to Louisville's pilot data and evaluate accuracy

The key technical challenge is called covariate shift: Austin and Louisville have different demand scales, fleet sizes, and rider populations. A model trained on Austin data would naively underestimate Louisville's higher utilization rates. The two-step fix — sample normalization plus label differencing — elegantly aligns the distributions without needing to retrain.

The Models

Head-to-Head

Four Algorithms Walk Into Louisville. One Wins.

Four machine learning models were tested, ranging from classical statistics to cutting-edge deep learning. The results challenge a common assumption: more complexity doesn't always win.

Model

Test RMSE

Test MAE

Verdict

LightGBM

Gradient Boosting · Decision Trees

1845.6

346.8

🏆 Winner

Linear Regression

Classical Statistics · Fast

2054.4

381.2

Runner-up

SVR

Support Vector · Kernel Methods

2208.3

371.3

Inconsistent

LSTM Neural Net

Deep Learning · Sequential

2376.0

436.4

Worst

LSTM — the deep learning champion often used for time-series prediction — came in last. The authors explain: for tabular data with mixed static and dynamic features, decision-tree-based models like LightGBM consistently outperform neural networks. This is a well-known pattern in data science competitions, but worth emphasizing for practitioners excited about deep learning.

Without Transfer Learning

2195.7

RMSE (×10⁻⁵)

With Transfer Learning

1845.6

RMSE (×10⁻⁵)

−15.9%

Error reduction
with sample normalization
+ label differencing

Neither transfer strategy alone was enough. Label differencing without normalization didn't help; normalization without differencing didn't help either. Only combining both reduced the cross-city generalization error — by 15.9% on the best model.

What Drives the Prediction

Feature Importance

Yesterday's Demand Is the Best Predictor of Tomorrow's.

The LightGBM model ranks its features by how often it splits on them. The results confirm intuition — but also reveal some surprises about what doesn't matter as much as you'd think.

Time Series Features

67%

Temporal (Day/Season)

9.8%

Sociodemographics

9.6%

Meteorological

7.1%

Built Environment

6.6%

The dominance of time-series features (67%) reflects a fundamental property of urban mobility: tomorrow looks a lot like today, and a lot like last week. The top individual predictor is yesterday's demand (6.6%), followed by elapsed days since service launch (6.3%) — a proxy for the service maturity effect where early users behave differently than regular users.

Removing time-series features caused a 43% jump in prediction error. Removing built environment or sociodemographic features each caused less than 2% degradation — yet both still matter for spatial prediction accuracy.

The practical takeaway: you don't need a massive feature set to build a working demand predictor. You need the last 30 days of trips, the temperature forecast, and basic census data. All of it is free.

Spatial Patterns

Space & Time

Downtown Belongs to Weekends. Campuses Own Weekdays.

Both Austin and Louisville exhibit the same spatial split that the companion study of five cities found. University areas — UT Austin, University of Louisville — dominate weekday demand at all hours. Downtown entertainment districts flip to dominate on weekends and early mornings.

The models' spatial error analysis reveals something important: prediction errors concentrate around downtown and university zones. These high-demand areas are also where the model understimates peaks — because they're also where unpredictable events (festivals, games, concerts) spike demand beyond what historical patterns suggest.

🗺️

Different Urban Structures, Same Spatial Logic

Austin and Louisville have completely different city layouts and sizes — yet their scooter demand is spatially concentrated in the same types of zones (educational and entertainment hubs). This suggests the framework can generalize broadly.

🌡️

Seasonal Synchrony Across Cities

Both cities show demand increasing through spring and summer, dropping from October, and hitting lows in January — despite different climates. The scaled demand trends are nearly identical once fleet size differences are controlled.

📉

Under One Trip Per Vehicle Per Day — and That's a Problem

The median fleet utilization in both cities is below 1 trip/vehicle/day. Most scooters sit idle most of the time. The authors argue fleet sizes should be dynamically adjusted — ideally daily — to match predicted demand, not held constant.

For Cities & Operators

Policy Implications

A Practical Tool, Not Just a Research Exercise

The paper is explicit about practical applicability. The entire methodology uses publicly available data sources that any city or operator can access: open city trip portals, census.gov, openstreetmap.org, and visualcrossing.com. No proprietary data required.

🚀

Deploy New Cities Without Historical Data

The transfer learning approach means a city launching scooters for the first time can borrow demand patterns from a similar city. Only ~3 months of pilot data is needed in the target city before the model adapts.

📅

Dynamic Fleet Sizing by Season and Forecast

Instead of deploying a fixed 1,200 or 15,000 scooters year-round, operators can use daily utilization predictions to right-size fleets. Fewer idle scooters means less sidewalk clutter, lower redistribution costs, and improved sustainability metrics.

🎉

Special Events Need Special Planning

Austin's SXSW festival drove 5–6× normal demand — an extreme outlier the model struggled with. Event calendars should be integrated as explicit features in future model versions, with dedicated redistribution protocols.

📖

Open Data Publication Is a Policy Tool

The study would be impossible without cities publishing trip data. The authors explicitly call for more cities to follow suit — not just for research, but because transparency creates accountability and improves service quality.

The Full Technical Story

The paper contains the complete model specifications, all coefficient tables, spatial error maps, and full feature engineering procedures. Everything reproducible with open data.

Read the Full Paper →

Abouelela, M., Lyu, C., & Antoniou, C. (2023). Exploring the potentials of open-source big data and machine learning in shared mobility fleet utilization prediction. Data Science for Transportation, 5, 5. https://doi.org/10.1007/s42421-023-00068-9

Teaching One City's AI to Predict Another City's Scooters