Rebuilding Broken GTFS Shapes: How We Redraw the World's Transit Maps

If you have ever tried to build a map from a raw GTFS feed, you have probably discovered an uncomfortable truth: a lot of them are broken. Not in the catastrophic, feed-won't-parse sense. In the quieter, more infuriating sense. The trips are there. The stops are there. The schedule is there. But the shape of the route — the actual line you want to draw on a map — is missing, sparse, or plainly wrong.
At trains.fyi we track more than 130 rail networks worldwide, and somewhere around a third of the feeds we ingest have shape data that is not really fit to render. This post is about how we handle that.
The problem, briefly
The GTFS spec includes an optional file called shapes.txt. It is a sequence of latitude-longitude points that describe the physical path a vehicle takes between stops. When it is present and accurate, everything is easy: you draw the line, the train icon slides along it, and the map looks like a map.
When it is missing, your options are grim. The classic fallback is to draw straight lines between consecutive stops. This works for a handful of tiny metros where stations already sit in a straight line. For almost every other rail system on earth, it falls apart. Long-distance intercity routes become chords across mountain ranges. Curved suburban branches become zigzags. Tunnels and bridges disappear entirely.
And "missing" is only the obvious failure mode. We see feeds where:
- Only a subset of routes has shapes, and the rest inherit the straight-line treatment.
- The shapes exist but follow obsolete alignments the railway stopped using years ago.
- Every trip on a route points at a single shape that has the stops in the wrong order.
- The shape snake-dances between the stops because somebody interpolated from a low-resolution source.
Each of these makes for a bad map. A bad map makes for a bad product.
The insight
Here is the thing. Even when a feed's geometry is a disaster, two pieces of information are almost always reliable:
- The stop locations. Agencies care a lot about where their stations are. Stop coordinates are usually accurate to within a few meters.
- The order of stops on a trip. Even the roughest feeds get the sequence right. They have to — passengers need it.
That gives us a skeleton: a sequence of points in the right order, on or very near the actual railway. The thing we are missing is the curve of the track between them.
Fortunately, that curve is not a secret. The world's railways have been mapped, in exhaustive detail, by an enormous community of volunteers and open data projects. Railway alignments are one of the most thoroughly catalogued things on Earth.
Our approach
The feed's stops and stop sequence give us a skeleton. Open rail infrastructure data gives us the curves between them. The job of our pipeline is to marry the two so that every route traces real track, in the correct order, from the first station to the last.
The result is geometry that looks and behaves like the agency published it themselves, even in cases where they did not publish anything at all. Train icons follow real track. Routes curve through mountains. Underground stretches trace the actual tunnel, not the street above it.
None of this is hand-drawn. The whole thing is automated end to end, runs on every sync, and updates as upstream data improves.
Why it matters
A map is the first thing a user sees. If the lines are wrong, they stop trusting everything else on the screen — the schedules, the real-time positions, the ETAs. If the lines are right, the rest of the product has a chance to earn that trust.
There is also a second-order effect. Once you have a faithful shape, a lot of downstream work gets easier: interpolating a vehicle's position between two GPS pings, detecting whether a reported position is plausible, computing corridor-based search radii, snapping noisy realtime data to a track. All of it becomes tractable the moment the underlying geometry is accurate.
The takeaway
GTFS is a quietly extraordinary piece of infrastructure. It powers every serious transit map, tracker, and journey planner in the world. But the spec is permissive by design, which means feed quality varies widely from agency to agency. Shipping a product that works across 130-plus networks means accepting that reality and engineering around it.
If you want to see the results, open any network page on trains.fyi and look at the route lines. Some are the agency's own shapes, faithfully rendered. Some are ours, reconstructed from scratch.

Free App
Track trains on the go
Get live train positions, arrivals, and trip info in your pocket. Download the app for iOS or Android.
