Table of Contents
- Introduction
- Getting Started
- Understanding the Basics
- Practical Examples
- Real-World Application
- Common Mistakes to Avoid
- Next Steps
- Additional Resources
Introduction
What is Linear Referencing?
Linear referencing (also called Linear Referencing System or LRS): A method for describing locations along linear features (like roads, trails, or rivers) using measurements instead of traditional X,Y coordinates.
Think of it like this: instead of saying “the pothole is at coordinates -70.12345, 43.67890,” you say “the pothole is 250 meters along Main Street.” This approach is especially powerful because:
- The reference stays valid even if the line geometry changes - if the road curves slightly due to construction, your reference point automatically adjusts
- You can describe segments easily - “from 100m to 150m needs repair” is clearer than storing multiple coordinate pairs
- Overlapping events are allowed - unlike traditional line geometry, multiple issues can exist at the same location
Why is Linear Referencing Useful?
Let’s consider a real-world scenario that demonstrates why linear referencing matters:
Problem: You manage a trail network and need to track maintenance issues. Crews walk the trails weekly and record problems using GPS devices. However:
- GPS has errors (typically 3-10 meters)
- You don’t want to create new line geometries every time something changes
- Multiple issues might overlap on the same trail section
- The trail network itself occasionally changes (erosion, rerouting)
Solution: Linear referencing allows you to:
- Snap GPS observations to the nearest trail automatically
- Store issues as measurements along trails (e.g., “200m from the start”)
- Describe segments (e.g., “from 200m to 210m needs repair”)
- Keep all references valid even when trail geometry changes
- Easily query “show me all issues on Trail A” or “find all problems in the first 500 meters”
How Linear Referencing Works in PostGIS
PostGIS (the spatial extension for PostgreSQL) implements linear referencing using m-values (measure values). Here’s the basic workflow:
- You have a line layer (e.g., trails, roads, rivers) with standard geometry
- You have point observations (e.g., GPS locations where issues were recorded)
- PostGIS calculates where each point falls along its nearest line as a fractional value (0 to 1)
- 0 = start of the line
- 0.5 = middle of the line
- 1 = end of the line
- You can create new points “snapped” to the exact line location
- You can create line segments that reference portions of the original line
The beauty is that steps 4 and 5 create new spatial features that reference the original line rather than duplicating its geometry.
Getting Started
Prerequisites
Before beginning, ensure you have:
- PostgreSQL with PostGIS extension installed
- PostgreSQL 12 or later recommended
- PostGIS 3.0 or later recommended
- A database with spatial data, including:
- A line layer (in this guide, we use
tracksin themyschemaschema) - A point layer (in this guide, we use
obsin themyschemaschema)
- A line layer (in this guide, we use
- Database connection credentials:
- Host:
127.0.0.1(localhost) - Database:
mydatabase - User:
spatial_user - Password: (your password)
- Host:
- A PostgreSQL client such as:
psqlcommand-line tool- pgAdmin graphical interface
- DBeaver or similar database IDE
Connecting to Your Database
To connect to the database from the command line:
psql -U spatial_user -h 127.0.0.1 mydatabase
You’ll be prompted for your password. Once connected, you should see a prompt like:
mydatabase=>
Tip: To make connections easier, you can create a
.pgpassfile in your home directory to store your password securely. See the PostgreSQL documentation for details.
Understanding the Basics
The Key Concept: M-Values
In traditional GIS, geometries have X and Y coordinates (and sometimes Z for elevation). Linear referencing adds an M value (measure) that represents position along a line.
Example: Imagine a 1000-meter trail:
| Location | Description | M-Value (fraction) | M-Value (meters) |
|---|---|---|---|
| Start | Trailhead | 0.0 | 0m |
| 1/4 way | First bridge | 0.25 | 250m |
| Middle | Viewpoint | 0.5 | 500m |
| 3/4 way | Water source | 0.75 | 750m |
| End | Summit | 1.0 | 1000m |
The M-value can be expressed as:
- A fraction (0 to 1): useful for calculations
- A distance in your units (meters, feet, etc.): useful for reporting
Essential PostGIS Functions
Here are the core PostGIS functions we’ll use for linear referencing:
| Function | What It Does | Example Use |
|---|---|---|
ST_Distance(geom1, geom2) |
Calculates shortest distance between two geometries | Find how far a GPS point is from a trail |
ST_DWithin(geom1, geom2, distance) |
Returns true if geometries are within specified distance | Filter to only points near trails (within 200m) |
ST_LineLocatePoint(line, point) |
Returns fraction (0-1) where point projects onto line | Calculate where along a trail an observation falls |
ST_LineInterpolatePoint(line, fraction) |
Returns a point at the given fraction along a line | Create a new point snapped to the exact trail location |
ST_LineSubstring(line, start_fraction, end_fraction) |
Returns a segment of a line between two fractions | Create a 10m segment centered on an observation |
ST_Length(line) |
Returns the length of a line in your unit system | Get trail length in meters |
ST_GeometryN(geom, n) |
Extracts the nth geometry from a collection | Ensure we’re working with a single line |
Note: Most of these functions work with the geometry’s native units. If you’re using a projected coordinate system like UTM, distances will be in meters. If using geographic coordinates (latitude/longitude), you may need to use the
::geographycast for accurate meter-based distances.
Practical Examples
Now let’s work through a complete example using your actual data: the obs (observations) point layer and tracks line layer in the myschema schema.
Step 1: Exploring Your Data
First, let’s understand what data we have:
-- Connect to the database
\c mydatabase
-- Check what tables exist in myschema
\dt myschema.*
-- Look at the structure of the observations table
\d myschema.obs
-- Look at the structure of the tracks table
\d myschema.tracks
-- Count records in each table
SELECT COUNT(*) as num_observations FROM myschema.obs;
SELECT COUNT(*) as num_tracks FROM myschema.tracks;
-- Check the coordinate system (SRID) being used
SELECT ST_SRID(geom) as srid FROM myschema.obs LIMIT 1;
SELECT ST_SRID(geom) as srid FROM myschema.tracks LIMIT 1;
-- Get basic statistics about track lengths
SELECT
COUNT(*) as num_tracks,
ROUND(AVG(ST_Length(geom))::numeric, 2) as avg_length_meters,
ROUND(MIN(ST_Length(geom))::numeric, 2) as min_length_meters,
ROUND(MAX(ST_Length(geom))::numeric, 2) as max_length_meters,
ROUND(SUM(ST_Length(geom))::numeric, 2) as total_length_meters
FROM myschema.tracks;
Expected Output:
num_tracks | avg_length_meters | min_length_meters | max_length_meters | total_length_meters
------------+-------------------+-------------------+-------------------+---------------------
147 | 89.83 | 1.70 | 782.59 | 13205.13
This tells us:
- We have 147 track segments
- Average track is about 90 meters long
- Tracks range from 1.7m to 783m
- Total network is about 13.2 km
Why This Matters: Understanding your data helps you choose appropriate distance thresholds. For example, if your shortest track is 45m, you wouldn’t want to snap observations from 200m away!
Step 2: Finding the Nearest Line
Before we can calculate measurements, we need to find which track line each observation point is nearest to.
-- Find the nearest track for each observation
-- This uses ST_DWithin to pre-filter to reasonable candidates (within 200m)
-- Then ST_Distance to calculate exact distances
SELECT
obs.id as obs_id,
tracks.id as track_id,
ROUND(ST_Distance(obs.geom, tracks.geom)::numeric, 2) as distance_meters
FROM myschema.obs
CROSS JOIN LATERAL (
SELECT id, geom
FROM myschema.tracks
WHERE ST_DWithin(obs.geom, tracks.geom, 200) -- Only consider tracks within 200m
ORDER BY obs.geom <-> tracks.geom -- Use distance operator for quick sorting
LIMIT 1
) tracks
ORDER BY obs.id;
What This Query Does:
CROSS JOIN LATERAL: For each observation, finds matching tracksST_DWithin(..., 200): Filters to only tracks within 200 meters (reduces computation)obs.geom <-> tracks.geom: Special PostGIS operator that quickly estimates distance for sortingLIMIT 1: Takes only the nearest trackST_Distance(...): Calculates the precise distance
Expected Output:
obs_id | track_id | distance_meters
--------+----------+-----------------
1 | 164 | 52.98
2 | 66 | 40.05
3 | 19 | 30.24
Understanding Distance: Our observations are between 30-53 meters from the nearest track. This is typical GPS error in forested or urban areas. If you see distances over 50 meters, your GPS observations might be quite inaccurate, or the observation might not actually be near a track. This is important for data quality checking!
Step 3: Calculating Measures
Now let’s calculate the m-value (measure) for each observation - where along the track line it falls.
-- Calculate measures for each observation
WITH nearest_tracks AS (
SELECT
obs.id as obs_id,
obs.geom as obs_geom,
tracks.id as track_id,
tracks.geom as track_geom,
ST_Length(tracks.geom) as track_length_meters,
ST_Distance(obs.geom, tracks.geom) as distance_to_track
FROM myschema.obs
CROSS JOIN LATERAL (
SELECT id, geom
FROM myschema.tracks
WHERE ST_DWithin(obs.geom, tracks.geom, 200)
ORDER BY obs.geom <-> tracks.geom
LIMIT 1
) tracks
)
SELECT
obs_id,
track_id,
ROUND(track_length_meters::numeric, 2) as track_length_m,
ROUND(distance_to_track::numeric, 2) as dist_to_track_m,
-- Calculate the measure (fraction from 0 to 1)
ROUND(ST_LineLocatePoint(track_geom, obs_geom)::numeric, 4) as measure_fraction,
-- Calculate the measure in meters
ROUND((ST_LineLocatePoint(track_geom, obs_geom) * track_length_meters)::numeric, 2) as measure_meters
FROM nearest_tracks
ORDER BY track_id, measure_fraction;
What This Query Does:
- CTE (Common Table Expression): The
WITH nearest_tracks AS (...)creates a temporary result set we can reference - ST_LineLocatePoint(line, point): Returns the fraction (0-1) along the line where the point projects
- Multiply by length: Converting fraction to actual meters makes it more meaningful
Expected Output:
obs_id | track_id | track_length_m | dist_to_track_m | measure_fraction | measure_meters
--------+----------+----------------+-----------------+------------------+----------------
3 | 19 | 727.00 | 30.24 | 0.0377 | 27.42
2 | 66 | 306.08 | 40.05 | 0.2049 | 62.71
1 | 164 | 217.92 | 52.98 | 0.1313 | 28.62
Interpreting Results:
- Observation 3 is 27.4m along Track 19 (which is 727m total) - very near the start
- Observation 2 is 62.7m along Track 66 (which is 306m total) - about 20% along
- Observation 1 is 28.6m along Track 164 (which is 218m total) - about 13% along
Key Insight: The measure_fraction will always be between 0 and 1, making it perfect for calculations regardless of line length.
Step 4: Creating Snapped Points
Now let’s create new point features that are snapped exactly to the track lines. These “event points” are located precisely on the line, not offset by GPS error.
-- Create a table of event points snapped to tracks
DROP TABLE IF EXISTS myschema.event_points;
CREATE TABLE myschema.event_points AS
WITH nearest_tracks AS (
SELECT
obs.id as obs_id,
obs.geom as obs_geom,
tracks.id as track_id,
tracks.geom as track_geom,
ST_Length(tracks.geom) as track_length,
ST_Distance(obs.geom, tracks.geom) as distance_to_track
FROM myschema.obs
CROSS JOIN LATERAL (
SELECT id, geom
FROM myschema.tracks
WHERE ST_DWithin(obs.geom, tracks.geom, 200)
ORDER BY obs.geom <-> tracks.geom
LIMIT 1
) tracks
),
calculated_measures AS (
SELECT
obs_id,
track_id,
track_geom,
track_length,
distance_to_track,
ST_LineLocatePoint(track_geom, obs_geom) as measure,
ST_LineLocatePoint(track_geom, obs_geom) * track_length as measure_meters
FROM nearest_tracks
)
SELECT
-- Create the snapped point using ST_LineInterpolatePoint
ST_LineInterpolatePoint(track_geom, measure) as geom,
obs_id,
track_id,
ROUND(measure::numeric, 4) as measure,
ROUND(measure_meters::numeric, 2) as measure_meters,
ROUND(track_length::numeric, 2) as track_length_meters,
ROUND(distance_to_track::numeric, 2) as original_gps_error_meters
FROM calculated_measures;
-- Add a primary key
ALTER TABLE myschema.event_points ADD PRIMARY KEY (obs_id);
-- Create a spatial index for better performance
CREATE INDEX event_points_geom_idx ON myschema.event_points USING GIST (geom);
-- View the results
SELECT * FROM myschema.event_points ORDER BY track_id, measure;
What This Query Does:
- Nested CTEs: Builds on previous logic to calculate measures
- ST_LineInterpolatePoint(line, fraction): Creates a new point at the exact fractional location on the line
- Creates a physical table:
event_pointsstores the results - Adds indexes: Primary key and spatial index for performance
Expected Output:
obs_id | track_id | measure | measure_meters | track_length_meters | original_gps_error_meters
--------+----------+---------+----------------+---------------------+---------------------------
3 | 19 | 0.0377 | 27.42 | 727.00 | 30.24
2 | 66 | 0.2049 | 62.71 | 306.08 | 40.05
1 | 164 | 0.1313 | 28.62 | 217.92 | 52.98
What Changed: The
geomcolumn in this table contains points that are exactly on the track lines, not the original GPS locations. Theoriginal_gps_error_meterscolumn shows how far we moved each point to snap it.
Step 5: Building Linear Segments
Finally, let’s create linear segments that represent sections of track. This is useful when an observation describes a problem area that extends along the track (e.g., “10 meters of eroded trail”).
For this example, we’ll create 10-meter segments centered on each observation point. In a real application, you might have a size field in your observations table indicating how many meters are affected.
-- Create linear segments from event points
-- Using a fixed 10-meter segment for demonstration
DROP TABLE IF EXISTS myschema.track_segments;
CREATE TABLE myschema.track_segments AS
WITH event_measures AS (
SELECT
ep.obs_id,
ep.track_id,
ep.measure,
ep.measure_meters,
ep.track_length_meters,
t.geom as track_geom,
10.0 as affected_size_meters -- Fixed 10m segment for demo
FROM myschema.event_points ep
JOIN myschema.tracks t ON ep.track_id = t.id
),
segment_measures AS (
SELECT
obs_id,
track_id,
track_geom,
measure,
measure_meters,
track_length_meters,
affected_size_meters,
-- Calculate lower bound (start of affected area)
-- Subtract half the affected size from the observation point
GREATEST(0, (measure_meters - (affected_size_meters / 2)) / track_length_meters) as lower_measure,
-- Calculate upper bound (end of affected area)
-- Add half the affected size to the observation point
LEAST(1, (measure_meters + (affected_size_meters / 2)) / track_length_meters) as upper_measure
FROM event_measures
)
SELECT
obs_id,
track_id,
-- Create line segment using ST_LineSubstring
ST_LineSubstring(
track_geom,
lower_measure,
upper_measure
) as geom,
ROUND(measure::numeric, 4) as center_measure,
ROUND(lower_measure::numeric, 4) as start_measure,
ROUND(upper_measure::numeric, 4) as end_measure,
ROUND(measure_meters::numeric, 2) as center_meters,
ROUND((lower_measure * track_length_meters)::numeric, 2) as start_meters,
ROUND((upper_measure * track_length_meters)::numeric, 2) as end_meters,
ROUND(affected_size_meters::numeric, 2) as segment_length_meters
FROM segment_measures;
-- Add primary key and spatial index
ALTER TABLE myschema.track_segments ADD COLUMN segment_id SERIAL PRIMARY KEY;
CREATE INDEX track_segments_geom_idx ON myschema.track_segments USING GIST (geom);
-- View the results
SELECT * FROM myschema.track_segments ORDER BY track_id, start_measure;
What This Query Does:
- Joins data together: Combines event points, tracks, and original observations
- Calculates bounds: Creates lower and upper measures by extending size/2 in each direction
- GREATEST() and LEAST(): Ensures measures stay within valid 0-1 range
- ST_LineSubstring(line, start, end): Extracts the line segment between two fractional positions
Expected Output:
obs_id | track_id | center_measure | start_measure | end_measure | center_meters | start_meters | end_meters | segment_length_meters
--------+----------+----------------+---------------+-------------+---------------+--------------+------------+-----------------------
3 | 19 | 0.0377 | 0.0308 | 0.0446 | 27.42 | 22.42 | 32.42 | 10.00
2 | 66 | 0.2049 | 0.1885 | 0.2212 | 62.71 | 57.71 | 67.71 | 10.00
1 | 164 | 0.1313 | 0.1084 | 0.1543 | 28.62 | 23.62 | 33.62 | 10.00
Interpreting Results:
- Observation 3 created a 10-meter segment centered at 27.4m along Track 19 (from 22.4m to 32.4m)
- Observation 2 created a 10-meter segment centered at 62.7m along Track 66 (from 57.7m to 67.7m)
- Observation 1 created a 10-meter segment centered at 28.6m along Track 164 (from 23.6m to 33.6m)
- These segments are actual line geometries that can be displayed in GIS software
Advanced Tip: Instead of creating physical tables, you could create these as database views. This means any time you add a new observation to the
obstable, the event points and segments would automatically update!
Here’s how to create them as views:
-- Create event_points as a view instead of a table
CREATE OR REPLACE VIEW myschema.v_event_points AS
WITH nearest_tracks AS (
-- ... same logic as before ...
)
SELECT * FROM nearest_tracks;
-- Create track_segments as a view
CREATE OR REPLACE VIEW myschema.v_track_segments AS
WITH event_measures AS (
-- ... same logic as before ...
)
SELECT * FROM event_measures;
Real-World Application
Let’s put this all together with a practical query that managers might use:
-- Summary report: Which tracks have the most issues?
SELECT
t.id as track_id,
t.name as track_name,
t.highway,
COUNT(DISTINCT s.obs_id) as num_issues,
ROUND(SUM(s.segment_length_meters)::numeric, 2) as total_affected_meters,
ROUND((SUM(s.segment_length_meters) / ST_Length(t.geom) * 100)::numeric, 2) as percent_affected,
ROUND(AVG(ep.original_gps_error_meters)::numeric, 2) as avg_gps_error_meters
FROM myschema.tracks t
LEFT JOIN myschema.track_segments s ON t.id = s.track_id
LEFT JOIN myschema.event_points ep ON s.obs_id = ep.obs_id
WHERE s.obs_id IS NOT NULL
GROUP BY t.id, t.name, t.highway, t.geom
ORDER BY total_affected_meters DESC;
Expected Output:
track_id | track_name | highway | num_issues | total_affected_meters | percent_affected | avg_gps_error_meters
----------+---------------+---------+------------+-----------------------+------------------+----------------------
19 | Porter Avenue | service | 1 | 10.00 | 1.38 | 30.24
66 | Riddle Road | service | 1 | 10.00 | 3.27 | 40.05
164 | | track | 1 | 10.00 | 4.59 | 52.98
This tells managers:
- Porter Avenue (a service road) has 1 issue affecting 10 meters (1.38% of the road)
- Riddle Road (also a service road) has 1 issue affecting 10 meters (3.27% of the road)
- An unnamed track has 1 issue affecting 10 meters (4.59% of the track)
- GPS observations had an average error ranging from 30-53 meters, which is typical for consumer GPS devices
Common Mistakes to Avoid
-
Using Geographic Coordinates Without Conversion
Problem: If your data uses latitude/longitude (EPSG:4326),
ST_Distancereturns degrees, not meters.Solution: Either:
- Convert to a projected coordinate system (like UTM)
- Use
::geographycast:ST_Distance(geom1::geography, geom2::geography)
-
Not Checking Distance Thresholds
Problem: Using too large a distance threshold (e.g., 1000m) might snap observations to wrong tracks.
Solution: Start with a reasonable threshold (50-200m) and review results.
-
Forgetting to Handle Edge Cases
Problem: Observations near the start or end of lines can create invalid measures (< 0 or > 1).
Solution: Use
GREATEST(0, ...)andLEAST(1, ...)to clamp values. -
Not Creating Indexes
Problem: Queries are slow on large datasets.
Solution: Always create spatial indexes:
CREATE INDEX idx_name ON table_name USING GIST (geom); -
Assuming Lines Are Single Geometries
Problem: Some “lines” are actually MultiLineStrings (collections of lines).
Solution: Use
ST_GeometryN(geom, 1)to extract the first line, or handle multi-geometries explicitly.
Next Steps
Now that you understand linear referencing basics, you can:
- Add More Attributes: Include observation dates, severity levels, or maintenance status
- Create Advanced Queries: Find overlapping segments, calculate maintenance costs, prioritize work
- Build Automated Workflows: Use database triggers to automatically create segments when new observations are added
- Integrate with Applications: Connect mobile field apps to directly insert observations
- Visualize Results: Load your event points and segments into QGIS, ArcGIS, or web mapping applications
Recommended Learning Path:
- Practice the examples in this post with your own data
- Learn about PostgreSQL views and materialized views for dynamic updates
- Explore PostGIS functions for more advanced spatial analysis
- Study database triggers and functions for automation
- Investigate tools like pg_tileserv or GeoServer for sharing your data
Additional Resources
Official Documentation:
- PostGIS Linear Referencing Functions - Complete function reference
- PostGIS Workshop: Linear Referencing - Interactive tutorial
- PostgreSQL Documentation - Core database concepts
Conceptual Background:
- GIS Geography: Linear Referencing Systems - High-level overview
- FHWA Linear Referencing - Government standard (PDF)
Community Help:
- GIS Stack Exchange - Q&A for GIS professionals
- PostGIS Users Mailing List - Active community support
- PostgreSQL Documentation - Core database features
Related Topics:
- Database views and materialized views for dynamic data
- Spatial indexing strategies for performance
- QGIS integration for visualization
- Building REST APIs with PostGIS data
This tutorial provides a foundation for working with linear referencing in PostGIS. The techniques shown here can be adapted for roads, rivers, pipelines, utility networks, or any linear feature where you need to track events or conditions along the feature.