Linear referencing events in PostGIS

A beginner's guide to snapping points to lines and describing linear conditions

Table of Contents

Introduction

What is Linear Referencing?

Linear referencing (also called Linear Referencing System or LRS): A method for describing locations along linear features (like roads, trails, or rivers) using measurements instead of traditional X,Y coordinates.

Think of it like this: instead of saying “the pothole is at coordinates -70.12345, 43.67890,” you say “the pothole is 250 meters along Main Street.” This approach is especially powerful because:

  1. The reference stays valid even if the line geometry changes - if the road curves slightly due to construction, your reference point automatically adjusts
  2. You can describe segments easily - “from 100m to 150m needs repair” is clearer than storing multiple coordinate pairs
  3. Overlapping events are allowed - unlike traditional line geometry, multiple issues can exist at the same location

Why is Linear Referencing Useful?

Let’s consider a real-world scenario that demonstrates why linear referencing matters:

Problem: You manage a trail network and need to track maintenance issues. Crews walk the trails weekly and record problems using GPS devices. However:

  • GPS has errors (typically 3-10 meters)
  • You don’t want to create new line geometries every time something changes
  • Multiple issues might overlap on the same trail section
  • The trail network itself occasionally changes (erosion, rerouting)

Solution: Linear referencing allows you to:

  1. Snap GPS observations to the nearest trail automatically
  2. Store issues as measurements along trails (e.g., “200m from the start”)
  3. Describe segments (e.g., “from 200m to 210m needs repair”)
  4. Keep all references valid even when trail geometry changes
  5. Easily query “show me all issues on Trail A” or “find all problems in the first 500 meters”

How Linear Referencing Works in PostGIS

PostGIS (the spatial extension for PostgreSQL) implements linear referencing using m-values (measure values). Here’s the basic workflow:

  1. You have a line layer (e.g., trails, roads, rivers) with standard geometry
  2. You have point observations (e.g., GPS locations where issues were recorded)
  3. PostGIS calculates where each point falls along its nearest line as a fractional value (0 to 1)
    • 0 = start of the line
    • 0.5 = middle of the line
    • 1 = end of the line
  4. You can create new points “snapped” to the exact line location
  5. You can create line segments that reference portions of the original line

The beauty is that steps 4 and 5 create new spatial features that reference the original line rather than duplicating its geometry.

Getting Started

Prerequisites

Before beginning, ensure you have:

  1. PostgreSQL with PostGIS extension installed
    • PostgreSQL 12 or later recommended
    • PostGIS 3.0 or later recommended
  2. A database with spatial data, including:
    • A line layer (in this guide, we use tracks in the myschema schema)
    • A point layer (in this guide, we use obs in the myschema schema)
  3. Database connection credentials:
    • Host: 127.0.0.1 (localhost)
    • Database: mydatabase
    • User: spatial_user
    • Password: (your password)
  4. A PostgreSQL client such as:
    • psql command-line tool
    • pgAdmin graphical interface
    • DBeaver or similar database IDE

Connecting to Your Database

To connect to the database from the command line:

psql -U spatial_user -h 127.0.0.1 mydatabase

You’ll be prompted for your password. Once connected, you should see a prompt like:

mydatabase=>

Tip: To make connections easier, you can create a .pgpass file in your home directory to store your password securely. See the PostgreSQL documentation for details.

Understanding the Basics

The Key Concept: M-Values

In traditional GIS, geometries have X and Y coordinates (and sometimes Z for elevation). Linear referencing adds an M value (measure) that represents position along a line.

Example: Imagine a 1000-meter trail:

Location Description M-Value (fraction) M-Value (meters)
Start Trailhead 0.0 0m
1/4 way First bridge 0.25 250m
Middle Viewpoint 0.5 500m
3/4 way Water source 0.75 750m
End Summit 1.0 1000m

The M-value can be expressed as:

  • A fraction (0 to 1): useful for calculations
  • A distance in your units (meters, feet, etc.): useful for reporting

Essential PostGIS Functions

Here are the core PostGIS functions we’ll use for linear referencing:

Function What It Does Example Use
ST_Distance(geom1, geom2) Calculates shortest distance between two geometries Find how far a GPS point is from a trail
ST_DWithin(geom1, geom2, distance) Returns true if geometries are within specified distance Filter to only points near trails (within 200m)
ST_LineLocatePoint(line, point) Returns fraction (0-1) where point projects onto line Calculate where along a trail an observation falls
ST_LineInterpolatePoint(line, fraction) Returns a point at the given fraction along a line Create a new point snapped to the exact trail location
ST_LineSubstring(line, start_fraction, end_fraction) Returns a segment of a line between two fractions Create a 10m segment centered on an observation
ST_Length(line) Returns the length of a line in your unit system Get trail length in meters
ST_GeometryN(geom, n) Extracts the nth geometry from a collection Ensure we’re working with a single line

Note: Most of these functions work with the geometry’s native units. If you’re using a projected coordinate system like UTM, distances will be in meters. If using geographic coordinates (latitude/longitude), you may need to use the ::geography cast for accurate meter-based distances.

Practical Examples

Now let’s work through a complete example using your actual data: the obs (observations) point layer and tracks line layer in the myschema schema.

Step 1: Exploring Your Data

First, let’s understand what data we have:

-- Connect to the database
\c mydatabase

-- Check what tables exist in myschema
\dt myschema.*

-- Look at the structure of the observations table
\d myschema.obs

-- Look at the structure of the tracks table
\d myschema.tracks

-- Count records in each table
SELECT COUNT(*) as num_observations FROM myschema.obs;
SELECT COUNT(*) as num_tracks FROM myschema.tracks;

-- Check the coordinate system (SRID) being used
SELECT ST_SRID(geom) as srid FROM myschema.obs LIMIT 1;
SELECT ST_SRID(geom) as srid FROM myschema.tracks LIMIT 1;

-- Get basic statistics about track lengths
SELECT
    COUNT(*) as num_tracks,
    ROUND(AVG(ST_Length(geom))::numeric, 2) as avg_length_meters,
    ROUND(MIN(ST_Length(geom))::numeric, 2) as min_length_meters,
    ROUND(MAX(ST_Length(geom))::numeric, 2) as max_length_meters,
    ROUND(SUM(ST_Length(geom))::numeric, 2) as total_length_meters
FROM myschema.tracks;

Expected Output:

num_tracks | avg_length_meters | min_length_meters | max_length_meters | total_length_meters
------------+-------------------+-------------------+-------------------+---------------------
        147 |             89.83 |              1.70 |            782.59 |            13205.13

This tells us:

  • We have 147 track segments
  • Average track is about 90 meters long
  • Tracks range from 1.7m to 783m
  • Total network is about 13.2 km

Why This Matters: Understanding your data helps you choose appropriate distance thresholds. For example, if your shortest track is 45m, you wouldn’t want to snap observations from 200m away!

Step 2: Finding the Nearest Line

Before we can calculate measurements, we need to find which track line each observation point is nearest to.

-- Find the nearest track for each observation
-- This uses ST_DWithin to pre-filter to reasonable candidates (within 200m)
-- Then ST_Distance to calculate exact distances
SELECT
    obs.id as obs_id,
    tracks.id as track_id,
    ROUND(ST_Distance(obs.geom, tracks.geom)::numeric, 2) as distance_meters
FROM myschema.obs
CROSS JOIN LATERAL (
    SELECT id, geom
    FROM myschema.tracks
    WHERE ST_DWithin(obs.geom, tracks.geom, 200)  -- Only consider tracks within 200m
    ORDER BY obs.geom <-> tracks.geom  -- Use distance operator for quick sorting
    LIMIT 1
) tracks
ORDER BY obs.id;

What This Query Does:

  1. CROSS JOIN LATERAL: For each observation, finds matching tracks
  2. ST_DWithin(..., 200): Filters to only tracks within 200 meters (reduces computation)
  3. obs.geom <-> tracks.geom: Special PostGIS operator that quickly estimates distance for sorting
  4. LIMIT 1: Takes only the nearest track
  5. ST_Distance(...): Calculates the precise distance

Expected Output:

obs_id | track_id | distance_meters
--------+----------+-----------------
      1 |      164 |           52.98
      2 |       66 |           40.05
      3 |       19 |           30.24

Understanding Distance: Our observations are between 30-53 meters from the nearest track. This is typical GPS error in forested or urban areas. If you see distances over 50 meters, your GPS observations might be quite inaccurate, or the observation might not actually be near a track. This is important for data quality checking!

Step 3: Calculating Measures

Now let’s calculate the m-value (measure) for each observation - where along the track line it falls.

-- Calculate measures for each observation
WITH nearest_tracks AS (
    SELECT
        obs.id as obs_id,
        obs.geom as obs_geom,
        tracks.id as track_id,
        tracks.geom as track_geom,
        ST_Length(tracks.geom) as track_length_meters,
        ST_Distance(obs.geom, tracks.geom) as distance_to_track
    FROM myschema.obs
    CROSS JOIN LATERAL (
        SELECT id, geom
        FROM myschema.tracks
        WHERE ST_DWithin(obs.geom, tracks.geom, 200)
        ORDER BY obs.geom <-> tracks.geom
        LIMIT 1
    ) tracks
)
SELECT
    obs_id,
    track_id,
    ROUND(track_length_meters::numeric, 2) as track_length_m,
    ROUND(distance_to_track::numeric, 2) as dist_to_track_m,
    -- Calculate the measure (fraction from 0 to 1)
    ROUND(ST_LineLocatePoint(track_geom, obs_geom)::numeric, 4) as measure_fraction,
    -- Calculate the measure in meters
    ROUND((ST_LineLocatePoint(track_geom, obs_geom) * track_length_meters)::numeric, 2) as measure_meters
FROM nearest_tracks
ORDER BY track_id, measure_fraction;

What This Query Does:

  1. CTE (Common Table Expression): The WITH nearest_tracks AS (...) creates a temporary result set we can reference
  2. ST_LineLocatePoint(line, point): Returns the fraction (0-1) along the line where the point projects
  3. Multiply by length: Converting fraction to actual meters makes it more meaningful

Expected Output:

obs_id | track_id | track_length_m | dist_to_track_m | measure_fraction | measure_meters
--------+----------+----------------+-----------------+------------------+----------------
      3 |       19 |         727.00 |           30.24 |           0.0377 |          27.42
      2 |       66 |         306.08 |           40.05 |           0.2049 |          62.71
      1 |      164 |         217.92 |           52.98 |           0.1313 |          28.62

Interpreting Results:

  • Observation 3 is 27.4m along Track 19 (which is 727m total) - very near the start
  • Observation 2 is 62.7m along Track 66 (which is 306m total) - about 20% along
  • Observation 1 is 28.6m along Track 164 (which is 218m total) - about 13% along

Key Insight: The measure_fraction will always be between 0 and 1, making it perfect for calculations regardless of line length.

Step 4: Creating Snapped Points

Now let’s create new point features that are snapped exactly to the track lines. These “event points” are located precisely on the line, not offset by GPS error.

-- Create a table of event points snapped to tracks
DROP TABLE IF EXISTS myschema.event_points;

CREATE TABLE myschema.event_points AS
WITH nearest_tracks AS (
    SELECT
        obs.id as obs_id,
        obs.geom as obs_geom,
        tracks.id as track_id,
        tracks.geom as track_geom,
        ST_Length(tracks.geom) as track_length,
        ST_Distance(obs.geom, tracks.geom) as distance_to_track
    FROM myschema.obs
    CROSS JOIN LATERAL (
        SELECT id, geom
        FROM myschema.tracks
        WHERE ST_DWithin(obs.geom, tracks.geom, 200)
        ORDER BY obs.geom <-> tracks.geom
        LIMIT 1
    ) tracks
),
calculated_measures AS (
    SELECT
        obs_id,
        track_id,
        track_geom,
        track_length,
        distance_to_track,
        ST_LineLocatePoint(track_geom, obs_geom) as measure,
        ST_LineLocatePoint(track_geom, obs_geom) * track_length as measure_meters
    FROM nearest_tracks
)
SELECT
    -- Create the snapped point using ST_LineInterpolatePoint
    ST_LineInterpolatePoint(track_geom, measure) as geom,
    obs_id,
    track_id,
    ROUND(measure::numeric, 4) as measure,
    ROUND(measure_meters::numeric, 2) as measure_meters,
    ROUND(track_length::numeric, 2) as track_length_meters,
    ROUND(distance_to_track::numeric, 2) as original_gps_error_meters
FROM calculated_measures;

-- Add a primary key
ALTER TABLE myschema.event_points ADD PRIMARY KEY (obs_id);

-- Create a spatial index for better performance
CREATE INDEX event_points_geom_idx ON myschema.event_points USING GIST (geom);

-- View the results
SELECT * FROM myschema.event_points ORDER BY track_id, measure;

What This Query Does:

  1. Nested CTEs: Builds on previous logic to calculate measures
  2. ST_LineInterpolatePoint(line, fraction): Creates a new point at the exact fractional location on the line
  3. Creates a physical table: event_points stores the results
  4. Adds indexes: Primary key and spatial index for performance

Expected Output:

obs_id | track_id | measure | measure_meters | track_length_meters | original_gps_error_meters
--------+----------+---------+----------------+---------------------+---------------------------
      3 |       19 |  0.0377 |          27.42 |              727.00 |                     30.24
      2 |       66 |  0.2049 |          62.71 |              306.08 |                     40.05
      1 |      164 |  0.1313 |          28.62 |              217.92 |                     52.98

What Changed: The geom column in this table contains points that are exactly on the track lines, not the original GPS locations. The original_gps_error_meters column shows how far we moved each point to snap it.

Step 5: Building Linear Segments

Finally, let’s create linear segments that represent sections of track. This is useful when an observation describes a problem area that extends along the track (e.g., “10 meters of eroded trail”).

For this example, we’ll create 10-meter segments centered on each observation point. In a real application, you might have a size field in your observations table indicating how many meters are affected.

-- Create linear segments from event points
-- Using a fixed 10-meter segment for demonstration
DROP TABLE IF EXISTS myschema.track_segments;

CREATE TABLE myschema.track_segments AS
WITH event_measures AS (
    SELECT
        ep.obs_id,
        ep.track_id,
        ep.measure,
        ep.measure_meters,
        ep.track_length_meters,
        t.geom as track_geom,
        10.0 as affected_size_meters  -- Fixed 10m segment for demo
    FROM myschema.event_points ep
    JOIN myschema.tracks t ON ep.track_id = t.id
),
segment_measures AS (
    SELECT
        obs_id,
        track_id,
        track_geom,
        measure,
        measure_meters,
        track_length_meters,
        affected_size_meters,
        -- Calculate lower bound (start of affected area)
        -- Subtract half the affected size from the observation point
        GREATEST(0, (measure_meters - (affected_size_meters / 2)) / track_length_meters) as lower_measure,
        -- Calculate upper bound (end of affected area)
        -- Add half the affected size to the observation point
        LEAST(1, (measure_meters + (affected_size_meters / 2)) / track_length_meters) as upper_measure
    FROM event_measures
)
SELECT
    obs_id,
    track_id,
    -- Create line segment using ST_LineSubstring
    ST_LineSubstring(
        track_geom,
        lower_measure,
        upper_measure
    ) as geom,
    ROUND(measure::numeric, 4) as center_measure,
    ROUND(lower_measure::numeric, 4) as start_measure,
    ROUND(upper_measure::numeric, 4) as end_measure,
    ROUND(measure_meters::numeric, 2) as center_meters,
    ROUND((lower_measure * track_length_meters)::numeric, 2) as start_meters,
    ROUND((upper_measure * track_length_meters)::numeric, 2) as end_meters,
    ROUND(affected_size_meters::numeric, 2) as segment_length_meters
FROM segment_measures;

-- Add primary key and spatial index
ALTER TABLE myschema.track_segments ADD COLUMN segment_id SERIAL PRIMARY KEY;
CREATE INDEX track_segments_geom_idx ON myschema.track_segments USING GIST (geom);

-- View the results
SELECT * FROM myschema.track_segments ORDER BY track_id, start_measure;

What This Query Does:

  1. Joins data together: Combines event points, tracks, and original observations
  2. Calculates bounds: Creates lower and upper measures by extending size/2 in each direction
  3. GREATEST() and LEAST(): Ensures measures stay within valid 0-1 range
  4. ST_LineSubstring(line, start, end): Extracts the line segment between two fractional positions

Expected Output:

obs_id | track_id | center_measure | start_measure | end_measure | center_meters | start_meters | end_meters | segment_length_meters
--------+----------+----------------+---------------+-------------+---------------+--------------+------------+-----------------------
      3 |       19 |         0.0377 |        0.0308 |      0.0446 |         27.42 |        22.42 |      32.42 |                 10.00
      2 |       66 |         0.2049 |        0.1885 |      0.2212 |         62.71 |        57.71 |      67.71 |                 10.00
      1 |      164 |         0.1313 |        0.1084 |      0.1543 |         28.62 |        23.62 |      33.62 |                 10.00

Interpreting Results:

  • Observation 3 created a 10-meter segment centered at 27.4m along Track 19 (from 22.4m to 32.4m)
  • Observation 2 created a 10-meter segment centered at 62.7m along Track 66 (from 57.7m to 67.7m)
  • Observation 1 created a 10-meter segment centered at 28.6m along Track 164 (from 23.6m to 33.6m)
  • These segments are actual line geometries that can be displayed in GIS software

Advanced Tip: Instead of creating physical tables, you could create these as database views. This means any time you add a new observation to the obs table, the event points and segments would automatically update!

Here’s how to create them as views:

-- Create event_points as a view instead of a table
CREATE OR REPLACE VIEW myschema.v_event_points AS
WITH nearest_tracks AS (
    -- ... same logic as before ...
)
SELECT * FROM nearest_tracks;

-- Create track_segments as a view
CREATE OR REPLACE VIEW myschema.v_track_segments AS
WITH event_measures AS (
    -- ... same logic as before ...
)
SELECT * FROM event_measures;

Real-World Application

Let’s put this all together with a practical query that managers might use:

-- Summary report: Which tracks have the most issues?
SELECT
    t.id as track_id,
    t.name as track_name,
    t.highway,
    COUNT(DISTINCT s.obs_id) as num_issues,
    ROUND(SUM(s.segment_length_meters)::numeric, 2) as total_affected_meters,
    ROUND((SUM(s.segment_length_meters) / ST_Length(t.geom) * 100)::numeric, 2) as percent_affected,
    ROUND(AVG(ep.original_gps_error_meters)::numeric, 2) as avg_gps_error_meters
FROM myschema.tracks t
LEFT JOIN myschema.track_segments s ON t.id = s.track_id
LEFT JOIN myschema.event_points ep ON s.obs_id = ep.obs_id
WHERE s.obs_id IS NOT NULL
GROUP BY t.id, t.name, t.highway, t.geom
ORDER BY total_affected_meters DESC;

Expected Output:

track_id |  track_name   | highway | num_issues | total_affected_meters | percent_affected | avg_gps_error_meters
----------+---------------+---------+------------+-----------------------+------------------+----------------------
       19 | Porter Avenue | service |          1 |                 10.00 |             1.38 |                30.24
       66 | Riddle Road   | service |          1 |                 10.00 |             3.27 |                40.05
      164 |               | track   |          1 |                 10.00 |             4.59 |                52.98

This tells managers:

  • Porter Avenue (a service road) has 1 issue affecting 10 meters (1.38% of the road)
  • Riddle Road (also a service road) has 1 issue affecting 10 meters (3.27% of the road)
  • An unnamed track has 1 issue affecting 10 meters (4.59% of the track)
  • GPS observations had an average error ranging from 30-53 meters, which is typical for consumer GPS devices

Common Mistakes to Avoid

  1. Using Geographic Coordinates Without Conversion

    Problem: If your data uses latitude/longitude (EPSG:4326), ST_Distance returns degrees, not meters.

    Solution: Either:

    • Convert to a projected coordinate system (like UTM)
    • Use ::geography cast: ST_Distance(geom1::geography, geom2::geography)
  2. Not Checking Distance Thresholds

    Problem: Using too large a distance threshold (e.g., 1000m) might snap observations to wrong tracks.

    Solution: Start with a reasonable threshold (50-200m) and review results.

  3. Forgetting to Handle Edge Cases

    Problem: Observations near the start or end of lines can create invalid measures (< 0 or > 1).

    Solution: Use GREATEST(0, ...) and LEAST(1, ...) to clamp values.

  4. Not Creating Indexes

    Problem: Queries are slow on large datasets.

    Solution: Always create spatial indexes:

    CREATE INDEX idx_name ON table_name USING GIST (geom);
    
  5. Assuming Lines Are Single Geometries

    Problem: Some “lines” are actually MultiLineStrings (collections of lines).

    Solution: Use ST_GeometryN(geom, 1) to extract the first line, or handle multi-geometries explicitly.

Next Steps

Now that you understand linear referencing basics, you can:

  1. Add More Attributes: Include observation dates, severity levels, or maintenance status
  2. Create Advanced Queries: Find overlapping segments, calculate maintenance costs, prioritize work
  3. Build Automated Workflows: Use database triggers to automatically create segments when new observations are added
  4. Integrate with Applications: Connect mobile field apps to directly insert observations
  5. Visualize Results: Load your event points and segments into QGIS, ArcGIS, or web mapping applications

Recommended Learning Path:

  1. Practice the examples in this post with your own data
  2. Learn about PostgreSQL views and materialized views for dynamic updates
  3. Explore PostGIS functions for more advanced spatial analysis
  4. Study database triggers and functions for automation
  5. Investigate tools like pg_tileserv or GeoServer for sharing your data

Additional Resources

Official Documentation:

Conceptual Background:

Community Help:

Related Topics:

  • Database views and materialized views for dynamic data
  • Spatial indexing strategies for performance
  • QGIS integration for visualization
  • Building REST APIs with PostGIS data

This tutorial provides a foundation for working with linear referencing in PostGIS. The techniques shown here can be adapted for roads, rivers, pipelines, utility networks, or any linear feature where you need to track events or conditions along the feature.