Organizing spatial data (or just data) for organizations

My approach for keeping data from spawling like basket of socks

This is a living post. Check back for updates as I learned more.

Usage

This is a listing of topics that can be used to group or hold data (directories, but also Esri datasets if you lean that way) and it’s accompanied by some words about an approach for managing the data therein. Of course, the approach is laden with problems, but it also solves some issues that have plagued some groups. Take what you need and leave the rest.

Data Directory Structures

I’ve used this approach several times with years between and each time I have to develop it from memory. So I’m going to put it here for future use and revision. Hopefully it helps you too.

The design of the structures defined below should directly in this document were inspired by concepts found in the Spatial Data Standards for Facilities, Infrastructure, and Environment (SDSFIE) version 2.6 , ISO 19115 Topic Categories [1, 2, 3], and the Maine State GIS Catalog public geospatial data access pages (the new pages follow the ISO 19115 Topics).

DELIVERABLES - directories by delivery year then project name containing final products from either this organization or other parties. Files in these project directories should never change and never be added to.

Products may be data as well as documents. Data should be considered read-only - stored exactly as delivered without changes, additions, or restructuring. Data to be used should be copied into working directories outside the DELIVERABLES tree.

 2019
     Project_Name_A
     Project_Name_B
 2020
     Project_Name_C
     Project_Name_ D

MAIN - directories by topic then by year containing finalized, authoritative data that is the most recent, validated, complete, and approved data owned by this organization. Files in these subject directories should only change when 1) a new final version is available, 2) new authoritative files are added to this organization’s holdings, or 3) data should be removed because they are deemed too problematic for use.

Data should be considered read-only until replaced by a newer finalized version. Data can be displayed from the MAIN tree, but should not be edited there. The MAIN tree should not contain working data. Data will be updated following this organization data stewardship guidelines from the data management plan, or as improvements become available, until the end of the year. Then the year directories are copied to make the next year unless the data did not change in the prior year. This approach allows for historical change tracking at the cost of increased data storage.

Working - any directories created outside of DELIVERABLES and MAIN used for intermediate processing. Input data should be copied from the MAIN or DELIVERABLES trees before used in processing.

From an older version of SDSFIE than is current. I can’t access SDSFIE materials any more as it’s primarily a function for federal organizations, like defense installations, and the current useful tools and documentation are largely locked away… because how a base stores information about their roads and swampy hazards is a matter of national security it seems.

SDSFIE 2.6 Classes Old Maine State GIS New Maine State GIS ISO 19115 Topic Category Organization Directory Name Contents
Boundary Administrative and political boundaries boundaries - Legal land descriptions, for example political and administrative boundaries, governmental units, marine boundaries, voting districts, school districts, international boundaries Boundaries Administrative and political boundaries. County Boundaries, Trust boundaries
Organisms, Ecology Biologic and ecologic ⁄ Environment and conservation biota - Flora or fauna in natural environment, for example wildlife, vegetation, biological sciences, ecology, wilderness, sea life, wetlands, habitat, biological resources Biota_Ecological Biological, ecological, Environmental, conservation. Not natural resource management
Cadastre Cadastral and land planning planningCadastre - Information used for appropriate actions for future use of the land, for example land use maps, zoning maps, cadastral surveys, land ownership, parcels, easements, tax maps, federal land ownership status, public land conveyance records Cadastre Parcels, Zoning, Street addresses
Cultural, Demographic Cultural, society, and demographic society - Characteristics of society and culture, for example settlements, housing, anthropology, archaeology, education, traditional beliefs, manners and customs, demographic data, tourism, recreational areas and activities, parks, recreational trails, historical sites, cultural resources, social impact assessments, crime and justice, law enforcement, census information, immigration, ethnicity Cultural_Demographic Consider replacing with Locations
  Elevation and derived products elevation - Height above or below sea level, for example altitude, bathymetry, digital elevation models, slope, derived products, DEMs, TINs Elevation LIDAR, DEM, Slope products.
Real Property Facilities and structures structure - Man-made construction, for example buildings, museums, churches, factories, housing, monuments, shops, towers, building footprints, architectural and structural plans Facilities_Structures  
Geology Geological and geophysical geoscientificInformation - Information pertaining to earth sciences, for example geophysical features and processes, geology, minerals, sciences dealing with the composition, structure and origin of the earth’s rocks, risks of earthquakes, volcanic activity, landslides, gravity information, soils, permafrost, hydrogeology, groundwater, erosion GeoPhysical  
Visual Sensor, Landform Imagery, base maps, and land cover imageryBaseMapsEarthCover - Base maps, for example land/earth cover, topographic maps, imagery, unclassified images, annotations, digital ortho imagery Imagery  
Geodetic Locations and geodetic networks location - Positional information and services, for example addresses, geodetic networks, geodetic control points, postal zones and services, place names, geographic names Locations Geodetic points, social/societal point of interest, historical features
Hydrography Oceans and estuaries ⁄ Inland water resources inlandWaters - Inland water features, drainage systems and characteristics, for example rivers and glaciers, salt lakes, water utilization plans, dams, currents, floods and flood hazards, water quality, hydrographic charts, watersheds, wetlands, hydrography Hydrography Marine and surface freshwater. Including
Improvement Transportation networks transportation - Means and aids for conveying persons or goods, for example roads, airports/airstrips, shipping routes, tunnels nautical charts, vehicle or vessel location, aeronautical charts, railways Transportation  
Utilities, Communications Utility and communication networks utilitiesCommunication - Energy, water and waste systems and communications infrastructure and services, for example hydroelectricity, geothermal, solar and nuclear sources of energy, water purification and distribution, sewage collection and disposal, electricity and gas distribution, data communication, telecommunication, radio, communication networks Utilities  
LandStatus   Land_Characteristics Land use, impervious surface. Not ecological, natural resources, or geophysical.
Future Projects   Planning Features of future work that may be incorporated into other data when completed
Climate climatologyMeteorologyAtmosphere - Processes and phenomena of the atmosphere, for example cloud cover, weather, climate, atmospheric conditions, climate change, precipitation Climate Meteorological data, but not physical installations.
Hazards   Hazards  
    Natural_Resources Forestry, fisheries. Towards yield / standing crop/harvestable biomass assessment, not biological or ecological
    farming - Rearing of animals or cultivation of plants, for example agriculture, irrigation, aquaculture, plantations, herding, pests and diseases affecting crops and livestock Farming  
    economy - Economic activities, conditions, and employment, for example production, labor, revenue, business, commerce, industry, tourism and ecotourism, forestry, fisheries, commercial or subsistence hunting, exploration and exploitation of resources such as minerals, oil and gas Economy  
    environment - Environmental resources, protection and conservation, for example environmental pollution, waste storage and treatment, environmental impact assessment, monitoring environmental risk, nature reserves, landscape, water quality, air quality, environmental modeling Environment  
    health - Health, health services, human ecology, and safety, for example disease and illness, factors affecting health, hygiene, substance abuse, mental and physical health, health services, health care providers, public health Health  
    intelligenceMilitary - Military bases, structures, activities, for example barracks, training grounds, military transportation, information collection Military  
    oceans - Features and characteristics of salt water bodies (excluding inland waters), for example tides, tidal waves, coastal information, reefs, maritime, outer continental shelf submerged lands, shoreline Oceans  

Directory Creation Scripts - To speed you up making folders to hold the data. This standard mkdir command worked for me on MacOS, Linux, and Cygwin on Windows.

mkdir -p DATA/MAIN/Boundaries/2022
mkdir -p DATA/MAIN/Biota_Ecological/2022
mkdir -p DATA/MAIN/Cadastre/2022
mkdir -p DATA/MAIN/Cultural_Demographic/2022
mkdir -p DATA/MAIN/Elevation/2022
mkdir -p DATA/MAIN/Facilities_Structures/2022
mkdir -p DATA/MAIN/GeoPhysical/2022
mkdir -p DATA/MAIN/Imagery/2022
mkdir -p DATA/MAIN/Locations/2022
mkdir -p DATA/MAIN/Hydrography/2022
mkdir -p DATA/MAIN/Transportation/2022
mkdir -p DATA/MAIN/Utilities/2022
mkdir -p DATA/MAIN/Land_Characteristics/2022
mkdir -p DATA/MAIN/Planning/2022
mkdir -p DATA/MAIN/Climate/2022
mkdir -p DATA/MAIN/Hazards/2022
mkdir -p DATA/MAIN/Natural_Resources/2022
mkdir -p DATA/DELIVERABLES/2022/Project_A
mkdir -p DATA/DELIVERABLES/2022/Project_B