Skip to content

Austin-Beck/NHI_Repo

Repository files navigation

NHI ETL Pipelines

Automated spatial analysis and data sync that the two web applications within Utah Division of Emergency Management's (DEM) Utah Natural Hazard Inventory web page.

  • Community Explorer - Tool that summarizes natural hazard exposure at four geography levels (State / County / Census Tract / CDP). Scripts read source feature services, compute summaries, and write the results back to the Community Explorer feature services.
  • Hazard Explorer — Tool that displays the raw hazard data layers. Scripts pull external sources (UGS WFS, USGS earthquake API, avalanche.org GeoJSON) and republish them as AGOL hosted feature services so the ExB app can consume them (ExB cannot currently read WFS / GeoJSON / public APIs directly).

Hazards Covered

  • Avalanche
  • Dam Incident
  • Drought
  • Earthquake
  • Flooding
  • Geologic Hazards
  • Wildfire

How It Works

Community Explorer scripts — Automation_Code/community_explorer/

Each script follows the same ETL pattern:

CONFIG  ->  EXTRACT  ->  TRANSFORM  ->  LOAD  ->  CLEANUP
  1. Extract — Pull source data and geography layers into an in-memory workspace, project everything to UTM 12N.
  2. Transform — Run spatial analysis (intersections, zonal statistics, presence checks, etc.) to produce a {GEOID: value} dictionary for each field.
  3. Load — Write results back to the feature service using batch updates via the arcgis API.

See Automation_Code/scripting_framework.md for the full architecture documentation, analysis patterns, and conventions.

Hazard Explorer scripts — Automation_Code/hazard_explorer/

Each script follows the same source-sync pattern:

FETCH  ->  VALIDATE COUNT  ->  VALIDATE SCHEMA  ->  WIPE + RELOAD TARGET
  1. Fetch — Pull the external source (WFS GetFeature, REST API call, etc.) with exponential-backoff retry.
  2. Validate count — If the source returned fewer features than MIN_FEATURE_COUNT, email an alert and abort. Prevents "upstream returned empty → target gets nuked."
  3. Validate schema — Compare source field names to the target layer's fields. Any drift (added / removed / renamed) emails an alert and aborts. Forces a conscious schema update on the target before resuming.
  4. Wipe + reload — Delete existing features and re-add the new ones in 1000-feature chunks.

Shared helpers for the source-sync pattern live in shared/source_sync.py.

Prerequisites

  • ArcGIS Pro with an active portal session (signed into the portal that hosts the target feature services)
  • Python environment: arcgispro_clone (ships with ArcGIS Pro)
  • Required packages: arcpy, arcgis, requests (all included with ArcGIS Pro — no extra installs needed)

Setup

  1. Clone the repository:

    git clone https://github.com/Austin-Beck/NHI_Repo.git
    
  2. Copy the environment template and fill in your SMTP credentials:

    cd NHI_Repo/Automation_Code
    copy .env.example .env
    

    Edit .env with your SMTP username, password, sender address, and recipient list.

  3. Copy the local config template and set the path to your ArcGIS Pro python:

    copy config.local.bat.example config.local.bat
    

    Edit config.local.bat and set PYTHON to the python.exe of the ArcGIS Pro env that has arcpy installed. The default for a stock Pro install is:

    set PYTHON="c:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro_clone\python.exe"
    

    config.local.bat is gitignored so each machine keeps its own path. run_all.bat will fail with a clear message if this file is missing.

  4. Ensure ArcGIS Pro is signed into the portal that hosts the Community Explorer + Hazard Explorer feature services.

Usage

Run everything in sequence (Task Scheduler entry point):

Automation_Code\run_all.bat

run_all.bat runs the Hazard Explorer source-sync scripts first (so the Community Explorer scripts see fresh source data), then the Community Explorer hazard analysis scripts, reporting pass/fail status per script. Logs older than 7 days are cleaned up first.

Run a single script:

Use the same python.exe you set in config.local.bat. Default location:

"c:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro_clone\python.exe" Automation_Code\community_explorer\flooding.py
"c:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro_clone\python.exe" Automation_Code\hazard_explorer\wfs_update.py

Logs land in Automation_Code/logs/ with the naming convention {script}_{date}.log. Each log captures start/end time, feature counts, update counts, and full stack traces on failure.

Adding a New Script

Community Explorer (hazard analysis)

  1. Copy Automation_Code/community_explorer/_template.py to Automation_Code/community_explorer/{your_hazard}.py.
  2. Fill in the SOURCES dictionary with feature service URLs.
  3. Fill in FIELDS_TO_POPULATE with the target field names.
  4. Write your run() function using the shared analysis helpers from shared/transform.py.
  5. Test against one geography level first, then enable the rest.
  6. Add an entry to run_all.bat under the Community Explorer block.
  7. Review the log file for correct counts and timing.

Refer to Automation_Code/scripting_framework.md for detailed guidance.

Hazard Explorer (source sync)

  1. Copy Automation_Code/hazard_explorer/_template.py to Automation_Code/hazard_explorer/{your_source}.py.
  2. Fill in SOURCE_URL, SOURCE_PARAMS, TARGET_ITEM_ID, and MIN_FEATURE_COUNT.
  3. Confirm the target AGOL hosted layer's schema matches the source's field names exactly (the strict schema check will abort otherwise).
  4. Add an entry to run_all.bat under the Hazard Explorer block.
  5. Test by running the script once and confirming the target layer was populated.

Email Notifications

Both pipelines use the same SMTP config in .env. Two notification types:

  • Failure email — Sent when a script crashes with an unhandled exception.
  • Alert email — Sent by Hazard Explorer source-sync guardrails when a source returned too few features or the source schema has drifted. The script aborts before any destructive operation.

Set NOTIFY_ENABLED = False in Automation_Code/shared/config.py to disable notifications entirely.