Production-pattern T-SQL toolkit: idempotent imports, audit triggers, SCD Type 2, analytics KPIs, maintenance helpers, data-quality checks. Battle-tested patterns from enterprise ERP work, rebuilt clean-room on synthetic data.
| Module | What it gives you | Key file |
|---|---|---|
| Schema | Core dims & facts (users, programs, attendance, invoices, payments) on a clean star layout | schema/001-core-schema.sql |
| Import log | Audit table + start/finish procs used by every idempotent import | schema/002-import-log.sql |
| Audit framework | Generic row-level audit table + trigger generator (JSON before/after) | schema/003-audit-framework.sql |
| SCD2 dimensions | Slowly-Changing-Dimension Type 2 schema for student history | schema/004-scd2-dimensions.sql |
| Idempotent imports | sp_import_attendance — re-runnable with same batch_id, zero duplicates |
procedures/imports/sp_import_attendance.sql |
| Daily close | Roll-up of raw punches → one row per user-day with status + duration | procedures/imports/sp_close_attendance_day.sql |
| SCD2 merge | Handles brand-new keys, unchanged rows (no-op), changes, late-arriving rows | procedures/scd2/sp_merge_dim_student.sql |
| Analytics KPIs | Absenteeism, fee collection rate, aging buckets, per-program drill | procedures/analytics/ |
| Maintenance | Index health, smart reindex, missing-index DMV report, unused-index report | procedures/maintenance/ |
| Data quality | Generic orphan finder, generic duplicate finder | procedures/dq/ |
| Functions | fn_business_days_between, fn_age_in_years, fn_split_csv, fn_date_range |
functions/ |
| Views | vw_attendance_daily, vw_finance_summary — Power BI / Excel friendly |
views/ |
| tSQLt tests | Unit tests for imports, SCD2, DQ | tests/ |
| Seed data | ~1000 rows synthetic data — students, punches, invoices | samples/seed-small.sql |
| Docs | T-SQL conventions, idempotency pattern, SCD2 pattern | docs/ |
The single most valuable script in this toolkit. Most enterprise import SPs in the wild have a hidden bug: re-running them inserts duplicates.
-- Re-runnable. Same @batch_id = zero new rows.
EXEC core.sp_import_attendance
@batch_id = N'BATCH-2026-05-23',
@source_file = N'punches.csv';The procedure:
- Writes a
RUNNINGrow toaudit.import_log - Idempotency guard: if
(batch_id, operation, status=COMPLETED)already exists → returnsSKIPPED_IDEMPOTENT - Joins staging → users → computes SHA2_256 row hash
INSERT ... WHERE NOT EXISTSagainst the unique index on(user_id, punch_at, device_id, direction)- Marks audit row
COMPLETEDwith@@ROWCOUNT, orFAILEDwith error +THROW
Full story → docs/idempotency-pattern.md
- SQL Server 2019+ (LocalDB / Express / Developer all work)
sqlcmdon PATH (ships with SQL Server)
# From repo root:
cd tools
sqlcmd -S "(localdb)\MSSQLLocalDB" -i run-everything.sql -v ServerType=LocalDBWhat it does:
- Drops/creates
ERPToolkitDemodatabase - Applies all schema scripts in order
- Creates every proc, function, and view
- Seeds ~1000 rows of synthetic data
- Closes a sample day & prints KPIs
sqlcmd -S "(localdb)\MSSQLLocalDB" -i schema/000-setup-database.sql
sqlcmd -S "(localdb)\MSSQLLocalDB" -i schema/001-core-schema.sql
# ...etc — see tools/run-everything.sql for the canonical orderUSE ERPToolkitDemo;
-- First run inserts rows
EXEC core.sp_import_attendance @batch_id = N'DEMO-1';
SELECT COUNT(*) AS row_count_after_first_run FROM core.fact_punch WHERE batch_id = N'DEMO-1';
-- Second run is a no-op — zero new rows
EXEC core.sp_import_attendance @batch_id = N'DEMO-1';
SELECT COUNT(*) AS row_count_after_second_run FROM core.fact_punch WHERE batch_id = N'DEMO-1';
-- Audit trail
SELECT TOP 5 batch_id, status, rows_affected, started_at, finished_at
FROM audit.import_log
WHERE operation = N'IMPORT_ATTENDANCE'
ORDER BY log_id DESC;USE ERPToolkitDemo;
EXEC mart.sp_close_attendance_day @for_date = '2026-04-15';
EXEC mart.sp_kpi_absenteeism
@from_date = '2026-04-01',
@to_date = '2026-04-30';
EXEC mart.sp_kpi_collection_rate
@from_date = '2026-04-01',
@to_date = '2026-04-30';-- 1. Install tSQLt manually — see tools/install-tsqlt.sql
-- 2. Then:
USE ERPToolkitDemo;
EXEC tSQLt.Run 'imports';
EXEC tSQLt.Run 'scd2_tests';
EXEC tSQLt.Run 'dq_tests';
-- Or run everything:
EXEC tSQLt.RunAll;erp-sql-toolkit/
├── schema/ # versioned DDL, run in order
│ ├── 000-setup-database.sql # drop/create ERPToolkitDemo + schemas
│ ├── 001-core-schema.sql # dims & facts
│ ├── 002-import-log.sql # audit.import_log + start/finish procs
│ ├── 003-audit-framework.sql # audit.row_audit + trigger generator
│ └── 004-scd2-dimensions.sql # SCD Type 2 dimension table
├── procedures/
│ ├── imports/
│ │ ├── sp_import_attendance.sql # the flagship idempotent import
│ │ └── sp_close_attendance_day.sql
│ ├── scd2/
│ │ └── sp_merge_dim_student.sql # SCD2 merge with late-arrival handling
│ ├── analytics/
│ │ ├── sp_kpi_absenteeism.sql
│ │ └── sp_kpi_collection_rate.sql
│ ├── maintenance/
│ │ ├── sp_index_health.sql
│ │ ├── sp_reindex_smart.sql # online rebuild >30%, reorg 5..30%
│ │ ├── sp_missing_indexes.sql # DMV-based suggestions
│ │ └── sp_unused_indexes.sql # candidates to drop
│ └── dq/
│ ├── sp_find_orphans.sql # generic, works on any (child,parent) pair
│ └── sp_find_duplicates.sql # generic, works on any column set
├── functions/
│ ├── fn_business_days_between.sql
│ ├── fn_age_in_years.sql
│ ├── fn_split_csv.sql # inline TVF
│ └── fn_date_range.sql # inline TVF
├── views/
│ ├── vw_attendance_daily.sql # Power BI / Excel friendly
│ └── vw_finance_summary.sql
├── tests/ # tSQLt
│ ├── test_imports.sql
│ ├── test_scd2.sql
│ └── test_dq.sql
├── samples/
│ └── seed-small.sql # ~1000 rows synthetic
├── tools/
│ ├── install-tsqlt.sql # instructions only — fetch tSQLt yourself
│ └── run-everything.sql # single-shot setup
├── docs/
│ ├── conventions.md # naming + safety conventions used here
│ ├── idempotency-pattern.md # why & how
│ └── scd2-pattern.md # SCD2 with late-arrival handling
├── BUILD-PLAN.md # the spec this was built from
├── LICENSE
└── README.md
| Rule | Why |
|---|---|
snake_case for all identifiers |
ANSI/Postgres compatible; no quoting |
core / stg / audit / mart / util / dq schemas |
Purpose-of-data, not technology |
Every SP starts with SET NOCOUNT ON; SET XACT_ABORT ON; |
Fewer roundtrips + safe transactions |
Every write wrapped in TRY ... CATCH ... ROLLBACK ... THROW |
Atomic, observable failures |
DATETIME2(0) everywhere, never DATETIME |
Smaller, more accurate, ANSI |
No SELECT * in production code |
Schema-drift safety |
INSERT ... WHERE NOT EXISTS over naive MERGE |
Avoids the canonical MERGE race |
Full list → docs/conventions.md
- Idempotency must be designed in, not patched on. The audit row + uniqueness key +
WHERE NOT EXISTSpattern is short, but every part of it earns its place. Any one of them missing and re-runs misbehave. - The
MERGErace condition is the most common production bug in T-SQL data engineering. Most teams writeMERGEwithoutHOLDLOCKbecause the syntax looks complete. It isn't. - Row hashes (
HASHBYTES('SHA2_256', concat_ws(...))) are the difference between an SCD2 merge that's fast and one that's pathological. Comparing business attributes via hash means no-op merges are a single comparison, not a full row diff. - Audit logs that are also the idempotency mechanism are double-duty wins. One write does both jobs.
- CDC-aware version of
sp_import_attendance(consume Change Data Capture instead of staging) - Always-on / availability-group-aware variants of
sp_reindex_smart - Column-store fact partitioning examples
- Parameterized data generator (CLI param:
n_users,n_days) - PostgreSQL port for the patterns (FOR JSON → JSON_AGG, etc.)
This is a clean-room reference implementation. No code, schema, screenshot, or data from any employer or client appears in this repository. All data is synthetic, generated by the included seeder.
MIT — see LICENSE.
Author: Yugandhar N — Junior Software Developer, Fresh B.Tech Grad (Apr 2026)