Reference Data & Master Data Management for ETRM
Practical course on building robust reference data and MDM platforms tailored for ETRM systems. Learn domain modeling, taxonomy, entity resolution, stewardship, synchronization patterns and governance required to keep your trading platform accurate, auditable and performant.
Course Snapshot
- • Master domains: Instruments, Counterparties, Locations, Books, Portfolios, Rates & Curves
- • Entity resolution, golden records, and lineage
- • MDM architectures: hub-and-spoke, registry, and hybrid
- • Integration patterns for Gravitas ETRM, TRM, Allegro and data warehouses
Overview
Reference data and master data are foundational for correct pricing, risk calculation, settlement and regulatory reporting in ETRM environments. This course teaches how to design and operate MDM solutions that keep trading systems consistent, reduce reconciliations, and support fast, auditable deployments.
Who should attend
Data architects, ETRM implementers, data stewards, platform engineers, and trading ops leads.
Key outcomes
Deliver master data models, stewardship processes, synchronization patterns and operational playbooks to run MDM for ETRM.
Prerequisites
Familiarity with ETRM trade concepts, basic SQL and an understanding of system integration concepts.
Curriculum — Modules & Topics
Comprehensive modules from domain modeling to governance and implementation patterns.
Module 1 — Master Data Domains for ETRM
- Instruments & instrument family taxonomy (forwards, swaps, options, physical commodities)
- Counterparties & legal entities, books, portfolios, locations, curves and calendars
- Reference attributes: currencies, units, settlement rules, delivery points
Module 2 — Logical & Physical Data Modeling
- Conceptual → logical → physical model translation for OLTP & OLAP
- Normalized vs denormalized patterns for fast joins and analytics
- Schema versioning and compatibility strategies
Module 3 — Entity Resolution & Golden Record
- Matching strategies (deterministic, probabilistic), fuzzy matching and scoring
- Golden record creation, survivorship rules and conflict resolution
- Tools & libraries: OpenRefine concepts, dedupe, custom scoring pipelines
Module 4 — MDM Architectures & Patterns
- Hub-and-spoke, registry, and hybrid MDM patterns
- Real-time vs batch sync: CDC, event-driven registry, and API façade
- Metadata store, schema registry, and contract governance
Module 5 — Data Quality & Validation
- DQ rules: completeness, accuracy, consistency, uniqueness
- Automated validation pipelines, shadow writes & quarantine queues
- Data quality KPIs and dashboards for stewards
Module 6 — Stewardship & Workflows
- Stewardship UI patterns, task management and approval flows
- Change requests, versioned updates, and audit trails
- SLAs and escalation for data fixes impacting trading
Module 7 — Integration with ETRM & Downstream Systems
- Adapters to Gravitas ETRM, TRM, Allegro — mapping canonical attributes to platform fields
- Synchronization patterns: push (API) vs pull (registry) vs publish/subscribe
- Idempotent updates and change-data-capture (CDC) strategies
Module 8 — Governance, Lineage & Compliance
- Data lineage, provenance, and impact analysis for regulatory audits
- Policies: retention, masking, PII handling and role-based access
- Operational runbooks, monitoring and remediation playbooks
Reference Architectures & Implementation Patterns
Practical blueprints for implementing MDM in ETRM landscapes.
Registry + Event-driven Façade
Central registry stores canonical records and emits events on change. Downstream ETRM adapters subscribe and reconcile. Provides low coupling and easy onboarding for legacy systems.
Hub-and-Spoke (Master Hub)
Central hub is the golden source — authoritative writes allowed only through governance workflows; spokes sync periodically or on-change to local systems for performance.
CDC + Event Sourcing for Near-real-time Sync
Use CDC from master data systems (Debezium / kafka-connect) to keep caches and ETRM systems in sync while preserving order and idempotency.
Schema Registry & Contract Governance
Store canonical JSON/Avro schemas in a registry, manage versions and compatibility, and enforce contracts with tests and CI pipelines.
Hands-on Labs & Exercises
Practical exercises to build reusable master data components and processes.
Lab 1 — Instrument Taxonomy & Canonical Schema
Design canonical schema for instruments with versioning; implement JSON Schema/Avro and a small registry.
Lab 2 — Counterparty Matching & Golden Record
Build deterministic + fuzzy matching, scoring, survivorship rules and generate golden records with provenance metadata.
Lab 3 — CDC Sync to ETRM Sandbox
Set up a CDC pipeline (Debezium/kafka-connect) to stream master updates and apply them to a mock Gravitas ETRM/TRM table while handling ordering and idempotency.
Lab 4 — Stewardship UI & Approval Workflow
Build a basic stewardship UI (React stub) to approve or reject changes, record audit trail and produce versioned snapshots.
Lab 5 — Data Quality Framework & Dashboards
Implement rule engine for DQ checks, create a dashboard (Grafana/Metabase) showing completeness, uniqueness and freshness KPIs.
Capstone — MDM for ETRM Mini-Project
Deliver a mini-MDM system: canonical schemas, golden records, CDC sync to mock ETRM, stewardship UI and DQ dashboards. Provide architecture doc and runbook.
Deliverables & Materials
- Canonical schema catalog (instruments, counterparties, locations, books)
- Entity resolution rules, golden-record engine and sample code
- CDC integration examples, mapping templates to Gravitas ETRM/TRM/Allegro
- Stewardship UI prototype, DQ rules and monitoring dashboards
- Operational runbooks, SLA definitions and governance templates
Pricing & Delivery Options
Self-paced
Recorded modules, schema catalog and lab guides.
Cohort (Instructor-led)
6-week cohort with live labs, code reviews and capstone feedback.
Enterprise
Private workshops, on-site MDM design and production hardening.
Contact & Custom Requests
Want an enterprise quote, private cohort, or a customized syllabus? Tell us about team size, preferred delivery and target outcomes.