UN Digital Trade Hackathon · 2026MLLegalTech/AI

ClauseMark

Verified Regulatory Evidence Engine - UN Digital Trade (RDTII)

Problem

UN digital-trade benchmarking (RDTII) requires mapping real statutes to policy indicators with defensible evidence. Legal RAG systems routinely hallucinate citations; isolated clause retrieval misses exceptions and definitions in other sections; a confident "0" without measured recall is epistemically unsafe for regulators.

Approach

Regulatory Intelligence Engine (RIE): parent-document hybrid retrieval over a legal structure graph; constrained decoding for indicator IDs; four verification gates (deterministic span checks + NLI + second LLM + self-consistency). Layer 1 is verifiable extraction; Layer 2 is a recommended score band that always requires human confirmation. Pillars and indicators live in YAML-not Python-so all 12 RDTII pillars extend by configuration.

At a glance

Event

UN Digital Trade · 2026

Pillars

12 (6 & 7 to depth)

Packages

15 (uv monorepo)

Verification

4 gates + HITL

Retrieval

BGE-M3 + Qdrant + RRF + rerank

Eval

RAGAS + ablations

Tech decisions

ID-replacement citations
The model never authors citation strings-only span IDs resolved by deterministic code, blocking ghost references.
Two-layer output
Extraction is automatable and gate-checked; scoring is legal judgment and stays a reviewer-confirmed recommendation.
Parent-document retrieval
Child-chunk search with parent+neighbourhood context recovers cross-section exceptions standard chunking severs.
Pillar-as-data
Engine stays generic; legal expertise ships in validated YAML + gold sets-add a pillar without refactoring core code.
LangGraph + Postgres checkpointer
Durable state and interrupt-based human review for flagged claims and absence cases.

Stack

PythonLangGraphFastAPIQdrantBGE-M3DoclingPyMuPDFRAGASPydanticStreamlitPostgresDocker

GitHub