Changelog¶
All notable changes to this project are documented here. This project adheres to Semantic Versioning.
v1.0.0 — 2026-04-21¶
Changed¶
Removed legacy structured C API — The MS-DRG grouper now exclusively uses the high-performance JSON FFI pipeline. Removed ~30 granular FFI functions (
msdrg_input_create,msdrg_version_create, etc.) and the Pythongroup_structured()method. This simplifies the API surface and prevents accidental use of slower, less efficient processing paths.Thread-safe MCE Component — Refactored the Medicare Code Editor to be strictly stateless and thread-safe. A single
MceEditorinstance can now be safely shared across threads, matching theMsdrgGrouperbehavior.
Fixed¶
Fixed per-claim memory leak — Resolved a string duplication leak in the marking phase (
marking.zig) and an ArrayList leak in the processor chain (chain.zig). The entire pipeline is now 100% memory stable over millions of claims.
Performance¶
AST Formula Caching — Implemented a thread-safe global cache for parsed DRG formulas. Evaluates formulas up to 10× faster by avoiding redundant lexing and parsing per claim.
- Double-Checked Locking — Optimized the AST cache with blocking
RwLockand double-checked locking patterns to minimize thread contention during warmup.
v0.1.10 — 2026-04-06¶
Added¶
ICD-10 Code Conversion — map diagnosis and procedure codes between ICD-10 fiscal year versions using CMS conversion tables. Supports forward mapping (newer→older) and backward mapping (older→newer).
Standalone converter:
with msdrg.IcdConverter() as conv:
# Convert a single DX code from FY2025 to FY2026
new_code = conv.convert_dx("A000", source_year=2025, target_year=2026)
# Convert a procedure code backwards
old_code = conv.convert_pr("02703DZ", source_year=2026, target_year=2025)
# Batch convert
results = conv.convert_dx_batch(["I5020", "E1165"], source_year=2025, target_year=2026)
Grouper integration — set source_icd_version on a claim to auto-convert codes before grouping:
with msdrg.MsdrgGrouper() as g:
result = g.group({
"version": 431, # Target: FY2026
"source_icd_version": 2025, # Source: FY2025 codes
"pdx": {"code": "I5020"},
...
})
-
scripts/compile_icd_conversions.py— downloads CMS ICD-10-CM and ICD-10-PCS conversion tables for adjacent year pairs and compiles them into a single binary file per code type (icd10cm_conversions.bin,icd10pcs_conversions.bin). -
zig_src/src/conversion.zig— binary data loading with memory-mapped I/O and binary search lookup for code conversion pairs. -
C API functions:
msdrg_convert_dx(ctx, code, source_year, target_year)— convert a single diagnosis codemsdrg_convert_pr(ctx, code, source_year, target_year)— convert a single procedure code-
msdrg_input_set_source_icd_year(input, year)— set source ICD year for auto-conversion -
IcdConverterPython class — standalone code conversion withconvert_dx(),convert_pr(),convert_dx_batch(),convert_pr_batch()methods andversion_to_year()/year_to_version()helpers. -
18 new tests in
test_icd_conversions.pycovering lifecycle, version/year helpers, conversion behavior without data, and grouper integration.
v0.1.9 — 2026-04-06¶
GrouperFlags in JSON output —
MsdrgGrouperFlagsnow computed and included in JSON response:admit_dx_grouper_flag— DX_VALID / DX_INVALID / DX_NOT_GIVENinitial_drg_secondary_dx_cc_mcc/final_drg_secondary_dx_cc_mcc— NONE / CC / MCCnum_hac_categories_satisfied— count of unique HAC categories with HAC_CRITERIA_MET-
hac_status_value— NOT_APPLICABLE / FINAL_DRG_NO_CHANGE / FINAL_DRG_CHANGES / FINAL_DRG_UNGROUPABLE -
GrouperFlags C API — 9 new result getters for structured callers:
msdrg_result_get_admit_dx_grouper_flag()/_name()msdrg_result_get_initial_severity()/_name()msdrg_result_get_final_severity()/_name()msdrg_result_get_num_hac_categories_satisfied()-
msdrg_result_get_hac_status_value()/_name() -
HAC test scenarios — 5 new test cases in
test_hac_scenarios.pyvalidating T80211A (HAC 7) and T8141XA (HAC 11/12/13) triggers with POA=Y/N and EXEMPT hospital status
Refactored¶
- Marking.zig deduplication — consolidated ~1,300 lines of duplicated marking logic into ~938 lines (~31% reduction) via shared helper functions:
getWinningFormula,updateImpactDirection,markDiagnosisCodes,markProcedureCodes, and Final-specific variants
v0.1.8 — 2026-04-02¶
Added¶
-
Expanded comparison testing —
compare_groupers.pynow validates initial DRG, initial MDC, final DRG, and final MDC against the Java reference (previously only checked final DRG/MDC). The Javaprocess()method now returns a structured dict with all four values. -
New output fields —
GroupResultand the JSON API now exposeinitial_base_drg,final_base_drg,initial_return_code,initial_severity, andfinal_severity, matching fields available on the JavaMsdrgOutputclass.
Input Validation¶
hospital_status— must beEXEMPT,NOT_EXEMPT, orUNKNOWN(was silently defaulting toNOT_EXEMPT)tie_breaker— must beCLINICAL_SIGNIFICANCEorALPHABETICAL- POA indicators — must be
Y,N,U,W, or space (unchecked) - Procedure
code— must be a string (was only checking existence) - MCE
icd_version— must be 9 or 10 - MCE
discharge_date— must be YYYYMMDD between 20000101 and 21001231
v0.1.7 — 2026-03-31¶
Added¶
-
Clinical significance tie-breaking — SDX codes are now sorted by severity (MCC > CC > other, then by ICD code string) before the marking phase. This matches the CMS Java grouper's
CLINICAL_SIGNIFICANCEtie-breaking behavior, where the most clinically significant diagnosis gets first pick of matching attributes during DRG formula evaluation. -
tie_breakerinput field — new optional per-request field onClaimInput:The default ({"tie_breaker": "CLINICAL_SIGNIFICANCE"} # default {"tie_breaker": "ALPHABETICAL"} # ICD code string onlyCLINICAL_SIGNIFICANCE) matches the CMS Java reference and is what all users should use unless specifically overriding. -
MarkingLogicTieBreakerenum — new enum inmodels.zig(CLINICAL_SIGNIFICANCE,ALPHABETICAL) stored onRuntimeOptions. -
msdrg_input_set_tie_breaker()— C API function for structured callers: -
CodeSetuppreprocessing link — new chain link inserted afterSdxAttributeProcessorthat sorts SDX codes (MCC > CC > other, by code string) and procedure codes (by code value) whenCLINICAL_SIGNIFICANCEmode is active.
Fixed¶
-
Stent marking: wrong attribute name case —
markStents()inmarking.zigused"nordrugstent"and"norstent"(lowercase) instead of the correct"NORdrugstent"and"NORstent"(mixed case) from the data layer. The attribute cleanup after stent processing was silently failing, leaving stale attributes in the matched set. -
Stent marking: missing secondary phase — Implemented the missing secondary marking pass from the Java reference (
ProcedureFunctionMarking.java:61-73). When the DRG formula matches botharterialandNORdrugstent(orNORstent), procedures with both attributes are now marked even if they lack theSTENT_4flag.
Performance¶
~57% throughput increase (7,000 → 11,000+ claims/sec). Two optimizations:
Mask-once architecture — the attribute mask is now built once after preprocessing and reused across all grouping, marking, and HAC call sites. Previously, buildMask() was called ~14-20 times per claim (each doing ~200-400 HashMap insertions with heap-allocated keys). Now it builds twice total (once after preprocessing, once after HAC processing). This eliminated ~10,000+ redundant heap allocations per claim.
Zero-allocation attribute comparison — replaced all Attribute.toString() + allocator.free() pairs in marking inner loops with Attribute.matchesString(), which compares directly using a stack buffer for prefixed attributes. Eliminated ~200 heap churn operations per claim from O(N×M×A) attribute matching loops.
Dead code cleanup — removed unused error sets, duplicate code blocks, dead imports, dead Python bindings, and the entire unused
msdrg_data.zigmodule.
Correctness¶
-
Discharge status enum synced to Java reference —
DischargeStatusenum inmodels.zignow matchesMsdrgDischargeStatus.javaexactly. Fixed enum name mismatches (ANOTHER_TYPE_FACILITY→CUST_SUPP_CARE,LEFT_AMA→LEFT_AGAINST_MEDICAL_ADVICE, etc.), added missing codes (69, 70), and fixedformulaStringfor NONE (was returning"invalid_dstat", now returns null per Java). -
Ungroupable claims now assign DRG 999 — when the grouper sets a non-OK return code (e.g.
HAC_STATUS_INVALID_MULT_HACS_POA_NOT_Y_W,INVALID_DISCHARGE_STATUS), the final DRG is now set to 999 (ungroupable) and MDC to 0, matching CMS standard behavior. Previously, DRG/MDC were left as null. -
Test claim generator fixed —
generate_test_claims.pynow uses only the 36 valid CMS discharge status codes fromMsdrgDischargeStatus.java. Previously, it included invalid codes (40, 41, 42) that caused spuriousINVALID_DISCHARGE_STATUSmismatches in comparison testing. -
Comparison test: discharge status passthrough —
compare_groupers.pypreviously forced all non-1/20 discharge statuses to HOME (1) when building Java input. Now passes the actual status through togetEnumFromInt(), ensuring both Java and Zig receive identical inputs. -
Comparison test: PDX POA passthrough —
compare_groupers.pypreviously hardcodedpoa=Yfor all PDX codes. Now uses the claim's actual POA value.
Removed¶
python_client/directory — old standalone wrapper superseded by the propermsdrgpackage. Removed deadmsdrg.pyandtest_grouper.pyfiles, cleaned up fallback import incompare_groupers.py.
v0.1.6 — 2026-03-30¶
Added¶
Structured C API — the C ABI now exposes a full structured API for building inputs and reading results without JSON serialization. This enables high-performance integration from C, C++, Rust, and other FFI-capable languages.
Input functions: - msdrg_input_create() / msdrg_input_free() — opaque input handle - msdrg_input_set_pdx(), msdrg_input_set_admit_dx(), msdrg_input_add_sdx(), msdrg_input_add_procedure() — set claim codes - msdrg_input_set_demographics() — set age, sex, discharge status - msdrg_input_set_hospital_status() — set hospital status (EXEMPT/NOT_EXEMPT/UNKNOWN)
Version functions: - msdrg_version_create() / msdrg_version_free() — create reusable version handle
Execution: - msdrg_group(version, input) — execute grouping, returns opaque result handle
Result getters (47 total): - Scalar: msdrg_result_get_{initial,final}_{drg,mdc}, msdrg_result_get_return_code[_name] - Descriptions: msdrg_result_get_{initial,final}_{drg,mdc}_description - PDX: msdrg_result_has_pdx, msdrg_result_get_pdx_{code,mdc,severity,drg_impact,poa_error,flags} - SDX: msdrg_result_get_sdx_{count,code,mdc,severity,drg_impact,poa_error,flags} - Procedures: msdrg_result_get_proc_{count,code,is_or,drg_impact,is_valid,flags} - msdrg_result_free() — release result
-
Auto-generated C header —
zig buildnow emitszig-out/include/msdrg.hwith all 47 exported function declarations,extern "C"guards, and opaque handle typedefs. No manual synchronization required. -
Python
group_structured()method — exposes the structured API path from Python for use cases that prefer direct FFI calls over JSON serialization. The defaultgroup()method continues to use the JSON path (faster for Python due to single FFI crossing).
Changed¶
- Hospital status is now exposed per-request in the structured API (
msdrg_input_set_hospital_status) rather than only through JSON parsing
v0.1.5 — 2026-03-29¶
Added¶
Input validation —
group()andedit()now validate inputs before FFI calls, raisingValueErrorwith clear, field-level messages (e.g.'sex' must be an int (0=Male, 1=Female, 2=Unknown), got str: 'M')POA support in helpers —
create_claim()andcreate_mce_input()now acceptpdx_poaand SDX tuples like("E1165", "Y")for present-on-admission indicatorsMsdrgGrouper.available_versions()— static method to programmatically discover supported DRG versions (400–431)orjsonacceleration — iforjsonis installed, JSON serialization/deserialization is 3–10× faster with zero code changesResourceWarningemitted whenMsdrgGrouperorMceEditoris garbage-collected without explicitclose()orwithblock__repr__onMsdrgGrouperandMceEditorshowingopen/closedstate- MCE smoke test added to
build.ymlCI workflow - MCE benchmark mode (
--benchmark) intests/compare_mce.py
Changed¶
discharge_statustype widened fromLiteral[1, 20]toint— all CMS discharge status codes (01–99) are now accepted, matching Zig backend behavior- Error messages improved — null returns from the native layer now include the input
version,pdx, anddischarge_datein the error, plus guidance on valid values
Fixed¶
build.ymlaction versions pinned to stable releases (checkout@v4,setup-python@v5,upload/download-artifact@v4) — previously referenced non-existent versionsbuild.ymlbranch triggers — addedmasteralongsidemainso CI actually runs on push- Eliminated
ctypesmodule pollution —mce.pyno longer monkeypatchesctypes._msdrg_lib; library loading is now centralized inmsdrg/_native.pywith thread-safe caching - Shared library loaded once — both
MsdrgGrouperandMceEditorshare a singleCDLLhandle via path-keyed cache, avoiding redundant loads
Internal¶
- New modules:
msdrg/_native.py(library discovery + cache),msdrg/_json.py(orjson fallback),msdrg/_validation.py(input checking) - Removed duplicate
_find_mce_data_dir()frommce.py(identical to_find_data_dir())
v0.1.4 — 2026-03-29¶
Added¶
Medicare Code Editor (MCE) — full MCE validation engine with Python bindings (
MceEditor,create_mce_input())
Fixed¶
- Fix non-short-circuit logic for claims assigned to MDC 0 with an invalid PDX
v0.1.3 — 2026-03-25¶
Added¶
Hospital status support — new
hospital_statusinput field for HAC-exempt processing (EXEMPT,NOT_EXEMPT,UNKNOWN)- Diagnosis filtering for SDX codes that meet HAC criteria under
NOT_EXEMPThospital status
Changed¶
- README updated to clarify that Zig is not required for
pip install(prebuilt wheels) - Improved HAC documentation in README
v0.1.2 — 2026-03-24¶
Added¶
TypedDict request/response types —
ClaimInput,GroupResult,DiagnosisInput,DiagnosisOutput,ProcedureInput,ProcedureOutputfor full type-checking support- Python test suite (
pytest) for MS-DRG grouper
Fixed¶
- Zig C API: proper null checks, enum conversion, and arena allocator for JSON string allocation
- Fix segfault caused by use of grouper context after
close() pyproject.tomlis now the single source of truth for version definition
v0.1.1 — 2026-03-23¶
Fixed¶
- Update GitHub Actions workflow versions to latest
- Fix for accurate record file creation
v0.1.0 — 2026-03-23¶
Initial release
- MS-DRG Grouper engine ported from CMS Java reference implementation
- Python bindings via ctypes with
MsdrgGrouperclass - Support for DRG versions 400–431 (FY 2023–FY 2026)
- Cross-platform shared library (Linux, macOS, Windows)
- 100% match rate against CMS Java grouper on 50,000+ test claims
- Binary data pipeline for compiling CMS CSV data
- C ABI for integration with any language