Nursing Home Penalties
Dataset ID: nh-penalties ·
← Back to Methodology Hub
Provenance
- Dataset ID
nh-penalties- Entity Type
- snf
- Role
- enrichment
- Source
- CMS
- Vintage
- Mar 2026
- Entity Count
- 14,703
- Last ETL Run
- 2026-04-13
Overview
The Nursing Home Penalties dataset is published by the Centers for Medicare & Medicaid Services (CMS) through the Care Compare program (formerly Nursing Home Compare) and is available as a public-use file on data.cms.gov (dataset identifier: g6vv-u9sr). It contains one row per federal enforcement penalty imposed on a Medicare- and Medicaid-certified skilled nursing facility (SNF). Each record carries approximately 10 fields including the penalty date, penalty type, fine amount, and the Federal Provider Number of the penalized facility. The dataset covers penalties imposed within the most recent three-year window and is updated monthly.
Penalty types fall into two categories: civil money penalties (CMPs), which are fines, and denial of payment for new admissions (DPNAs), which prohibit a facility from billing Medicare or Medicaid for newly admitted residents until the cited deficiency is corrected. CMPs can be assessed on a per-day or per-instance basis; per-day penalties accrue for each day a deficiency remains uncorrected, which can produce large cumulative fine amounts for facilities that are slow to remediate. This dataset answers questions such as: Has a facility been fined by CMS? How many penalties has it received in the past three years? What was the total dollar amount of fines imposed? Has a facility faced denial of payment for new admissions? How does its penalty history compare to peer facilities?
Join Strategy
Each penalty record is joined to a CareGraph SNF entity page using the Federal Provider Number field, which is the facility's CMS Certification Number (CCN). The CCN is a 6-character string, zero-padded on the left (e.g., 015001). During ETL, the join key is normalized by stripping leading and trailing whitespace and enforcing zero-padding to six digits via the normalize_ccn function. Because each facility can have multiple penalty records, the join produces a one-to-many relationship: all matching penalty records are collected into a penalties array on the SNF entity page manifest. SNF pages without matching penalty records display no penalties section — the absence of records indicates no federal penalties were imposed, not missing data. Source rows with CCN values that do not match any existing SNF entity page are excluded from the site build. The CCN format is validated during the ETL build step, and malformed keys are reported in the build log.
Known Limitations
- Federal penalties only. This dataset contains only penalties imposed by CMS under federal authority. State-imposed penalties and sanctions, which some states impose under their own regulatory frameworks, are not included. A facility with no records in this dataset may still have been penalized by its state survey agency.
- CMS enforcement policy shift since 2016. CMS has increasingly used civil money penalties as an enforcement tool since 2016, particularly for deficiencies at the level of actual harm or immediate jeopardy. The volume of penalty records has grown over time due to this policy change. Trend analysis of penalty counts or total fine amounts must account for this shift in enforcement posture, not attribute it solely to changes in facility quality.
- No facility size normalization. Fine amounts are reported as absolute dollar values without adjustment for facility size. A $10,000 penalty represents a materially different financial impact for a 30-bed rural SNF than for a 300-bed urban facility. CareGraph does not normalize penalty amounts by bed count or revenue.
- Per-day penalty accumulation. Per-day CMPs accrue from the date the deficiency is identified until it is corrected. A single per-day penalty record may represent a modest daily rate that accumulated into a large total due to slow remediation. The dataset does not separately report the daily rate and the number of accrual days — only the total fine amount appears.
- Three-year lookback window. The dataset includes only penalties imposed within the most recent three years. Facilities with no penalty records may have had significant enforcement history prior to the lookback window. Older penalties are removed from the dataset as they age out.
- Deficiency-penalty gap. Not all deficiency citations result in penalties. Many deficiencies are resolved through plans of correction without financial enforcement. The absence of penalty records does not indicate deficiency-free inspections — it indicates only that CMS did not impose federal financial penalties.
Data Quality Notes
- Fine amount encoding. The
fine_amountfield in the source CSV contains dollar values with commas, dollar signs, and occasional non-numeric placeholders ("N/A", "Not Available", ".", "*"). The ETL strips currency formatting and converts valid values to floating-point numbers; non-numeric values are normalized to null. A nullfine_amounton a DPNA-type record is expected, as denial of payment penalties do not carry a dollar fine amount. - Date format inconsistencies. The
penalty_datefield in the source data uses mixed formats (MM/DD/YYYY and YYYY-MM-DD). The ETL reads this field as a string and stores it as-is in the JSON manifest. Consumers should parse dates defensively. - Penalty type values. The
penalty_typefield uses free-text values that vary across data vintages (e.g., "Civil Money Penalty", "Fine", "Denial of Payment for New Admissions"). The ETL does not normalize these values to a controlled vocabulary. Consumers should match on substrings or patterns rather than exact equality. - Remaining source fields passed through. Beyond the three explicitly mapped fields (
penalty_date,penalty_type,fine_amount), the ETL passes through all other non-empty fields from the source CSV row into the penalty record. Field names retain the original CMS column headers, which use mixed case and spaces. These additional fields vary across data vintages and are not guaranteed to be present in every record.
---