SNF Quality Measures (MDS)
Dataset ID: nh-quality-mds ·
← Back to Methodology Hub
Provenance
- Dataset ID
nh-quality-mds- Entity Type
- snf
- Role
- enrichment
- Source
- CMS
- Vintage
- Mar 2026
- Entity Count
- —
- Last ETL Run
- 2026-04-13
Overview
The SNF Quality Measures (MDS) dataset is published by the Centers for Medicare & Medicaid Services (CMS) through the Nursing Home Care Compare program and is available as a public-use file on data.cms.gov (dataset identifier: nh-quality-mds). It contains facility-level quality measure scores derived from the Minimum Data Set (MDS), the standardized clinical assessment instrument that nursing home staff complete for every resident at admission, quarterly, annually, and upon significant change in condition. Each row represents one facility-measure combination, with approximately 20 fields per row covering the measure score, measure description, resident counts, footnote codes, and reporting period.
The measures are divided into two populations: long-stay (residents with 101 or more cumulative days in the facility) and short-stay (residents with fewer cumulative days). Long-stay measures include rates of pressure ulcers, falls with major injury, use of physical restraints, urinary tract infections, and antipsychotic medication use. Short-stay measures include rehospitalization rates and functional improvement metrics. This dataset answers questions such as: What percentage of a facility's long-stay residents received antipsychotic medications? How does a facility's short-stay rehospitalization rate compare to the national average? Which quality measures are suppressed for a given facility, and what does that imply about its resident population size?
Join Strategy
Each record is joined to a CareGraph SNF entity page using the Federal Provider Number field, which is the facility's CMS Certification Number (CCN). The CCN is a 6-character string, zero-padded on the left (e.g., 015001). During ETL, the join key is normalized by stripping leading and trailing whitespace and enforcing zero-padding to six digits. The join is a left join from the SNF entity manifest to this dataset: SNF pages without matching quality measure records display missing-data indicators rather than being omitted from the site. Because each facility has multiple rows (one per measure), the ETL pivots these into a single quality-measures object on the entity page manifest, keyed by measure identifier. Source rows with CCN values that do not match any entity page are logged as unmatched and excluded from the site build. The CCN format is validated during the ETL build step, and malformed keys are reported in the build log.
Known Limitations
- Self-reporting bias. MDS quality measures are derived from clinical assessments completed by nursing home staff, not from independent audits or claims data. Facilities with poor documentation practices may paradoxically appear to have better outcomes because conditions such as pressure ulcers, falls, or UTIs go unrecorded. CMS conducts targeted audits, but these cover a small fraction of facilities in any given year.
- Long-stay vs. short-stay population separation. Long-stay measures (101+ cumulative days) and short-stay measures (shorter stays, typically post-acute rehabilitation) describe fundamentally different resident populations. Comparing a long-stay pressure ulcer rate to a short-stay rehospitalization rate, or averaging across both populations, produces misleading quality assessments.
- Antipsychotic exclusion diagnoses. The long-stay antipsychotic medication use measure excludes residents diagnosed with schizophrenia, Huntington's disease, or Tourette syndrome from the denominator. The reported rate reflects use among residents without these qualifying exclusion diagnoses, not the full resident population. Facilities with a higher proportion of excluded residents will have a smaller denominator, increasing rate volatility.
- Small-facility suppression. Facilities with fewer than 20 eligible residents for a given measure are suppressed; their values appear as blank/null rather than zero. Small rural nursing homes frequently fall below this threshold across multiple measures simultaneously, producing systematically incomplete quality profiles. Users cannot distinguish suppression due to small sample size from true missing data without consulting the footnote codes.
- Measure specification changes over time. The MDS assessment instrument underwent a major revision from MDS 2.0 to MDS 3.0 in October 2010, and individual measure specifications are updated periodically thereafter. Year-over-year trend comparisons that span specification changes are not valid without adjusting for the methodology shift. The dataset does not include a version identifier for the measure specification used.
- Reporting lag. Quality measure scores reflect a rolling reporting period (typically 3-4 quarters), not a point-in-time snapshot. CMS updates the dataset quarterly, but the underlying MDS assessments may be several months old by the time scores are published. The reporting period start and end dates vary by measure.
Data Quality Notes
- Measure scores encoded as strings. Several numeric measure score fields in the source CSV contain non-numeric representations including "Not Available", empty strings, and "N/A" for suppressed or incalculable measures. The ETL normalizes all of these to null in the JSON manifest. Legitimate zero scores are preserved as numeric 0 and are distinct from null.
- Footnote codes carry suppression semantics. The footnote fields use numeric codes (e.g., 1 = suppressed due to too few residents) that encode the reason a measure value is missing or qualified. These codes are retained in the JSON manifest as-is. Consumers should consult the CMS footnote code definitions to interpret missing measure values correctly, as blank scores with different footnote codes have different meanings.
- Field names normalized to snake_case. The ETL converts CMS source column names (which use mixed case, spaces, and special characters) to snake_case identifiers. The original field names from the CMS CSV are preserved in the raw data section of each entity page for traceability.
- Date format standardization. Reporting period start and end date fields in the source data use mixed formats (MM/DD/YYYY, YYYY-MM-DD). The ETL standardizes all dates to ISO 8601 (YYYY-MM-DD) in the output manifests.