Provenance

Dataset ID
hosp-hcahps
Entity Type
hospital
Role
enrichment
Source
CMS
Vintage
FY2026
Entity Count
5,399
Last ETL Run
2026-04-13

Overview

The Patient Survey (HCAHPS) dataset is published by the Centers for Medicare & Medicaid Services (CMS) as part of the Hospital Compare program, now integrated into the Care Compare initiative. HCAHPS — the Hospital Consumer Assessment of Healthcare Providers and Systems — is a standardized survey instrument developed by CMS and the Agency for Healthcare Research and Quality (AHRQ) to measure patients' perspectives on their hospital experience. The survey is administered to a random sample of adult inpatients between 48 hours and 6 weeks after discharge from Medicare-certified acute care hospitals across the United States and its territories. The current file covers FY2026, with survey collection periods typically spanning 12 months and a reporting lag of approximately 9–12 months.

The dataset contains approximately 60 fields spanning multiple survey dimensions: communication with nurses, communication with doctors, responsiveness of hospital staff, communication about medicines, cleanliness and quietness of the hospital environment, discharge information, care transition, and overall hospital rating. Each dimension reports top-box, middle-box, and bottom-box percentage scores, along with HCAHPS star ratings and the number of completed surveys. The top-box methodology counts only the most favorable responses — "Always" on 4-point frequency scales and "9" or "10" on the 0–10 overall rating scale. Each row represents one survey measure for one hospital. This dataset answers questions about how patients perceive their hospital stay, which dimensions of experience a hospital excels or lags in, and how a hospital's patient experience compares to the national distribution. It measures patient perception, not clinical quality — the two can diverge significantly.

Join Strategy

CareGraph joins this dataset to hospital entity pages using the Facility ID field, which corresponds to the CMS Certification Number (CCN). The CCN is a 6-character zero-padded string (e.g., 010001). During ETL, the _find_column() function matches the CCN column against a candidate list (Facility ID, Hospital CCN, Provider Number, Facility Id, Provider ID, CCN) to handle header variation across CMS file releases. The normalize_ccn() function strips whitespace and zero-pads values shorter than 6 characters. Because the dataset contains multiple rows per hospital (one per survey measure), the join produces a one-to-many relationship between the hospital entity and its measure-level records. Matched rows are grouped by CCN via _load_measures_by_ccn() and written to the hospital's JSON manifest under data.hcahps. Non-numeric sentinel values such as "Not Available" and "Not Applicable" are discarded during loading; numeric fields are parsed with _try_float(), which converts non-numeric values to null. A provenance record with dataset ID hosp-hcahps is appended to the manifest. Hospitals without matching HCAHPS rows display missing data indicators rather than being excluded from CareGraph.

Known Limitations

Data Quality Notes

← Back to Methodology Hub · Report an error