HCUP National Estimates Tutorial - Accessible Version
Welcome to the HCUP National Estimates Tutorial
Thank you for joining us for this Healthcare Cost and Utilization Project, or HCUP, online tutorial on calculating national estimates using the HCUP nationwide databases.
HCUP's nationwide databases provide estimates for hospital stays, emergency department visits, or major ambulatory surgery encounters across the United States. They are built from the HCUP State databases. The databases contain information on all discharges or encounters, regardless of expected payer. They can be used to create national estimates of healthcare utilization, access, charges, quality, and outcomes. The nationwide databases are available for purchase through the HCUP Central Distributor. Statistics from select databases are available on HCUPnet.
This tutorial is organized into five modules specific to each nationwide database:
- National Inpatient Sample, or NIS
- Kids' Inpatient Database, or KID
- Nationwide Ambulatory Surgery Sample, or NASS
- Nationwide Emergency Department Sample, or NEDS
- Nationwide Readmissions Database, or NRD
Each module is divided into four sections that provide background information on the nationwide database and demonstrate how it can be used to produce national and, in some cases, regional estimates for healthcare-related analyses. Also provided is sample SAS® code demonstrating how to use the nationwide database to produce such estimates.
This tutorial is self-paced. Therefore, the time to complete this tutorial will vary based on the individual user's experience. This tutorial includes narration.
Return to Contents
Overview of the HCUP National Estimates Tutorial Structure
This tutorial contains five modules, with each corresponding to an HCUP nationwide database. Each module is divided into four sections that cover specific topic areas for the respective HCUP nationwide database. These topic areas are the same across all modules.
Modules |
Module 1
National Inpatient Sample (NIS), which includes the following sections:
- Overview
- Weighting the Data
- SAS Code Examples
- Validating Estimates
|
Module 2
Kids' Inpatient Database (KID), which includes the following sections:
- Overview
- Weighting the Data
- SAS Code Examples
- Validating Estimates
|
Module 3
Nationwide Ambulatory Surgery Sample (NASS), which includes the following sections:
- Overview
- Weighting the Data
- SAS Code Examples
- Validating Estimates
|
Module 4
Nationwide Emergency Department Sample (NEDS), which includes the following sections:
- Overview
- Weighting the Data
- SAS Code Examples
- Validating Estimates
|
Module 5
Nationwide Readmissions Database (NRD), which includes the following sections::
- Overview
- Weighting the Data
- SAS Code Examples
- Validating Estimates
|
Return to Contents
Module 1: National Inpatient Sample (NIS)
The National Inpatient Sample, or NIS, is the largest publicly available all-payer inpatient care database in the United States, containing data on more than 7 million hospital stays.
Information on the NIS is organized by the four sections below. These include:
- Overview
- Weighting the Data
- SAS Code Examples, and
- Validating Estimates
Additional information about the NIS is available on the NIS Database Documentation page on the HCUP User Support, or HCUP-US, website.
Return to Contents
Module 1: National Inpatient Sample (NIS), Overview of the NIS
The NIS is the largest publicly available all-payer inpatient healthcare database in the United States. It is designed to produce U.S. regional and national estimates of inpatient utilization, access, cost, quality, and outcomes. Unweighted, it contains data from more than 7 million hospital stays each year. Weighted, it estimates more than 35 million hospitalizations nationally.
The NIS is sampled from the HCUP State Inpatient Databases (SID), which include all inpatient data from participating HCUP Partners that currently contribute to HCUP. Available since data year 1988, the NIS sampling frame has grown from including data from 8 HCUP Partners to 49 HCUP Partners, including about 97 percent of discharges from U.S. community hospitals.
Additional information on the sample design of the NIS is available in the NIS Introduction or the HCUP Sample Design tutorial.
Return to Contents
Module 1: National Inpatient Sample (NIS), Weighting the NIS
NIS Data Element Discharge Weight
To produce nationally or regionally representative estimates, the NIS data must be weighted. This can be done using the data element discharge weight, or DISCWT, which is assigned to each record in the NIS. The value of DISCWT is 5 for all records because the NIS is a self-weighted sample.
When the discharge weights are applied to the NIS data, the result is an estimate of the number of discharges for the target universe, which includes all inpatient discharges from community hospitals in the United States, excluding rehabilitation hospitals beginning in 1998 and long-term acute care hospitals beginning in 2012.
Per the American Hospital Association, community hospitals include non-Federal, short-term general, and other specialty hospitals that are open to the public. Included among community hospitals are specialty hospitals such as obstetrics-gynecology, ear-nose-throat, orthopedic, and pediatric institutions. Also included are public hospitals and academic medical centers. Examples of excluded hospitals include non-Federal long-term acute care, rehabilitation, psychiatric, and Federal hospitals, such as Indian Health Service hospitals.
NIS Discharge Weights Over Time
NIS data are available annually beginning with data year 1988. Users should be mindful of changes to the discharge weight variable over time. These changes are listed in the table below.
Years |
Variable Name |
Use |
2001+ |
DISCWT |
All national estimates |
2000 |
DISCWT |
National estimates except those including total charge (data element TOTCHG) |
2000 |
DISCWTcharge |
National estimates of total charge (data element TOTCHG) |
1998-1999 |
DISCWT |
All national estimates |
1988-1997 |
DISCWT_U |
All national estimates |
Accounting for the NIS Redesign in Data Year 2012 for Multi-Year Analysis
In 2012, the NIS was redesigned to improve national estimates:
- Beginning with data year 2012, the NIS is a sample of discharges from all hospitals in HCUP.
- Through data year 2011, the NIS is composed of all discharges from a sample of hospitals in HCUP.
If conducting a trend analysis that uses NIS data before and after data year 2012, it is recommended that users account for the redesign when creating national estimates. The NIS Trend Weights (data element TRENDWT) for data years 1993-2011 have been developed to assist with such an analysis.
For example, if conducting a trend analysis for NIS data years 2010-2013, you should use a combination of TRENDWT and DISCWT to generate comparable national estimates.
Additional information on the NIS Trend Weights is available on the HCUP-US website. For information on how to conduct a trend analysis using NIS data before and after data year 2012, refer to the HCUP Multi-Year Analysis tutorial. The SAS code examples included in the next section do not provide an example of a trend analysis.
NIS Hospital Weight
As described on the prior slide, the NIS was redesigned in data year 2012. For data years 1988-2011, the NIS is composed of all discharges from a sample of hospitals. To project NIS hospitals to the number of hospitals in the target universe, users must apply a hospital weight (data element HOSPWT) to the NIS data for these years. For data years 1988-1997, this data element was named HOSPST_U.
Beginning in 2012, HOSPWT is no longer applicable because the NIS is a sample of discharges from all hospitals from participating HCUP Partners.
The SAS code included in the next section does not provide an example of using HOSPWT. However, such an example is available in Module 4, which covers the Nationwide Emergency Department Sample (NEDS), where the use of HOSPWT is still applicable.
Return to Contents
Module 1: National Inpatient Sample (NIS), SAS Code Examples
SAS Code for Producing National Estimates by Expected Payer
This example SAS code produces national estimates of discharges by the primary expected payer or data element PAY1 in the 2019 NIS.
Title "Produce National Estimate of Discharges By Primary Expected Payer from 2019 NIS File (Weighted)";
Libname NIS2019 "V:\NIS\2019\SASDATA" access=readonly;
Options PS=51 LS=146 ;
proc format;
Value FPAY /* PAY1 and PAY2 */
1 = " 1: Medicare"
2 = " 2: Medicaid"
3 = " 3: Private insurance"
4 = " 4: Self-pay"
5 = " 5: No charge"
6 = " 6: Other"
. = " .: Missing"
.A = ".A: Invalid"
;
run;
proc surveymeans data=nis2019.nis_2019_core missing sumwgt ;
cluster HOSP_NIS ;
strata NIS_STRATUM ;
domain PAY1 ;
format PAY1 fpay. ;
weight DISCWT ;
var KEY_NIS ;
run;
The first section of this example SAS code includes a PROC FORMAT, which is a procedure that assigns data labels to the data values in the output. For this example, we are focused on data element PAY1, which has the following mappings:
- Numeric value 1 for Medicare
- Numeric value 2 for Medicaid
- Numeric value 3 for Private insurance
- Numeric value 4 for Self-pay
- Numeric value 5 for No charge
- Numeric value 6 for Other
- A decimal point means a numeric value is missing
- A decimal followed by the uppercase letter, A, means the value is invalid.
This PROC FORMAT is specific to this example and should be modified if your analysis requires a different data element of interest. For example, if you are interested in obtaining national estimates for patient race or data element RACE, the proc format would include the mapping for that data element.
The second section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NIS. This procedure includes the following statements:
- The CLUSTER statement, which includes the NIS hospital identifier or data element HOSP_NIS.
- The STRATA statement, which includes the NIS stratum identifier or data element NIS_STRATUM.
- The DOMAIN and FORMAT statements, which are specific to this analysis, and produce national estimates by data element PAY1.
- The WEIGHT statement, which includes the NIS discharge weight or data element DISCWT.
- The VAR statement, which includes the NIS record identifier or data element KEY_NIS.
Note that for analysis including 2012 and earlier years: Replace HOSP_NIS with HOSPID in the CLUSTER statement and use the NIS Trend Weight (TRENDWT) in place of the original discharge weight (DISCWT) in the WEIGHT statement. See the Multi-Year Analysis Tutorial for more information.
Produce National Estimate of Discharges By Primary Expected Payer from 2019 NIS File (Weighted)
The SURVEYMEANS Procedure
Statistics for PAY1 domains
Sum of
PAY1 Variable Label Weights
--------------------------------------------------------------------------------
.: Missing KEY_NIS NIS record number 42179
.A: Invalid KEY_NIS NIS record number 1945.003493
1: Medicare KEY_NIS NIS record number 14529614
2: Medicaid KEY_NIS NIS record number 7927119
3: Private insurance KEY_NIS NIS record number 10249101
4: Self-pay KEY_NIS NIS record number 1525965
5: No charge KEY_NIS NIS record number 109380
6: Other KEY_NIS NIS record number 1033720
--------------------------------------------------------------------------------
The output for this example SAS code provides the following weighted record counts for PAY1 in the 2019 NIS:
- Missing: 42,179
- Invalid: 1,945
- Medicare: 14,529,614
- Medicaid: 7,927,119
- Private insurance: 10,249,101
- Self-pay: 1,525,965
- No charge: 109,380
- Other: 1,033,720
Example SAS Code for Producing National Estimates for Asthma
This example SAS code identifies the number of weighted records in the 2019 NIS with a principal diagnosis of asthma, which is based on the default HCUP Clinical Classifications Software Refined (CCSR) for ICD-10-CM diagnosis category, RSP009 for Asthma.
Title "Produce National Estimate of Discharges with Default DXCCSR=RSP009 (Asthma) from 2019 NIS File (Weighted)";
Libname NIS2019 "V:\NIS\2019\SASDATA" access=readonly;
Options PS=51 LS=146 ;
data asthma;
merge nis2019.nis_2019_core (keep=HOSP_NIS KEY_NIS DISCWT NIS_STRATUM)
nis2019.nis_2019_dx_pr_grps (keep=HOSP_NIS KEY_NIS DXCCSR_Default_DX1)
;
by HOSP_NIS KEY_NIS;
Attrib Asthma length=3 label='Asthma default DXCCSR=RSP009';
Asthma =(DXCCSR_Default_DX1='RSP009');
run;
proc surveymeans data=asthma sum std mean stderr;
cluster HOSP_NIS ;
strata NIS_STRATUM;
var Asthma;
weight DISCWT;
run;
The first section of this example SAS code includes the DATA step, which is looking for records with a default CCSR category of RSP009, Asthma, for the principal diagnosis. This step includes the following statements:
- The MERGE statement, which combines the NIS Core File with the NIS Diagnosis and Procedure Groups Files by HOSP_NIS and KEY_NIS. This process results in the acquisition of the default CCSR category for the principal diagnosis or data element DXCCSR_Default_DX1.
- The KEEP statements, which are present for each file containing data elements we need for this analysis. This includes data elements necessary for linking the files, weighting the data, and DXCCSR_Default_DX1.
- The ATTRIB statement, which assigns a length and a label to a new data element (ASTHMA) specific to our example analysis. The next statement, Asthma =, assigns a value to this new data element, which in our example, is defined based on the default CCSR category of RSP009 for the principal diagnosis (NIS data element DXCCSR_Default_DX1=RSP009).
The second section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NIS. This procedure includes the following statements:
- The CLUSTER statement, which includes HOSP_NIS.
- The STRATA statement, which includes NIS_STRATUM.
- The WEIGHT statement, which includes the data element DISCWT.
- The VAR statement, which includes the value, ASTHMA, that we defined in the DATA step above.
Produce National Estimate of Discharges with Default DXCCSR=RSP009 (Asthma) from 2019 NIS File (Weighted)
The SURVEYMEANS Procedure
Data Summary
Number of Strata 201
Number of Clusters 4568
Number of Observations 7083805
Sum of Weights 35419023
Statistics
Std Error Std Error
Variable Label Mean of Mean Sum of Sum
-------------------------------------------------------------------------------------
Asthma Asthma default 0.004781 0.000109 169330 4029.071202
DXCCSR=RSP009
The output for this example SAS code provides the total number of weighted records in the 2019 NIS with a default CCSR category for the principal diagnosis of RSP009, Asthma, which is 169,330.
Example SAS Code for Producing Regional Estimates for Asthma
This example SAS code is focused on producing regional estimates for asthma in the 2019 NIS, which we defined based on the default CCSR category of RSP009 for the principal diagnosis.
Title "Produce Regional Estimates of Discharges with Default DXCCSR=RSP009 (Asthma) from 2019 NIS File (Weighted)";
Libname NIS2019 "V:\NIS\2019\SASDATA" access=readonly;
Options PS=51 LS=146 ;
proc format;
Value St_Regn
1 = "1: Northeast"
2 = "2: Midwest"
3 = "3: South"
4 = "4: West"
;
run;
data asthma;
merge nis2019.nis_2019_core (keep=HOSP_NIS KEY_NIS DISCWT NIS_STRATUM)
nis2019.nis_2019_dx_pr_grps (keep=HOSP_NIS KEY_NIS DXCCSR_Default_DX1)
;
by HOSP_NIS KEY_NIS;
/* Look up region */
if _n_=1 then do;
if 0 then set nis2019.nis_2019_hospital (keep=HOSP_REGION);
declare hash h (dataset: "nis2019.nis_2019_hospital");
h.defineKey('HOSP_NIS');
h.defineData('HOSP_REGION');
h.defineDone();
end;
if h.find() ne 0 then abort; /* all disharges should have a matching hospital record */
Attrib Asthma length=3 label='Asthma default DXCCSR=RSP009';
Asthma = (DXCCSR_Default_DX1='RSP009');
run;
proc surveymeans data=asthma missing sum mean ;
cluster HOSP_NIS ;
strata NIS_STRATUM ;
var Asthma;
weight DISCWT ;
domain HOSP_REGION ;
format HOSP_REGION St_Regn. ;
run;
The first section of this example SAS code includes a PROC FORMAT, which assigns data labels to the data values in the output. For this example, we are focused on the data element HOSP_REGION, which includes the following mappings:
- Numeric value 1 for Northeast
- Numeric value 2 for Midwest
- Numeric value 3 for South
- Numeric value 4 for West
The second section includes the DATA step, which includes the following statements:
- The MERGE statement, which links the NIS Core File with the NIS Diagnosis and Procedure Groups File keeping essential data elements from each file.
- For this specific example, there is an additional step that uses the hash technique to acquire the data element, HOSP_REGION, from the NIS Hospital File. The NIS hospital identification number, HOSP_NIS, is used for linkage.
- The ATTRIB statement, which assigns a length and a label to a new data element (ASTHMA) specific to our example analysis. The next statement, Asthma =, assigns a value to this new data element, which in our example, is defined based on the default CCSR category of RSP009 for the principal diagnosis (NIS data element DXCCSR_Default_DX1=RSP009).
The final section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NIS. This procedure includes the following statements:
- The CLUSTER statement, which includes HOSP_NIS.
- The STRATA statement, which includes NIS_STRATUM.
- The WEIGHT statement, which includes the data element DISCWT.
- The VAR statement, which includes the value, asthma, that we defined in the DATA step above.
- The DOMAIN and FORMAT statements, which are specific to HOSP_REGION as we are interested in regional estimates.
Produce Regional Estimates of Discharges with Default DXCCSR=RSP009 (Asthma) from 2019 NIS File (Weighted)
The SURVEYMEANS Procedure
Statistics for HOSP_REGION Domains
Std Error Std Error
HOSP_REGION Variable Label Mean of Mean Sum of Sum
--------------------------------------------------------------------------------------------------------
1: Northeast Asthma Asthma default 0.006485 0.000358 41550 2419.380819
DXCCSR=RSP009
2: Midwest Asthma Asthma default 0.004217 0.000221 33065 1775.993606
DXCCSR=RSP009
3: South Asthma Asthma default 0.004415 0.000135 62150 2058.065599
DXCCSR=RSP009
4: West Asthma Asthma default 0.004590 0.000240 32565 1729.226390
DXCCSR=RSP009
--------------------------------------------------------------------------------------------------------
The output for this example SAS code provides the total number of weighted records in the 2019 NIS with a default CCSR category of RSP009, Asthma, by hospital region:
- Northeast: 41,550
- Midwest: 33,065
- South: 62,150
- West: 32,565
Return to Contents
Module 1: National Inpatient Sample (NIS), Validating National and Regional Estimates
There are three resources that can be used to validate national and regional estimates for the NIS.
- The HCUP Summary Statistics include means on all numeric variables, frequency distributions, and univariates on continuous variables for each HCUP database. Summary Statistics are provided by year.
- The HCUP Diagnosis and Procedure Frequency Tables provide frequencies of ICD-9-CM and ICD-10-CM/PCS codes (individually and grouped by clinical category) in the HCUP nationwide databases. These are available under the "Data Elements" section of the respective nationwide database documentation page on the HCUP-US website.
- HCUPnet is a free online query tool that provides select precalculated statistics derived from both the State and nationwide HCUP databases. HCUPnet can be used to validate select national estimates obtained from the NIS, KID, NEDS, or NRD and county- or State-level statistics for participating HCUP Partners.
HCUP Summary Statistics
Produce National Estimate of Discharges By Primary Expected Payer from 2019 NIS File (Weighted)
The SURVEYMEANS Procedure
Statistics for PAY1 Domains
Sum of
PAY1 Variable Label Weights
--------------------------------------------------------------------------------
.: Missing KEY_NIS NIS record number 42179
.A: Invalid KEY_NIS NIS record number 1945.003493
1: Medicare KEY_NIS NIS record number 14529614
2: Medicaid KEY_NIS NIS record number 7927119
3: Private insurance KEY_NIS NIS record number 10249101
4: Self-pay KEY_NIS NIS record number 1525965
5: No charge KEY_NIS NIS record number 109380
6: Other KEY_NIS NIS record number 1033720
--------------------------------------------------------------------------------
The output for this example SAS code provides the following weighted record counts for PAY1 in the 2019 NIS:
- Missing: 42,179
- Invalid: 1,945
- Medicare: 14,529,614
- Medicaid: 7,927,119
- Private insurance: 10,249,101
- Self-pay: 1,525,965
- No charge: 109,380
- Other: 1,033,720
For validation, we are going to compare the output with the 2019 NIS Summary Statistics.
From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the NIS Database Documentation.
The NIS Summary Statistics are available on this page, under the "Data Elements" section on the left-hand side.
The NIS Summary Statistics page includes all years of the NIS. We will scroll down to the section specific to data year 2019. Our data element of interest, PAY1, is in the NIS Core File, which means we will want to select the Summary Statistics for the NIS Core File, and, specifically, the file that provides weighted estimates (i.e., NIS 2019 Core Weighted). Once the file has downloaded, we will need to navigate to the frequency of the data element PAY1. We can do this easily by searching for this data element name within the downloaded PDF. We are now ready to compare the Summary Statistics with our output from SAS.
HCUP Weighted Summary Statistics Report: NIS 2019 Core File Weighted Frequency Distribution for PAY1 |
PAY1 |
Variable Name |
Use |
.: Missing |
42,179 |
0.12% |
.A: Invalid |
1,945 |
0.01% |
1: Medicare |
14,529,614 |
41.02% |
2: Medicaid |
7,927,119 |
22.38% |
3: Private insurance |
10,249,101 |
28.94% |
4: Self-pay |
1,525,965 |
4.31% |
5: No charge |
109,380 |
0.31% |
6: Other |
1,033,720 |
2.92% |
Produce National Estimate of Discharges By Primary Expected Payer from 2019 NIS File (Weighted)
The SURVEYMEANS Procedure
Statistics for PAY1 Domains
Sum of
PAY1 Variable Label Weights
--------------------------------------------------------------------------------
.: Missing KEY_NIS NIS record number 42179
.A: Invalid KEY_NIS NIS record number 1945.003493
1: Medicare KEY_NIS NIS record number 14529614
2: Medicaid KEY_NIS NIS record number 7927119
3: Private insurance KEY_NIS NIS record number 10249101
4: Self-pay KEY_NIS NIS record number 1525965
5: No charge KEY_NIS NIS record number 109380
6: Other KEY_NIS NIS record number 1033720
--------------------------------------------------------------------------------
A comparison of the PAY1 frequency from the 2019 NIS Weighted Core Summary Statistics and the output from the example SAS code in this tutorial demonstrates that our results match.
HCUP Diagnosis and Procedure Frequency Tables
In our second example analysis, which produced national estimates for records in the 2019 NIS with a default CCSR category of RSP009, Asthma, for the principal diagnosis, we obtained a count of 169,330.
For validation, we are going to compare the output with the NIS Diagnosis and Procedure Frequency Tables.
From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the NIS Database Documentation.
The NIS Diagnosis and Procedure Frequency Tables are available on this page, under the "Data Elements" section on the left-hand side.
Once the file has been downloaded, we will navigate to the tab T.1_By_DXCCSR_Category, which includes the unweighted and weighted number of records by individual CCSR for ICD-10-CM diagnosis category. We will then navigate to the row for CCSR category RSP009, Asthma, and scroll over to the columns that are specific to the 2019 NIS. Note that you can filter to RSP009 by using either Column A or Column B.
Table 1. Weighted and Unweighted Number of Records by Clinical Classifications
Software Refined (CCSR) for ICD-10-CM Diagnoses, v2021.2
Source: Agency for Healthcare Research and Quality (AHRQ), Healthcare Cost and Utilization Project (HCUP), National Inpatient Sample (NIS), 2016-2019
Note: Counts for all-listed diagnoses include all possible CCSR category assignments. Unduplicated means that if two or more diagnosis codes on the same discharge record mapped to the same CCSR category, the discharge record was only counted once. An asterisk (*) indicates the value has been suppressed because of small sample size. |
CCSR for ICD-10-CM Category, v2021.2 |
CCSR Description, v2021.2 |
2019 NIS: Weighted N for DX1 CCSR Default |
2019 NIS: Weighted N for All-Listed CCSR (Unduplicated) |
2019 NIS: Unweighted N for DX1 CCSR Default |
2019 NIS: Unweighted N for All-Listed CCSR (Unduplicated) |
RSP009 |
RSP009 Asthma |
**169,330 |
2,274,896 |
33,866 |
454,979 |
Produce National Estimate of Discharges with Default DXCCSR=RSP009 (Asthma) from 2019 NIS File (Weighted)
The SURVEYMEANS Procedure
Data Summary
Number of Strata 201
Number of Clusters 4568
Number of Observations 7083805
Sum of Weights 35419023
Statistics
Std Error Std Error
Variable Label Mean of Mean Sum of Sum
-------------------------------------------------------------------------------------
Asthma Asthma default 0.004781 0.000109 **169330 4029.071202
DXCCSR=RSP009
A comparison of the count obtained from the NIS Diagnosis and Procedure Frequency Tables and the output from the example SAS code in this tutorial (denoted by **) demonstrates that our results match.
HCUPnet
Produce Regional Estimates of Discharges with Default DXCCSR=RSP009 (Asthma) from 2019 NIS File (Weighted)
The SURVEYMEANS Procedure
Statistics for HOSP_REGION Domains
Std Error Std Error
HOSP_REGION Variable Label Mean of Mean Sum of Sum
--------------------------------------------------------------------------------------------------------
1: Northeast Asthma Asthma default 0.006485 0.000358 41550 2419.380819
DXCCSR=RSP009
2: Midwest Asthma Asthma default 0.004217 0.000221 33065 1775.993606
DXCCSR=RSP009
3: South Asthma Asthma default 0.004415 0.000135 62150 2058.065599
DXCCSR=RSP009
4: West Asthma Asthma default 0.004590 0.000240 32565 1729.226390
DXCCSR=RSP009
--------------------------------------------------------------------------------------------------------
Here is output from our final example analysis, which produced regional estimates for records in the 2019 NIS with a default CCSR category of RSP009, Asthma, for the principal diagnosis:
- Northeast: 41,550
- Midwest: 33,065
- South: 62,150
- West: 32,565
For validation, we are going to compare the output with HCUPnet.
As a first step, we will need to accept the terms of the Data Use Agreement. Now, we will navigate to the top menu and select the "Inpatient Setting" dashboard. Once selected, we will expand the option for "National Inpatient" and select "Diagnoses and Procedures."
The HCUPnet results will default to displaying trends in the total number of discharges with a default CCSR category of BLD001, Nutritional anemia, for the principal diagnosis. We need to modify the selections on the left-hand side of the screen to align with our analysis as follows:
- First, select the option for "Cross-Sectional" analysis.
- Next, retain the default data year of "2019" in the "Years" drop-down, the "Diagnoses—Clinical Classifications Software Refined or CCSR" in the "Classification Types" drop-down, and the "Principal" option in the "Principal or All-Listed" drop-down.
- Next, under the "Diagnoses/Procedures" drop-down unclick the (All) selection to change the default from running the query on all CCSR categories. Scroll down through the list to CCSR category RSP009, Asthma, or use the search bar and ensure the box is checked.
- Next, ensure only "Number of discharges" is selected in the "Outcome" drop-down.
- Next, select the "Hospital Census Region" option in the "Characteristic" drop-down.
- Next, retain the default option of "All" for the "Characteristic Levels" drop-down so that all four U.S. census regions are included in the results.
- Last, select the box for "Show 95% CI" to display the standard error of the estimates if you wish to view this information.
A table will appear next to the left-hand side menu where the selections were made. This table presents regional estimates for records in the 2019 NIS with a default CCSR category of RSP009, Asthma, for the principal diagnosis. If you wish to display a graph for this output, navigate to the upper right and make the necessary selections under the "Diagnoses/Procedures to Graph" and "Outcome to Graph" drop-down menus.
Diagnoses/Procedures |
Characteristic Levels |
Total number of discharges |
Estimate |
Std. Error |
RSP009: Asthma |
Northeast |
41,550 |
2,419 |
Midwest |
33,065 |
1,776 |
South |
62,150 |
2,058 |
West |
32,565 |
1,729 |
Produce Regional Estimates of Discharges with Default DXCCSR=RSP009 (Asthma) from 2019 NIS File (Weighted)
The SURVEYMEANS Procedure
Statistics for HOSP_REGION Domains
Std Error Std Error
HOSP_REGION Variable Label Mean of Mean Sum of Sum
--------------------------------------------------------------------------------------------------------
1: Northeast Asthma Asthma default 0.006485 0.000358 41550 2419.380819
DXCCSR=RSP009
2: Midwest Asthma Asthma default 0.004217 0.000221 33065 1775.993606
DXCCSR=RSP009
3: South Asthma Asthma default 0.004415 0.000135 62150 2058.065599
DXCCSR=RSP009
4: West Asthma Asthma default 0.004590 0.000240 32565 1729.226390
DXCCSR=RSP009
--------------------------------------------------------------------------------------------------------
A comparison of our output from HCUPnet with the output from the example SAS code in this tutorial demonstrates that our results match.
You have completed Module 1, National Inpatient Sample (NIS).
For any questions about the NIS that cannot be addressed by this tutorial or the database's documentation, consult HCUP User Support:
- Email: hcup@ahrq.gov
- Phone: 866-290-HCUP (4287) (toll free)
- International users, please contact HCUP User Support by email.
The staff reviews messages daily and usually responds to inquiries within 3 business days.
Return to Contents
Module 2: Kids' Inpatient Database (KID)
The Kids' Inpatient Database, or KID, is the largest publicly available all-payer pediatric inpatient care database in the United States, containing data from 2 to 3 million hospital stays each year.
Information on the KID is organized by the four sections below. These include:
- Overview
- Weighting the Data
- SAS Code Examples
- Validating Estimates
Additional information about the KID is available on the KID Database Documentation page on the HCUP User Support (HCUP-US) website.
Return to Contents
Module 2: Kids' Inpatient Database (KID), Overview of the KID
The KID is the largest publicly available all-payer pediatric inpatient care database in the United States, yielding national estimates of hospital inpatient stays for patients younger than 21 years. The KID can be used to identify, track, and analyze national trends in healthcare utilization, cost, quality, and outcomes for the pediatric population. The unique design of the KID enables national and regional studies of rare conditions (e.g., congenital anomalies) as well as uncommon treatments (e.g., cardiac surgery).
The KID includes a sample of pediatric discharges from the HCUP State Inpatient Databases (SID), which include all inpatient data from participating HCUP Partners that currently contribute to HCUP. Generally available every 3 years, the KID sampling frame has grown from including data from 22 HCUP Partners to 49 HCUP Partners.
Additional information on the sample design of the KID is available in the KID Introduction and the HCUP Sample Design tutorial.
Return to Contents
Module 2: Kids' Inpatient Database (KID), Weighting the KID
KID Data Element Discharge Weight
To produce nationally or regionally representative estimates, the KID data must be weighted. This can be done using the data element discharge weight, or DISCWT, which is assigned to each record in the KID with the value varying across records.
When the discharge weights are applied to the KID data, the result is an estimate of the number of discharges for the target universe, which includes all pediatric inpatient discharges from community hospitals in the United States, excluding rehabilitation hospitals beginning in 2000. Per the American Hospital Association, or AHA, community hospitals include non-Federal, short-term general, and other specialty hospitals that are open to the public. Included among community hospitals are specialty hospitals such as obstetrics-gynecology, ear-nose-throat, orthopedic, and pediatric institutions. Also included are public hospitals and academic medical centers. Examples of excluded hospitals are non-Federal long-term care, psychiatric, and Federal hospitals, such as Indian Health Service hospitals.
Weights are developed after discharges sampled from the SID are stratified into counts using six hospital characteristics: (1) Indicator of freestanding children's hospital, (2) U.S. census region, (3) urban/rural location, (4) teaching status, (5) bed size, and (6) hospital ownership/control. Total discharge counts for the target universe are calculated using the American Hospital Association (AHA) Annual Survey birth estimates for births, and a combination of AHA and SID data to estimate other pediatric discharges.
Pediatric discharges included in the KID are a combination of newborn discharges and non-newborn pediatric discharges. For each stratum, weights are created for each group.
- Newborn discharges by dividing the number of universe newborns in the stratum by the number of KID newborns in the stratum
- Non-newborn pediatric discharges by dividing the number of universe non-newborn pediatric discharges in the stratum by the number of KID non-newborn pediatric discharges in the stratum
KID Discharge Weights Over Time
KID data are generally available every 3 years beginning with data year 1997. Users should be mindful of changes to the discharge weight variable over time. These changes are listed in the table below.
Data Year(s) |
Data Element Name |
Use |
2003+ |
DISCWT |
All national estimates |
2000 |
DISCWT |
National estimates except those including total charge (data element TOTCHG) |
2000 |
DISCWTcharge |
National estimates of total charge (data element TOTCHG) |
1997 |
DISCWT_U |
All national estimates |
Return to Contents
Module 2: Kids' Inpatient Database (KID), SAS Code Examples
Example SAS Code for Producing National Estimates by Patient Location
This example SAS code produces national estimates of discharges by patient location in the 2019 KID, defined by the National Centers for Health Statistics (NCHS) urban-rural code (data element PL_NCHS).
Title "Produce National Estimate of Discharges By Patient NCHS Location from 2019 KID File (Weighted)";
Libname KID2019 "V:\KID\2019\KID\SASDATA" access=readonly;
Options PS=51 LS=146 ;
proc format;
Value NCHSF
1 = " 1: Large Central Metro"
2 = " 2: Large Fringe Metro"
3 = " 3: Medium Metro"
4 = " 4: Small Metro"
5 = " 5: Micropolitan"
6 = " 6: Noncore"
. = " .: Missing"
;
run;
proc surveymeans data=kid2019.kid_2019_core missing sumwgt ;
cluster HOSP_KID ;
strata KID_STRATUM ;
domain PL_NCHS ;
format PL_NCHS nchsf. ;
weight DISCWT ;
var RECNUM ;
run;
The first section of this example SAS code includes a PROC FORMAT, which assigns data labels to the data values in the output. For this example, we are focused on data element PL_NCHS, which has the following mappings:
- Numeric value 1 for Large central metro
- Numeric value 2 for Large fringe metro
- Numeric value 3 for Medium metro
- Numeric value 4 for Small metro
- Numeric value 5 for Micropolitan
- Numeric value 6 for Noncore
- A decimal point means a numeric value is missing
This PROC FORMAT is specific to this example and should be modified if your analysis requires a different data element of interest. For example, if you are interested in obtaining national estimates for patient age or data element FEMALE, the proc format would include the mapping for that data element.
The second section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the KID. This procedure includes the following statements:
- The CLUSTER statement, which includes the KID hospital identifier or data element HOSP_KID.
- The STRATA statement, which includes the KID stratum identifier or data element KID_STRATUM.
- The DOMAIN and FORMAT statements are specific to this analysis, which is interested in national estimates by data element PL_NCHS.
- The WEIGHT statement, which includes the KID discharge weight or data element DISCWT.
- The VAR statement, which includes the KID record identifier or data element RECNUM.
- Note that the KID record identifier data element name does not include KEY, which differs from the other nationwide databases.
Produce National Estimate of Discharges By Patient NCHS Location from 2019 KID File (Weighted)
The SURVEYMEANS Procedure
Statistics for PL_NCHS domains
Sum of
PL_NCHS Variable Label Weights
--------------------------------------------------------------------------------
.: Missing RECNUM KID record number 19590
1: Large Central Metro RECNUM KID record number 1908413
2: Large Fringe Metro RECNUM KID record number 1415496
3: Medium Metro RECNUM KID record number 1229632
4: Small Metro RECNUM KID record number 509407
5: Micropolitan RECNUM KID record number 492487
6: Noncore RECNUM KID record number 327514
--------------------------------------------------------------------------------
The output for this example SAS code provides the following weighted record counts for PL_NCHS in the 2019 KID:
- Missing: 19,590
- Large central metro: 1,908,413
- Large fringe metro: 1,415,496
- Medium metro: 1,229,632
- Small metro: 509,407
- Micropolitan: 492,487
- Noncore: 327,514
Example SAS Code for Producing National Estimates for Appendectomies
This example SAS code identifies the number of weighted records in the 2019 KID with a principal procedure of appendectomy, which is based on the HCUP Clinical Classifications Software Refined (CCSR) for ICD-10-PCS procedure category, GIS008 (Appendectomy).
Title "Produce National Estimate of Discharges with PR1=GIS008 (Appendectomy) from 2019 KID File (Weighted)";
Libname KID2019 "V:\KID\2019\KID\SASDATA" access=readonly;
Options PS=51 LS=146 ;
data Appendectomy;
merge kid2019.kid_2019_core (keep=HOSP_KID RECNUM DISCWT KID_STRATUM)
kid2019.kid_2019_dx_pr_grps (keep=HOSP_KID RECNUM PRCCSR_GIS008)
;
by HOSP_KID RECNUM;
/* 1 is principal only, 2 is both principal and secondary, 3 is secondary only, 0 is none */
Attrib Appendectomy length=3 label='Appendectomy (PRCCSR=GIS008=1 or 2)';
Appendectomy =(PRCCSR_GIS008 in (1:2));
run;
proc surveymeans data=Appendectomy sum std mean stderr;
cluster HOSP_KID ;
strata KID_STRATUM;
var Appendectomy;
weight DISCWT;
run;
The first section of this example SAS code includes the DATA step, which identifies records with a CCSR category for the principal procedure of GIS008, Appendectomy. This step includes the following statements:
- The MERGE statement, which combines the KID Core File with the KID Diagnosis and Procedure Groups File. The KID Diagnosis and Procedure Groups File includes the data element specific to the CCSR category for appendectomy, which is PRCCSR_GIS008.
- The KEEP statements, which are present for each file containing data elements we need for this analysis. This includes data elements necessary for linking the files, weighting the data, and PRCCSR_GIS008.
- The ATTRIB statement, which assigns a length and a label to a new data element (Appendectomy) specific to our example analysis. The next statement, Appendectomy =, assigns a value to this new data element, which in our example, is defined based on the CCSR category of GIS008 for the principal procedure (KID data element PRCCSR_GIS008 where the value is equal to 1 or 2).
- A value of 1 means that the CCSR category was triggered by only the principal procedure on the record, and a value of 2 indicates it was triggered by both the principal and a secondary procedure on the record.
- It is important to note that the absence of the value 2 in this statement is a common mistake experienced by users of HCUP data. It is not uncommon for a procedure code (or diagnosis code) to be repeated more than once on a record. If you limit your analysis to just records where the respective CCSR category is equal to the value of 1, that is it was triggered by the principal procedure (or diagnosis) only, you will not obtain accurate results.
The second section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the KID. This procedure includes the following statements:
- The CLUSTER statement, which includes HOSP_KID.
- The STRATA statement, which includes KID_STRATUM.
- The WEIGHT statement, which includes data element DISCWT.
- The VAR statement, which includes the value, Appendectomy, that we defined in the DATA step above.
Produce National Estimate of Discharges with PR1=GIS008 (Appendectomy) from 2019 KID File (Weighted)
The SURVEYMEANS Procedure
Data Summary
Number of Strata 95
Number of Clusters 3998
Number of Observations 3089283
Sum of Weights 5902538.38
Statistics
Std Error Std Error
Variable Label Mean of Mean Sum of Sum
--------------------------------------------------------------------------------------------
Appendectomy Appendectomy 0.007910 0.000275 46687 42075.679551
(PRCCSR_GIS008=1 or 2)
The output for this example SAS code provides the total number of weighted records in the 2019 KID with a CCSR category of GIS008, Appendectomy, which is 46,687.
Example SAS Code for Producing Regional Estimates for Appendectomies
This example SAS code below produces regional estimates for records with a CCSR category for the principal procedure (PR1) of GIS008, Appendectomy.
Title "Produce Regional Estimates of Discharges with PR1=GIS008 (Appendectomy) from 2019 KID File (Weighted)";
Libname KID2019 "V:\KID\2019\KID\SASDATA" access=readonly;
Options PS=51 LS=146 ;
proc format;
Value St_Regn
1 = "1: Northeast"
2 = "2: Midwest"
3 = "3: South"
4 = "4: West"
;
run;
data appendectomy;
merge kid2019.kid_2019_core (keep=HOSP_KID RECNUM DISCWT KID_STRATUM)
kid2019.kid_2019_dx_pr_grps (keep=HOSP_KID RECNUM PRCCSR_GIS008)
;
by HOSP_KID RECNUM;
/* Look up region */
if _n_=1 then do;
if 0 then set kid2019.kid_2019_hospital (keep=HOSP_REGION);
declare hash h (dataset: "kid2019.kid_2019_hospital");
h.defineKey('HOSP_KID');
h.defineData('HOSP_REGION');
h.defineDone();
end;
if h.find() ne 0 then abort; /* all disharges should have a matching hospital record */
/* 1 is principal only, 2 is both principal and secondary, 3 is secondary only, 0 is none */
Attrib Appendectomy length=3 label='Appendectomy (PRCCSR_GIS008=1 or 2)';
Appendectomy = (PRCCSR_GIS008 in (1:2));
run;
proc surveymeans data=Appendectomy missing sum mean ;
cluster HOSP_KID ;
strata KID_STRATUM ;
var Appendectomy;
weight DISCWT ;
domain HOSP_REGION ;
format HOSP_REGION St_Regn. ;
run;
The first section of this example SAS code includes a PROC FORMAT, which assigns data labels to the data values in the output. For this example, we are focused on the data element HOSP_REGION, which includes the following mappings:
- Numeric value 1 for Northeast
- Numeric value 2 for Midwest
- Numeric value 3 for South
- Numeric value 4 for West
The second section of this example SAS code includes the DATA step. Like the second example above, this step is looking for records with a CCSR category for the principal procedure of GIS008, Appendectomy. This step includes the following statements:
- The MERGE statement, which combines the KID Core File with the KID Diagnosis and Procedure Groups File keeping essential data elements from both files.
- There is an additional step that is looking for the data element HOSP_REGION, which resides in the KID Hospital File and is needed to produce regional estimates.
- The ATTRIB statement, which assigns a length and a label to a new data element (Appendectomy) specific to our example analysis. The next statement, Appendectomy =, assigns a value to this new data element, which in our example, is defined based on the CCSR category of GIS008 for the principal procedure (KID data element PRCCSR_GIS008 where the value is equal to 1 or 2).
The final section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the KID. This procedure includes the following statements:
- The CLUSTER statement, which includes HOSP_KID.
- The STRATA statement, which includes KID_STRATUM.
- The WEIGHT statement, which includes data element DISCWT.
- The VAR statement, which includes the value, Appendectomy, which we defined in the DATA step above.
- The DOMAIN and FORMAT statements, which are specific to HOSP_REGION as we are interested in regional estimates.
Produce Regional Estimates of Discharges with PR1=GIS008 (Appendectomy) from 2019 KID File (Weighted)
The SURVEYMEANS Procedure
Statistics for HOSP_REGION Domains
Std Error Std Error
HOSP_REGION Variable Label Mean of Mean Sum of Sum
-------------------------------------------------------------------------------------------------------------
1: Northeast Appendectomy Appendectomy 0.007744 0.000653 7430.721673 846.785971
(PRCCSR_GIS008=1 or 2)
2: Midwest Appendectomy Appendectomy 0.005521 0.000348 7025.478501 662.349843
(PRCCSR_GIS008=1 or 2)
3: South Appendectomy Appendectomy 0.006957 0.000384 16252 1277.861496
(PRCCSR_GIS008=1 or 2)
4: West Appendectomy Appendectomy 0.011976 0.000875 15980 1659.952615
(PRCCSR_GIS008=1 or 2)
-------------------------------------------------------------------------------------------------------------
The output for this example SAS code provides the total number of weighted records in the 2019 KID with a CCSR category for the principal procedure of GIS008, Appendectomy, by hospital region:
- Northeast: 7,430
- Midwest: 7,025
- South: 16,252
- West: 15,980
Return to Contents
Module 2: Kids' Inpatient Database (KID), Validating National and Regional Estimates
There are three resources that can be used to validate national and regional estimates for the KID.
- The HCUP Summary Statistics include means on all numeric variables, frequency distributions, and univariates on continuous variables for each HCUP database. Summary Statistics are provided by year.
- The HCUP Diagnosis and Procedure Frequency Tables provide frequencies of ICD-9-CM and ICD-10-CM/PCS codes (individually and grouped by clinical category) in the HCUP nationwide databases. These are available under the "Data Elements" section of the respective nationwide database documentation page on the HCUP-US website.
- HCUPnet is a free online query tool that provides select precalculated statistics derived from both the State and nationwide HCUP databases. HCUPnet can be used to validate select national estimates obtained from the NIS, KID, NEDS, or NRD and county- or State-level statistics for participating HCUP Partners.
HCUP Summary Statistics
Produce National Estimate of Discharges By Patient NCHS Location from 2019 KID File (Weighted)
The SURVEYMEANS Procedure
Statistics for PL_NCHS domains
Sum of
PL_NCHS Variable Label Weights
--------------------------------------------------------------------------------
.: Missing RECNUM KID record number 19590
1: Large Central Metro RECNUM KID record number 1908413
2: Large Fringe Metro RECNUM KID record number 1415496
3: Medium Metro RECNUM KID record number 1229632
4: Small Metro RECNUM KID record number 509407
5: Micropolitan RECNUM KID record number 492487
6: Noncore RECNUM KID record number 327514
--------------------------------------------------------------------------------
Here is the output from our first example analysis, which produced national estimates for records in the 2019 KID by patient location using the NCHS urban-rural code, or data element PL_NCHS. We have separate weighted counts for each PL_NCHS value as well as for the missing value.
For validation, we are going to compare the output with the 2019 KID Summary Statistics.
From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the KID Database Documentation.
The KID Summary Statistics are available on this page, under the "Data Elements" section on the left-hand side.
The KID Summary Statistics page includes all years of the KID. We will scroll down to the section specific to data year 2019. Our data element of interest, PL_NCHS, is in the KID Core File, which means we will want to select the Summary Statistics for the KID Core File and, specifically, the file that provides weighted estimates (i.e., KID 2019 Core Weighted). Once the file has been downloaded, we will navigate to the frequency table for the data element PL_NCHS. We can do this easily by searching for this data element name within the downloaded PDF.
HCUP Weighted Summary Statistics Report: KID 2019 Core File Weighted Frequency Distribution for PL_NCHS |
PL_NCHS |
Frequency |
Percent of Total |
.: Missing |
19,590 |
0.33% |
1: Large Central Metro |
1,908,413 |
32.33% |
2: Large Fringe Metro |
1,415,496 |
23.98% |
3: Medium Metro |
1,229,632 |
20.83% |
4: Small Metro |
509,407 |
8.63% |
5: Micropolitan |
492,487 |
8.34% |
6: Noncore |
327,514 |
5.55% |
Produce National Estimate of Discharges By Patient NCHS Location from 2019 KID File (Weighted)
The SURVEYMEANS Procedure
Statistics for PL_NCHS domains
Sum of
PL_NCHS Variable Label Weights
--------------------------------------------------------------------------------
.: Missing RECNUM KID record number 19590
1: Large Central Metro RECNUM KID record number 1908413
2: Large Fringe Metro RECNUM KID record number 1415496
3: Medium Metro RECNUM KID record number 1229632
4: Small Metro RECNUM KID record number 509407
5: Micropolitan RECNUM KID record number 492487
6: Noncore RECNUM KID record number 327514
--------------------------------------------------------------------------------
A comparison of the PL_NCHS frequency from the 2019 KID Weighted Core Summary Statistics and the output from SAS demonstrates that our results match.
KID Diagnosis and Procedure Frequency Tables
In the output from our second example analysis, which produced national estimates for records in the 2019 KID with a CCSR category of GIS008, Appendectomy, for the principal procedure, we obtained a count of 46,687.
For validation, we are going to compare the output with the KID Diagnosis and Procedure Frequency Tables.
From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the KID Database Documentation.
The KID Diagnosis and Procedure Frequency Tables are available on this page, under the "Data Elements" section on the left-hand side.
Once the file has been downloaded, we will navigate to the tab T.3_By_PRCCSR_Category, which includes the unweighted and weighted number of records by individual CCSR for ICD-10-PCS procedure categories. We will then navigate to the row for CCSR category GIS008, Appendectomy, and scroll over to the columns that are specific to the 2019 KID. Note that you can filter to GIS008 by using either Column A or Column B. We are now ready to compare the values with our output from SAS.
Table 3. Weighted and Unweighted Number of Records by Clinical Classifications Software Refined (CCSR) for ICD-10-PCS Procedures, v2021.1
Source: Agency for Healthcare Research and Quality (AHRQ), Healthcare Cost and Utilization Project (HCUP), Kids' Inpatient Database (KID), 2016 and 2019
Note: Unduplicated means that if two or more procedure codes on the same discharge record mapped to the same CCSR category, the discharge record was only counted once. An asterisk (*) indicates the value has been suppressed because of small sample size. |
CCSR for ICD-10-PCS Category, v2021.1 |
CCSR Description, v2021.1 |
2019 KID: Weighted N for PR1 CCSR |
2019 KID: Weighted N for All-Listed CCSR (Unduplicated) |
2019 KID: Unweighted N for PR1 CCSR |
2019 KID: Unweighted N for All-Listed CCSR (Unduplicated) |
GIS008 |
GIS008 Appendectomy |
**46,687 |
51,239 |
34,537 |
37,927 |
Produce National Estimate of Discharges with PR1=GIS008 (Appendectomy) from 2019 KID File (Weighted)
The SURVEYMEANS Procedure
Data Summary
Number of Strata 95
Number of Clusters 3998
Number of Observations 3089283
Sum of Weights 5902538.38
Statistics
Std Error Std Error
Variable Label Mean of Mean Sum of Sum
--------------------------------------------------------------------------------------------
Appendectomy Appendectomy 0.007910 0.000275 **46687 42075.679551
(PRCCSR_GIS008=1 or 2)
A comparison of the weighted count for records in the 2019 KID with a CCSR category of GIS008, Appendectomy, for the principal procedure with the output from SAS (denoted by **) demonstrates that our results match.
HCUPnet
Produce Regional Estimates of Discharges with PR1=GIS008 (Appendectomy) from 2019 KID File (Weighted)
The SURVEYMEANS Procedure
Statistics for HOSP_REGION Domains
Std Error Std Error
HOSP_REGION Variable Label Mean of Mean Sum of Sum
-------------------------------------------------------------------------------------------------------------
1: Northeast Appendectomy Appendectomy 0.007744 0.000653 7430.721673 846.785971
(PRCCSR_GIS008=1 or 2)
2: Midwest Appendectomy Appendectomy 0.005521 0.000348 7025.478501 662.349843
(PRCCSR_GIS008=1 or 2)
3: South Appendectomy Appendectomy 0.006957 0.000384 16252 1277.861496
(PRCCSR_GIS008=1 or 2)
4: West Appendectomy Appendectomy 0.011976 0.000875 15980 1659.952615
(PRCCSR_GIS008=1 or 2)
-------------------------------------------------------------------------------------------------------------
Here is the output from our final example analysis, which produced regional estimates for records in the 2019 KID with a CCSR category of GIS008, Appendectomy, for the principal procedure. These weighted counts can be validated using HCUPnet. It is important to note that within HCUPnet, procedure-related statistics are limited to operating room (OR) procedures only. OR procedures are identified using the HCUP Procedure Classes Refined for ICD-10-PCS. For reference, all ICD-10-PCS procedures included in CCSR category GIS008 are classified as OR procedures by the Procedure Classes tool.
As a first step, we will need to accept the terms of the Data Use Agreement. Now, we will navigate to the top menu and select the "Inpatient Setting" dashboard. Once selected, we will expand the option for "Children Only" and select "Diagnoses and Procedures."
The output will default to displaying trends in the total number of discharges for all major diagnostic categories, or MDCs. We need to modify the selections on the left-hand side of the screen to align with our analysis.
- First, select the option for "Cross-Sectional" analysis.
- Next, retain the default data year of "2019" in the "Years" drop-down.
- Next, select "Procedures—Clinical Classifications Software Refined or CCSR, Restricted to Operating Room Only" in the "Classification Types" drop-down, and retain the "Principal" option in the "Principal or All-Listed" drop-down.
- Next, under the "Diagnoses/Procedures" drop-down unclick the (All) selection to change the default from running the query on all CCSR categories. Scroll down through the list to CCSR category GIS008, Appendectomy, or use the search bar and ensure the box is checked.
- Next, ensure only "Number of discharges" is selected in the "Outcome" drop-down.
- Next, select the "Hospital Census Region" option in the "Characteristic" drop-down.
- Next, retain the default option of "All" for the "Characteristic Levels" drop-down.
- Last, select the box for "Show 95% CI" to display the standard error of the estimates if you wish to view this information.
A table will appear next to the left-hand side menu where the selections were made. This table presents regional estimates for records in the 2019 KID with a CCSR category of GIS008, Appendectomy, for the principal procedure. If you wish to display a graph for this output, navigate to the upper right and make the necessary selections under the "Diagnoses/Procedures to Graph" and "Outcome to Graph" drop-downs.
Diagnoses/Procedures |
Characteristic Levels |
Total number of discharges |
Estimate |
Std. Error |
GIS008: Appendectomy |
Midwest |
7,025 |
662 |
Northeast |
7,431 |
847 |
South |
16,252 |
1,278 |
West |
15,980 |
1,660 |
Produce Regional Estimates of Discharges with PR1=GIS008 (Appendectomy) from 2019 KID File (Weighted)
The SURVEYMEANS Procedure
Statistics for HOSP_REGION Domains
Std Error Std Error
HOSP_REGION Variable Label Mean of Mean Sum of Sum
-------------------------------------------------------------------------------------------------------------
1: Northeast Appendectomy Appendectomy 0.007744 0.000653 7430.721673 846.785971
(PRCCSR_GIS008=1 or 2)
2: Midwest Appendectomy Appendectomy 0.005521 0.000348 7025.478501 662.349843
(PRCCSR_GIS008=1 or 2)
3: South Appendectomy Appendectomy 0.006957 0.000384 16252 1277.861496
(PRCCSR_GIS008=1 or 2)
4: West Appendectomy Appendectomy 0.011976 0.000875 15980 1659.952615
(PRCCSR_GIS008=1 or 2)
-------------------------------------------------------------------------------------------------------------
A comparison of our output from HCUPnet with the output from the example SAS code in this tutorial demonstrates that our results match.
You have completed Module 2, Kids' Inpatient Database (KID)!
For any questions about the KID that cannot be addressed by this tutorial or the database's documentation, consult HCUP User Support:
- Email: hcup@ahrq.gov
- Phone: 866-290-HCUP (4287) (toll free)
- International users, please contact HCUP User Support by email.
The staff reviews messages daily and usually responds to inquiries within 3 business days.
Return to Contents
Module 3: Nationwide Ambulatory Surgery Sample (NASS)
The Nationwide Ambulatory Surgery Sample, or NASS, is the largest all-payer ambulatory surgery database in the United States. It produces national estimates of major ambulatory surgery encounters in hospital-owned facilities.
Information on the NASS is organized by the four sections below. These include:
- Overview
- Weighting the Data
- SAS Code Examples, and
- Validating Estimates
Additional information about the NASS is available on the NASS Database Documentation page on the HCUP User Support, or HCUP-US, website.
Module 3: Nationwide Ambulatory Surgery Sample (NASS), Overview of the NASS
The NASS is the largest all-payer ambulatory surgery database in the United States, yielding national and regional estimates of major ambulatory surgery encounters performed in hospital-owned facilities.
Major ambulatory surgeries are defined as selected major therapeutic procedures that require the use of an operating room, penetrate or break the skin, and involve regional anesthesia, general anesthesia, or sedation to control pain (that is, surgeries flagged as "narrow" in the HCUP Surgery Flags Software).
The NASS is limited to encounters with at least one in-scope major ambulatory surgery on the record performed at hospital-owned facilities. Procedures intended primarily for diagnostic purposes are not considered in scope. Unweighted, the NASS contains about 9 million ambulatory surgery encounters each year and about 11.8 million ambulatory surgery procedures. Weighted, it estimates about 11.9 million ambulatory surgery encounters and 15.7 million ambulatory surgery procedures.
The NASS is sampled from the HCUP State Ambulatory Surgery and Services Databases (SASD), and is available beginning with data year 2016.
Additional information on the NASS sample design is available in the NASS Introduction.
Return to Contents
Module 3: Nationwide Ambulatory Surgery Sample (NASS), Weighting the NASS
NASS Data Element Encounter Weight
To produce nationally or regionally representative estimates, the NASS data must be weighted. This can be done using the data element encounter weight, or DISCWT, which is assigned to each record in the NASS.
When the encounter weights are applied to the NASS data, the result is an estimate of the number of major ambulatory surgery encounters for the target universe, which includes all major ambulatory surgery encounters performed in facilities owned by community hospitals in the United States, excluding rehabilitation hospitals and long-term acute care hospitals. Prior to data year 2019, specialty hospitals were also excluded. Per the American Hospital Association, or AHA, community hospitals include non-Federal, short-term general, and other specialty hospitals that are open to the public. Included among community hospitals are specialty hospitals such as obstetrics-gynecology, ear-nose-throat, orthopedic, and pediatric institutions. Also included are public hospitals and academic medical centers. Examples of excluded hospitals include non-Federal long-term care, psychiatric, and Federal hospitals, such as Indian Health Service hospitals.
The NASS target universe is major ambulatory surgery encounters in hospital-owned facilities in the U.S. The NASS sample is comprised of 100 percent of the major ambulatory surgery encounters for facilities in the SASD that are also in the NASS target universe. Ambulatory surgery volume for the target universe is derived from encounters in the SASD for facilities in HCUP States and estimated for facilities in non-HCUP States using predictive modeling.
Weights are developed by first summarizing after target and sample ambulatory surgery volume by strata defined by four hospital characteristics: (1) ownership/control, (2) bed size, (3) location and teaching status, and (4) the four U.S. census regions.
NASS encounter weights are calculated by dividing the number of universe major ambulatory surgery encounters by the number of sampled SASD major ambulatory surgery encounters within each stratum.
Changes to the NASS Sampling Design Over Time
Changes have occurred to the NASS design since its initial release for data year 2016. These changes include:
- Procedures considered in scope can change year to year.
- Earlier years of the NASS (2016–2018) undercount certain emergent surgeries.
- The hospital-owned facility universe was modified between data years 2018 and 2019 to include specialty hospitals and limit to hospitals included in the AHA Annual Survey that reported performing outpatient surgery.
Additional information on these changes is available in the NASS Introduction.
These changes may cause discontinuities in trend analyses of major ambulatory surgery encounters over time. Unlike the redesign of the NIS in 2012, the NASS design changes have not resulted in the development of special trend weight files. The NASS encounter weight (data element DISCWT) should be used to obtain national or regional estimates.
Return to Contents
Module 3: Nationwide Ambulatory Surgery Sample (NASS), SAS Code Examples
Example SAS Code for Producing National Estimates by Race and Ethnicity
This example SAS code produces national estimates of major ambulatory surgeries by patient race and ethnicity (data element RACE) in the 2019 NASS.
Title "Produce National Estimate of Encounters By Patient Race/Ethnicity from 2019 NASS File (Weighted)";
Libname NASS2019 "O:\NASS\2019\run1\data" access=readonly;
Options PS=51 LS=146 ;
proc format;
Value FRACE
1 = " 1: White"
2 = " 2: Black"
3 = " 3: Hispanic"
4 = " 4: Asian/Pacific Islander"
5 = " 5: Native American"
6 = " 6: Other"
. = " .: Missing"
.A = ".A: Invalid"
.B = ".B: Unavailable from source"
;
run;
Title2 "Add NASS_STRATUM from Hospital file";
Proc Sort Data=NASS2019.NASS_2019_encounter (Keep=HOSP_NASS NASS_STRATUM) Out=Hospital ;
By HOSP_NASS;
Run;
Data NASS;
Merge Encounter (in=inE)
Hospital (in=inH)
;
By HOSP_NASS;
if inE;
if not inH then abort;
Run;
Title2;
proc surveymeans data=NASS missing sumwgt ;
cluster HOSP_NASS ;
strata NASS_STRATUM ;
domain RACE ;
format RACE FRACE. ;
weight DISCWT ;
var KEY_NASS ;
run;
The first section of this example SAS code includes a PROC FORMAT, which assigns data labels to the data values in the output. For this example, we are focused on data element RACE, which has the following mappings:
- Numeric value 1 for White
- Numeric value 2 for Black
- Numeric value 3 for Hispanic
- Numeric value 4 for Asian or Pacific Islander
- Numeric value 5 for Native American
- Numeric value 6 for Other
- A decimal point means a numeric value is missing, and
- A decimal followed by the uppercase letter, A, means the value is invalid.
This PROC FORMAT is specific to this example and should be modified if your analysis requires a different data element of interest. For example, if you are interested in obtaining national estimates for the primary expected payer or data element PAY1, the proc format would include the mapping for that data element.
The second section of this example SAS code involved two procedures which sort the NASS Encounter File and the NASS Hospital File by HOSP_NASS and KEY_NASS. KEEP statements in these procedures limit the Encounter and Hospital files to only those data elements necessary for linkage, weighting the data, and adding the stratum and RACE fields.
The third section in this code employs a SAS data step that creates a temporary file called NASS by using the MERGE statement to join the NASS Encounter File with the NASS Hospital File to obtain the field NASS_STRATUM.
The final section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NASS. This procedure includes the following statements:
- The CLUSTER statement, which includes the NASS hospital identifier or data element HOSP_NASS.
- The STRATA statement, which includes the NASS stratum identifier or data element NASS_STRATUM.
- The DOMAIN and FORMAT statements are specific to this analysis, which is interested in national estimates by data element RACE.
- The WEIGHT statement, which includes the NASS discharge weight or data element DISCWT.
- The VAR statement, which includes the NASS record identifier or data element KEY_NASS.
Produce National Estimate of Encounters By Patient Race/Ethnicity from 2019 NASS File (Weighted)
The SURVEYMEANS Procedure
Data Summary
Number of Strata 61
Number of Clusters 2958
Number of Observations 8994101
Sum of Weights 11880487.3
Statistics
Variable Label Sum of Weights
-------------------------------------------------------
KEY_NASS NASS record number 11880487
Produce National Estimate of Encounters By Patient Race/Ethnicity from 2019 NASS File (Weighted)
The SURVEYMEANS Procedure
Statistics for RACE Domains
Sum of
RACE Variable Label Weights
--------------------------------------------------------------------------------
.: Missing KEY_NASS NASS record number 343473
.A: Invalid KEY_NASS NASS record number 188.485640
1: White KEY_NASS NASS record number 8425840
2: Black KEY_NASS NASS record number 1101118
3: Hispanic KEY_NASS NASS record number 1247939
4: Asian/Pacific Islander KEY_NASS NASS record number 318776
5: Native American KEY_NASS NASS record number 66099
6: Other KEY_NASS NASS record number 377055
--------------------------------------------------------------------------------
The output for this example SAS code provides the weighted record counts for RACE in the 2019 NASS:
- Missing: 343,473
- Invalid: 188
- White: 8,425,840
- Black: 1,101,118
- Hispanic: 1,247,939
- Asian/Pacific Islander: 318,776
- Native American: 66,099
- Other: 377,055
Example SAS Code for Producing National Estimates for Arthroplasty of Knee
This example SAS code identifies the number of weighted records in the 2019 NASS with any-listed procedure of knee arthroplasty, which is based on the HCUP Clinical Classifications Software (CCS) For Services and Procedures category 152.
Title "Produce National Estimate of Encounters With Any-Listed Knee Arthroplasty Procedures from 2019 NASS File (Weighted)";
Libname NASS2019 "O:\NASS\2019\run1\data" access=readonly;
Options PS=51 LS=146 ;
Title2 "Add NASS_STRATUM from Hospital file";
Proc Sort Data=NASS2019.NASS_2019_encounter (keep=HOSP_NASS KEY_NASS DISCWT CPTCCS1-CPTCCS30) Out=Encounter ;
By HOSP_NASS KEY_NASS;
Run;
Proc Sort Data=NASS2019.NASS_2019_hospital (keep=HOSP_NASS NASS_STRATUM) Out=Hospital ;
By HOSP_NASS;
Run;
Title2 "Define Knee Arthroplasty"
Data NASS;
Merge Encounter (in=inE)
Hospital (in=inH)
;
by HOSP_NASS;
if inE;
if not inH then abort;
Attrib Knee_Arthroplasty length=3 label='Knee arthroplasty (CPTCCSn=152)';
array CPTCCS{*} CPTCCS1-CPTCCS30;
Knee_Arthroplasty=0;
do i=1 to dim(CPTCCS) until (Knee_Arthroplasty=1);
if CPTCCS(i)=152 then Knee_Arthroplasty=1;
end;
drop i;
Run;
Title2;
proc surveymeans data=NASS missing sum;
cluster HOSP_NASS ;
strata NASS_STRATUM ;
weight DISCWT ;
var Knee_Arthroplasty ;
run;
The first section of this example SAS code involves two procedures which sort the NASS Encounter File and the NASS Hospital File by HOSP_NASS and KEY_NASS. KEEP statements in these procedures limit the Encounter and Hospital files to only those data elements necessary for linkage, weighting the data, and adding the stratum and CPTCCS fields.
- The second section in this example SAS code employs a SAS data step that creates a temporary file called NASS. This step includes the following statements:
- The MERGE statement, which combines the NASS Encounter File with the NASS Hospital File to obtain the field NASS_STRATUM. If a hospital is not found in the NASS hospital file, the data create step is aborted. This can be seen in the line with the abort code.
- The ATTRIB statement, which assigns a length and a label to a new data element (Knee_Arthroplasty) specific to our example analysis. The next statement, Knee_Arthroplasty =, assigns a value to this new data element, which in our example, is defined based on the CCS for Services and Procedures category of 152 (NASS data element CPTCCSn=152).
The final section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NASS. This procedure includes the following statements:
- The CLUSTER statement, which includes HOSP_NASS.
- The STRATA statement, which includes NASS_STRATUM.
- The WEIGHT statement, which includes data element DISCWT.
- The VAR statement, which includes the value Knee_Arthroplasty, which we defined in the DATA step above.
Produce National Estimate of Encounters With Any-Level Knee Arthroplasty Procedures from 2019 NASS File (Weighted)
The SURVEYMEANS Procedure
Data Summary
Number of Strata 61
Number of Clusters 2958
Number of Observations 8994101
Sum of Weights 11880487.3
Statistics
Std Error
Variable Label Sum of Sum
----------------------------------------------------------------
Knee_Arthroplasty Knee_Arthroplasty 301910 10285
(CPTCCSn=152)
The output for this example SAS code provides the total number of weighted records in the 2019 NASS with any-listed procedure CCS for Services and Procedures category of 152 for Arthroplasty of knee, which is 301,910.
Example SAS Code for Producing Regional Estimates for Arthroplasty of Knee
This example SAS code produces regional estimates for knee arthroplasty in the 2019 NASS, which is based on the CCS for Services and Procedures category 152.
Title "Produce National Estimate of Encounters With Any-Listed Knee Arthroplasty Procedures from 2019 NASS File (Weighted)";
Libname NASS2019 "O:\NASS\2019\run1\data" access=readonly;
Options PS=51 LS=146 ;
proc format;
Value St_Regn
1 = "1: Northeast"
2 = "2: Midwest"
3 = "3: South"
4 = "4: West"
;
run;
Title2 "Add NASS_STRATUM and Region from Hospital file";
Proc Sort Data=NASS2019.NASS_2019_encounter (keep=HOSP_NASS KEY_NASS DISCWT CPTCCS1-CPTCCS30) Out=Encounter ;
By HOSP_NASS KEY_NASS;
Run;
Proc Sort Data=NASS2019.NASS_2019_hospital (keep=HOSP_NASS NASS_STRATUM HOSP_REGION) Out=Hospital ;
By HOSP_NASS;
Run;
Title2 "Define Knee Arthroplasty"
Data NASS;
Merge Encounter (in=inE)
Hospital (in=inH)
;
by HOSP_NASS;
if inE;
if not inH then abort;
Attrib Knee_Arthroplasty length=3 label='Knee arthroplasty (CPTCCSn=152)';
array CPTCCS{*} CPTCCS1-CPTCCS30;
Knee_Arthroplasty=0;
do i=1 to dim(CPTCCS) until (Knee_Arthroplasty=1);
if CPTCCS(i)=152 then Knee_Arthroplasty=1;
end;
drop i;
Run;
Title2;
proc surveymeans data=NASS missing sum ;
cluster HOSP_NASS ;
strata NASS_STRATUM ;
weight DISCWT ;
domain HOSP_REGION ;
format HOSP_REGION st_regn. ;
var Knee_Arthroplasty ;
run;
The first section of this example SAS code includes a PROC FORMAT, which assigns data labels to the data values in the output. For this example, we are focused on the data element HOSP_REGION, which includes the following mappings:
- Numeric value 1 for Northeast
- Numeric value 2 for Midwest
- Numeric value 3 for South
- Numeric value 4 for West
The second section in the example SAS code involves two procedures which sort the NASS Encounter File and the NASS Hospital File by HOSP_NASS and KEY_NASS keeping essential data elements from each file.
The third section in this example SAS code employs a SAS data step that combines the NASS Encounter File with the NASS Hospital File to obtain the field NASS_STRATUM. If a hospital is not found in the NASS hospital file, the data create step is aborted. This can be seen in the line with the abort code. The ATTRIB statement assigns a length and a label to a new data element (Knee_Arthroplasty) specific to our example analysis. Th next statement, Knee_Athroplasty =, assigns a value to this new data element, which in our example, is defined based on any-listed CCS for Services and Procedures category of 152 (NASS data element CPTCCSn=152).
The final section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NASS. This procedure includes the following statements:
- The CLUSTER statement, which includes HOSP_NASS.
- The STRATA statement, which includes NASS_STRATUM.
- The WEIGHT statement, which includes data element DISCWT.
- The VAR statement, which includes the value Knee_Arthroplasty, that we defined in the DATA step above.
Produce Regional Estimate of Encounters With Any-Listed Knee Arthroplasty Procedure from 2019 NASS File (Weighted)
The SURVEYMEANS Procedure
Statistics for HOSP_REGION Domains
Std Error
HOSP_REGION Variable Label Sum of Sum
-------------------------------------------------------------------------------------------
1: Northeast Knee_Arthroplasty Knee arthroplasty 35372 3981.738273
(CPTCCSn=152)
2: Midwest Knee_Arthroplasty Knee arthroplasty 77946 4121.714968
(CPTCCSn=152)
3: South Knee_Arthroplasty Knee arthroplasty 117928 6265.562914
(CPTCCSn=152)
4: West Knee_Arthroplasty Knee arthroplasty 70664 5804.339212
(CPTCCSn=152)
-------------------------------------------------------------------------------------------
The output for this example SAS code provides the total number of weighted records in the 2019 NASS with any-listed CCS for Services and Procedures category 152, Arthroplasty of knee, by hospital region:
- Northeast: 35,372
- Midwest: 77,946
- South: 117,928
- West: 70,664
Return to Contents
Module 3: Nationwide Ambulatory Surgery Sample (NASS), Validating National and Regional Estimates
There are two resources that can be used to validate national and regional estimates for the NASS.
- The HCUP Summary Statistics include means on all numeric variables, frequency distributions, and univariates on continuous variables for each HCUP database. Summary Statistics are provided by year.
- The HCUP Diagnosis and Procedure Frequency Tables provide frequencies of ICD-10-CM codes (individually and grouped by clinical category) and CPT codes grouped by Clinical Classifications Software (CCS) for the NASS. These are available under the "Data Elements" section of the respective nationwide database documentation page on the HCUP-US website.
HCUP Summary Statistics
Here is the output from first example analysis, which produced national estimates by patient race and ethnicity, or data element RACE, from the 2019 NASS. We have separate weighted counts for each RACE value as well as for missing and invalid values.
Produce National Estimate of Discharges By Patient Race/Ethnicity from 2019 NASS File (Weighted)
The SURVEYMEANS Procedure
Statistics for RACE Domains
Sum of
RACE Variable Label Weights
--------------------------------------------------------------------------------
.: Missing KEY_NASS NASS record number 343473
.A: Invalid KEY_NASS NASS record number 188.485640
1: White KEY_NASS NASS record number 8425840
2: Black KEY_NASS NASS record number 1101118
3: Hispanic KEY_NASS NASS record number 1247939
4: Asian/Pacific Islander KEY_NASS NASS record number 318776
5: Native American KEY_NASS NASS record number 66099
6: Other KEY_NASS NASS record number 377055
--------------------------------------------------------------------------------
For validation, we are going to compare the output with the 2019 NASS Summary Statistics.
From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the NASS Database Documentation.
The NASS Summary Statistics are available on this page, under the "Data Elements" section on the left-hand side.
The NASS Summary Statistics include all years of the NASS. We will scroll down to the section specific to data year 2019. Our data element of interest, RACE, is in the NASS Encounter File, which means we will want to select the Summary Statistics for the NASS Encounter File and, specifically, the file that provides weighted estimates (i.e., 2019 NASS Encounter File, Weighted). Once the file has downloaded, we will need to navigate to the frequency of the data element RACE. We can do this easily by searching for this data element name within the downloaded PDF.
NASS Summary Statistics 2019 Weighted Frequency Distribution for RACE |
RACE |
Frequency |
Percent of Total |
.: Missing |
343,473 |
2.89% |
.A: Invalid |
188 |
0.00% |
1: White |
8,425,840 |
70.92% |
2: Black |
1,101,118 |
9.27% |
3: Hispanic |
1,247,939 |
10.50% |
4: Asian/Pacific Islander |
318,776 |
2.68% |
5: Native American |
66,099 |
0.56% |
6: Other |
377,055 |
3.17% |
Produce National Estimate of Discharges By Patient Race/Ethnicity from 2019 NASS File (Weighted)
The SURVEYMEANS Procedure
Statistics for RACE Domains
Sum of
RACE Variable Label Weights
--------------------------------------------------------------------------------
.: Missing KEY_NASS NASS record number 343473
.A: Invalid KEY_NASS NASS record number 188.485640
1: White KEY_NASS NASS record number 8425840
2: Black KEY_NASS NASS record number 1101118
3: Hispanic KEY_NASS NASS record number 1247939
4: Asian/Pacific Islander KEY_NASS NASS record number 318776
5: Native American KEY_NASS NASS record number 66099
6: Other KEY_NASS NASS record number 377055
--------------------------------------------------------------------------------
A comparison of the RACE frequency from the 2019 NASS Weighted Encounter Summary Statistics and the output from SAS demonstrates that our results match.
Diagnosis and Procedure Frequency Tables
Here is the output from our second example analysis, which produced national estimates for records in the 2019 NASS with any-listed CCS for Services and Procedures category 152, Arthroplasty of knee.
Produce National Estimate of Encounters With Any-Level Knee Arthroplasty Procedures from 2019 NASS File (Weighted)
The SURVEYMEANS Procedure
Data Summary
Number of Strata 61
Number of Clusters 2958
Number of Observations 8994101
Sum of Weights 11880487.3
Statistics
Std Error
Variable Label Sum of Sum
----------------------------------------------------------------
Knee_Arthroplasty Knee_Arthroplasty 301910 10285
(CPTCCSn=152)
For validation, we are going to compare the output with the NASS Diagnosis and Procedure Frequency Tables.
From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the NASS Database Documentation.
The NASS Diagnosis and Procedure Frequency Tables are available on this page, under the "Data Elements" section on the left-hand side.
Once the file has been downloaded, we will navigate to the tab T.3_By_CPTCCS_Category, which includes the unweighted and weighted number of major ambulatory surgery encounters by individual CCS for Services and Procedures category. We will then navigate to the row for CCS for Services and Procedures category 152, Arthroplasty of knee, and scroll over to the columns that are specific to the 2019 NASS. Note that you can filter to CCS for Services and Procedures category 152 by using either Column A or Column B. We are now ready to compare the values with the output from SAS.
Table 3. Weighted and Unweighted Number of Records by Clinical Classifications Software for CPT Codes by Clinical Classifications Software (CCS) for Services and Procedures Category
Source: Agency for Healthcare Research and Quality (AHRQ), Healthcare Cost and Utilization Project (HCUP), Nationwide Ambulatory Surgery Sample (NASS), 2016-2019
Note: Unduplicated means that if two or more procedures on the encounter record mapped to the same CCS category, the record was only counted once. An asterisk (*) indicates the value has been suppressed because of small sample size. Blank cells indicate that the CCS category was not in scope for the data year. |
CCS for Services and Procedure Category |
CCS for Services and Procedures Description |
2019 NASS: Weighted N for CPT1 CCS |
2019 NASS: Weighted N for All-Listed CCS (Unduplicated) |
2019 NASS: Unweighted N for CPT1 CCS |
2019 NASS: Unweighted N for All-Listed CCS (Unduplicated) |
152 |
152: Arthroplasty |
294,917 |
**301,910 |
220,773 |
225,866 |
Produce National Estimate of Encounters With Any-Level Knee Arthroplasty Procedures from 2019 NASS File (Weighted)
The SURVEYMEANS Procedure
Data Summary
Number of Strata 61
Number of Clusters 2958
Number of Observations 8994101
Sum of Weights 11880487.3
Statistics
Std Error
Variable Label Sum of Sum
----------------------------------------------------------------
Knee_Arthroplasty Knee_Arthroplasty **301910 10285
(CPTCCSn=152)
A comparison of the weighted count for records in the 2019 NASS with any-listed CCS for Services and Procedures category 152, Arthroplasty of knee, along with the output from SAS (denoted by **) demonstrates our results match.
Module 3: Nationwide Ambulatory Surgery Sample (NASS)
You have completed Module 3, Nationwide Ambulatory Surgery Sample (NASS)!
For any questions about the NASS that cannot be addressed by this tutorial or the database's documentation, consult HCUP User Support:
- Email: hcup@ahrq.gov
- Phone: 866-290-HCUP (4287) (toll free)
- International users, please contact HCUP User Support by email.
The staff reviews messages daily and usually responds to inquiries within 3 business days.
Return to Contents
Module 4: Nationwide Emergency Department Sample (NEDS)
The Nationwide Emergency Department Sample (NEDS) can be used to produce national estimates of emergency department (ED) visits across the country. The NEDS includes both ED visits that result in admission to the hospital and those that do not.
Information on the NEDS is organized by the four sections below. These include:
- Overview
- Weighting the Data
- SAS Code Examples
- Validating Estimates
Additional information about the NEDS is available on the NEDS Database Documentation page on the HCUP User Support (HCUP-US) website.
Return to Contents
Module 4: Nationwide Emergency Department Sample (NEDS), Overview of the NEDS
One of the most distinctive features of the NEDS is its large sample size, which allows for analysis across hospital types and the study of relatively uncommon disorders and procedures. Unweighted, the NEDS contains data from 33 million ED visits from nearly 1,000 hospital-owned EDs. Weighted, the NEDS represents 143 million ED visits.
The NEDS is sampled from the HCUP State Emergency Department Databases (SEDD) and State Inpatient Databases (SID). The SEDD capture information on ED visits that do not result in an admission (e.g., treat-and-release visits and transfers to another hospital). The SID contain information on patients initially seen in the ED and then admitted to the same hospital. The NEDS is available annually beginning with data year 2006.
Additional information on the sample design of the KID is available in the NEDS Introduction and the HCUP Sample Design tutorial.
Return to Contents
Module 4: Nationwide Emergency Department Sample (NEDS), NEDS Data Elements Discharge Weight and Hospital Weights
NEDS Data Element Discharge Weight
To produce nationally or regionally representative estimates, the NEDS data must be weighted.
When the discharge weight (DISCWT) is applied to NEDS discharge-level data, the result is an estimate of the number of ED visits for the target universe. When the hospital weight (HOSPWT) is applied to hospital-level data, the result is an estimate of the number of EDs in the target universe. The target universe, covers all ED visits in facilities owned by community hospitals in the United States, excluding rehabilitation hospitals. As defined by the American Hospital Association, or AHA, community hospitals include non-Federal, short-term general, and other specialty hospitals that are open to the public. Included among community hospitals are specialty hospitals such as obstetrics-gynecology, ear-nose-throat, orthopedic, and pediatric institutions. Also included are public hospitals and academic medical centers. Examples of excluded hospitals include non-Federal long-term care, psychiatric, and Federal hospitals, such as Indian Health Service hospitals.
Weights are calculated after hospitals from the SEDD and SID have been stratified and sampled. Hospitals are stratified (grouped) based on five hospital characteristics: (1) ownership/control, (2) teaching status, (3) urban/rural location, (4) trauma center designation, and (5) location in the four U.S. Census regions. Within each stratum, 20% of hospitals in the target universe are sampled from the combined SEDD and SID. The number of hospitals in the target universe stratum is determined from the American Hospital Association (AHA) Annual Survey data for all States and hospitals, including those without data in the SEDD or SID.
Then the NEDS discharge weight is calculated by dividing the number of ED visits in the target universe by the number of ED visits in the sampled hospitals within each stratum.
NEDS Hospital Weight
To produce hospital-level estimates, such as the number of hospital-owned EDs in the United States located in a metropolitan area, you need to apply a hospital weight (HOSPWT) to the data. HOSPWT is assigned to each hospital within the NEDS Hospital File, with the value varying across records.
HOSPWT is also calculated according to the NEDS strata of ownership/control, teaching status, urban/rural location, trauma center designation, and the four U.S. census regions. The number of hospital-owned EDs in the target universe is determined from the AHA data for all States and hospitals, including those without data in the SEDD or SID.
NEDS hospital weights are calculated by dividing the number of hospital-owned EDs in the target universe by the number of sampled hospital-owned EDs within each stratum.
Return to Contents
Module 4: Nationwide Emergency Department Sample (NEDS), SAS Code Examples
Example SAS Code for Producing National Estimates by Type of ED Visit
This example SAS code produces national estimates by the type of ED visit or source of the ED record (data element HCUPFILE) in the 2019 NEDS.
Title "Produce National Estimate of ED Visits from 2019 NEDS File (Weighted)";
Libname NEDS2019 "V:\NEDS\2019\SASDATA" access=readonly;
Options PS=51 LS=146 ;
proc surveymeans data=neds2019.neds_2019_core missing sumwgt ;
cluster HOSP_ED ;
strata NEDS_STRATUM ;
domain HCUPFILE ;
format PL_NCHS nchsf. ;
weight DISCWT ;
var KEY_ED ;
run;
This example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NEDS. This procedure includes the following statements:
- The CLUSTER statement, which includes the NEDS hospital identifier or data element HOSP_ED.
- The STRATA statement, which includes the NEDS stratum identifier or data element NEDS_STRATUM.
- The DOMAIN statement is specific to this analysis, which produces national estimates by data element HCUPFILE.
- The WEIGHT statement, which includes the NEDS encounter weight or data element DISCWT.
- The VAR statement, which includes the NEDS record identifier or data element KEY_ED.
Produce National Estimate of ED Visits from 2019 NEDS File (Weighted)
The SURVEYMEANS Procedure
Data Summary
Number of Strata 141
Number of Clusters 989
Number of Observations 33147251
Sum of Weights 143432284
Statistics
Variable Label Sum of Weights
---------------------------------------------------------------
KEY_ED HCUP NEDS record identifier 143432284
Produce National Estimate of ED Visits from 2019 NEDS File (Weighted)
The SURVEYMEANS Procedure
Statistics for HCUPFILE domains
Sum of
HCUPFILE Variable Label Weights
-------------------------------------------------------------------------
SEDD KEY_ED HCUP NEDS record identifier 123058750
SID KEY_ED HCUP NEDS record identifier 20373534
-------------------------------------------------------------------------
The output for this example SAS code provides the total number of weighted records in the 2019 NEDS, which is 143,432,284, as well as the total number of weighted records for HCUPFILE:
- SEDD: 123,058,750
- SID: 20,373,534
Example SAS Code for Producing National Estimates for Urinary Tract Infection
This example SAS code identifies the number of weighted records in the 2019 NEDS with a principal or first-listed diagnosis of urinary tract infection (UTI), which is based on the default HCUP Clinical Classifications Software Refined (CCSR) for ICD-10-CM diagnosis category, GEN004 (Urinary tract infection).
Title "Produce National Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)";
Libname NEDS2019 "V:\NEDS\2019\SASDATA" access=readonly;
Options PS=51 LS=146 ;
data neds;
merge neds2019.neds_2019_core (keep=HOSP_ED NEDS_STRATUM KEY_ED DISCWT)
neds2019.neds_2019_dx_pr_grps (keep=HOSP_ED KEY_ED DXCCSR_Default_DX1)
;
by HOSP_ED KEY_ED;
Attrib UTI length=3 label='Urinary Tract Infection (Default CCSR=GEB004)';
UTI=(DXCCSR_Default_DX1='GEN004')
run;
proc surveymeans data=neds sum mean nomcar;
cluster HOSP_ED ;
strata NEDS_STRATUM;
weight DISCWT;
var UTI;
run;
The first section of this example SAS code includes the DATA step, which identifies records with a default CCSR category of GEN004, Urinary tract infection, for the principal or first-listed diagnosis. This step includes the following statements:
- The MERGE which combines the NEDS Core File with the NEDS Diagnosis and Procedure Groups File. The NEDS Diagnosis and Procedure Groups file includes the default CCSR category for the principal or first-listed diagnosis or data element DXCCSR_Default_DX1.
- The KEEP statements, which are present for each file containing data elements we need for this analysis. This includes data elements necessary for linking the files, weighting the data, and DXCCSR_Default_DX1.
- The ATTRIB statement, which assigns a length and a label to a new data element (UTI) specific to our example analysis. The next statement, UTI =, assigns a value to this new data element, which in our example, is defined based on the default CCSR category of GEN004 for the principal or first-listed diagnosis (NEDS data element DXCCSR_Default_DX1=GEN004).
The second section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NEDS. This procedure includes the following statements:
- The CLUSTER statement, which includes HOSP_ED.
- The STRATA statement, which includes NEDS_STRATUM.
- The WEIGHT statement, which includes data element DISCWT.
- The VAR statement, which includes the value, UTI, that we defined in the DATA step above.
Produce National Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)
The SURVEYMEANS Procedure
Data Summary
Number of Strata 141
Number of Clusters 989
Number of Observations 33147251
Sum of Weights 143432284
Variance Estimation
Method Taylor Series
Missing Values NOMCAR
Statistics
Std Error Std Error
Variable Label Mean of Mean Sum of Sum
-------------------------------------------------------------------------------------------------------
UTI Urinary Tract Infection 0.025498 0.000305 3657277 85457
(Default CCSR=GEN004)
The output for this example SAS code provides the total number of weighted records in the 2019 NEDS with a default CCSR category for the principal or first-listed diagnosis of GEN004, Urinary tract infection, which is 3,657,277.
Example SAS Code for Producing Regional Estimates for Urinary Tract Infection
The example SAS code below produces regional estimates for records in the 2019 NEDS with a principal or first-listed diagnosis of UTI (default CCSR category GEN004).
Title "Produce Regional Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)";
Libname NEDS2019 "V:\NEDS\2019\SASDATA" access=readonly;
Options PS=51 LS=146 ;
proc format;
Value St_Regn
1 = "1: Northeast"
2 = "2: Midwest"
3 = "3: South"
4 = "4: West"
;
run;
data neds;
merge neds2019.neds_2019_core (keep=HOSP_ED NEDS_STRATUM KEY_ED DISCWT)
neds2019.neds_2019_dx_pr_grps (keep=HOSP_ED KEY_ED DXCCSE_Default_DX1)
;
by HOSP_ED KEY_ED;
Attrib UTI length=3 label='Urinary Tract Infection (Default CCSR=GEN004)';
UTI=(DXCCSR_Default_DX1='GEN004');
/* look up region */
if _n_=1 then do;
if 0 then set neds2019.neds_2019_hospital (keep=HOSP_REGION); %* initiates the variable;
declare hash h (dataset: "neds2019.neds_2019_hospital");
h.defineKey('HOSP_ED');
h.defineData('HOSP_REGION');
h.defineDone();
end;
if h.find() ne 0 then abort; %* all disharges should have a matching hospital record;
format HOSP_REGION st_regn.;
run;
proc surveymeans data=neds sum mean nomcar ;
cluster HOSP_ED ;
strata NEDS_STRATUM ;
domain HOSP_REGION ;
weight DISCWT ;
var UTI ;
run;
The first section of this example SAS code includes a PROC FORMAT, which assigns data labels to the data values in the output. For this example, we are focused on the data element HOSP_REGION, which includes the following mappings:
- Numeric value 1 for Northeast
- Numeric value 2 for Midwest
- Numeric value 3 for South
- Numeric value 4 for West
The second section includes the DATA step, which includes the following statements:
- The MERGE statement, which links the NEDS Core File with the NEDS Diagnosis and Procedure Groups File keeping essential data elements from each file.
- For this specific example, there is an additional step that is looking for the data element HOSP_REGION, which resides in the NEDS Hospital File.
- The ATTRIB statement, which assigns a length and a label to a new data element (UTI) specific to our example analysis. The next statement, UTI =, assigns a value to this new data element, which in our example, is defined based on the default CCSR category of GEN004 for the principal or first-listed diagnosis (NEDS data element DXCCSR_Default_DX1=GEN004).
The final section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NEDS. This procedure includes the following statements:
- The CLUSTER statement, which includes HOSP_ED.
- The STRATA statement, which includes NEDS_STRATUM.
- The WEIGHT statement, which includes data element DISCWT.
- The VAR statement, which includes the value, UTI, which we defined in the DATA step above.
- The DOMAIN statement, which is specific to HOSP_REGION as we are interested in regional estimates.
Produce Regional Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)
The SURVEYMEANS Procedure
Statistics for HOSP_REGION Domains
Std Error Std Error
HOSP_REGION Variable Label Mean of Mean Sum of Sum
-------------------------------------------------------------------------------------------------------------------
1: Northeast UTI Urinary Tract Infection 0.021364 0.000509 554359 34012
(Default CCSR=GEN004)
2: Midwest UTI Urinary Tract Infection 0.024167 0.000529 776972 39084
(Default CCSR=GEN004)
3: South UTI Urinary Tract Infection 0.027824 0.000614 1623836 58585
(Default CCSR=GEN004)
4: West UTI Urinary Tract Infection 0.026030 0.000496 702110 34442
(Default CCSR=GEN004)
-------------------------------------------------------------------------------------------------------------------
The output for this example SAS code provides the total number of weighted records in the 2019 NEDS with a default CCSR category of GEN004, UTI, by hospital region:
- Northeast: 554,359
- Midwest: 776,972
- South: 1,623,836
- West: 702,110
Title "Produce National Estimate of Hospitals with HOSP_TRAUMA>0 from 2019 NEDS File (Weighted)";
Libname NEDS2019 "V:\NEDS\2019\SASDATA" access=readonly;
Options PS=51 LS=146 ;
data neds;
set neds2019.neds_2019_hospital;
Attrib Trauma_Hosp length=3 label='Trauma Hospital';
Trauma_Hosp=(HOSP_TRAUMA>0);
run;
proc surveymeans data=neds sum mean nomcar;
cluster HOSP_ED ;
strata NEDS_STRATUM;
weight HOSPWT;
var Trauma_Hosp;
run;
The first section of this example SAS code includes the DATA step. Included in this step is the ATTRIB statement, which assigns a length and a label to a new data element (Trauma_Hosp) specific to our example analysis. The next statement, Trauma_Hosp =, assigns a value to this new data element, which in our example, is defined based on any trauma center designation; therefore, we are interested in a value greater than 0 for the NEDS data element HOSP_TRAUMA.
The second section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NEDS. This procedure includes the following statements:
- The CLUSTER statement, which includes HOSP_ED.
- The STRATA statement, which includes NEDS_STRATUM.
- The WEIGHT statement, which includes the NEDS hospital weight or data element HOSPWT given that we are interested in national estimates of hospital-owned EDs.
- The VAR statement, which includes the value, Trauma_Hosp, that we defined in the DATA step above.
Produce National Estimate of Hospitals with HOSP_TRAUMA>0 from 2019 NEDS File (Weighted)
The SURVEYMEANS Procedure
Data Summary
Number of Strata 141
Number of Clusters 989
Number of Observations 989
Sum of Weights 4549
Variance Estimation
Method Taylor Series
Missing Values NOMCAR
Statistics
Std Error Std Error
Variable Label Mean of Mean Sum of Sum
-----------------------------------------------------------------------------------------------
Trauma_Hosp Trauma Hospital 0.239613 0 1090.000000 3.335209E-15
The output for this example SAS code includes the number of hospital-owned EDs in the United States designated as a trauma center in the 2019 NEDS, which is 1,090.
Return to Contents
Module 4: Nationwide Emergency Department Sample (NEDS), Validating National and Regional Estimates
There are three resources that can be used to validate national and regional estimates for the NEDS.
- The HCUP Summary Statistics include means on all numeric variables, frequency distributions, and univariates on continuous variables for each HCUP database. Summary Statistics are provided by year.
- The HCUP Diagnosis and Procedure Frequency Tables provide frequencies of ICD-9-CM and ICD-10-CM/PCS codes (individually and grouped by clinical category) in the HCUP nationwide databases. These are available under the "Data Elements" section of the respective nationwide database documentation page on the HCUP-US website.
- HCUPnet is a free online query tool that provides select precalculated statistics derived from both the State and nationwide HCUP databases. HCUPnet can be used to validate select national estimates obtained from the NIS, KID, NEDS, or NRD and county- or State-level statistics for participating HCUP Partners.
HCUP Summary Statistics
Here is output from our first example analysis, which produced national estimates by source of the ED record, or data element HCUPFILE, from the 2019 NEDS. We have separate weighted counts for each of the two HCUPFILE values.
Produce National Estimate of ED Visits from 2019 NEDS File (Weighted)
The SURVEYMEANS Procedure
Statistics for HCUPFILE domains
Sum of
HCUPFILE Variable Label Weights
-------------------------------------------------------------------------
SEDD KEY_ED HCUP NEDS record identifier 123058750
SID KEY_ED HCUP NEDS record identifier 20373534
-------------------------------------------------------------------------
For validation, we are going to compare the output with the 2019 NEDS Summary Statistics.
From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the NEDS Database Documentation.
The NEDS Summary Statistics page all years of the NEDS. We will scroll down to the section specific to data year 2019. Our data element of interest, HCUPFILE, is in the NEDS Core File, which means we will want to select the Summary Statistics for the NEDS Core File and, specifically, the file that provides weighted estimates (i.e., 2019 NEDS Core File, weighted). Once the file has downloaded, we will need to navigate to the frequency of the data element, HCUPFILE. We can do this easily by searching for this data element name within the downloaded PDF.
HCUP Weighted Summary Statistics Report: NEDS 2019 Core File Weighted Frequency Distribution for HCUPFILE |
HCUPFILE |
Frequency |
Percent |
SEDD |
123,058,750 |
85.80% |
SID |
20,373,534 |
14.20% |
Produce National Estimate of ED Visits from 2019 NEDS File (Weighted)
The SURVEYMEANS Procedure
Statistics for HCUPFILE domains
Sum of
HCUPFILE Variable Label Weights
-------------------------------------------------------------------------
SEDD KEY_ED HCUP NEDS record identifier 123058750
SID KEY_ED HCUP NEDS record identifier 20373534
-------------------------------------------------------------------------
A comparison of the HCUPFILE frequency from the 2019 NEDS Weighted Core Summary Statistics and the output from SAS demonstrates that our results match.
Diagnosis and Procedure Frequency Tables
Here is output from our second example analysis, which produced national estimates for records in the 2019 NEDS with a default CCSR category of GEN004, Urinary tract infection, for the principal or first-listed diagnosis.
Produce National Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)
The SURVEYMEANS Procedure
Data Summary
Number of Strata 141
Number of Clusters 989
Number of Observations 33147251
Sum of Weights 143432284
Variance Estimation
Method Taylor Series
Missing Values NOMCAR
Statistics
Std Error Std Error
Variable Label Mean of Mean Sum of Sum
-------------------------------------------------------------------------------------------------------
UTI Urinary Tract Infection 0.025498 0.000305 3657277 85457
(Default CCSR=GEN004)
For validation, we are going to compare the output with the NEDS Diagnosis and Procedure Frequency Tables.
From the HCUP User Support (HCUP-US) website homepage we will navigate to the top menu and select Database Information. Once we arrive on this page, we will select the link for the NEDS Database Documentation.
The NEDS Diagnosis and Procedure Frequency Tables are available on this page, under the "Data Elements" section on the left-hand side.
Once the file has been downloaded, we will navigate to the tab T.1_By_DXCCSR_Category, which includes the unweighted and weighted number of records by individual CCSR for ICD-10-CM diagnosis categories for all ED visits. Note that if you wish to obtain counts separately for treat-and-release ED visits or ED visits that result in admission to the same hospital, you will need to use the two subsequent tabs that end in "TandR" or "EDadmit." We will then navigate to the row for CCSR category GEN004, Urinary tract infection, and scroll over to the columns that are specific to the 2019 NEDS. Note that you can filter to GEN004 using either Column A or Column B.
Table 1. Weighted and Unweighted Number of Records (All Emergency Department Visits) by Clinical Classifications Software Refined (CCSR) for ICD-10-CM Diagnoses, v2021.2
Source: Agency for Healthcare Research and Quality (AHRQ), Healthcare Cost and Utilization Project (HCUP), Nationwide Emergency Department Sample (NEDS), 2016-2019
Note: Counts for all-listed diagnoses include all possible CCSR category assignments. Unduplicated means that if two or more diagnosis codes on the same discharge record mapped to the same CCSR category, the discharge record was only counted once. An asterisk (*) indicates the value has been suppressed because of small sample size. |
CCSR for ICD-10-CM Category, v2021.2 |
CCSR Description, v2021.2 |
2019 NEDS-All: Weighted N for DX1 CCSR Default |
2019 NEDS-All: Weighted N for All-Listed CCSR (Unduplicated) |
2019 NEDS-All: Unweighted N for DX1 CCSR Default |
2019 NEDS-All: Unweighted N for All-Listed CCSR (Unduplicated) |
GEN004 |
GEN004 Urinary tract infections |
**3,657,277 |
7,963,917 |
851,106 |
1,856,213 |
Produce National Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)
The SURVEYMEANS Procedure
Data Summary
Number of Strata 141
Number of Clusters 989
Number of Observations 33147251
Sum of Weights 143432284
Variance Estimation
Method Taylor Series
Missing Values NOMCAR
Statistics
Std Error Std Error
Variable Label Mean of Mean Sum of Sum
-------------------------------------------------------------------------------------------------------
UTI Urinary Tract Infection 0.025498 0.000305 **3657277 85457
(Default CCSR=GEN004)
A comparison of the weighted count for records in the 2019 NEDS with a default CCSR category of GEN004, Urinary tract infection, for the principal or first-listed diagnosis with the output from SAS (denoted by **) demonstrates that our results match.
HCUPnet
Here is output from our third example analysis, which produced regional estimates for records in the 2019 NEDS with a default CCSR category of GEN004, Urinary tract infection, for the principal or first-listed diagnosis.
Produce Regional Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)
The SURVEYMEANS Procedure
Statistics for HOSP_REGION Domains
Std Error Std Error
HOSP_REGION Variable Label Mean of Mean Sum of Sum
-------------------------------------------------------------------------------------------------------------------
1: Northeast UTI Urinary Tract Infection 0.021364 0.000509 554359 34012
(Default CCSR=GEN004)
2: Midwest UTI Urinary Tract Infection 0.024167 0.000529 776972 39084
(Default CCSR=GEN004)
3: South UTI Urinary Tract Infection 0.027824 0.000614 1623836 58585
(Default CCSR=GEN004)
4: West UTI Urinary Tract Infection 0.026030 0.000496 702110 34442
(Default CCSR=GEN004)
-------------------------------------------------------------------------------------------------------------------
For validation, we are going to compare the output with HCUPnet.
As a first step, we will need to accept the terms of the Data Use Agreement. Now, we will navigate to the top menu and select "Emergency Department Setting." Once selected, we will subsequently expand the option for "National Emergency Department" and select "Diagnoses."
The output will default to displaying trends in the total number of ED visits with a default CCSR category of BLD001, Nutritional anemia, for the principal or first-listed diagnosis. We need to modify the selections on the left-hand side of the screen to align with our analysis.
- First, select the option for "Cross-Sectional" analysis.
- Next, retain the default data year of "2019" in the "Years" drop-down.
- Next, select "Diagnoses—Clinical Classifications Software Refined (CCSR)" in the "Classification Types" drop-down, and retain the "Principal/First-listed" option in the "Principal/First-listed or All-Listed" drop-down.
- Next, under the "Diagnoses/Procedures" drop-down unclick the (All) selection to change the default from running the query on all CCSR categories. Scroll down through the list to CCSR category GEN004, Urinary tract infection, or use the search bar and ensure the box is checked.
- Next, retain the default value "ED Visits Resulting in Hospital Admission" in the "Type of ED Visit" drop-down. Note that there is no option for "All ED Visits." As a result, we will need to query the two ED visit types separately and then combine for all ED visits. Because each discharge only has one principal/first-listed diagnosis, we are able to sum the number of discharges.
- Next, ensure only "Number of Ed visits" is selected in the "Outcomes" drop-down.
- Next, select the "Hospital Census Region" option in the "Characteristic" drop-down.
- Next, retain the default option of "All" for the "Characteristic" drop-down.
- Last, select the box for "Show 95% CI" to display the standard error of the estimates if you wish to view this information.
A table will appear next to the left-hand side menu where the selections were made. This table presents regional estimates for ED visits that result in hospital admission in the 2019 NEDS with a default CCSR category of GEN004, Urinary tract infection, for the principal diagnosis. If you wish to display a graph for this output, navigate to the upper right and make the necessary selections under the "Diagnoses/Procedures to Graph" and "Outcome to Graph" drop-downs.
Now, we need to go back and select the "Treat-and-release ED visits" option from the "Type of ED Visit" drop-down. The results table will automatically update to reflect this type of ED visit.
|
ED Visits Resulting in Admission |
Treat-and-Release ED Visits |
|
Sum of ED Visits Resulting in Admission + Treat-and-Release ED Visits |
Diagnoses |
Characteristic Levels |
Number of ED visits |
Number of ED visits |
|
Number of ED visits |
GEN004: Urinary Tract Infections |
Midwest |
97,389 |
+ 679,582 |
= |
776,971 |
Northeast |
92,261 |
+ 462,098 |
= |
554,359 |
South |
205,833 |
+ 1,418,003 |
= |
1,623,836 |
West |
67,451 |
+ 634,659 |
= |
702,110 |
Produce Regional Estimate of ED Visits With Principal/First-Listed Urinary Tract Infection from 2019 NEDS File (Weighted)
The SURVEYMEANS Procedure
Statistics for HOSP_REGION Domains
Std Error Std Error
HOSP_REGION Variable Label Mean of Mean Sum of Sum
-------------------------------------------------------------------------------------------------------------------
1: Midwest UTI Urinary Tract Infection 0.024167 0.000529 776972 39084
(Default CCSR=GEN004)
2: Northeast UTI Urinary Tract Infection 0.021364 0.000509 554359 34012
(Default CCSR=GEN004)
3: South UTI Urinary Tract Infection 0.027824 0.000614 1623836 58585
(Default CCSR=GEN004)
4: West UTI Urinary Tract Infection 0.026030 0.000496 702110 34442
(Default CCSR=GEN004)
-------------------------------------------------------------------------------------------------------------------
A comparison of our output from HCUPnet with the output from SAS demonstrates that our results match except for the number of ED visits in the Midwest, which is lower by one in HCUPnet. This is a result of how estimates are rounded in HCUPnet. In some cases, the sum of estimates in HCUPnet for the two ED visit types may differ slightly from estimates obtained directly from the NEDS.
Module 4: Nationwide Emergency Department Sample (NEDS)
You have completed Module 4, Nationwide Emergency Department Sample (NEDS)!
For any questions about the NEDS that cannot be addressed by this tutorial or the database's documentation, consult HCUP User Support:
- Email: hcup@ahrq.gov
- Phone: 866-290-HCUP (4287) (toll free)
- International users, please contact HCUP User Support by email.
The staff reviews messages daily and usually responds to inquiries within 3 business days.
Return to Contents
Module 5: Nationwide Readmissions Database (NRD)
The Nationwide Readmissions Database, or NRD, is a unique and powerful database designed to support various types of analyses of national readmissions for all patients regardless of the expected payer for the hospital stay.
Information on the NRD is organized by the four sections below. These include:
- Overview
- Weighting the Data
- SAS Code Examples
- Validating Estimates
Additional information about the NRD is available on the NRD Database Documentation page on the HCUP User Support (HCUP-US) website.
Return to Contents
Module 5: Nationwide Readmissions Database (NRD), Overview of the NRD
The Nationwide Readmissions Database (NRD) is a unique and powerful database designed to support various types of analyses of national readmissions for all patients regardless of the expected payer for the hospital stay. The NRD includes discharges for patients with and without repeat hospital visits in a year and those who have died in the hospital. Repeat stays may or may not be related. The criteria to determine the relationship between hospital admissions is left to the analyst using the NRD. Unweighted, the NRD contains data from about 18 million discharges each year. Weighted, it estimates roughly 35 million discharges in the United States.
The NRD is drawn from HCUP State Inpatient Databases (SID) containing verified patient linkage numbers that can be used to track a person across hospitals within a State, while adhering to strict privacy guidelines. The NRD is available annually beginning with data year 2010.
Additional information on the sample design of the NRD is available in the NRD Introduction and the HCUP Sample Design tutorial.
Return to Contents
Module 5: Nationwide Readmissions Database (NRD), Weighting the Nationwide Readmissions Database (NRD)
To produce nationally representative estimates, the NRD data must be weighted. This can be done using the data element discharge weight (DISCWT), which is available on each record in the NRD, with the value varying across records.
When the discharge weights are applied to the NRD data, the result is an estimate of the number of discharges for the target universe, which includes discharges from all community hospitals in the United States, excluding rehabilitation and long-term acute care hospitals. Per the American Hospital Association, or AHA, community hospitals include non-Federal, short-term general, and other specialty hospitals that are open to the public. Included among community hospitals are specialty hospitals such as obstetrics-gynecology, ear-nose-throat, orthopedic, and pediatric institutions. Also included are public hospitals and academic medical centers. Examples of excluded hospitals include non-Federal long-term care, psychiatric, and Federal hospitals, such as Indian Health Service hospitals.
The NRD is not designed to support regional estimates, because information on U.S. census region is not available.
Weights are developed after discharges sampled from the SID are stratified into counts using five hospital characteristics and two patient characteristics: (1) ownership/control, (2) bed size, (3) teaching status, (4) urban/rural location, (5) the four U.S. census regions, (6) patient age, in groups, and (7) patient sex. Total discharge counts for the target universe are estimated using total discharges from hospitals in the SID and the American Hospital Association (AHA) Survey estimates of discharges (admissions plus births) for hospitals not included in the NRD.
NRD discharge weights are calculated by dividing the number of universe discharges by the number of sampled discharges within each NRD stratum.
Return to Contents
Module 5: Nationwide Readmissions Database (NRD), SAS Code Example
As described earlier in this module, the NRD is designed to support readmission analyses. It is not recommended for use in obtaining total national estimates of discharges in the United States because pairs of transfer records are collapsed into a single record in the NRD. In that case, the National Inpatient Sample (NIS) should be used.
These are two critical components of a readmission analysis:
- Index event: Initial inpatient stay that indicates the starting point for analyzing repeat hospital stays and that is typically defined by specific inclusion and exclusion criteria.
- Readmission: A subsequent inpatient stay within a specified time period; the readmission may be for a specific cause or any cause.
Additional information on defining the index event and readmission is available in the NRD Introduction and the NRD tutorial.
Example SAS Code for Producing National Estimates for Index Events with Principal Diagnosis of Septicemia
This example SAS code determines the weighted number of index events in the 2019 NRD with a principal diagnosis of septicemia, which is based on the default HCUP Clinical Classifications Software Refined (CCSR) for ICD-10-CM diagnosis category, INF002 (Septicemia).
For this example, an index event is defined as follows:
- The patient was discharged between January and November 2019.
- The patient was discharged alive.
- The length of stay was nonmissing.
- The discharge was for a patient aged 1 year or older.
- The patient may be a nonresident of the State.
- And, the patient is allowed to have multiple index events, regardless of how far apart.
This index event definition is consistent with what is used on HCUPnet, which is a free online query tool that provides select precalculated statistics derived from both the State and nationwide HCUP databases. However, it should be noted that users should define the index event (as well as the readmission) based on their own analytic purpose.
Title1 "Produce National Estimate of Index Discharges with Principal or First-Listed Diagnosis of Septicemia";
Title2 "(Default CCSR=INF002) from 2019 NRD File (Weighted)";
Libname NRD2019 "O:\NRD\2019\CD\CDNRD" access=readonly;
Options PS=51 LS=146 ;
data nrd;
merge nrd2019.nrd_2019_core (keep=HOSP_NRD KEY_NRD DISCWT NRD_STRATUM NRD_visitlink NRD_daystoevent AGE LOS DMONTH DIED )
nrd2019.nrd_2019_dx_pr_grps (keep=HOSP_NRD KEY_NRD DXCCSR_Default_DX1)
;
by HOSP_NRD KEY_NRD;
Attrib IndexEvent length=3 label='Index event with DX1 of Septicemia (Default CCSR=INF002)';
if DIED=0 /* not died */
and DMONTH in (1:11) /* Discharged Jan-Nov to allow 30 day follow up */
and not missing(NRD_daystoevent) /* non-missing admission date */
and not missing(LOS) /* non-missing LOS to calculate discharge date */
and age>=1 /* match HCUPnet */
and DXCCSR_Default_DX1='INF002' /* DX1 of interest */ then indexEvent=1;
else IndexEvent=0;
run;
proc surveymeans data=NRD sum mean nomcar;
cluster HOSP_NRD ;
strata NRD_STRATUM;
weight DISCWT;
var IndexEvent;
run;
The first section of this example SAS code includes the DATA step. This step includes the following statements:
- The MERGE statement, which links the NRD Core File with the NRD Diagnosis and Procedure Groups File keeping essential data elements from each file. The NRD Diagnosis and Procedure Groups File includes the default CCSR category for the principal diagnosis or data element DXCCSR_Default_DX1.
- The KEEP statements, which are present for each file containing data elements we need for this analysis. This includes data elements necessary for linking the files, weighting the data, and DXCCSR_Default_DX1.
- The ATTRIB statement, which assigns a length and a label to a new data element (IndexEvent). This new data element is defined based on a combination of clinical criteria (default CCSR of INF002, Septicemia, for the principal diagnosis) as well as non-clinical criteria (e.g., patient did not die in the hospital or data element DIED = 0, patient was discharged between January to November 2019 or data element DMONTH has a value within the range of 1 to 11).
The second section of this example SAS code includes the SURVEYMEANS procedure, which accounts for the complex sample design of the NRD. This procedure includes the following statements:
- The CLUSTER statement, which includes the NRD hospital identifier or data element HOSP_NRD.
- The STRATA statement, which includes the NRD stratum identifier or data element NRD_STRATUM.
- The WEIGHT statement, which includes the NRD discharge weight or data element DISCWT.
- The VAR statement, which includes the data element, IndexWeight, that we defined in the DATA step above.
Produce National Estimate of Index Discharges with Principal or First-Listed Diagnosis
of Septicemia (Default CCSR=INF002) from 2019 NRD File (Weighted)
The SURVEYMEANS Procedure
Data Summary
Number of Strata 93
Number of Clusters 2507
Number of Observations 18132856
Sum of Weights 35399480
Variance Estimation
Method Taylor Series
Missing Values NOMCAR
Statistics
Std Error Std Error
Variable Label Mean of Mean Sum of Sum
-------------------------------------------------------------------------------------------------------------
IndexEvent Index event with DX1 of Septicemia 0.052613 0.000736 1862468 32880
(Default CCSR=INF002)
The output for this example SAS code provides the weighted number of index events in the 2019 NRD with a principal diagnosis of septicemia.
Return to Contents
Module 5: Nationwide Readmissions Database (NRD), Validating National Estimates
Unlike the other four modules within this tutorial, the NRD cannot be validated using the HCUP Summary Statistics, HCUP Diagnosis and Procedure Frequency Tables, or HCUPnet as the definitions of an index event and readmission will vary depending on the analytic purpose.
Additional information for working with the NRD is available in the NRD Introduction and the NRD Tutorial.
Return to Contents
Module 5: Nationwide Readmissions Database (NRD)
You have completed Module 5, Nationwide Readmissions Database (NRD)!
For any questions about the NRD that cannot be addressed by this tutorial or the database's documentation, consult HCUP User Support:
- Email: hcup@ahrq.gov
- Phone: 866-290-HCUP (4287) (toll free)
- International users, please contact HCUP User Support by email.
The staff reviews messages daily and usually responds to inquiries within 3 business days.
Return to Contents
|