Disability Analysis File Public Use File

DAF PUF Overview
Suggested Citations for the DAF PUF
DAF PUF Documentation
DAF PUF Files
DAF Code Library
DAF Research Solutions
Frequently Asked Questions

DAF PUF Overview

The Disability Analysis File (DAF) is an analytical file consisting of agency administrative data in an easy-to-use format. We create a new version of the file and documentation each year. The file contains historical, longitudinal, and one-time data on all children and pre-retirement adults with disabilities who participated in the Supplemental Security Income (SSI) or Social Security Disability Insurance (SSDI) programs at any time between 1996 and the year of the file. Each DAF is an updated version of all prior DAF files, so users should use the most recent file available.

The DAF PUF contains a random 10 percent sample of beneficiaries included in the full DAF. The 10 percent file is large enough to avoid disclosure risk and small enough to keep the file size manageable. The PUF contains a more limited set of variables than the full DAF, and we masked some variables in a variety of ways to avoid disclosure. Please consult the DAF-PUF21 Overview and Documentation below for additional details on the DAF variables.

The current version of the DAF PUF is the DAF21 PUF, with data through the end of 2021.

You can find the DAF PUF data documentation and data files below. If you have questions about the DAF PUF, please contact ORDES.DAF@ssa.gov.

Suggested Citations for the DAF PUF

Users of the DAF PUF should cite the data files and/or documentation using the following citations, updated to include the date the files were accessed and the years of data used:

  1. For the DAF PUF Demographic file (<YYYY> is the year accessed):

    U.S. Social Security Administration, Office of Retirement and Disability Policy, Office of Research, Demonstration, and Employment Support. (<YYYY>). Disability Analysis File (DAF) Public Use File (PUF), DAF PUF Demographic File [Data set]. Retrieved from https://www.ssa.gov/disabilityresearch/daf_puf.html#files. Accessed on <DATE>.

  2. For the DAF PUF Annual files (<YYYY> is the year accessed):

    U.S. Social Security Administration, Office of Retirement and Disability Policy, Office of Research, Demonstration, and Employment Support. (<YYYY>). Disability Analysis File (DAF) Public Use File (PUF), DAF PUF Annual File(s) (<DATA YEARS USED>) [Data set]. Retrieved from https://www.ssa.gov/disabilityresearch/daf_puf.html#files. Accessed on <DATE>.

  3. For DAF documentation (<YYYY> is documentation publication date):

    U.S. Social Security Administration, Office of Retirement and Disability Policy, Office of Research, Demonstration, and Employment Support. (<YYYY>). Overview and Documentation of the Social Security Administration's Disability Analysis File Public Use File for 2021 [Data file and code book]. Retrieved from https://www.ssa.gov/disabilityresearch/daf_puf.html#documentation.

DAF PUF Documentation

Overview and Documentation of the Social Security Administration's Disability Analysis File (DAF) Public Use File for 2021 (PUF21)

This volume contains:

  • An introduction to the DAF
  • How the DAF PUF differs from the full DAF
  • Information on the structure of the PUF
  • Information on variables in the PUF, including how variables were masked, lists of variables, and detailed information on each variable
  • Additional information on the PUF

DAF PUF Files
The DAF PUF has two components, the Demographic file and the Annual files. These files contain one record per beneficiary and are linkable using the unique identifier (PUFPIN) in each file. The files are available in SAS, Stata, and CSV format. Please note that the CSV files will not open completely in Excel due to the Excel row limit.

Over time, we may add additional files and variables to the PUF. The documentation for future versions of the PUF will describe any updates.

DAF Code Library
To make the DAF more efficient and easier to use, we have developed SAS code for common analytical tasks run on DAF files. Researchers can use and modify this code as needed. We designed the code library for use with the full DAF; however, it may help users get started with code for the PUF. The DAF Users' Code Library currently includes code to complete the following tasks:

  • Determine whether a beneficiary is in current pay for either SSDI or SSI within a user-specified time period;
  • Categorize impairment codes into the groupings used in our published statistics;
  • Determine whether we have suspended or terminated a beneficiary due to work within a user-specified time period; and
  • Reorder variables suffixed 1-n into a chronological order.

We expect the DAF Users' Code Library to grow over time, so please check back periodically.

DAF Research Solutions

These fact sheets illustrate how the DAF has been used to support research and answer questions about our disability beneficiary population. Although this research used the full DAF, it may be helpful in generating ideas for how to use the PUF.

DAF Research Solutions 1

This fact sheet describes how the DAF was useful in an analysis by Ben-Shalom and Stapleton (2012), who sought to better understand the long-term program participation and employment patterns of adult SSI recipients following benefit award.

Frequently Asked Questions

What is the most recent year of data available in the DAF PUF?

The most recent data available is the DAF21 PUF, which includes data through December 2021.

How often does SSA update the DAF PUF?

Each year, we build a new version of the DAF to add records for beneficiaries who began participating in SSDI or SSI during the most recent year and to update records for beneficiaries who enrolled in an earlier year and are in previous versions of the DAF. The next version of the DAF PUF will be the DAF21 PUF. We expect this file to be available in the spring of 2023. Users should use the most recent version available.

Who is included in the DAF PUF?

The DAF21 includes all children and pre-retirement adults who have received one or more SSDI or SSI disability payments at any time between March 1996 and December 2021.

The DAF PUF contains a random 10 percent sample of beneficiaries included in the full DAF.

How does the DAF PUF differ from the full DAF?

The DAF PUF contains a random 10 percent sample of beneficiaries included in the DAF. Instead of SSNs, the PUF records have a unique random identifier (PUFPIN) for each beneficiary, allowing researchers to link across files in a given version of the PUF.

Relative to the DAF, the PUF contains a more limited set of variables. We selected the variables in the DAF PUF to be those of broad interest to researchers examining SSI and SSDI beneficiaries with disabilities. To minimize disclosure risk, the DAF PUF includes limited information on participation in employment services, including Ticket to Work (TTW), and payments to employment service providers on behalf of beneficiaries. Over time, we may add additional variables to the DAF PUF.

We also masked data in a variety of ways to avoid disclosure. These methods include collapsing categories of certain categorical variables, recoding all days to the 15th of the month for date variables, and rounding and top-coding dollar values. Please see the DAF-PUF21 Overview and Documentation for more details on variable recoding.

What is the structure of the DAF PUF?

The DAF PUF has two components, the Demographic file and the Annual files. Each file contains one record per beneficiary and is linkable to the other files using the unique identifier variable, PUFPIN.

The Demographic component is a beneficiary-level file that contains information on benefit status and amount, demographic and other one-time information, such as date of birth, date of death, and information collected at the time of disability application.
The Demographic component has two main types of variables: one-time variables and "n" variables. One-time variables include data such as date of birth, SSI file type (MFT_PUF), or Date of Initial Entitlement (DOEI_PUF). These variables reflect the latest information shown in the SSA administrative file used. Many of these variables, such as DOEI_PUF, will show dates going back several decades. Since the DAF includes all beneficiaries who received benefits in any one month since 1996, many beneficiaries included in the DAF started benefits well before 1996.

Like one-time variables, "n" variables reflect the latest information shown in the SSA administrative file used and can show dates going back several decades, but unlike one-time variables, "n" variables can show multiple occurrences. Assuming Var is the root variable name, Var1 will be the first occurrence, Var2 the second, and so on. Most "n" variables will have a value variable (e.g. status for occurrence n) combined with a corresponding date variable for that occurrence. So, for example, DR_PUFn (n=1-5) (Reason code for medical re-examination) can have up to 5 occurrences, one for each continuing disability review, and DR_PUF1 will align with DD_PUF1, DR_PUF2 with DD_PUF2, etc., with each DD_PUFn showing the Medical Continuing Disability Review Date associated with the corresponding "n" decision code. The DAF PUF Demographic file contains more than 3 million records.

The Annuals component is a set of beneficiary-level files, one per calendar year from 1994 to 2021, containing monthly and yearly longitudinal data related to program participation and benefits. Such data include benefit amounts due, SSA region where benefits were received, participation in the Ticket to Work program, and information related to beneficiary income. Although we only include beneficiaries who received benefits from 1996 onward in the DAF, the Annuals contain historical data on those beneficiaries going back to 1994. Since program participation changes over time, the size of each Annual file differs, but all are smaller than the Demographic file because not all beneficiaries received benefits in every year.

Which file should I start with?

While some research projects may require combining data across multiple files, you may be able to complete your research using one file or a small subset of files. Identifying the best file to start with may help streamline your research plan.

In general, start with the Demographic component for research questions that require identifying a group of beneficiaries based on their characteristics. Start with the Annuals component for questions that require identifying beneficiaries at a point in time. For more guidance on how to use the files, see Table I.1 in Volume 2: Working with the DAF20.

When merging multiple years of the Annual files together, you may want to consider limiting the number of variables to keep the file size manageable.

Where can I find additional information?

The DAF-PUF21 Overview and Documentation provides additional information about the DAF PUF as well as detailed information on each PUF variable. In addition, users should consult Volumes 1 through 3 of the full DAF documentation for more information on working with the DAF and tips for conducting analyses using the DAF.

If you have questions after consulting the documentation, please contact ORDES.DAF@ssa.gov.