Skip to main content

Standardizing AHS variables

HUD.GOV HUDUser.gov
eList
="">
 

Has anyone attempted to standardize AHS national survey variables from 1973-2003 (or some subset of years)?  I’ve recoded about 150 of the variables by first extracting them from the ASCII files w/SAS, converting the SAS files to Stata and then standardizing the variables through recoding.  Some examples of such standardization are

 

1)The hown variable tracks respondents’ satisfaction w/their neighborhood.  Prior to 1984, the variable ranges from 1-4, where 1 is best and 4 is worst.  Then the variable range becomes 1-10 with 10 expressing the most satisfaction.  So I have converted the 1973-1984 values to their post-1984 analogs (e.g. 1 became 10, 2 became 7,…). 

 

2)The ejunk variable encodes whether litter/junk was observed near the responding unit.  In 1982 and 1983, the value 1 means no trash was observed and 4 means a “heavy accumulation” was observed.  In 1984-2003, the value 1 means a “major accumulation” was observed and 3 means no trash was observed.  So for records in 1982-1983, 1 was translated to 3 and 4 to 1.

 

Some more elaborate conversions have been performed.  I would be interested in comparing my recodings w/anyone else’s recodings for quality assurance purposes.  Please let me know if you have done similar work before and would be willing to share your code.  Also, if anyone would like a copy of the code, let me know and I’ll send it.

 

Thank you and happy new years,

Ilya Beylin

 

PS:  Often, the conversion couldn’t be perfect.  For example, the phrasings of the questions and potential answers vary from year to year.  Sometimes, a variable reports binned values for numeric variables in some years and exact values for other years (when this happens I use the bin mean).  And there are numerous other problems w/standardizing such a dataset.  I am aware that such problems exist and may precipitate a (constrained) variety of standardizations.