Data analysis 5 (Recoding Variables)

5. Recoding variables

command [action]
5.1. recode var list (rule) [ changes the values of numeric variables according to the rules specified alters data]
5.2. rename old varname new varname [ changes the name of existing variable ]
5.3. generate newvar = exp [ creates a new variable. The values of the variable are specified by = exp ]
5.4. replace varname = exp [ changes the contents of an existing variable alters data ]

Now that you know how to load data. We can begin manipulating variables. First, lets load some data. This is data from my survey about immigration attitudes. It includes a Disgust Sensitivity Scale. To load the data from the we, type:

    use "https://eddie-hearn.github.io/teaching/ZEM/data/dss-scale.dta"

Now that the data is loaded you can view the details of the data. Type:

    describe

We can see that the data set consist of an id variable, some demographic variables (age, gender, education), 5 immigration attitude questions (immig_attitude*), and 6 Disgust Sensitivity Scale items (dss*). First lets consider the "gender" variable. Let's generate a new variable called gender2 that is = to gender.

    generate gender2 = gender

Since male is set to 1 and female is set to 2, let's recode our variable as 0 and 1.

    recode gender2 (2=0)

Finally we can give our variable a better name. Since 1 is male we can name our variable "male".

    rename gender2 male

Next we have 5 questions about support for immigration. We can add these questions together to make an index of immigration attitudes. But we have to be careful. " of the variables (immig_attitude2 and immig_attitude4) code negative attitudes as high scores. The other variables code positive values as high scores. First we need to invert the 2 negative questions.

    sum immig_*

    replace immig_attitude2 = immig_attitude2*-1

    sum immig_attitude2

    replace immig_attitude2 = immig_attitude2+6

    sum immig_attitude2

    replace immig_attitude4 = (immig_attitude4*-1)+6

    sum immig_*

Now we can generate a simple additive index of immigration attitudes named "support".

    generate support = immig_attitude1 + immig_attitude2 + immig_attitude3 + immig_attitude4 + immig_attitude5

See if you can make an additive index of the dss items. Call your new variable "DSS" What do you think predicts individual immigration attitudes. We can run a simple model looking at the effect of age, gender, education, and DSS

    reg support male educ age DSS

According to our model, what is the best predictor of immigration attitude?