I’m trying to find a way to create a new variable called “Race” and merge the values from the separate columns into one. If a respondent has selected multiple races, the Race variable should be coded as “Mixed”.
ID <- c(rep(c(1:8), 1))
White <- c("White",NA,NA,NA,NA,NA,"White","White")
Asian <- c(NA,NA,NA,NA,NA,"Asian",NA,"Asian")
SouthAfrican <- c(NA,"SouthAfrican",NA,NA,NA,NA,NA,"SouthAfrican")
Hispanic <- c(NA,NA,NA,NA,"Hispanic","Hispanic",NA,NA)
WestAsian <- c(NA,NA,NA,NA,NA,NA,"WestAsian",NA)
PreferNotToAnswer <- c(NA,NA,"PreferNotToAnswer", "PreferNotToAnswer",NA,NA,NA,NA)
df <- data.frame(ID, White, Asian, SouthAfrican, Hispanic,WestAsian,PreferNotToAnswer)
# Create Race variable
df$Race <- NA
# Set Race variable to "White" when "White" is present
df$Race[df$White == "White"] <- "White"
# Set Race variable to "Asian" when "Asian" is present
df$Race[df$Asian == "Asian"] <- "Asian"
# Set Race variable to "SouthAfrican" when "SouthAfrican" is present
df$Race[df$SouthAfrican == "SouthAfrican"] <- "SouthAfrican"
# Set Race variable to "Hispanic" when "Hispanic" is present
df$Race[df$Hispanic == "Hispanic"] <- "Hispanic"
# Set Race variable to "WestAsian" when "WestAsian" is present
df$Race[df$WestAsian == "WestAsian"] <- "WestAsian"
# Set Race variable to "PreferNotToAnswer" when "PreferNotToAnswer" is present
df$Race[df$PreferNotToAnswer == "PreferNotToAnswer"] <- "PreferNotToAnswer"
# Set Race variable to "Mixed" when multiple race options are present
df$Race[rowSums(df[,2:6] > 0) > 1] <- "Mixed"
I’m trying to find a way to create a new variable called “Race” in a dataset with multiple demographic variables (White
, Asian
, SouthAfrican
, Hispanic
, WestAsian
, and PreferNotToAnswer
). The values in the separate columns should be merged into the new “Race” variable, and if a respondent has selected multiple races, the Race variable should be coded as “Mixed”.
Here is the code for replicating the dataset:
ID <- c(rep(c(1:8), 1))
White <- c("White",NA,NA,NA,NA,NA,"White","White")
Asian <- c(NA,NA,NA,NA,NA,"Asian",NA,"Asian")
SouthAfrican <- c(NA,"SouthAfrican",NA, NA,NA, NA, NA, "SouthAfrican")
Hispanic <- c(NA, NA, NA, NA, "Hispanic", "Hispanic", NA, NA)
WestAsian <- c(NA, NA, NA, NA, NA, NA, "WestAsian", NA)
PreferNotToAnswer <- c(NA, NA,"PreferNotToAnswer", "PreferNotToAnswer",NA, NA, NA, NA)
df <- data.frame(ID, White, Asian, SouthAfrican, Hispanic,WestAsian,PreferNotToAnswer)
The following code should create the desired “Race” variable:
# Create Race variable
df$Race <- NA
# Set Race variable to "White" when "White" is present
df$Race[df$White == "White"] <- "White"
# Set Race variable to "Asian" when "Asian" is present
df$Race[df$Asian == "Asian"] <- "Asian"
# Set Race variable to "SouthAfrican" when "SouthAfrican" is present
df$Race[df$SouthAfrican == "SouthAfrican"] <- "SouthAfrican"
# Set Race variable to "Hispanic" when "Hispanic" is present
df$Race[df$Hispanic == "Hispanic"] <- "Hispanic"
# Set Race variable to "WestAsian" when "WestAsian" is present
df$Race[df$WestAsian == "WestAsian"] <- "WestAsian"
# Set Race variable to "PreferNotToAnswer" when "PreferNotToAnswer" is present
df$Race[df$PreferNotToAnswer == "PreferNotToAnswer"] <- "PreferNotToAnswer"
# Set Race variable to "Mixed" when multiple race options are present
df$Race[rowSums(df[,2:6] > 0) > 1] <- "Mixed"
The result is a new “Race” variable with values from the separate columns merged into one. If a respondent has selected multiple races, the Race variable is coded as “Mixed”.