I have a data frame with survey questions in the format Do you think that [xxxxxxx]?
, where the possible answers are one of the following:
- “I am certain that [xxxxxxx]”
- “I think it is possible that [xxxxxx]”
- “I don’t know if [xxxxxx]”
- “I think it is not possible that [xxxxxx]”
- “I am certain that it is not possible that [xxxxxx]”
- “It is impossible for me to know if [xxxxxx]”
I would like to recode these factors so that “I am certain” = 1, “I think it is possible” = 2 and so on.
I have tried using dplyr::recode
, but I am unable to use regular expressions.
Example data:
set.seed(12345)
possible_answers <- c(
"I am certain that", "I think it is possible that",
"I don't know if is possible that", "I think it is not possible that",
"I am certain that it is not possible that", "It is impossible for me to know if"
)
num_answers <- 10
survey <- data.frame(
Q1 = paste(
sample(possible_answers, num_answers, replace = TRUE),
"topic 1"
),
Q2 = paste(
sample(possible_answers, num_answers, replace = TRUE),
"topic 2"
),
Q3 = paste(
sample(possible_answers, num_answers, replace = TRUE),
"topic 3"
),
Q4 = paste(
sample(possible_answers, num_answers, replace = TRUE),
"topic 4"
),
Q5 = paste(
sample(possible_answers, num_answers, replace = TRUE),
"topic 5"
)
)
I would like to recode the survey questions using dplyr::recode
and regular expressions.
survey %>%
mutate_at(vars(starts_with("Q")), recode,
"I am certain that (.*)" = 1,
"I think it is possible that (.*)" = 2,
"I don't know if is possible that (.*)" = 3,
"I think it is not possible that (.*)" = 4,
"I am certain that it is not possible that (.*)" = 5,
"It is impossible for me to know if (.*)" = 6)
However, this changes everything to NA, because it does not see the strings as regular expressions.