Function recodes religious identification from Genearl Social Survey, based on three variables: relig, denom, and other. It can successfully recode either respondent's, or any other religious identification which is determened by coresponding three variables.

recode_religion(relig, denom, other, n_groups = 12,
  add_missing_levels = FALSE, frequencies = TRUE, print_key = FALSE,
  return_num = FALSE)

Arguments

relig, denom, other

Numerical, character, or factor, all of same length and with coresponding punches or labels in codebook.

n_groups

Number 12, i.e. number of new religious identifications.

add_missing_levels

Logical, to include as empty levels religious identifications that may not be present in specific sample, but are part of recoding schema.

frequencies

Logical, to print frequency and percent table of recoded religius identification (default is TRUE).

print_key

Logical, to print the all unique tetrads of recoded variables, i.e. a recoding key.

return_num

Logical, to return numerical factor and print codebook.

Value

Vector with recoded religion from relig, denom, and other. Function does not return NA, but as factor levels "Not answered" and "Don't know", or combined "Not answered/Don't know" when missing values are not declared as punches or labels in initial variables but passed on as NA (function gives message and where NAs are lcoated). Default is to have factor with 12 descriptive levels, but function can also return numerical vector. Default behavior returns only present values, but can be made to add additional empty levels if TRUE is passed to add_missing_levels. Function also prints frequency table of newly recoded religious identification, which can be suppressed with frequencies. If required, it can also return numerical vector and print the coding for it (not recommended).

Details

recode_religion uses schema developed by Darren E. Sherkat and Derek Lehman in "After The Resurrection: The Field of the Sociology of Religion in the United States", and is effectievly translation of that SPSS syntax (the bare bone function for recoding is fct_rec_relig), with additional functionality.

Namley, it can handle both punches and labels at the same time (but in different variables), which is important since punches are not consequtive as indexes. In addition, function checks that variables are adequate (i.e. that all values are in codebook) and of same length, and also handles missing values: (1) if supplied through values, provides detail recoding; (2) if NA, lumps them together in final variable but uses them correctly in the recoding. Through passed arguments, one can:

  1. Add identifications from schema that are not present in sample as empty levels.

  2. Suppress printing of the frequencies of newly recoded variable.

  3. Print unique key of the values that were recoded.

  4. Return values as numerical factor, in which case the codebook for new variable will be printed.

If frequencies is passed as FALSE, and numerical vector is not requested as return value, all other information, such as treatment of missing values, are provided as messages that can be suppressed.

Future behavior will provide recoding to 7 levels. More details can be found on github.mdjeric.

Examples

library(resurrectionr) # When all variables are factor gss14_f$religion <- recode_religion(gss14_f$relig, gss14_f$denom, gss14_f$other, frequencies = FALSE) # When all variables are numeric gss14_n$religion <- recode_religion(gss14_n$relig, gss14_n$denom, gss14_n$other, add_missing_levels = TRUE)
#> Some of the variables contain NA: `Don't know` and `NA`will be merged. Please see documentation for more details.
#> * `relig` recoded from punches to labels; and 'NA' introduced.
#> * `denom` recoded from punches to labels; and 'NA' introduced.
#> * `other` recoded from punches to labels; and 'NA' introduced.
#> Distribution of religious identification, in your data of 2538 is: #> Freq Relative Cumul #> Catholic or Orthodox 615 24.23 24.23 #> None 522 20.57 44.80 #> Baptist 324 12.77 57.57 #> Christian, no group given 302 11.90 69.46 #> Moderate Protestant 206 8.12 77.58 #> Sectarian Protestant 178 7.01 84.59 #> Lutheran 92 3.62 88.22 #> Other religion 88 3.47 91.69 #> Liberal Protestant 77 3.03 94.72 #> Jewish 40 1.58 96.30 #> Episcopalian 39 1.54 97.83 #> Mormon 32 1.26 99.09 #> Don't know 0 0.00 99.09 #> No answer 0 0.00 99.09 #> <NA> 23 0.91 100.00
# But also, combining them works religion <- recode_religion(gss14_f$relig, gss14_n$denom, as.character(gss14_f$other))
#> * `denom` recoded from punches to labels; and 'NA' introduced.
#> Distribution of religious identification, in your data of 2538 is: #> Freq Relative Cumul #> Catholic or Orthodox 615 24.23 24.23 #> None 522 20.57 44.80 #> Baptist 324 12.77 57.57 #> Christian, no group given 319 12.57 70.13 #> Moderate Protestant 206 8.12 78.25 #> Sectarian Protestant 178 7.01 85.26 #> Lutheran 92 3.62 88.89 #> Other religion 88 3.47 92.36 #> Liberal Protestant 77 3.03 95.39 #> Jewish 40 1.58 96.97 #> Episcopalian 39 1.54 98.50 #> Mormon 32 1.26 99.76 #> Don't know 3 0.12 99.88 #> <NA> 3 0.12 100.00