Recode Religion from GSS.

Function recodes religious identification from Genearl Social Survey, based on three variables: relig, denom, and other. It can successfully recode either respondent's, or any other religious identification which is determened by coresponding three variables.

recode_religion(relig, denom, other, n_groups = 12,
  add_missing_levels = FALSE, frequencies = TRUE, print_key = FALSE,
  return_num = FALSE)

Arguments

relig, denom, other	Numerical, character, or factor, all of same length and with coresponding punches or labels in codebook.
n_groups	Number 12, i.e. number of new religious identifications.
add_missing_levels	Logical, to include as empty levels religious identifications that may not be present in specific sample, but are part of recoding schema.
frequencies	Logical, to print frequency and percent table of recoded religius identification (default is `TRUE`).
print_key	Logical, to print the all unique tetrads of recoded variables, i.e. a recoding key.
return_num	Logical, to return numerical factor and print codebook.

Value

Vector with recoded religion from relig, denom, and other. Function does not return NA, but as factor levels "Not answered" and "Don't know", or combined "Not answered/Don't know" when missing values are not declared as punches or labels in initial variables but passed on as NA (function gives message and where NAs are lcoated). Default is to have factor with 12 descriptive levels, but function can also return numerical vector. Default behavior returns only present values, but can be made to add additional empty levels if TRUE is passed to add_missing_levels. Function also prints frequency table of newly recoded religious identification, which can be suppressed with frequencies. If required, it can also return numerical vector and print the coding for it (not recommended).

Details

recode_religion uses schema developed by Darren E. Sherkat and Derek Lehman in "After The Resurrection: The Field of the Sociology of Religion in the United States", and is effectievly translation of that SPSS syntax (the bare bone function for recoding is fct_rec_relig), with additional functionality.

Namley, it can handle both punches and labels at the same time (but in different variables), which is important since punches are not consequtive as indexes. In addition, function checks that variables are adequate (i.e. that all values are in codebook) and of same length, and also handles missing values: (1) if supplied through values, provides detail recoding; (2) if NA, lumps them together in final variable but uses them correctly in the recoding. Through passed arguments, one can:

Add identifications from schema that are not present in sample as empty levels.
Suppress printing of the frequencies of newly recoded variable.
Print unique key of the values that were recoded.
Return values as numerical factor, in which case the codebook for new variable will be printed.

If frequencies is passed as FALSE, and numerical vector is not requested as return value, all other information, such as treatment of missing values, are provided as messages that can be suppressed.

Future behavior will provide recoding to 7 levels. More details can be found on github.mdjeric.

Examples

library(resurrectionr)

# When all variables are factor
gss14_f$religion <- recode_religion(gss14_f$relig, gss14_f$denom,
                                    gss14_f$other, frequencies = FALSE)

# When all variables are numeric
gss14_n$religion <- recode_religion(gss14_n$relig, gss14_n$denom,
                                    gss14_n$other,
                                    add_missing_levels = TRUE)
#> Some of the variables contain NA: `Don't know` and `NA`will be merged. Please see documentation for more details.
#> * `relig` recoded from punches to labels; and 'NA' introduced.
#> * `denom` recoded from punches to labels; and 'NA' introduced.
#> * `other` recoded from punches to labels; and 'NA' introduced.
#> Distribution of religious identification, in your data of 2538 is:
#>                           Freq Relative  Cumul
#> Catholic or Orthodox       615    24.23  24.23
#> None                       522    20.57  44.80
#> Baptist                    324    12.77  57.57
#> Christian, no group given  302    11.90  69.46
#> Moderate Protestant        206     8.12  77.58
#> Sectarian Protestant       178     7.01  84.59
#> Lutheran                    92     3.62  88.22
#> Other religion              88     3.47  91.69
#> Liberal Protestant          77     3.03  94.72
#> Jewish                      40     1.58  96.30
#> Episcopalian                39     1.54  97.83
#> Mormon                      32     1.26  99.09
#> Don't know                   0     0.00  99.09
#> No answer                    0     0.00  99.09
#> <NA>                        23     0.91 100.00

# But also, combining them works
religion <- recode_religion(gss14_f$relig, gss14_n$denom,
                            as.character(gss14_f$other))
#> * `denom` recoded from punches to labels; and 'NA' introduced.
#> Distribution of religious identification, in your data of 2538 is:
#>                           Freq Relative  Cumul
#> Catholic or Orthodox       615    24.23  24.23
#> None                       522    20.57  44.80
#> Baptist                    324    12.77  57.57
#> Christian, no group given  319    12.57  70.13
#> Moderate Protestant        206     8.12  78.25
#> Sectarian Protestant       178     7.01  85.26
#> Lutheran                    92     3.62  88.89
#> Other religion              88     3.47  92.36
#> Liberal Protestant          77     3.03  95.39
#> Jewish                      40     1.58  96.97
#> Episcopalian                39     1.54  98.50
#> Mormon                      32     1.26  99.76
#> Don't know                   3     0.12  99.88
#> <NA>                         3     0.12 100.00

Arguments

Value

Details

Examples

Contents