data("credit_data")
set.seed(55)
train_test_split <- initial_split(credit_data)
credit_train <- training(train_test_split)
credit_test <- testing(train_test_split)
We specify a recipe providing formula and data arguments. Check out Tidy Modeling with R to learn more about specifying formulas in R.
The recipe
funtion returns a recipe object. The formula
argument determines the roles of each variables. Status
is
assigned the role of outcome
, and the 13 other variables
are assigned to role of predictor
.
rec_obj
#>
#> ── Recipe ──────────────────────────────────────────────────────────────────────
#>
#> ── Inputs
#> Number of variables by role
#> outcome: 1
#> predictor: 13
Diving a bit deeper, the recipe object is a list with 7 elements.
Within these elements, we can see more details are saved about our
variables. This includes the type
and source
stored in rec_obj$var_info
.
cat(names(rec_obj), sep = "\n")
#> var_info
#> term_info
#> steps
#> template
#> levels
#> retained
#> requirements
#> ptype
rec_obj$var_info
#> # A tibble: 14 × 4
#> variable type role source
#> <chr> <list> <chr> <chr>
#> 1 Seniority <chr [2]> predictor original
#> 2 Home <chr [3]> predictor original
#> 3 Time <chr [2]> predictor original
#> 4 Age <chr [2]> predictor original
#> 5 Marital <chr [3]> predictor original
#> 6 Records <chr [3]> predictor original
#> 7 Job <chr [3]> predictor original
#> 8 Expenses <chr [2]> predictor original
#> 9 Income <chr [2]> predictor original
#> 10 Assets <chr [2]> predictor original
#> 11 Debt <chr [2]> predictor original
#> 12 Amount <chr [2]> predictor original
#> 13 Price <chr [2]> predictor original
#> 14 Status <chr [3]> outcome original