What is a recipe?

library(tidymodels)
library(measure)
data("credit_data")

set.seed(55)
train_test_split <- initial_split(credit_data)

credit_train <- training(train_test_split)
credit_test <- testing(train_test_split)

Creating a Recipe

We specify a recipe providing formula and data arguments. Check out Tidy Modeling with R to learn more about specifying formulas in R.

rec_obj <- recipe(Status ~ ., data = credit_train)

The recipe funtion returns a recipe object. The formula argument determines the roles of each variables. Status is assigned the role of outcome, and the 13 other variables are assigned to role of predictor.

rec_obj
#> 
#> ── Recipe ──────────────────────────────────────────────────────────────────────
#> 
#> ── Inputs
#> Number of variables by role
#> outcome:    1
#> predictor: 13

Diving a bit deeper, the recipe object is a list with 7 elements. Within these elements, we can see more details are saved about our variables. This includes the type and source stored in rec_obj$var_info.

cat(names(rec_obj), sep = "\n")
#> var_info
#> term_info
#> steps
#> template
#> levels
#> retained
#> requirements
#> ptype
rec_obj$var_info
#> # A tibble: 14 × 4
#>    variable  type      role      source  
#>    <chr>     <list>    <chr>     <chr>   
#>  1 Seniority <chr [2]> predictor original
#>  2 Home      <chr [3]> predictor original
#>  3 Time      <chr [2]> predictor original
#>  4 Age       <chr [2]> predictor original
#>  5 Marital   <chr [3]> predictor original
#>  6 Records   <chr [3]> predictor original
#>  7 Job       <chr [3]> predictor original
#>  8 Expenses  <chr [2]> predictor original
#>  9 Income    <chr [2]> predictor original
#> 10 Assets    <chr [2]> predictor original
#> 11 Debt      <chr [2]> predictor original
#> 12 Amount    <chr [2]> predictor original
#> 13 Price     <chr [2]> predictor original
#> 14 Status    <chr [3]> outcome   original

Adding a Step

The recipe does not yet contain any steps.

rec_obj$steps
#> NULL

rec_obj_add_step <- rec_obj %>%
  step_impute_knn(all_predictors())

rec_obj_add_step$steps
#> [[1]]
#> • K-nearest neighbor imputation for: all_predictors()