Workflows and Metrics

.title[
# <span style="font-size:48pt;">Workflows and Metrics</span>
]
.subtitle[
## 🚣 💯 📏
]
.author[
### Machine Learning in R<br /><i>SMaRT Workshops</i>
]
.date[
### Day 2A     Jeffrey Girard
]

---

---

# Setup

``` r
# Set up packages

library(tidyverse)
library(tidymodels)
tidymodels_prefer()
```

``` r
# Read in and tidy data

titanic <- 
  read_csv("https://tinyurl.com/mlr-titanic") %>% 
  mutate(
    survived = factor(survived, levels = c(1, 0)),
    pclass = factor(pclass),
    sex = factor(sex)
  )
```

---

## Data Splitting and Model Setup

``` r
# Create train/test split, stratified by fare

set.seed(2022)
fare_split <- initial_split(data = titanic, prop = 0.8, strata = fare)
```

``` r
# Extract training and testing sets

fare_train <- training(fare_split)
fare_test <- testing(fare_split)
```

``` r
# Set up model (linear regression using lm)

lm_model <- 
  linear_reg() %>% 
  set_mode("regression") %>% 
  set_engine("lm")
```

---

## Create a simple workflow

-   The **model** specifies the algorithm, mode, engine, and tuning steps
  
  -   The **preprocessor** specifies the formula and feature engineering steps
]

``` r
fare_wflow <-
  workflow() %>% 
  add_model(lm_model)
```

``` r
fare_wflow
## ══ Workflow ════════════════════════════════════════════════════════════════════
## Preprocessor: None
## Model: linear_reg()
## 
## ── Model ───────────────────────────────────────────────────────────────────────
## Linear Regression Model Specification (regression)
## 
## Computational engine: lm
```

---

## Add a formula as a simple preprocessor

``` r
fare_wflow <- 
  fare_wflow %>% 
  add_formula(fare ~ pclass + sex + age + sibsp + parch)
```

``` r
fare_wflow
## ══ Workflow ════════════════════════════════════════════════════════════════════
## Preprocessor: Formula
## Model: linear_reg()
## 
## ── Preprocessor ────────────────────────────────────────────────────────────────
## fare ~ pclass + sex + age + sibsp + parch
## 
## ── Model ───────────────────────────────────────────────────────────────────────
## Linear Regression Model Specification (regression)
## 
## Computational engine: lm
```

---

## Fit a model using a workflow

``` r
fare_fit <- fit(fare_wflow, fare_train)
```

.scroll.h-3l[

``` r
fare_fit
## ══ Workflow [trained] ══════════════════════════════════════════════════════════
## Preprocessor: Formula
## Model: linear_reg()
## 
## ── Preprocessor ────────────────────────────────────────────────────────────────
## fare ~ pclass + sex + age + sibsp + parch
## 
## ── Model ───────────────────────────────────────────────────────────────────────
## 
## Call:
## stats::lm(formula = ..y ~ ., data = data)
## 
## Coefficients:
## (Intercept)      pclass2      pclass3      sexmale          age        sibsp  
##    92.39601    -71.45150    -81.07618     -6.76926     -0.05083      4.13072  
##       parch  
##    11.74983
```

]

---

## Make predictions using the fit model

``` r
fare_pred <- predict(fare_fit, fare_test)
```

.scroll.h-3l[

``` r
fare_pred
## # A tibble: 263 × 1
##    .pred
##    <dbl>
##  1 113. 
##  2  82.0
##  3  91.2
##  4  91.1
##  5  83.8
##  6  99.6
##  7  90.9
##  8  90.9
##  9  90.2
## 10  90.3
## # ℹ 253 more rows
```

]

---

## A helpful shortcut

-   This will fit to the (entire) training set and predict the testing set
]

``` r
fare_fit <- last_fit(fare_wflow, split = fare_split)
```

``` r
fare_fit
## # Resampling results
## # Manual resampling 
## # A tibble: 1 × 6
##   splits             id               .metrics .notes   .predictions .workflow 
##   <list>             <chr>            <list>   <list>   <list>       <list>    
## 1 <split [1046/263]> train/test split <tibble> <tibble> <tibble>     <workflow>
```

---

## Collecting the predictions

``` r
fare_pred <- collect_predictions(fare_fit)
```

.scroll.h-3l[

``` r
fare_pred
## # A tibble: 263 × 5
##    .pred id                .row  fare .config             
##    <dbl> <chr>            <int> <dbl> <chr>               
##  1 113.  train/test split     2 152.  Preprocessor1_Model1
##  2  82.0 train/test split    10  49.5 Preprocessor1_Model1
##  3  91.2 train/test split    13  69.3 Preprocessor1_Model1
##  4  91.1 train/test split    14  78.8 Preprocessor1_Model1
##  5  83.8 train/test split    20  75.2 Preprocessor1_Model1
##  6  99.6 train/test split    21  52.6 Preprocessor1_Model1
##  7  90.9 train/test split    25 222.  Preprocessor1_Model1
##  8  90.9 train/test split    33 165.  Preprocessor1_Model1
##  9  90.2 train/test split    42  27.7 Preprocessor1_Model1
## 10  90.3 train/test split    45 134.  Preprocessor1_Model1
## # ℹ 253 more rows
```

]

---

## Plotting the predictions (basic)

``` r
ggplot(
  fare_pred, 
  aes(x = fare, y = .pred)
) + 
  geom_point()
```

]

---

## Plotting the predictions (advanced)

``` r
ggplot(
  fare_pred, 
  aes(x = fare, y = .pred)
) + 
  geom_point(alpha = .2) +
  geom_abline(color = "darkred") +
  coord_obs_pred() +
  labs(
    x = "Observed Fare", 
    y = "Predicted Fare"
  )
```

]

---

## Collecting the performance metrics

``` r
fare_perf <- collect_metrics(fare_fit)
```

``` r
fare_perf
## # A tibble: 2 × 4
##   .metric .estimator .estimate .config             
##   <chr>   <chr>          <dbl> <chr>               
## 1 rmse    standard      30.2   Preprocessor1_Model1
## 2 rsq     standard       0.567 Preprocessor1_Model1
```

---

## Live Coding: Putting it all together

.scroll.h-0l[

``` r
# Load packages

library(tidyverse)
library(tidymodels)
tidymodels_prefer()

# Load and tidy data

titanic <- 
  read_csv("https://tinyurl.com/mlr-titanic") %>% 
  mutate(
    survived = factor(survived),
    pclass = factor(pclass),
    sex = factor(sex)
  )

# Create data splits, stratified by fare

set.seed(2022)
fare_split <- initial_split(data = titanic, prop = 0.8, strata = 'fare')
fare_train <- training(fare_split)
fare_test <- testing(fare_split)

# Set up model (linear regression using lm)

lm_model <- 
  linear_reg() %>% 
  set_mode("regression") %>% 
  set_engine("lm")

# Set up workflow with simple formula preprocessor

fare_wflow <-
  workflow() %>% 
  add_model(lm_model) %>% 
  add_formula(fare ~ pclass + sex + age + sibsp + parch)

# Fit workflow and make predictions using data splits

fare_fit <- last_fit(fare_wflow, split = fare_split)

# Collect predictions and performance metrics

fare_pred <- collect_predictions(fare_fit)
fare_perf <- collect_metrics(fare_fit)
```
]

---

## Live Coding: Modifying it for classification

.scroll.h-0l[

``` r
# Load packages and data
library(tidyverse)
library(tidymodels)
tidymodels_prefer()

titanic <- 
  read_csv("https://tinyurl.com/mlr-titanic") %>% 
  mutate(
    survived = factor(survived),
    pclass = factor(pclass),
    sex = factor(sex)
  )

# Create data splits, stratified by survived

set.seed(2022)
*surv_split <- initial_split(titanic, prop = 0.8, strata = 'survived')
surv_train <- training(surv_split)
surv_test <- testing(surv_split)

# Set up model (logistic regression using glm)

glm_model <- 
* logistic_reg() %>%
* set_mode("classification") %>%
* set_engine("glm")

# Set up workflow with simple formula preprocessor

surv_wflow <-
  workflow() %>% 
* add_model(glm_model) %>%
* add_formula(survived ~ pclass + sex + age + sibsp + parch + fare)

# Fit workflow and make predictions using data splits

surv_fit <- last_fit(surv_wflow, split = surv_split)

# Collect predictions and performance metrics

surv_pred <- collect_predictions(surv_fit)
surv_perf <- collect_metrics(surv_fit)
```

]

---
class: inverse, center, middle
# Performance Metrics
---
class: onecol
## Performance Metrics

.left-column.pt3[
<img src="data:image/png;base64,#../figs/target.jpg" width="100%" />
]

**Metrics for Regression**

- .imp[Distance] between predicted and trusted values

- .imp[Correlation] between predicted and trusted values

.pt1[
**Metrics for Classification**
]
- .imp[Confusion matrix] between predicted and trusted classes

- Compare predicted .imp[class probabilities] to trusted classes
]

---
class: onecol
## Default distance metric for regression

- Penalizes severe errors harsher

- Ranges from `$0$` to `$+\infty$`, lower is better

.pull-right.pt4[
<img src="data:image/png;base64,#slides_2a_files/figure-html/rmse-1.png" width="100%" />
]

---
class: onecol
## Default correlation metric for regression

**R-Squared `$(R^2$` or RSQ)**
- Calculated in ML as the **squared correlation** between the predictions and labels

- Ranges from `$0$` to `$1$`, higher is better

`$$R^2 = \left(\frac{\text{cov}(y, \hat{y})}{\sigma_y\sigma_{\hat{y}}}\right)^2 = \left(\frac{\sum(y_i - \bar{y})(\hat{y}_i - \bar{\hat{y}})}{\sqrt{\sum (y_i-\bar{y})^2}\sqrt{\sum(\hat{y}_i-\bar{\hat{y}})^2}}\right)^2$$`

.bg-light-yellow.b--light-red.ba.bw1.br3.pl4[
+ RSQ is a measure of *consistency* (i.e., linear association) and not distance
+ RSQ can become unstable or undefined when data variability is low
+ RSQ can become unstable when applied in small samples (e.g., test sets)
]

---

|                                               | Truth = `$+$`<br /> `$(y=1)$` | Truth = `$-$`<br /> `$(y=0)$` |
|:---------------------------------------------:|:-------------------------:|:-------------------------:|
| **Predicted = `$+$`**&emsp;<br /> `$(\hat{y}=1)$` | True Positive (TP)        | False Positive (FP)       |
| **Predicted = `$-$`**&emsp;<br /> `$(\hat{y}=0)$` | False Negative (FN)       | True Negative (TN)        |

.mt3.pull-left[
`$$\text{Accuracy} = \frac{TN + TP}{TN + FN + FP + TP}$$`
.tc.mt4[
Ranges from `$0$` to `$1$`, higher is better
]
]

.mt3.pull-right[
`$$\text{Sensitivity}=\frac{TP}{TP+FN}=\frac{TP}{P}$$`

`$$\text{Specificity}=\frac{TN}{TN+FP}=\frac{TN}{N}$$`

]

---

## Live Coding: Classification metrics

.scroll.h-0l[

``` r
# Add that metric set to the fitting function

surv_fit <- last_fit(surv_wflow, split = surv_split)

# Collect and print testing set metrics

surv_perf <- collect_metrics(surv_fit)
surv_perf

# Calculate raw confusion matrix

surv_cm <- 
  collect_predictions(surv_fit) %>% 
  conf_mat(truth = survived, estimate = .pred_class)
surv_cm

# Plot confusion matrix

autoplot(surv_cm, type = "mosaic")
autoplot(surv_cm, type = "heatmap")

# Calculate all confusion matrix metrics

summary(surv_cm)
```
]

---

- When a classifier outputs class probabilities, we can choose any **decision threshold**

- We might naturally consider any probability over 50% positive and all others negative

- But we could choose a threshold more conservative (e.g., 75%) or liberal (e.g., 25%)

- This gives us an overview of how the classification system performs *overall*
]

- The **area under the curve (AUC)** condenses the ROC Curve into a single metric
]

---
class: onecol
## Interpreting a ROC curve and its AUC

- Sensitivity is `$TP / P$`
- Specificity is `$TN / N$`

Better curves are closer to the top-left

The area under the ROC curve (AUC) ranges from `$0.5$` to `$1.0$`, higher is better.

*ROC AUC is the probability that a random positive example has a higher estimate than a random negative example.*

]

.pull-right[
<img src="data:image/png;base64,#slides_2a_files/figure-html/rocex-1.png" width="95%" />
]

---

## Live Coding: ROC Curves

.scroll.h-0l[

``` r
# Examine predictions

surv_pred

# Examine outcome level ordering

levels(surv_pred$survived)

# Calculate per-threshold performance

surv_roc <- 
  roc_curve(
    data = surv_pred, 
    truth = survived, 
    .pred_1, 
  )
surv_roc

# Plot the ROC curve

autoplot(surv_roc)
```
]

---
class: twocol
## Comprehension Check \#1

<span style="font-size:30px;">Bindi trains Model [A] to predict how many kilometers each bird will migrate this year and Model [B] to predict whether or not it will reproduce this year.</span>

a) Accuracy for [A] and [B]

b) RMSE for [A] and [B]

c) Accuracy for [A] and RMSE for [B]

d) RMSE for [A] and Accuracy for [B]

]

a) RSQ = 0.10 and ROC AUC = 0.04

b) RSQ = 0.45 and ROC AUC = 0.92

c) RSQ = 0.10 and ROC AUC = 0.92

d) RSQ = 0.45 and ROC AUC = 0.04
]

---
class: inverse, center, middle
# Time for a Break!
<div class="countdown" id="timer_1d5a780e" data-warn-when="120" data-update-every="1" tabindex="0" style="right:33%;bottom:15%;left:33%;">
<div class="countdown-controls"><button class="countdown-bump-down">−</button><button class="countdown-bump-up">+</button></div>
<code class="countdown-time"><span class="countdown-digits minutes">10</span><span class="countdown-digits colon">:</span><span class="countdown-digits seconds">00</span></code>
</div>