class: center, middle, inverse, title-slide .title[ #
Data Splitting and Validation
] .subtitle[ ## 👥 📊️ 🪓 ] .author[ ### Machine Learning in R
SMaRT Workshops
] .date[ ### Day 1C Shirley Wang ] --- class: inverse, center, middle # Overview --- class: onecol ## Motivation There are **several steps** in the machine learning workflow. These include data exploration, engineering, model development, and evaluation. We have a .imp[finite amount of data] to work with for each of these tasks. -- <p style="padding-top:30px;">Reusing the same data for multiple tasks increases risk of .imp[overfitting]. --- class: onecol count: false ## (a brief reminder about overfitting) <img src="data:image/png;base64,#../figs/overfitting_lay.png" width="100%" /> --- class: onecol count: false ## Motivation There are **several steps** in the machine learning workflow. These include data exploration, engineering, model development, and evaluation. We have a .imp[finite amount of data] to work with for each of these tasks. <p style="padding-top:30px;">Reusing the same data for multiple tasks increases risk of .imp[overfitting]. -- It also reduces **generalizability** to future datasets or prediction problems. We may not get a **true understanding** of our model's accuracy or effectiveness. --- class: onecol ## A Data Budget .left-column.pv3[ <img src="data:image/png;base64,#../figs/spending.jpg" width="100%" /> ] .right-column[ Rather than using all available data at once, in ML we need to think about **budgeting** our data for different tasks. This is especially important when moving from **model development** into **model evaluation**. Importantly, in smaller datasets, we might have to spend the same data on multiple tasks. We need strong methods for doing so while minimizing the risk for overfitting our models! ] --- class: onecol ## Avoiding Data Leakage .left-column.pv3[ <img src="data:image/png;base64,#../figs/leak.jpg" width="100%" /> ] .right-column[ All datasets contain a mix of the true *data-generating process* and random noise.<sup>1</sup> The modeling process includes many decision points, and there is a risk of .imp[leaking information] from model training to evaluation. This can artificially **inflate optimism about model accuracy**. Carefully spending our data for model training vs. model can help prevent .imp[data leakage]. ] .footnote[ [1] *Data generating process* refers to the signal we want to capture - the underlying processes that gives rise to the data that we observe. *Random noise* refers to error (e.g., sampling error) we do not want to model. ] --- class: inverse, center, middle # Countering Overfitting --- class: onecol ## Cross-Validation There are some clever algorithmic tricks to prevent overfitting - For example, we can penalize the model for adding complexity -- The main approach, however, is to use .imp[cross-validation]: -- - Multiple **fully independent** sets of data are created (by subsetting or resampling) - Some sets are used for training (and tuning) and other sets are used for testing - **Model evaluation is always done on data that were not used to train the model** - This way, if performance looks good, we can worry less about variance/overfitting -- .bg-light-yellow.b--light-red.ba.bw1.br3.ph4[ **Caution:** We still need to consider whether the original data was representative! ] --- ## Holdout Set Validation <img src="data:image/png;base64,#../figs/holdout.png" width="100%" /> --- ## Holdout Set Validation <img src="data:image/png;base64,#../figs/holdout2.png" width="100%" /> --- ## Comprehension Check \#1 <span style="font-size:30px;">Bogdan collects data from 1000 patients. He assigns patients 1 to 800<br />to be in his training set and patients 700 to 1000 to be in his testing set.</span> .pull-left[ ### Question 1 **What major mistake did Bogdan make?** a) He used a testing set instead of a holdout set b) Some patients are in both training and testing c) The two subsets of data have different sizes d) He does not have enough data ] .pull-right[ ### Question 2 **Which step should not be done in the training set?** a) Exploratory Analysis b) Feature Engineering c) Model Development d) Model Evaluation ] --- class: inverse, center, middle ## Holdout Set Validation in R --- ## Example Dataset: `titanic` .pull-left[ **Rows:** 963 passengers  **Columns:** 7 variables Variable | Description :------- | :---------- survived  | Did passenger survive? {FALSE, TRUE}  pclass | Passenger class {1st, 2nd, 3rd} sex | Passenger sex {female, male} age | Passenger age (years) sibsp | Siblings and spouses Aboard (\#) parch | Parents and children Aboard (\#) fare | Cost of passenger fare ($) ] .pull-right.pv4[ ``` r library(readr) library(tidymodels) # resolve package conflicts tidymodels_prefer() # load titanic data titanic <- read_csv( "https://tinyurl.com/mlr-titanic" ) ``` ] --- class: onecol ## Simple Random Split The simplest way to perform **holdout set validation** is by randomly splitting the data. One portion of the data is allocated for model training, and another for testing. Deciding on the proportion of data to split is very context-dependent. -- <p style="padding-top:30px;">Too little data in the training set makes it difficult to **accurately estimate parameters**. Too little data in the test set reduces **quality of model performance estimates**. -- .bg-light-green.b--dark-green.ba.bw1.br3.ph4[ **Advice:** It is common to use 80% for training and 20% for testing. ] --- class: onecol ## Simple Random Split in R .left-column[ <br /> <img src="data:image/png;base64,#../figs/rsample.png" width="100%" /> ] .right-column[ We will use the {rsample} package from {tidymodels}. The `initial_split()` function was built for this purpose. Its simplest use case requires 2 arguments: - `data`: the dataframe to be split. - `prop`: the proportion of data to be used for training. ] --- class: onecol ## Simple Random Split in R ``` r # set random seed so results are reproducible set.seed(2022) # create and save an 80/20 data split titanic_split_simple <- initial_split(data = titanic, prop = 0.80) titanic_split_simple #> <Training/Testing/Total> #> <1047/262/1309> ``` -- <p style="padding-top:30px;">This output shows the amount of training, testing, and total data. The resulting `initial_split` object **only contains partitioning information**. --- class: onecol ## Simple Random Split in R To get the actual training and testing data subsets, we need two more functions. We will use the `training()` and `testing()` functions, also from {rsample}. -- ``` r # create training and testing sets titanic_train_simple <- training(titanic_split_simple) titanic_test_simple <- testing(titanic_split_simple) dim(titanic_train_simple) #> [1] 1047 7 ``` ``` r dim(titanic_test_simple) #> [1] 262 7 ``` --- class: onecol ## Stratified Sampling While simple random sampling is sometimes sufficient, there are often exceptions. **Class imbalance**<sup>1</sup> (classification) and highly **skewed data** (regression) may cause issues. Most machine learning algorithms work best when classes are of equal size. .footnote[ [1] Class imbalance refers to a categorical variable in which one class occurs much more frequently than others (e.g., a dataset with 90% healthy cases and 10% disease cases). ] -- <p style="padding-top:30px;"> Random sampling may result in vastly **different distributions** between data subsets. This can result in poor model performance (particularly for the **minority class**). It is important for data subsets to be representative of the whole dataset. --- class: onecol ## Stratified Sampling As a solution, .imp[stratified sampling] can be used instead of simple random sampling. For classification problems, we conduct **data splits separately within each class**. For regression problems, we **bin the outcome into quartiles** and sample within each. -- <p style="padding-top:30px;">These **stratified samples are combined** together for overall training and testing sets. This ensures that the **distribution of the outcome variable** is preserved. -- .bg-light-yellow.b--light-red.ba.bw1.br3.ph4[ **Caution:** We still need to ensure the sample distribution matches the population! ] --- class: onecol ## Stratified Sampling in R To create a stratified random sample, we can use the same `initial_split()` function. We will use the `strata` argument to name the variable to stratify on. -- ``` r set.seed(2022) # create initial stratified split titanic_split_strat <- initial_split(data = titanic, prop = 0.8, strata = 'survived') # create training and testing sets titanic_train_strat <- training(titanic_split_strat) titanic_test_strat <- testing(titanic_split_strat) ``` --- class: inverse, center, middle # Advanced Considerations --- class: onecol ## What about Multilevel Data? .left-column[ <br /> <img src="data:image/png;base64,#../figs/multilevel.png" width="100%" /> ] -- .right-column[ Observations in multilevel data are not .imp[independent]. **Random resampling of rows** may cause biased train/test sets<sup>1</sup>. Therefore, when working with multilevel data, resampling should occur at the **level of the experimental unit**. In this example, resampling should be at the level of **hospitals**. We can use the `group_initial_split()` function in {rsample} to create an initial split of multilevel data. ] .footnote[ [1] For example, some patients from hospital 1 may end up in the training set and others in the testing set. ] --- class: onecol ## Splitting Multilevel Data in R The `group_initial_split()` function has three main arguments: - `data`: dataset - `group`: group to split on (e.g., participants in an EMA study) - `prop`: proportion of data to use for training -- ``` r # load EMA data ema <- readr::read_csv("https://tinyurl.com/mlr-ema") # initial split ema_split <- group_initial_split(covid_ema, group = "id", prop = 0.8) # create training and testing datasets ema_train <- training(ema_split) ema_test <- testing(ema_split) ``` --- class: onecol ## Splitting Multilevel Data in R .pull-left[ ``` r # training data ema_train %>% select(id) %>% unique() ``` <div style="border: 1px solid #ddd; padding: 0px; overflow-y: scroll; height:350px; "><table class="table" style="font-size: 25px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;position: sticky; top:0; background-color: #FFFFFF;"> id </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> user_1 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> </tr> <tr> <td style="text-align:left;"> user_3 </td> </tr> <tr> <td style="text-align:left;"> user_7 </td> </tr> <tr> <td style="text-align:left;"> user_10 </td> </tr> <tr> <td style="text-align:left;"> user_12 </td> </tr> <tr> <td style="text-align:left;"> user_13 </td> </tr> <tr> <td style="text-align:left;"> user_14 </td> </tr> <tr> <td style="text-align:left;"> user_16 </td> </tr> <tr> <td style="text-align:left;"> user_18 </td> </tr> <tr> <td style="text-align:left;"> user_19 </td> </tr> <tr> <td style="text-align:left;"> user_20 </td> </tr> <tr> <td style="text-align:left;"> user_21 </td> </tr> <tr> <td style="text-align:left;"> user_22 </td> </tr> <tr> <td style="text-align:left;"> user_25 </td> </tr> <tr> <td style="text-align:left;"> user_26 </td> </tr> <tr> <td style="text-align:left;"> user_27 </td> </tr> <tr> <td style="text-align:left;"> user_28 </td> </tr> <tr> <td style="text-align:left;"> user_29 </td> </tr> <tr> <td style="text-align:left;"> user_30 </td> </tr> <tr> <td style="text-align:left;"> user_31 </td> </tr> <tr> <td style="text-align:left;"> user_32 </td> </tr> <tr> <td style="text-align:left;"> user_33 </td> </tr> <tr> <td style="text-align:left;"> user_34 </td> </tr> <tr> <td style="text-align:left;"> user_36 </td> </tr> <tr> <td style="text-align:left;"> user_37 </td> </tr> <tr> <td style="text-align:left;"> user_38 </td> </tr> <tr> <td style="text-align:left;"> user_39 </td> </tr> <tr> <td style="text-align:left;"> user_40 </td> </tr> <tr> <td style="text-align:left;"> user_41 </td> </tr> <tr> <td style="text-align:left;"> user_43 </td> </tr> <tr> <td style="text-align:left;"> user_44 </td> </tr> <tr> <td style="text-align:left;"> user_45 </td> </tr> <tr> <td style="text-align:left;"> user_46 </td> </tr> <tr> <td style="text-align:left;"> user_47 </td> </tr> <tr> <td style="text-align:left;"> user_48 </td> </tr> <tr> <td style="text-align:left;"> user_50 </td> </tr> <tr> <td style="text-align:left;"> user_52 </td> </tr> <tr> <td style="text-align:left;"> user_53 </td> </tr> <tr> <td style="text-align:left;"> user_54 </td> </tr> <tr> <td style="text-align:left;"> user_55 </td> </tr> <tr> <td style="text-align:left;"> user_56 </td> </tr> <tr> <td style="text-align:left;"> user_58 </td> </tr> <tr> <td style="text-align:left;"> user_59 </td> </tr> <tr> <td style="text-align:left;"> user_60 </td> </tr> <tr> <td style="text-align:left;"> user_61 </td> </tr> <tr> <td style="text-align:left;"> user_62 </td> </tr> <tr> <td style="text-align:left;"> user_63 </td> </tr> <tr> <td style="text-align:left;"> user_64 </td> </tr> <tr> <td style="text-align:left;"> user_65 </td> </tr> <tr> <td style="text-align:left;"> user_66 </td> </tr> <tr> <td style="text-align:left;"> user_67 </td> </tr> <tr> <td style="text-align:left;"> user_69 </td> </tr> <tr> <td style="text-align:left;"> user_70 </td> </tr> <tr> <td style="text-align:left;"> user_71 </td> </tr> <tr> <td style="text-align:left;"> user_72 </td> </tr> <tr> <td style="text-align:left;"> user_73 </td> </tr> <tr> <td style="text-align:left;"> user_74 </td> </tr> <tr> <td style="text-align:left;"> user_75 </td> </tr> <tr> <td style="text-align:left;"> user_76 </td> </tr> <tr> <td style="text-align:left;"> user_77 </td> </tr> <tr> <td style="text-align:left;"> user_78 </td> </tr> <tr> <td style="text-align:left;"> user_79 </td> </tr> </tbody> </table></div> ] .pull-right[ ``` r # testing data ema_test %>% select(id) %>% unique() ``` <div style="border: 1px solid #ddd; padding: 0px; overflow-y: scroll; height:350px; "><table class="table" style="font-size: 25px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;position: sticky; top:0; background-color: #FFFFFF;"> id </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> user_4 </td> </tr> <tr> <td style="text-align:left;"> user_5 </td> </tr> <tr> <td style="text-align:left;"> user_6 </td> </tr> <tr> <td style="text-align:left;"> user_8 </td> </tr> <tr> <td style="text-align:left;"> user_9 </td> </tr> <tr> <td style="text-align:left;"> user_11 </td> </tr> <tr> <td style="text-align:left;"> user_15 </td> </tr> <tr> <td style="text-align:left;"> user_17 </td> </tr> <tr> <td style="text-align:left;"> user_23 </td> </tr> <tr> <td style="text-align:left;"> user_24 </td> </tr> <tr> <td style="text-align:left;"> user_35 </td> </tr> <tr> <td style="text-align:left;"> user_42 </td> </tr> <tr> <td style="text-align:left;"> user_49 </td> </tr> <tr> <td style="text-align:left;"> user_51 </td> </tr> <tr> <td style="text-align:left;"> user_57 </td> </tr> <tr> <td style="text-align:left;"> user_68 </td> </tr> </tbody> </table></div> ] --- class: onecol ## What about Time Series Data? .left-column[ <br /> <img src="data:image/png;base64,#../figs/time.png" width="100%" /> ] .right-column[ Random resampling is also not appropriate for **time series data**. We can't use .imp[data from the future] to predict data from the past! Instead, we should use the most **recent data** for a test set. When working with time series data, it's common to use the first portion of data for training, and the last portion for testing. We can use the `initial_time_split()` function in {rsample} to create an initial split of time series data. ] --- class: onecol ## Splitting Time Series Data in R The `initial_time_split()` function has three main arguments: - `data`: dataset - `prop`: proportion of first part of data for training - `lag`: lag value if using lagged predictors .footnote[ Note: the `initial_time_split()` function assumes that your data are sorted in an appropriate order. ] -- ``` r # example: subset data to a single participant idiographic <- ema %>% filter(id == "user_2") # initial split idiographic_split <- initial_time_split(idiographic, prop = 0.8, lag = 1) # create training and testing datasets idiographic_train <- training(idiographic_split) idiographic_test <- testing(idiographic_split) ``` --- class: onecol ## Splitting Time Series Data in R .pull-left[ ``` r # training data idiographic_train ``` <div style="border: 1px solid #ddd; padding: 0px; overflow-y: scroll; height:350px; "><table class="table" style="font-size: 17px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;position: sticky; top:0; background-color: #FFFFFF;"> id </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Relax </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Irritable </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Worry </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Nervous </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Future </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Anhedonia </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Tired </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Hungry </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Alone </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Angry </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> day </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> beep </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> resp </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 5 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 6 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 7 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 8 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 9 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 10 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 11 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 12 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 13 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 14 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 15 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 16 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 17 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 18 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 19 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 20 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 21 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 22 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 23 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 24 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 25 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 26 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 27 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 28 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 29 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 30 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 31 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 32 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 9 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 33 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 9 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 34 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 9 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 35 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 9 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 36 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 10 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 37 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 10 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 38 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 10 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 39 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 10 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 40 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 11 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 41 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> 11 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 42 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 11 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 43 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 11 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 44 </td> </tr> </tbody> </table></div> ] .pull-right[ ``` r # testing data idiographic_test ``` <div style="border: 1px solid #ddd; padding: 0px; overflow-y: scroll; height:350px; "><table class="table" style="font-size: 17px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;position: sticky; top:0; background-color: #FFFFFF;"> id </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Relax </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Irritable </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Worry </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Nervous </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Future </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Anhedonia </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Tired </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Hungry </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Alone </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> Angry </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> day </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> beep </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> resp </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 11 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 44 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 12 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 45 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 12 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 46 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 12 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 47 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 12 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 48 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 13 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 49 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 13 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 50 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 13 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 51 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 13 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 52 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 14 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 53 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 14 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 54 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 14 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 55 </td> </tr> <tr> <td style="text-align:left;"> user_2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 14 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 56 </td> </tr> </tbody> </table></div> ] --- class: onecol ## Coming Soon (Preview) Today we reviewed the rationale for .imp[separating model training from model testing]. We learned various methods for **holdout set validation**: - Simple random split - Stratified sampling - Multilevel data - Time series data However, in practice, we might not want to limit ourselves to a **single test set**. What if we could create **multiple test sets** to evaluate our model many times? Tomorrow will cover advanced methods for **cross-validation**. --- class: inverse, center, middle # Time for a Break!
−
+
10
:
00