Sweatmas

Introduction

Welcome to Sweatmas 2022. Introduced by Gina Alamillo, Sweatmas is a challenge that starts on December 1st and goes until Christmas. The challenge is to get up and do something that makes you sweat every day for the 25 days. This can include anything from a tough workout to a walk with the family.

This year, I decided to record some data about my workouts and just visualize and summarize what we actually accomplished throughout the 25 days. Code will be included in this post to show how each plot was made.

Loading Packages

There are quite a few packages being used for this project. Most of them are used for one specific function that gets used to add specific item to the plots. For example, magick and cowplot get used to add a background image to ggplot and ggtext allows us to render images as axis ticks. This will become more apparent later.

library(googlesheets4)
library(ggplot2)
library(DT)
library(png)
library(patchwork)
library(cowplot)
library(magick)
library(ggtext)
library(jpeg)
library(ggpubr)
library(tidyr)
library(plotly)
library(pillar)

Read in the Data

Here we read in our data from Google Sheets, I’ve removed a few lines of code such as the line that authorizes access to the sheet and I’ve changed the actual links themselves. If you want a different look at this data you can check out the Shiny App I made for it here.

Once the data is read in, we have to change the column that contains the time into a better format.

Example

1899-12-30 10:45:00 changes to 10:45 AM

1899-12-30 19:30:00 changes to 07:30 PM

df_drew <- as.data.frame(read_sheet(".../google_sheets_link", sheet = "Drew"))
df_drew$Time_of_day <- format(strptime(df_drew$Time_of_day, 
                                       format = "%Y-%m-%d %H:%M:%S"), "%I:%M %p")

Quick Look

Before getting into the plots, lets just take a quick look at what the data actually looks like. There are quite a few things here, maybe more than what was needed, but I wanted to be thorough and not leave anything out. We are going to look at the first 9 columns, these are the most intuitive and make up the bulk of the important information. We will talk about the final 11 later on.

str(df_drew[,1:9])

## 'data.frame':    35 obs. of  9 variables:
##  $ Day          : num  1 2 3 4 5 5 6 7 8 9 ...
##  $ Workout_Title: chr  "2000s Ride" "Full Body Strength" "House Ride" "Holiday Ride" ...
##  $ Workout_Type : chr  "Cycling" "Strength" "Cycling" "Cycling" ...
##  $ Duration     : num  45 30 30 30 20 10 45 30 45 60 ...
##  $ Instructor   : chr  "Emma Lovewell" "Robin Arzon" "Leanne Hainsby" "Emma Lovewell" ...
##  $ Calories     : num  381 221 250 254 147 74 376 216 382 438 ...
##  $ App_used     : chr  "Peloton" "Peloton" "Peloton" "Peloton" ...
##  $ Time_of_day  : chr  "10:45 AM" "09:45 AM" "10:15 AM" "07:30 PM" ...
##  $ Music_Type   : chr  "Pop" "New" "House" "Holiday" ...

Visualizations

First up is the amount of calories (in kcal) that were burned on each day. There were a few days where I completed multiple workouts in that day, and those calorie values are reported seperatly rather than being aggregated into one. It’s interesting to note that I seemed to have started off pretty hard, especially in the first ten days or so. After that I tapered off a little bit, with more of my workouts burning less calories, then towards the end I started to pick it back up a little bit.

ggplot(data = df_drew) +
  
  geom_line(aes(x = Day, y = Calories), color = "#3D3C30") +
      
  geom_point(aes(x = Day, y = Calories, color = Workout_Type), cex = 7) +
  
  scale_color_manual(values = c("#ffee94", "#ffa466", "#ff7057")) + 
  
  scale_x_continuous(breaks = seq(1,31,1)) + 
  
  labs(y = "Calories (kcal)",
       x = "",
       title = "Sweatmas 2022 Calories",
       color = "Workout Type",
       caption = "*NOTE: calorie amounts are recorded based on what the app reports. 
       These may or may not reflect actual calories burned during the workout") +
  
  ylim(0, 500) +
  
  theme(plot.caption = element_text(size = 10, hjust = 0.5, vjust = 0.5),
        panel.grid.major = element_line(color = "gray40"),
        panel.grid.minor = element_blank(),
        axis.ticks = element_blank(),
        legend.key = element_rect(fill = "white", color = "white"))

All of those workouts, all of that sweat, but what did I actually workout? There’s bootcamps, cycling, and strength but did I focus on one part of the body over another, is there a workout type that I tend to favor over the others? To find this out, I also tracked the usage rates of muscle groups for every workout I did. This was left out of the quick look up above, but below we can see an example of what this looks like.

str(df_drew[,10:20])

## 'data.frame':    35 obs. of  11 variables:
##  $ Hamstrings: num  17 10 17 17 NA NA 17 14 17 14 ...
##  $ Glutes    : num  16 14 17 15 10 NA 15 14 17 15 ...
##  $ Quads     : num  17 12 17 17 10 NA 17 14 17 15 ...
##  $ Calves    : num  12 8 11 13 NA NA 12 9 11 10 ...
##  $ Core      : num  11 NA 11 11 10 10 11 11 11 10 ...
##  $ Shoulders : num  NA 12 NA NA 13 18 NA 11 NA NA ...
##  $ Biceps    : num  NA NA NA NA NA 10 NA 11 NA NA ...
##  $ Triceps   : num  NA NA NA NA 11 12 NA 11 NA NA ...
##  $ Back      : num  NA NA NA NA 11 NA NA NA NA NA ...
##  $ Chest     : num  NA NA NA NA NA 10 NA NA NA NA ...
##  $ Other     : num  27 44 27 27 45 40 28 5 27 36 ...

These are the main muscle groups that are used during my workouts. This data looks messy, but these values measure the percentage of use for that muscle group for each workout. For example, the first column there was a cycling workout, and our hamstrings contributed 17% of the activation necessary to complete that workout, the glutes contributed 16% and so on. The “Other” column is what is leftover, and those values do not have any specific muscles associated with them. Just because a column has an NA, does not mean it was not used at all. I am limited to what the Peloton App on my phone gives me, so that is what will be used rather than trying to infer and guess.

Before we can plot our pseudo-anatogram, we first need to create a new data frame that assigns an x and y value to each muscle group, so we can put them in the proper spot on the body.

# isolate the muscle data (excluding "Other")
muscles <- df_drew[,10:19]

# create a list of the muscles, then get a sum of all the percentages
muscle_list <- c("Hamstrings", "Glutes", "Quads", "Calves", "Core", "Shoulders", "Biceps", "Triceps", "Back", "Chest")
muscle_usage <- colSums(muscles, na.rm = TRUE)


# our data is wide, so we cannot aggregate it, we must first reshape it to be long
muscles <- muscles %>% pivot_longer(cols = all_of(muscle_list),
                         names_to = "Muscle",
                         values_to = "Usage")
# aggregate to get the totals by muscle
muscles <- aggregate(muscles, 
                     Usage ~ Muscle, 
                     FUN = sum)

# create new df
# since each muscle is on the left and the right, we have to split that into two
# then each of our data points is repeated twice since the values are the same
# x and y were both found through trial and error, the plot has a limit from 1 to 100 in the x and y direction
coords <- data.frame(Muscle = c("Back_L", "Back_R", "Biceps_L","Biceps_R","Calves_L", "Calves_R", "Chest_L", "Chest_R", "Core_L", "Core_R","Glutes_L", "Glutes_R","Hamstrings_L", "Hamstrings_R","Quads_L", "Quads_R","Shoulders_L", "Shoulders_R", "Triceps_L", "Triceps_R"),
                     
                     Usage = unlist(lapply(muscles$Usage, FUN = function(x){rep(x, 2)})),
                     
                     x = c(73, 81, 8, 35, 72, 82, 17, 27, 17, 27, 
                           71, 83, 71, 83, 16, 28, 9, 35, 63, 90),
                     
                     y = c(75, 75, 68, 68, 14, 14, 72, 72, 60, 60, 
                           46, 46, 30, 30, 37, 37, 78, 78, 67, 67))

Below is a graph showing the totals of each muscle group, a darker point means it was used more, and lighter point means it wasn’t used as much. You’ll see that I mostly was using my hamstrings, quads, glutes, and core for sweatmas because I spent a lot of time on the bike.

img <- readJPEG("C:\\Users\\drewc\\OneDrive\\Documents\\Peloton Instructors\\body_chart.jpeg")

ggplot(data = coords, 
            aes(x = x, 
                y = y, 
                color = Usage)) +
  
      background_image(img) +
  
      geom_point(cex = 8, 
                 show.legend = FALSE) +
  
      scale_color_gradient(low = "#D1FBD8", 
                           high = "#188300") +
  
      labs(title = "Where it Hurts the Most",
           subtitle = "Darker colors mean that muscle group was used more throughout Sweatmas",
           caption = "*NOTE: Data shown is what is recorded through the Peloton App\nwhich does not report every muscle group for every workout") +
  
      ylim(0,100) +
      xlim(0,100) +
  
      theme(axis.title = element_blank(),
            axis.text = element_blank(),
            axis.ticks = element_blank(),
            plot.background = element_blank(),
            panel.background = element_blank(),
            panel.grid = element_blank(),
            plot.title = element_text(hjust = 0.5),
            plot.subtitle = element_text(hjust = 0.5),
            plot.caption = element_text(size = 10, hjust = 0.5, vjust = 0.5))

# add a datatable to make this data more readable and help to make sense of what the points mean
# we are going to use the average usages for each muscle group
    muscles <- df_drew[,10:19]
    ave_use <- as.data.frame(colMeans(muscles, na.rm = TRUE))
    colnames(ave_use) <- "Average Use (%)"
    
    muscle_table <- datatable(ave_use,
                              rownames = TRUE,
                              options = list(dom = "t")) %>%
                      formatStyle("Average Use (%)", 
                                  background =  styleColorBar(ave_use$`Average Use`, color = "#D1FBD8" )) %>%
                      formatRound("Average Use (%)", digits = 2)

All of these workouts take time, its all about making the time and not trying to find the time (shout-out Jess Sims). Well I wanted to see when I was able to make the time, so I tracked what time I started my workouts at (rounded to the nearest 15 minute time). NOTE: I work from home

In order to plot these times, I needed to first aggregate all of the times I started a workout so I can get a count. But the problem is that there isn’t a count of each time of the day, so my plot would look really funny and unintuitive. In order to fix this, I needed to create a new data frame with all of the times from “00:00” to “23:59” by every 15 minutes, then reformat those times to match my data frame, then merge this new data frame with my aggregated data frame to get the counts.

temp_times <- aggregate(df_drew, Music_Type ~ Time_of_day, FUN = length)
colnames(temp_times) <- c("Time_of_Day", "Count")

# create all of the times, along with a count column full of zeroes
all_times <- data.frame(Time_of_Day = format(seq(as.POSIXct("00:00", 
                                                            format = "%H:%M", 
                                                            tz = "UTC"),
                                                 
                                                 as.POSIXct("23:59", 
                                                            format = "%H:%M", 
                                                            tz = "UTC"),
                                                 
                                                 by = "15 mins"), 
                                             format = "%Y-%m-%d %H:%M:%S"),
                        Count = rep(0, 96))

# reformat the dates into the format as my dataframe
all_times$Time_of_Day <- format(strptime(all_times$Time_of_Day, 
                                         format = "%Y-%m-%d %H:%M:%S"), "%I:%M %p")

# merge the dataframes, replace the NA values with 0
times <- merge(x = all_times, 
               y = temp_times, 
               by = "Time_of_Day", 
               all.x = TRUE)
times <- times[, -2]
times[is.na(times)] <- 0
colnames(times) <- c("Time_of_Day", "Count")

# add a new column determining if its a morning or afternoon workout
times$Day_Night <- unlist(lapply(times$Time_of_Day, FUN = function(x){ifelse(grepl("AM", x), "Morning", "Afternoon")}))

# the times were ordered 
times <- times[order(times$Day_Night, decreasing = TRUE),]
rownames(times) <- 1:nrow(times)
times <- times[c(45:48,1:44,93:96,49:92),]
rownames(times) <- 1:nrow(times)

morning <- times[which(times$Day_Night == "Morning"),]
afternoon <- times[which(times$Day_Night == "Afternoon"),]

I was mostly working out late morning, most commonly between 9:45 and 10:30 AM. Again, I work from home and make this work, which is really nice. But there were a few times on the weekends where I was only able to workout later at night.

ggplot() +
      geom_bar(data = morning, 
               aes(x = seq(1:48), y = Count, fill = Day_Night), stat = "identity") +
      
      geom_bar(data = afternoon, 
               aes(x = seq(1:48), y = Count, fill = Day_Night), stat = "identity") +
      
      scale_x_continuous(breaks = seq(1,48,4),
                         labels = c("12:00","1:00","2:00","3:00","4:00","5:00","6:00","7:00","8:00","9:00","10:00","11:00"),
                         minor_breaks = c(1:48)[-seq(1,48,4)]) +
      
      scale_fill_manual(values = c("#ff7057", "#ffee94")) +
      
      labs(fill = "",
           title = "Early(ish) Bird Gets the Worm") +
      
      theme(axis.text.x = element_text(size = 9, vjust = 0),
            panel.grid.minor = element_blank(),
            plot.background = element_blank(),
            panel.background = element_blank(),
            axis.title = element_blank(),
            axis.text.y = element_blank(),
            axis.ticks.y = element_blank(),
            plot.title = element_text(hjust = 0.5),
            panel.grid.major.x = element_line(color = "dimgray", 
                                              linetype = "dotted", 
                                              linewidth = 0.25),
            
            panel.grid.major.y = element_line(color = "dimgray", 
                                              linetype = "dotted", 
                                              linewidth = 0.25),
            panel.ontop = FALSE,
            axis.ticks.x = element_blank()) +
      
      coord_polar()

Finally, I wanted to see who I tended to workout with more. I use the Peloton app and there are tons of amazing instructors, but I know I have my favorite few and it depends on what workout I am doing. I wanted to make this plot a bit more interesting and use pictures of the instructors as the axis ticks, this is where cowplot and magick came in handy. But first, I needed to read in all of the pictures into R and associate them with one of the instructors.

# read in all of the pictures from our local files 
alex <- "C:\\Users\\drewc\\OneDrive\\Documents\\Peloton Instructors\\alextoussaint.jpg"
ally <- "C:\\Users\\drewc\\OneDrive\\Documents\\Peloton Instructors\\allylove.jpg"
callie <- "C:\\Users\\drewc\\OneDrive\\Documents\\Peloton Instructors\\calliegullickson.jpg"
emma <- "C:\\Users\\drewc\\OneDrive\\Documents\\Peloton Instructors\\emmalovewell.jpg"
jess <- "C:\\Users\\drewc\\OneDrive\\Documents\\Peloton Instructors\\jesssims.jpg"
leanne <- "C:\\Users\\drewc\\OneDrive\\Documents\\Peloton Instructors\\leannehainsby.jpg"
robin <- "C:\\Users\\drewc\\OneDrive\\Documents\\Peloton Instructors\\robinarzon.jpg"

# create a new df with the picture links and the names of the instructors that match our original df
instructor_df <- data.frame(Instructor = c("Alex Toussaint", "Ally Love", "Callie Gullickson","Emma Lovewell",
                                           "Jess Sims", "Leanne Hainsby", "Robin Arzon"),
                            
                            Picture = c(alex, ally, callie, emma,  
                                        jess, leanne, robin),
                            
                            stringsAsFactors = FALSE)

# get the counts of how many workouts were done with each instructor
# add a plot number for easier plotting later on, then merge by instructor names that match
instructors <- aggregate(df_drew, Workout_Title ~ Instructor, FUN = length)
colnames(instructors) <- c("Instructor", "Workouts")
instructors <- instructors[order(instructors$Instructor),]
instructors$plot_num <- seq(1, nrow(instructors), 1)

instructors <- merge(instructor_df, instructors)


# next, we need to turn these picture links into html code so it can be rendered in ggplot
labels <- c()


for (i in 1:nrow(instructors)){
  
  labels <- c(labels, paste("<img src='", instructors$Picture[i], "' width = '40' height = '40' />", sep = ""))
}

ggplot(data = instructors, aes(x = Instructor, y = Workouts)) +
  
  geom_bar(stat = "identity", fill = "#ffa466") +
  
  geom_text(aes(x = Instructor, y = 0.85, label = Instructor), 
            color = "white", size = 5, family = "mono", fontface = "bold") +
  
  scale_x_discrete(name = NULL, labels = labels) +
  
  labs(title = "The Fearless Leaders") +
  
  coord_flip() +
  
  # note the element_markdown() code
  # this is what allows us to render html directly into the axis
  theme(axis.title.y = element_blank(),
        axis.text.y = element_markdown(),
        axis.ticks.y = element_blank(),
        plot.title = element_text(hjust = 0.5),
        panel.grid = element_blank())

Robin, Jess, and Callie are typically going to be my strength or bootcamp workouts. Emma, Alex, and Leanne are mostly my cycling instructors. Whether its Club Bangers with Alex Toussaint or the 60 minute bootcamp with Jess Sims, this Sweatmas would not have been possible without these instructors.

Conclusion

Overall, this was my most successful Sweatmas. This helped to reset me mentally and physically and it got me ready for the craziness of the holidays. I will hopefully have an update for the next Sweatmas and maybe even do a comparison!

Here are the grand totals for Sweatmas 2022:

Total Workouts: 35

Total Calories Burned: 7855 kcal

Average Workout Duration: 33 min

Most Listened to Genre: Hip Hop

Sweatmas 2022 (R)