1. Learn
  2. /
  3. Courses
  4. /
  5. Helsinki Open Data Science

Connected

Exercise

The pipe: summarising by group

The pipe operator, %>%, takes the result of the left-hand side and uses it as the first argument of the function on the right-hand side. For example:

1:10 %>% mean() # 5.5

The parenthesis of the 'target' function (here mean) can be dropped unless one wants to specify more arguments for it.

1:10 %>% mean # 5.5

Chaining operations with the pipe is great fun, so let's try it!

Utilizing the pipe, you'll apply the functions group_by() and summarise() on your data. The first one splits the data to groups according to a grouping variable (a factor, for example). The latter can be combined with any summary function such as mean(), min(), max() to summarize the data.

Instructions

100 XP
  • Access the tidyverse libraries dplyr and ggplot2
  • Execute the sample code to see the counts of males and females in the data
  • Adjust the code to calculate means of the grades of the students: inside summarise(), after the definition of count, define mean_grade by using mean() on the variable G3.
  • Adjust the code: After sex, add high_use as another grouping variable. Execute the code again.