1. Learn
  2. /
  3. Courses
  4. /
  5. Helsinki Open Data Science

Connected

Exercise

Scale the whole dataset

Usually the R datasets do not need much data wrangling as they are already in a good shape. But we will need to do little adjustments.

For later use, we will need to scale the data. In the scaling we subtract the column means from the corresponding columns and divide the difference with standard deviation.

$$scaled(x) = \frac{x - mean(x)}{ sd(x)}$$

The Boston data contains only numerical values, so we can use the function scale() to standardize the whole dataset.

Instructions

100 XP
  • Use the scale() function on the Boston dataset. Save the scaled data to boston_scaled object.
  • Use summary() to look at the scaled variables. Note the means of the variables.
  • Find out the class of the scaled object by executing the class() function.
  • Later we will want the data to be a data frame. Use as.data.frame() to convert the boston_scaled to a data frame format. Keep the object name as boston_scaled.