1. Learn
  2. /
  3. Courses
  4. /
  5. Helsinki Open Data Science

Connected

Exercise

Creating a factor variable

We can create a categorical variable from a continuous one. There are many ways to to do that. Let's choose the variable crim (per capita crime rate by town) to be our factor variable. We want to cut the variable by quantiles to get the high, low and middle rates of crime into their own categories.

See how it's done below!

Instructions

100 XP
  • Look at the summary of the scaled variable crim
  • Use the function quantile() on the scaled crime rate variable and save the results to bins. Print the results.
  • Create categorical crime vector with the cut() function. Set the breaks argument to be the quantile vector you just created.
  • Use the function table() on the crime object
  • Adjust the code of cut() by adding the label argument in the function. Create a string vector with the values "low", "med_low", "med_high", "high" (in that order) and use it to set the labels.
  • Do the table of the crime object again
  • Execute the last lines of code to remove the original crime rate variable and adding the new one to scaled Boston dataset.
  • NOTE! If you receive an error message regarding factors while submitting and you feel your solution is correct, try pressing the submit-button again without altering the code. This usually works. We are currently working on the problem.