1. Learn
  2. /
  3. Courses
  4. /
  5. Helsinki Open Data Science
  • 1

    Regression and model validation

    Data wrangling, simple regression, multiple regression, regression diagnostics

  • 2

    Logistic regression

    Regression for binary outcomes, training and testing a (predictive) model, cross-validation

  • 3

    Clustering and classification

    Datasets in R, Linear Discriminant Analysis (LDA) and K-means clustering

  • 4

    Dimensionality reduction techniques

    Principal component analysis (PCA), Correspondence analysis (CA)

  • 5

    Analysis of longitudinal data

    Graphical Displays and Summary Measure Approach, Linear Mixed Effects Models for Normal Response Variables


Connected

Exercise

Correlations plot

It is often interesting to look at the correlations between variables in the data. The function cor() can be used to create the correlation matrix. A more visual way to look at the correlations is to use corrplot() function (from the corrplot package).

Use the corrplot to visualize the correlation between variables of the Boston dataset.

Instructions

100 XP
  • Calculate the correlation matrix and save it as cor_matrix. Print the matrix to see how it looks like.
  • Adjust the code: use the pipe (%>%) to round the matrix. Rounding can be done with the round() function. Use the first two digits. Print the matrix again.
  • Plot the rounded correlation matrix
  • Adjust the code: add argument type = "upper" to the plot. Print the plot again.
  • Adjust the code little more: add arguments cl.pos = "b", tl.pos = "d" and tl.cex = 0.6 to the plot. Print the plot again.
  • See more of corrplot here