1. Learn
  2. /
  3. Courses
  4. /
  5. Statistical Thinking in Python (Part 2)

Connected

Exercise

Linear regression

We will assume that fertility is a linear function of the female illiteracy rate. That is, \(i = a f + b\), where \(a\) is the slope and \(b\) is the intercept. We can think of the intercept as the minimal fertility rate, probably somewhere between one and two. The slope tells us how the fertility rate varies with illiteracy. We can find the best fit line using np.polyfit().

Plot the data and the best fit line. Print out the slope and intercept. (Think: what are their units?)

Instructions

100 XP
  • Compute the slope and intercept of the regression line using np.polyfit().
  • Print out the slope and intercept from the linear regression.
  • To plot the best fit line, choose x values that consist of 0 and 100 using np.array(). Then, compute the theoretical values of y based on your regression parameters. I.e., y = a * x + b.
  • Plot the data and the regression line on the same plot. Be sure to label your axes.
  • Show your plot.