1. Belajar
  2. /
  3. Kursus
  4. /
  5. Helsinki Open Data Science

Connected

latihan

String manipulation

Sometimes a variable is coded in a way that is not natural for R to understand. For example, large integers can sometimes be coded with a comma to separate thousands. In these cases, R interprets the variable as a factor or a character.

In some cases you could use the dec argument in read.table() to get around this, but if the data also includes decimals separated by a dot, this is not an option. To get rid of the unwanted commas, we need string manipulation.

In R, strings are of the basic type character and they can be created by using quotation marks or specific functions. There are quite a few functions in Base R that can be used to manipulate characters, but there is also a bit more consintent and simple tidyverse package stringr.

Instruksi

100 XP
  • Access the stringr package
  • Look at the structure of the Gross National Income (GNI) variable in human
  • Execute the sample code where the comma is removed from each value of GNI.
  • Adjust the code: Use the pipe operator (%>%) to convert the resulting vector to numeric with as.numeric.