Preparing data for analysis using R

Microsoft In these days has reIeased an interesting paper about using R to preparing Data.

In this paper, they’ll demonstrate some of the things that can go wrong with data, and explore ways to address those issues using the R statistical language (https://cran.r-project.org/) before going on to analysis.

For faster numerical libraries, allte the paper is based on the Microsoft R Open distribution (https://mran.microsoft.com/open/).

The idealized goal in mind is using machine learning to build a predictive model.

In the paper that can be found on this link you cand ofund information about:

  • Loading Data
  • Shaping Data
  • Variable type
  • Check for bad or missing value
  • Dealing with missing values (NA)
  • Categorical variables with too many levels or with rare levels

Read it is very interesting!

Annunci

Rispondi

Inserisci i tuoi dati qui sotto o clicca su un'icona per effettuare l'accesso:

Logo WordPress.com

Stai commentando usando il tuo account WordPress.com. Chiudi sessione / Modifica )

Foto Twitter

Stai commentando usando il tuo account Twitter. Chiudi sessione / Modifica )

Foto di Facebook

Stai commentando usando il tuo account Facebook. Chiudi sessione / Modifica )

Google+ photo

Stai commentando usando il tuo account Google+. Chiudi sessione / Modifica )

Connessione a %s...