Descriptive statistics in R Studio
Dataset has been loaded to R Studio, important libraries installed and basic descriptive statistics performed, so that Dataset can be interpreted and performed the first easy generalizations. Descriptive statistics is showing, describing and summarizing the first findings in dataset.
# Load and view the Dataset:
> sys.setlocale("LC_ALL", "C") > sys.getlocale() > coffee <- read.csv("~/Desktop/coffee.csv", header=TRUE) > view(coffee)
# See the currently installed packages in Library:
> library()
# See the currently loaded packages in Library:
> search()
# Install and revise useful packages:
> install.packages("ggplot2") > library("ggplot2") > require("ggplot2") > data()
# See the structure of data, dimensions and depth of variables. Missing data are converted into NA, R can read: > str(coffee)
'data.frame': $ UID $ Category $ KW
99 obs. of 21 variables: : Factor w/ 56 levels "","CHE-100.139.986",..: 1 1 1 1 1 1 1 1 1 1 ...
: Factor w/ 18 levels "","Bar/Pub","Delica",..: 1 1 1 1 1 1 1 1 1 1 ... : Factor w/ 99 levels "Adrianos","Badilatti",..: 66 94 9 75 54 44 17 6 98 76 ...
$ Web $Emp_count :int NANANANANANANANANANA... $Sales :int NANANANANANANANANANA... $ Note : Factor w/ 7 levels "","2016 August",..: 1 1 1 1 1 1 1 1 1 1 ...
: Factor w/ 92 levels "","http://caferos.com",..: 6 68 14 52 44 37 17 12 19 53 ...
$KW_search :int 2150020040240... $FB_likes :int NANA146NANANANANANANA... $ FB : Factor w/ 3 levels "","N","Y": 1 1 3 1 1 1 1 1 1 1 ... $ Position_CH : int 24350 27551 302886 97324 160549 42412 139874 418444 423670 442142 ... $ Impressions : int 15120 11246 1058 358 322 1191 300 300 300 300 ... $ Visits : int 1686 1105 529 358 322 314 300 300 300 300 ... $ Value_visitor: num 0.29 1.2 0.6 0.41 0.3 0.26 1.41 NA NA NA ... $ Value : int 2048 1686 1231 883 986 971 578 660 608 743 ... $Links :int 676321012740261515... $ Site_count : int 66000 12240 NA 1230 855 25350 465 210 1260 255 ... $Capital :int NANANANANANANANANANA... $ Headquarter : Factor w/ 2 levels "Italy","Switzerland": 2 2 2 1 2 2 2 2 2 2 ... $ City : Factor w/ 61 levels "","Allschwil",..: 1 44 13 33 1 1 38 11 61 1 ... $ Registered : int NA 1761 NA NA NA NA 1986 NA NA 1997 ...
# Plot multiple histograms:
> par(mfrow=c(3,5)) > colnames<-dimnames(coffee)
> hist(coffee$Emp_count)
# Print detailed descriptive statistics of all variables:
> summary(coffee)
# Print detailed descriptive statistics of all variables better and revise n (number of missing values for each variable) and basic statistical parameters: > install.packages("psych") > library(psych)
> describe(coffee)
Comments