Tuesday, 8 January 2013

The exordium

Journey Begins.....

R or rather the R Statistical package, very simply put is the open source equivalent of SAS.  R can pretty much do everything SAS can do in terms of Statistical analysis and there are some pretty cool things R can do which SAS can’t. Say someone wants to build a predictive model using Logistic regression, well R can do it; ARIMA model, yes; Decision Trees, yes; Association rule mining,yes;etc.Many of R's standard functions are written in R itself, which makes it easy for users to follow the algorithmic choices made. It's applied in insurance,finance, marketing etc.      

 In a nutshell, R is here to stay and to grow.

The R project for Statistical Computing
Assignment 1: Draw a histogram after concatenating 3 data points.
Soln : 
Commands used are as under -:
> x<-c(1,2,3)
> plot(x, type = "h")

Assignment 2: Drawing a line graph with points and naming the graph and the axis.  

Soln : We gathered the data from National Stock Exchange web site. Let z be the variable that contains data from the .csv file selected. Reading from the csv file is done as under -:   

> z<-read.csv(file.choose(), header=T)

This command prompts the user to select the data file from the saved location. 

zcol1 be the variable that contains contents of column 3 from the excel data.

the following commands were used.
> zcol1<-z[,3]
> plot(zcol1 , type="b" , main="NSE Graph" , xlab="Time" , ylab="indices").

Assignment 3: Merge two columns from the table obtained. Create a scatter plot by using share HIGH and LOW values from the NSE Historical data as obtained from the .csv file.
Soln :HIGH values as obtained in previous ques 
> zcol1<-z[,3]
LOW values are in column 4 from the csv file
> zcol2<-z[,4]
To plot the scatter plot 
> plot(zcol1,zcol2)


Assignment 4 :
To find the volatility between the merged values obtained from NSE historical data and obtain the range for the same.
Soln :-
For this, we would require the maximum value amongst the HIGH values and the minimum values amongst the LOW values.
Merging both the columns into one vector variable 'y' to get the HIGH and LOW values together.
> y<-c(zcol1,zcol2)
> summary(y)
 will give the min and the max value as under -:
   Min.    1st Qu.  Median    Mean   3rd Qu.    Max.
   4888    5660    5723        5758    5884       6021 

> range(y)
will give the desired range of volatility
[1] 4888.20 6020.75



No comments:

Post a Comment