Wednesday, April 17, 2013

Linear Algebra in R (Part 1)


1 Generating Vectors and Matrices in R:
In this section we would discuss the most basic operation of creating a one dimension array (Vector) and creating two dimension array (Matrix). Further we will dive into discussion on how to manipulate the elements of a matrix and conduct matrix operations.
How to define a Vector in R ?
To generate a row vector
x = ( 1 2 3 4 5 6 7 8 9)
in R use the following command.

x = c(1,2,3,4,5,6,7,8,9) 

In the above command c( ) is used to define a row vector and each  n ,m element of a vector is     separated by a comma. Users can also define a column vector in a similar way by just taking a transpose of the row vector. To transpose a row vector, user can use the following command.
y = t(x)
 
The t() will transpose a row vector into a column vector. Whenever users use the above mentioned command R creates a numeric vector and users can display all the elements of the vector by simply typing the following in the R console.
 x

Word of Caution: Careful with the Comma !!!
If the user is not very familiar with R then he might run into an error wherein a comma is placed after the last element of the vector. This would result into the following error message:
> x = c(1,2,3,4,5,6,7,8,9,)  
Error in c(1, 2, 3, 4, 5, 6, 7, 8, 9, ) : argument 10 is empty
 
In order to define a vector with some special elements such as a square root or a pi you can define it using the following command:
> f = c(1,2,4*pi,sqrt(2), pi)
> f
[1] 1.000000  2.000000 12.566371  1.414214  3.141593

In order to define a Pi or take a square root of an element users can simply use the word pi and sqrt().


There are more than one way in which users can define a matrix in R. Matrix is an array with more than one dimension. One of the ways in which a user can generate a matrix  
  
is by first generating three row vectors using the commands specified in the prebious page and then combining the three vectors to form a 3x3 matrix by using the rbind command.
 l = c(1,4,7)
 m= c(2,5,8)
 n = c(3,6,9)
 rbind(l,m,n)
 
The above mentioned R commands will generate the following matrix.
  [,1] [,2] [,3]
l    1    4    7
m    2    5    8
n    3    6    9

Alternatively, users can also use the cbind command in R to create a matrix in R using the columns of the vector. This does not seem like a very practical way to define a matrix in R. A bit simpler way would be to make use of the matrix command in R.
d = matrix((1:10),2,5)

This command will generate a matrix of 2 rows and 5 columns(2x5). The following matrix will appear in R.
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    3    5    7    9
[2,]    2    4    6    8   1

Note that the elements are filled using a columnwise. In order to change this user can add one more argument to the matrix command.
 d = matrix((1:10),2,5, byrow = TRUE)

     [,1] [,2] [,3] [,4] [,5]
[1,]    1    2    3    4    5
[2,]    6    7    8    9   10



When users specify byrow = TRUE, R fills in the element using the row. R users can also use trigonometric  functions such as the following:
ab = abs(3);## absolute value
s = sin(3);## sin
p = cos(6);## cos
d = tan(1);## tan
r = exp(1) ## exponential
kk = log(2) ## log
round(3.564) ## will round it, in this case to 4
round(2.2)## will result in just 2


In order to generate a matrix with all zeros
> s = matrix(rep(0,6),2,3)
> s
     [,1] [,2] [,3]
[1,]    0    0    0
[2,]    0    0    0

rep() command is very useful in R. In the above R command, rep will generate the 0 ,6 times.
The length() command can be used to calculate the length of a vector in R.
f = (1:3);## generate a vector
length(f)## calculates length of a vector f
[1] 3

dim() command can be used to calculate the dimension of a matrix.
> gt = matrix((1:8),4,2); ## generate a matrix
> dim(gt) ## calculate the dimension of the above matrix
[1] 4 2
> length(gt) ## Length of the matrix
[1] 8

Both the command can be used on a matrix. The output of dim() command is 4 2 a the matrix had 4 rows and 2 columns. length() is 4*2 =8.Further, these commands can be used to manipulate matrices or calculate additional matrices as shown bellow:
> t=matrix(rep(0,length(gt)),dim(gt))
> t
     [,1] [,2]
[1,]    0    0
[2,]    0    0
[3,]    0    0
[4,]    0    0

R lets users to combine commands in a single statement. We are generating a matrix all zeros of the same dimension as gt . Note that rep does not recognize dim() command and hence we have used length command followed by the dim command.
cbind(gt,t)
     [,1] [,2] [,3] [,4]
[1,]    1    5    0    0
[2,]    2    6    0    0
[3,]    3    7    0    0
[4,]    4    8    0    0

Here, we are combining the two matrices to create a new matrix with 4 rows and 4 columns.
1.2 Creating Matrices using random numbers from Distributions:
In order to generate a matrix of normally distributed random numbers following command can be used in R.
g = matrix(rnorm(2, 1,2), 2,2)
g
         [,1]     [,2]
[1,] 1.645968 1.645968
[2,] 3.992582 3.992582

The rnorm(number of observation, mean, standard deviation) function is used alongwith the matrix function. Simillarly, following commands can be used to generate other known distributions.
runif() for Uniform Distribution
rpois() for poisson distributions
rlnorm() for log normal distribution
rbinom() for binomial distribution
usually users can use these random matrices to test their models or just play around with some data. The seq(begin, end, length of the vector) command can be inserted in the rnorm() to generate a sequence of values between the two given endpoints.
g = matrix(rnorm(seq(-4,4, length = 4), 1,2), 2,2)
g
         [,1]     [,2]
[1,] 3.750171 1.216113
[2,] 1.300675 2.796268

While generating random numbers set.seed(id) command should be used in order to generate a random matrix and store it, else every time user runs the command he/she will get a different set of matrices.
set.seed(123)
g = matrix(rnorm(seq(-4,4, length = 4), 1,2), 2,2)
g
           [,1]     [,2]
[1,] -0.1209513 4.117417
[2,]  0.5396450 1.141017

1.3 Diagonals, identities and matrix manipulations:
In linear algebra the identity matrix plays the same role as 1 in normal arithmetics. Any matrix multiplied by the identity matrix gives back the identity matrix.  To generate an identity matrix –
 diag(5)
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    0    0    0    0
[2,]    0    1    0    0    0
[3,]    0    0    1    0    0
[4,]    0    0    0    1    0
[5,]    0    0    0    0    1

All the diagonal elements of a matrix can be extracted using the diag() command mentioned above. This command is very useful in statistics to extract all the diagonals of a variance covariance matrix. In a variance covariance matrix the diagonal elements are variances and the off diagonal elements are covariances.

A = matrix(c(2,4,5,6,8,9,7,3,2), 3,3)
A
     [,1] [,2] [,3]
[1,]    2    6    7
[2,]    4    8    3
[3,]    5    9    2
dg = diag(A)
dg
[1] 2 8 2

The upper.tri() command and the lower.tri() command can be used to create an upper and lower  triangle matrix respectively. However, the matrix created is a logical matrix with TRUE and FALSE as shown below. We need to add one more command to add zeros wherever TRUE appears.
up = upper.tri(A)
up
      [,1]  [,2]  [,3]
[1,] FALSE  TRUE  TRUE
[2,] FALSE FALSE  TRUE
[3,] FALSE FALSE FALSE

A[upper.tri(A)]= 0
A
     [,1] [,2] [,3]
[1,]    2    0    0
[2,]    4    8    0
[3,]    5    9    2

For the ease of understanding data extraction from a matrix X, we will pre specify a matrix with 5 rows and 6 columns.
x = matrix(rep(51:80), 5,6)
x



     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]   51   56   61   66   71   76
[2,]   52   57   62   67   72   77
[3,]   53   58   63   68   73   78
[4,]   54   59   64   69   74   79
[5,]   55   60   65   70   75   80

Note that the same matrix can be generated by a nesting of seq() and matrix(). However, seq() command will generate values with decimals.
x = matrix(seq(50,80, length = 30), 5,6)
x
         [,1]     [,2]     [,3]     [,4]     [,5]     [,6]
[1,] 50.00000 55.17241 60.34483 65.51724 70.68966 75.86207
[2,] 51.03448 56.20690 61.37931 66.55172 71.72414 76.89655
[3,] 52.06897 57.24138 62.41379 67.58621 72.75862 77.93103
[4,] 53.10345 58.27586 63.44828 68.62069 73.79310 78.96552
[5,] 54.13793 59.31034 64.48276 69.65517 74.82759 80.00000

For the learning purpose we will only use the X matrix generated from using the rep() command. To extract just the first column of the matrix x
u = x[,1]
u
[1] 51 52 53 54 55

In the above command [row number, column number] is specified. The blank space before the comma is interpreted by R as all the rows but only the first column. Similarly, to extract first row from the matrix x
v = x[1,]
v
[1] 51 56 61 66 71 76

The key to understanding matrix manipulation is to understand when to use the square brackets and when to use the circular brackets. Most of the matrix manipulation is performed using the square brackets.
Suppose user wants selected elements from a matrix , data extraction can be performed using the following set of R Commands.
g = x[1:2,5:6]
g
     [,1] [,2]
[1,]   71   76
[2,]   72   77

Now suppose user wants to multiply all the elements of a sub matrix by a scalar, following commands can be used.
g = 2*x[1:2,5:6]
g
     [,1] [,2]
[1,]  142  152
[2,]  144  154
following command can be used  to make the first two rows of a matrix the same as the last two rows. Note that anything on the right side of the equal sign will be the manipulated form of the matrix. since we want to make the first 2 rows same as the last two we are referring the last to rows on the right side.
x[1:2,] = x[4:5,]
x
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]   54   59   64   69   74   79
[2,]   55   60   65   70   75   80
[3,]   53   58   63   68   73   78
[4,]   54   59   64   69   74   79
[5,]   55   60   65   70   75   80

If the user wants to exchange the first row with the second row , we can achieve this by using the same command as mentioned above but now we change the order of the rows on the right hand side of the equal sign.
x[1:2,]=x[2:1,]
> x
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]   52   57   62   67   72   77
[2,]   51   56   61   66   71   76
[3,]   53   58   63   68   73   78
[4,]   54   59   64   69   74   79
[5,]   55   60   65   70   75   80

To extract all the diagonal elements of a
> dg = diag(x)
> dg
[1] 51 57 63 69 75

To remove a column or row from a matrix in R, users need to add the negative sign as shown bellow
x[,-1]
     [,1] [,2] [,3] [,4] [,5]
[1,]   56   61   66   71   76
[2,]   57   62   67   72   77
[3,]   58   63   68   73   78
[4,]   59   64   69   74   79
[5,]   60   65   70   75   80

R will preserve all the rows in a matrix but will omit first column. The same can be applied to rows.
x = matrix(51:80,5,6)
x[-5,]
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]   51   56   61   66   71   76
[2,]   52   57   62   67   72   77
[3,]   53   58   63   68   73   78
[4,]   54   59   64   69   74   79

No comments:

Post a Comment