1
Generating Vectors and Matrices in R:
In this
section we would discuss the most basic operation of creating a one dimension
array (Vector) and creating two dimension array (Matrix). Further we will dive
into discussion on how to manipulate the elements of a matrix and conduct
matrix operations.
How to define a Vector in R ?
To
generate a row vector
x = ( 1 2 3 4 5 6 7 8 9)
in R use
the following command.
x = c(1,2,3,4,5,6,7,8,9)
In the
above command c( ) is used to define a row vector and each n ,m element of a vector is separated by a comma. Users can also
define a column vector in a similar way by just taking a transpose of the row
vector. To transpose a row vector, user can use the following command.
y = t(x)
The t()
will transpose a row vector into a column vector. Whenever users use the above
mentioned command R creates a numeric vector and users can display all the
elements of the vector by simply typing the following in the R console.
x
Word of
Caution: Careful with the Comma !!!
If the
user is not very familiar with R then he might run into an error wherein a
comma is placed after the last element of the vector. This would result into
the following error message:
> x = c(1,2,3,4,5,6,7,8,9,)
Error in c(1, 2, 3, 4, 5, 6, 7, 8, 9, ) : argument 10 is empty
In order
to define a vector with some special elements such as a square root or a pi you
can define it using the following command:
> f = c(1,2,4*pi,sqrt(2), pi)
> f
[1] 1.000000 2.000000 12.566371 1.414214 3.141593
In order
to define a Pi or take a square root of an element users can simply use the
word pi and sqrt().
There
are more than one way in which users can define a matrix in R. Matrix is an
array with more than one dimension. One of the ways in which a user can
generate a matrix
is by
first generating three row vectors using the commands specified in the prebious
page and then combining the three vectors to form a 3x3 matrix by using the
rbind command.
l = c(1,4,7)
m= c(2,5,8)
n = c(3,6,9)
rbind(l,m,n)
The
above mentioned R commands will generate the following matrix.
[,1] [,2] [,3]
l 1 4 7
m 2 5 8
n 3 6 9
Alternatively,
users can also use the cbind command in R to create a matrix in R using the
columns of the vector. This does not seem like a very practical way to define a
matrix in R. A bit simpler way would be to make use of the matrix command in R.
d = matrix((1:10),2,5)
This
command will generate a matrix of 2 rows and 5 columns(2x5). The following
matrix will appear in R.
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3
5 7 9
[2,] 2 4
6 8 1
Note
that the elements are filled using a columnwise. In order to change this user
can add one more argument to the matrix command.
d = matrix((1:10),2,5, byrow =
TRUE)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2
3 4 5
[2,] 6 7
8 9 10
When
users specify byrow = TRUE, R fills in the element using the row. R users can
also use trigonometric functions such as
the following:
ab = abs(3);## absolute value
s = sin(3);## sin
p = cos(6);## cos
d = tan(1);## tan
r = exp(1) ## exponential
kk = log(2) ## log
round(3.564) ## will round it, in this case to 4
round(2.2)## will result in just 2
In order
to generate a matrix with all zeros
> s = matrix(rep(0,6),2,3)
> s
[,1] [,2] [,3]
[1,] 0 0
0
[2,] 0 0
0
rep()
command is very useful in R. In the above R command, rep will generate the 0 ,6
times.
The
length() command can be used to calculate the length of a vector in R.
f = (1:3);## generate a vector
length(f)## calculates length of a vector f
[1] 3
dim()
command can be used to calculate the dimension of a matrix.
> gt = matrix((1:8),4,2); ## generate a matrix
> dim(gt) ## calculate the dimension of the above matrix
[1] 4 2
> length(gt) ## Length of the matrix
[1] 8
Both the
command can be used on a matrix. The output of dim() command is 4 2 a the
matrix had 4 rows and 2 columns. length() is 4*2 =8.Further, these commands can
be used to manipulate matrices or calculate additional matrices as shown
bellow:
> t=matrix(rep(0,length(gt)),dim(gt))
> t
[,1] [,2]
[1,] 0 0
[2,] 0 0
[3,] 0 0
[4,] 0 0
R lets
users to combine commands in a single statement. We are generating a matrix all
zeros of the same dimension as gt . Note that rep does not recognize dim()
command and hence we have used length command followed by the dim command.
cbind(gt,t)
[,1] [,2] [,3] [,4]
[1,] 1 5
0 0
[2,] 2 6
0 0
[3,] 3 7
0 0
[4,] 4 8
0 0
Here, we
are combining the two matrices to create a new matrix with 4 rows and 4
columns.
1.2 Creating Matrices using
random numbers from Distributions:
In order
to generate a matrix of normally distributed random numbers following command
can be used in R.
g = matrix(rnorm(2, 1,2), 2,2)
g
[,1] [,2]
[1,] 1.645968 1.645968
[2,] 3.992582 3.992582
The
rnorm(number of observation, mean, standard deviation) function is used
alongwith the matrix function. Simillarly, following commands can be used to
generate other known distributions.
runif()
for Uniform Distribution
rpois()
for poisson distributions
rlnorm()
for log normal distribution
rbinom()
for binomial distribution
usually
users can use these random matrices to test their models or just play around
with some data. The seq(begin, end, length of the vector) command can be
inserted in the rnorm() to generate a sequence of values between the two given
endpoints.
g = matrix(rnorm(seq(-4,4, length = 4), 1,2), 2,2)
g
[,1] [,2]
[1,] 3.750171 1.216113
[2,] 1.300675 2.796268
While
generating random numbers set.seed(id) command should be used in order to
generate a random matrix and store it, else every time user runs the command
he/she will get a different set of matrices.
set.seed(123)
g = matrix(rnorm(seq(-4,4, length = 4), 1,2), 2,2)
g
[,1] [,2]
[1,] -0.1209513 4.117417
[2,] 0.5396450 1.141017
1.3 Diagonals, identities and
matrix manipulations:
In
linear algebra the identity matrix plays the same role as 1 in normal
arithmetics. Any matrix multiplied by the identity matrix gives back the
identity matrix. To generate an identity
matrix –
diag(5)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 0
0 0 0
[2,] 0 1
0 0 0
[3,] 0 0
1 0 0
[4,] 0 0
0 1 0
[5,] 0 0
0 0 1
All the
diagonal elements of a matrix can be extracted using the diag() command
mentioned above. This command is very useful in statistics to extract all the
diagonals of a variance covariance matrix. In a variance covariance matrix the
diagonal elements are variances and the off diagonal elements are covariances.
A = matrix(c(2,4,5,6,8,9,7,3,2), 3,3)
A
[,1] [,2] [,3]
[1,] 2 6
7
[2,] 4 8
3
[3,] 5 9
2
dg = diag(A)
dg
[1] 2 8 2
The
upper.tri() command and the lower.tri() command can be used to create an upper
and lower triangle matrix respectively.
However, the matrix created is a logical matrix with TRUE and FALSE as shown
below. We need to add one more command to add zeros wherever TRUE appears.
up = upper.tri(A)
up
[,1] [,2]
[,3]
[1,] FALSE TRUE TRUE
[2,] FALSE FALSE TRUE
[3,] FALSE FALSE FALSE
A[upper.tri(A)]= 0
A
[,1] [,2] [,3]
[1,] 2 0
0
[2,] 4 8
0
[3,] 5 9
2
For the
ease of understanding data extraction from a matrix X, we will pre specify a matrix
with 5 rows and 6 columns.
x = matrix(rep(51:80), 5,6)
x
[,1] [,2] [,3] [,4] [,5]
[,6]
[1,] 51 56
61 66 71
76
[2,] 52 57
62 67 72
77
[3,] 53 58
63 68 73
78
[4,] 54 59
64 69 74 79
[5,] 55 60
65 70 75
80
Note
that the same matrix can be generated by a nesting of seq() and matrix().
However, seq() command will generate values with decimals.
x = matrix(seq(50,80, length = 30), 5,6)
x
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 50.00000 55.17241 60.34483 65.51724 70.68966 75.86207
[2,] 51.03448 56.20690 61.37931 66.55172 71.72414 76.89655
[3,] 52.06897 57.24138 62.41379 67.58621 72.75862 77.93103
[4,] 53.10345 58.27586 63.44828 68.62069 73.79310 78.96552
[5,] 54.13793 59.31034 64.48276 69.65517 74.82759 80.00000
For the
learning purpose we will only use the X matrix generated from using the rep()
command. To extract just the first column of the matrix x
u = x[,1]
u
[1] 51 52 53 54 55
In the
above command [row number, column number] is specified. The blank space before
the comma is interpreted by R as all the rows but only the first column.
Similarly, to extract first row from the matrix x
v = x[1,]
v
[1] 51 56 61 66 71 76
The key
to understanding matrix manipulation is to understand when to use the square
brackets and when to use the circular brackets. Most of the matrix manipulation
is performed using the square brackets.
Suppose
user wants selected elements from a matrix , data extraction can be performed
using the following set of R Commands.
g = x[1:2,5:6]
g
[,1] [,2]
[1,] 71 76
[2,] 72 77
Now
suppose user wants to multiply all the elements of a sub matrix by a scalar,
following commands can be used.
g = 2*x[1:2,5:6]
g
[,1] [,2]
[1,] 142 152
[2,] 144 154
following
command can be used to make the first
two rows of a matrix the same as the last two rows. Note that anything on the
right side of the equal sign will be the manipulated form of the matrix. since
we want to make the first 2 rows same as the last two we are referring the last
to rows on the right side.
x[1:2,] = x[4:5,]
x
[,1] [,2] [,3] [,4] [,5]
[,6]
[1,] 54 59
64 69 74
79
[2,] 55 60
65 70 75
80
[3,] 53 58
63 68 73
78
[4,] 54 59
64 69 74
79
[5,] 55 60
65 70 75
80
If the
user wants to exchange the first row with the second row , we can achieve this
by using the same command as mentioned above but now we change the order of the
rows on the right hand side of the equal sign.
x[1:2,]=x[2:1,]
> x
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 52 57
62 67 72
77
[2,] 51 56
61 66 71
76
[3,] 53 58
63 68 73
78
[4,] 54 59
64 69 74
79
[5,] 55 60
65 70 75
80
To
extract all the diagonal elements of a
> dg = diag(x)
> dg
[1] 51 57 63 69 75
To
remove a column or row from a matrix in R, users need to add the negative sign
as shown bellow
x[,-1]
[,1] [,2] [,3] [,4] [,5]
[1,] 56 61
66 71 76
[2,] 57 62
67 72 77
[3,] 58 63
68 73 78
[4,] 59 64
69 74 79
[5,] 60 65
70 75 80
R will
preserve all the rows in a matrix but will omit first column. The same can be
applied to rows.
x = matrix(51:80,5,6)
x[-5,]
[,1] [,2] [,3] [,4] [,5]
[,6]
[1,] 51 56
61 66 71
76
[2,] 52 57
62 67 72
77
[3,] 53 58
63 68 73
78
[4,] 54 59
64 69 74
79