Data Structures are known to make data accessing and operations easier. They are also selected or designed to be used with different algorithms. In some scenarios, it has been observed that the algorithm’s base operations are closely adhered to the design of the data structures. The various data types in R that we will find in this tutorial are character, integer, complex, logical, and numeric.

DS

This tutorial will cover the following :

What are Data Structures?

Data structures are used to store data in an organized fashion in order to make data manipulation and other data operations more efficient.

Vector

Vector is one of the basic data structures in R programming. It is homogenous in nature, which means that it only contains elements of the same data type. Data types can be numeric, integer, character, complex or logical.
The vector in R programming is created using the c() function. Coercion takes place in a vector from lower to top, if the elements passed are of different data types from Logical to Integer to Double to Character.
The typeof() function is used to check the data type of the vector, and class() function is used to check the class of a vector.
For example:

Vec1 <- c(44, 25, 64, 96, 30)
Vec2 <- c(1, FALSE, 9.8, "hello world")
typeof(Vec1)
typeof(Vec2)

Output:

[1] "double"
[1] "character"

To delete a vector, we simply do the following

Vec1 <- NULL
Vec2 <- NULL

Methods to access vector elements

Vectors can be accessed in the following ways:

1. Elements of a vector can be accessed by using their respective indexes.[ ] brackets are used to specify indexes of the elements to be accessed.
For example:

x <- c("Jan","Feb","March","Apr","May","June","July")
y <- x[c(3,2,7)]
print(y)
Output:
[1] "March" "Feb"   "July"

2. We can also use logical indexing, negative indexing, and 0/1 to access the elements of a vector:
For example:

x <- c("Jan","Feb","March","Apr","May","June","July")
y <- x[c(TRUE,FALSE,TRUE,FALSE,FALSE,TRUE,TRUE)]z <- x[c(-3,-7)]c <- x[c(0,0,0,1,0,0,1)]
print(y)
print(z)
print(c)

Output:

[1] "Jan"   "March" "June" "July"(All TRUE values are printed)
[1] "Jan" "Feb" "Apr" "May" "June"(All corresponding values for negative indexes are dropped)
[1] "Jan" "Jan"(All corresponding values are printed)

Get 50% Hike!

Master Most in Demand Skills Now !

Vector Arithmetic

We can perform addition, subtraction, multiplication, and division on vectors having the same number of elements in the following ways:

v1 <- c(4,6,7,31,45)
v2 <- c(54,1,10,86,14,57)
add.v <- v1+v2
print(add.v)
sub.v <- v1-v2
print(sub.v)
multi.v <- v1*v2
print(multi.v)
divi.v <- v1/v2
print(divi.v)

Output:

[1]  58   7  17 117  59  66
[1] -50   5  -3 -55  31 -48
[1]  216    6   70 2666  630  513
[1] 0.07407407 6.00000000 0.70000000 0.36046512 3.21428571 0.15789474

Get to learn about Vectors in R Programming from this insightful blog!

Recycling Vector Elements

If arithmetic operations are performed on vectors having unequal lengths, then the vector’s elements which are shorter in number as compared to the other vector, are recycled. For example:

v1 <- c(8,7,6,5,0,1)
v2 <- c(7,15)                               
add.v <- v1+v2                                     
(v2 becomes c(7,15,7,15,7,15))
print(add.v)
sub.v <- v1-v2
print(sub.v)

Output:

[1] 15 22 13 20  7 16
[1]   1  -8  -1 -10  -7 -14

Want to get certified in R! Learn R Programming from top R Programming experts and excel in your career with Intellipaat’s Data Science with R Certification!

Sorting a Vector

We can sort the elements of a vector by using the sort() function in the following way.

v <- c(4,78,-45,6,89,678)
sort.v <- sort(v)
print(sort.v)
#Sort the elements in the reverse order
revsort.v <- sort(v, decreasing = TRUE)
print(revsort.v) 
#Sorting character vectors
v <- c("Jan","Feb","March","April")
sort.v <- sort(v)
print(sort.v) 
#Sorting character vectors in reverse order
revsort.v <- sort(v, decreasing = TRUE)
print(revsort.v)

Output:

[1] -45   4   6 78 89 678
[1] 678 89 78   6   4 -45
[1] "April" "Feb" "Jan"   "March"
[1] "March" "Jan"   "Feb"   "April"

Career Transition

List in R Programming

A list in R programming is a non-homogenous data structure, which implies that it can contain elements of different data types. It accepts numbers, characters, lists, and even matrices and functions inside it. It is created using the list() function.
For example:

list1<- list("Sam", "Green", c(8,2,67), TRUE, 51.99, 11.78,FALSE)
print(list1)

Output:

[[1]]
[1] "Sam"
[[2]]
[1] "Green"
[[3]]
[1]  8  2 67
[[4]]
[1] TRUE
[[5]]
[1] 51.99
[[6]]
[1] 11.78
[7]]
[1] FALSE

Accessing Elements of a List

Elements of a list can be accessed by using the indices of those elements.
For Example:

list2 <- list(matrix(c(3,9,5,1,-2,8), nrow = 2), c("Jan","Feb","Mar"), list(3,4,5))
print(list2[1])
print(list2[2])
print(list2[3])

Output:

[[1]]
[,1] [,2] [,3]          (First element of the list)
[1,]    3    5   -2
[2,]    9    1    8
[[1]]
[1] "Jan" "Feb" "Mar"        (Second element of the list)
[1,]    3    5   -2
[[1]]
[[1]][[1]]
[1] 3
[[1]][[2]]                      (Third element of the list)
[1] 4
[[1]][[3]]
[1] 5

Wish to crack R Programming job interviews? Intellipaat’s Top Apache R Interview Questions are meant only for you!

Adding, Deleting elements of a List

We can add and delete elements only at the end of a list.
For example:

list2 <- list(matrix(c(3,9,5,1,-2,8), nrow = 2), c("Jan","Feb","Mar"), list(3,4,5))
list2[4] <- “HELLO”
print(list2[4])

Output:

[[1]]
[1] "Hello"

Similarly,

list2[4] <- NULL
print(list2[4])

Output:

[[1]]
NULL

Updating Elements of a List

To update a value in a list, use the following syntax:

list2[3] <- "Element Updated"
print(list2[3])

Output:

[[1]]
[1] "Element Updated"

Preparing for Data Structure job interviews? Have a look at our blog on Data Structure Interview Questions and crack your job interview!

Matrix in R Programming

The Matrix in R programming is a 2-dimensional data structure that is homogenous, meaning that it only accepts elements of the same data type. Coercion takes place if elements of different data types are passed. It is created using the matrix() function.
The basic syntax to create a matrix is given below:
matrix(data, nrow, ncol, byrow, dimnames)
where,
data = the input element of a matrix given as a vector.
nrow = the number of rows to be created.
ncol = the number of columns to be created.
byrow = the row-wise arrangement of the elements instead of column-wise
dimnames = the names of columns/rows to be created.
For example:

M1 <- matrix(c(1:9), nrow = 3, ncol =3, byrow= TRUE)
print(M1)

Output:

[,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9
M2 <-  matrix(c(1:9), nrow = 3, ncol =3, byrow= FALSE)
print(M2)

Output:

[,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

By using row and column names, a matrix can be created as follows:

rownames = c("row1", "row2", "row3")
colnames = c("col1", "col2", "col3")
M3 <- matrix(c(1:9), nrow = 3, byrow = TRUE, dimnames = list(rownames, colnames))
print(M3)

Output:

col1 col2 col3
row1    1    2    3
row2    4    5    6
row3    7    8    9

Have you got more queries? Come to our R Programming Community and get them clarified today!

Accessing Elements of a Matrix

To access the elements of a matrix, row and column indices are used in the following ways:
For accessing the elements of the matrix M3 created above, use the following syntax:

print(M3[1,1])
print(M3[3,3])
print(M3[2,3])

Output:

[1] 1 (Element at first row and first column)
[1] 9 (Element at third row and third column)
[1] 6 (Element at second row and third column)

Data Frame

A data frame in R programming is a 2-dimensional array-like structure that also resembles a table, in which each column contains values of one variable and each row contains one set of values from each column.
A data frame has the following characteristics:

  • The column names of a data frame should not be empty.
  • Row names should be unique.
  • Data stored in a data frame can be numeric, factor or character type.
  • Each column should contain the same number of data items.

Learn more about R Programming from this R Programming Training in Toronto to get ahead in your career!

Creating a Data Frame

Use the following syntax for creating a data frame in R programming:

empid <- c(1:4)
empname <- c("Sam","Rob","Max","John")
empdept <- c("Sales","Marketing","HR","R & D")
emp.data <- data.frame(empid,empname,empdept)
print(emp.data)

Output:

Sl.No. empid empname empdept
1 1 Sam Sales
2 2 Rob Marketing
3 3 Max HR
4 4 John R & D

Become a Big Data Architect

Extracting Columns/Rows from a Data Frame

To extract a specific column from a data frame, use the following syntax:

result <- data.frame(emp.data$empname,emp.data$empdept)
print(result)

Output:

Sl.No. emp.data.empname emp.data.empdept
1 Sam Sales
2 Rob Marketing
3 Max HR
4 John R & D

To extract specific rows from a data frame, use the following syntax:

result <- emp.data[1:2,]
print(result)

Output:

Sl.No. empid empname empdept
1 1 Sam Sales
2 2 Rob Marketing

The following code extracts the first and third rows with second and third columns respectively.

result <- emp.data[c(1,2),c(2,3)]
print(result)

Output:

Sl.No. empname empdept
1 Sam Sales
2 Max HR

Adding a Column to a Data Frame

To add a salary column to the above Data Frame, use the following syntax:

emp.data$salary <- c(20000,30000,40000,27000)
n <- emp.data
print(n)
Sl.No. empid empname empdept Salary
1 1 Sam Sales 20000
2 2 Rob Marketing 30000
3 3 Max HR 40000
4 4 John R & D 27000

Adding a Row to a Data Frame

To add new rows to an existing Data Frame, we need to create a new data frame, which contains the new rows, and then merge it with the existing data frame using the rbind() function. This way, we will get the final Data Frame.

Creating a new Data Frame

emp.newdata <-   data.frame(
empid = c(5:7),
empname = c("Frank","Tony","Eric"),
empdept = c("IT","Operations","Finance"),
salary = c(32000,51000,45000)
)

Learn new Technologies

Merging the Created Data Frame with the Existing One:

emp.finaldata <- rbind(emp.data,emp.newdata)
print(emp.finaldata)

Output:

Sl.No. empid empname empdept Salary
1 1 Sam Sales 20000
2 2 Rob Marketing 30000
3 3 Max HR 40000
4 4 John R & D 27000
5 5 Frank IT 32000
6 6 Tony Operations 51000
7 7 Eric Finance 45000

Are you interested in learning R programming from experts? Enroll in our R programming training institutes in Bangalore now!

Factor

Factors in R programming are used in data analysis for statistical modeling. They are used to categorize unique values in columns, like “Male, “Female”, “TRUE”, “FALSE”etc., and store them as levels. They can store both strings and integers. They are useful in columns that have a limited number of unique values.
Factors can be created using the factor() function and they take vectors as inputs.
For example:

data <- c("Male","Female","Male","Child","Child","Male","Female","Female")
print(data)
factor.data <- factor(data)
print(factor.data)

Output:

[1] Male   Female Male   Child  Child  Male   Female Female
Levels: Child Female Male

For any Data Frame, R treats the text column as categorical data and creates factors on it.
For example: For the emp.finaldata Data Frame R treats empdept as a factor.

print(is.factor(emp.finaldata$empdept))
print(emp.finaldata$empdept)

Output:

[1] TRUE
[1] Sales      Marketing  HR         R & D      IT         Operations Finance   
Levels: HR     Marketing     R & D     Sales    Finance     IT Operations

Arrays

Arrays refer to the type of data structure that is used to store multiple items of similar type together. This leads to a collection of items that are stored at contiguous memory locations. This memory location is denoted by the array name. The position of an element can be calculated simply by adding an offset to its base value.

Example:

Array

Array Structure

An array consists of the following:

Array Index: The array index identifies the location of the element. The array index starts with 0.

Array Element: The items that are stored in the array are referred to as Array elements.

Array Length: The array length is determined by the number of elements that it can store. In the above example, the array length is 12.

There are two types of arrays:

One-dimensional Arrays

Multi-dimensional Arrays

One-dimensional Arrays

One- or single-dimensional arrays are the types of arrays that have array elements stored in a sequence and can be accessed in the same order. The above given figure is an example of a one-dimensional array.

Multi-dimensional Arrays

The arrays that have elements stored in more than one dimension are referred to as multi-dimensional arrays. They can be two-dimensional or three-dimensional arrays and consist of row index and column index.

Example:

Multi-dimensional Arrays

Accessing Elements of an Array

Elements in an array can be accessed using the following syntax:

Syntax:

arrayName[index]

In this tutorial, we learned what are the data structures in R programming, their different types, and how to perform simple data manipulation using data structures. In the next session, we are going to talk about Control Flow statements in R. Let us meet there!

Key Points:

  • The various types of data structures in R can be data frames, list, vector, matrix, string, and arrays.
  • These various types of data structures are also used with different kinds of algorithms.
  • Some of the basic data types in R can be character, integer, complex, logical, and numeric.

Course Schedule

Name Date
Data Science Course 2021-11-27 2021-11-28
(Sat-Sun) Weekend batch
View Details
Data Science Course 2021-12-04 2021-12-05
(Sat-Sun) Weekend batch
View Details
Data Science Course 2021-12-11 2021-12-12
(Sat-Sun) Weekend batch
View Details

Leave a Reply

Your email address will not be published. Required fields are marked *