**Data Structures**

Data structures are used to store data in an organized fashion in order to make data manipulation and other data operations more efficient.

There are **five types of Data Structures** in R Programming which are mentioned below:

**Vector****List****Matrix****Data Frame****Factor**

**Vector**

Vector is one of the basic data structures in R programming. It is homogenous in nature, which means that it only contains elements of the same data type. Data types can be numeric, integer, character, complex or logical.

The vector in R programming is created using the c() function. Coercion takes place in a vector from lower to top, if the elements passed are of different data types from Logical to Integer to Double to Character.

The typeof() function is used to check the data type of the vector, and class() function is used to check the class of a vector.

For example:

Vec1 <- c(44, 25, 64, 96, 30) Vec2 <- c(1, FALSE, 9.8, "hello world") typeof(Vec1) typeof(Vec2)

Output:

[1] "double" [1] "character"

To delete a vector, we simply do the following

Vec1 <- NULL Vec2 <- NULL

**Enroll yourself in R Programming Training and give a head-start to your career in R Programming!**

#### Accessing Vector Elements

Elements of a vector can be accessed by using their respective indexes.[ ] brackets are used to specify indexes of the elements to be accessed.

For example:

x <- c("Jan","Feb","March","Apr","May","June","July") y <- x[c(3,2,7)] print(y)

`Output:`

[1] "March" "Feb" "July"

We can also use logical indexing, negative indexing, and 0/1 to access the elements of a vector:

For example:

x <- c("Jan","Feb","March","Apr","May","June","July") y <- x[c(TRUE,FALSE,TRUE,FALSE,FALSE,TRUE,TRUE)]z <- x[c(-3,-7)]c <- x[c(0,0,0,1,0,0,1)] print(y) print(z) print(c)

Output:

[1] "Jan" "March" "June" "July"(All TRUE values are printed) [1] "Jan" "Feb" "Apr" "May" "June"(All corresponding values for negative indexes are dropped) [1] "Jan" "Jan"(All corresponding values are printed)

#### Vector Arithmetic

We can perform addition, subtraction, multiplication, and division on vectors having the same number of elements in the following ways:

v1 <- c(4,6,7,31,45) v2 <- c(54,1,10,86,14,57) add.v <- v1+v2 print(add.v) sub.v <- v1-v2 print(sub.v) multi.v <- v1*v2 print(multi.v) divi.v <- v1/v2 print(divi.v)

Output:

[1] 58 7 17 117 59 66 [1] -50 5 -3 -55 31 -48 [1] 216 6 70 2666 630 513 [1] 0.07407407 6.00000000 0.70000000 0.36046512 3.21428571 0.15789474

#### Recycling Vector Elements

If arithmetic operations are performed on vectors having unequal lengths, then the vector’s elements which are shorter in number as compared to the other vector, are recycled. For example:

v1 <- c(8,7,6,5,0,1) v2 <- c(7,15) add.v <- v1+v2 (v2 becomes c(7,15,7,15,7,15)) print(add.v) sub.v <- v1-v2 print(sub.v)

Output:

[1] 15 22 13 20 7 16 [1] 1 -8 -1 -10 -7 -14

**Want to get certified in R! Learn R Programming from top R Programming experts and excel in your career with Intellipaat’s R Programming Certification!**

#### Sorting a Vector

We can sort the elements of a vector by using the **sort()** function in the following way.

v <- c(4,78,-45,6,89,678) sort.v <- sort(v) print(sort.v)

#Sort the elements in the reverse order revsort.v <- sort(v, decreasing = TRUE) print(revsort.v) #Sorting character vectors v <- c("Jan","Feb","March","April") sort.v <- sort(v) print(sort.v) #Sorting character vectors in reverse order revsort.v <- sort(v, decreasing = TRUE) print(revsort.v)

Output:

[1] -45 4 6 78 89 678 [1] 678 89 78 6 4 -45 [1] "April" "Feb" "Jan" "March" [1] "March" "Jan" "Feb" "April"

**List in R Programming**

A list in R programming is a non-homogenous data structure, which implies that it can contain elements of different data types. It accepts numbers, characters, lists, and even matrices and functions inside it. It is created using the **list()** function.

For example:

list1<- list("Sam", "Green", c(8,2,67), TRUE, 51.99, 11.78,FALSE) print(list1)

Output:

[[1]] [1] "Sam" [[2]] [1] "Green" [[3]] [1] 8 2 67 [[4]] [1] TRUE [[5]] [1] 51.99 [[6]] [1] 11.78 [7]] [1] FALSE

#### Accessing Elements of a List

Elements of a list can be accessed by using the indices of those elements.

For Example:

list2 <- list(matrix(c(3,9,5,1,-2,8), nrow = 2), c("Jan","Feb","Mar"), list(3,4,5)) print(list2[1]) print(list2[2]) print(list2[3])

Output:

[[1]] [,1] [,2] [,3](First element of the list)[1,] 3 5 -2 [2,] 9 1 8 [[1]] [1] "Jan" "Feb" "Mar"(Second element of the list)[1,] 3 5 -2 [[1]] [[1]][[1]] [1] 3 [[1]][[2]](Third element of the list)[1] 4 [[1]][[3]] [1] 5

**Wish to crack R Programming job interviews? Intellipaat’s Top Apache R Programming Interview Questions are meant only for you!**

#### Adding, Deleting elements of a List

We can add and delete elements only at the end of a list.

For example:

list2 <- list(matrix(c(3,9,5,1,-2,8), nrow = 2), c("Jan","Feb","Mar"), list(3,4,5)) list2[4] <- “HELLO” print(list2[4])

Output:

[[1]] [1] "Hello"

Similarly,

list2[4] <- NULL print(list2[4])

Output:

[[1]] NULL

#### Updating Elements of a List

To update a value in a list, use the following syntax:

list2[3] <- "Element Updated" print(list2[3])

Output:

[[1]] [1] "Element Updated"

**M****atrix in R Programming**

The matrix in R programming is a 2-dimensional data structure that is homogenous in nature, which means that it only accepts elements of the same data type. Coercion takes place if elements of different data types are passed. It is created using the matrix() function.

The basic syntax to create a matrix is given below:

matrix(data, nrow, ncol, byrow, dimnames)

where,

data = the input element of a matrix given as a vector.

nrow = the number of rows to be created.

ncol = the number of columns to be created.

byrow = the row-wise arrangement of the elements instead of column-wise

dimnames = the names of columns/rows to be created.

For example:

M1 <- matrix(c(1:9), nrow = 3, ncol =3, byrow= TRUE) print(M1)

Output:

[,1] [,2] [,3] [1,] 1 2 3 [2,] 4 5 6 [3,] 7 8 9 M2 <- matrix(c(1:9), nrow = 3, ncol =3, byrow= FALSE) print(M2)

Output:

[,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9

By using row and column names, a matrix can be created as follows:

rownames = c("row1", "row2", "row3") colnames = c("col1", "col2", "col3") M3 <- matrix(c(1:9), nrow = 3, byrow = TRUE, dimnames = list(rownames, colnames)) print(M3)

Output:

col1 col2 col3 row1 1 2 3 row2 4 5 6 row3 7 8 9

**Have you got more queries? Come to our R Programming Community and get them clarified today!**

#### Accessing Elements of a Matrix

To access the elements of a matrix, row and column indices are used in the following ways:

For accessing the elements of the matrix M3 created above, use the following syntax:

print(M3[1,1]) print(M3[3,3]) print(M3[2,3])

Output:

[1] 1 (Element at first row and first column) [1] 9 (Element at third row and third column) [1] 6 (Element at second row and third column)

**Data Frame**

A data frame in R programming is a 2-dimensional array-like structure that also resembles a table, in which each column contains values of one variable and each row contains one set of values from each column.

A data frame has the following characteristics:

- The column names of a data frame should not be empty.
- Row names should be unique.
- Data stored in a data frame can be numeric, factor or character type.
- Each column should contain the same number of data items.

**Learn more about R Programming from this R Programming Training in Toronto to get ahead in your career!**

#### Creating a Data Frame

Use the following syntax for creating a data frame in R programming:

empid <- c(1:4) empname <- c("Sam","Rob","Max","John") empdept <- c("Sales","Marketing","HR","R & D") emp.data <- data.frame(empid,empname,empdept) print(emp.data)

Output:

Sl.No. | empid | empname | empdept |

1 | 1 | Sam | Sales |

2 | 2 | Rob | Marketing |

3 | 3 | Max | HR |

4 | 4 | John | R & D |

#### Extracting Columns/Rows from a Data Frame

To extract a specific column from a data frame, use the following syntax:

result <- data.frame(emp.data$empname,emp.data$empdept) print(result)

Output:

Sl.No. | emp.data.empname | emp.data.empdept |

1 | Sam | Sales |

2 | Rob | Marketing |

3 | Max | HR |

4 | John | R & D |

To extract specific rows from a data frame, use the following syntax:

result <- emp.data[1:2,] print(result)

Output:

Sl.No. | empid | empname | empdept |

1 | 1 | Sam | Sales |

2 | 2 | Rob | Marketing |

The following code extracts the first and third rows with second and third columns respectively.

result <- emp.data[c(1,2),c(2,3)] print(result)

Output:

Sl.No. | empname | empdept |

1 | Sam | Sales |

2 | Max | HR |

#### Adding a Column to a Data Frame

To add a salary column to the above Data Frame, use the following syntax:

emp.data$salary <- c(20000,30000,40000,27000) n <- emp.data print(n)

Sl.No. | empid | empname | empdept | Salary |

1 | 1 | Sam | Sales | 20000 |

2 | 2 | Rob | Marketing | 30000 |

3 | 3 | Max | HR | 40000 |

4 | 4 | John | R & D | 27000 |

#### Adding a Row to a Data Frame

To add new rows to an existing Data Frame, we need to create a new data frame, which contains the new rows, and then merge it with the existing data frame using the rbind() function. This way, we will get the final Data Frame.

**Creating a new Data Frame**

emp.newdata <- data.frame( empid = c(5:7), empname = c("Frank","Tony","Eric"), empdept = c("IT","Operations","Finance"), salary = c(32000,51000,45000) )

**Merging the Created Data Frame with the Existing One**:

emp.finaldata <- rbind(emp.data,emp.newdata) print(emp.finaldata)

Output:

Sl.No. | empid | empname | empdept | Salary |

1 | 1 | Sam | Sales | 20000 |

2 | 2 | Rob | Marketing | 30000 |

3 | 3 | Max | HR | 40000 |

4 | 4 | John | R & D | 27000 |

5 | 5 | Frank | IT | 32000 |

6 | 6 | Tony | Operations | 51000 |

7 | 7 | Eric | Finance | 45000 |

**Are you interested in learning R programming from experts? Enroll in our R programming Course in Bangalore now!**

**Factor**

Factors in R programming are used in data analysis for statistical modeling. They are used to categorize unique values in columns, like “Male, “Female”, “TRUE”, “FALSE”etc., and store them as levels. They can store both strings and integers. They are useful in columns that have a limited number of unique values.

Factors can be created using the **factor()** function and they take vectors as inputs.

For example:

data <- c("Male","Female","Male","Child","Child","Male","Female","Female") print(data) factor.data <- factor(data) print(factor.data)

Output:

[1] Male Female Male Child Child Male Female Female Levels: Child Female Male

For any Data Frame, R treats the text column as categorical data and creates factors on it.

For example: For the emp.finaldata Data Frame R treats empdept as a factor.

print(is.factor(emp.finaldata$empdept)) print(emp.finaldata$empdept)

Output:

[1] TRUE [1] Sales Marketing HR R & D IT Operations Finance Levels: HR Marketing R & D Sales Finance IT Operations

In this tutorial, we learned what data structures in R programming are, their different types, and how to perform simple data manipulation using data structures. In the next session, we are going to talk about Control Flow statements in R. Let’s meet there!