R Data Structures User Handbook
Data Structures are used to organize and store the data on the computer. R programming is a language that supports particular types of Data Structures. Most of the industries use Data Structures and write them in particular programming languages such as R.
Data Structures in R cheat sheet will help you with the basic concepts and the commands one must know to get started with it. It is helpful for beginners as well as experienced people as it provides a quick overview of the important concepts required.
You can also download the printable PDF of this Data Structures in R cheat sheet
Data Structure: It is a way of organizing data that contains the items stored and their relationship with each other
R Programming: It is a programming language that is mainly used by Data Scientists, it is preferred by people who are good at Statistics and mathematics. In this language functions and codes are stored in a package inside the library
Types of R objects:
- Vector: The basic data structure in R is Vector, it comes in two parts
- Atomic vector and
- List
- A basic way of using vectors is by the c () function. E.g.: C (1,2,3)
- Matrix: A matrix is a collection of numbers arranged into an affixed number of rows and columns. By using a matrix function we can reproduce a memory representation of the matrix in R
- Array: In R it is called a multi-dimensional data structure. Here, the data is stored in the form of matrices. Array in R is the data object which can store data in more than two dimensions
- List: These are the objects which contain elements of different types like strings, numbers, vectors, and other lists inside it. It can be created using the list () function
- Data Frames: It refers to the tabular form of data, representing the cases (rows), each of which consists of the number of observations or measurements (columns). It is used for storing data tables, it is a list of vectors of equal length
Data tables:
It extends and enhances the functionality of Data Frames
Types of Data Structures in R
Syntax for the use of R data structures:
Vector:
v1 < - c (1,2,3)
length(v1)
- Check if all or any is true
all(v1); any(v1)
v1[1:3]; v1[c (1<6)]
v1[is.na(v1)] < - 0
c(first = ‘a’, ..) or names(v1) < -c(‘first’, ..)
List:
list1 < - list (first = ‘a’, …)
vector (mode = ‘list’ , Length = 3)
list1[[1]] or list1[[‘First’]]
- Append using numeric index
list1 [[6]] < - 2
Data frame:
df1 < - data.frame (col1=v1, col2=v2, v3)
nrow(df1); ncol(df1); dim(df1)
rownames(df1)
rownames(df1) < - c(…)
head(df1, n=10) ; tail(…)
class(df1) # is data.frame
df1[‘col1’] or df1[1]
df1[ c(‘col1’, ‘col3’)] or df1[ c(1,3)]
df1[ c(1,3), 2:3]
# returns data from rows 1,3 and columns 2,3
- To create data table from data.frame:
data.table(df1)
dt1[, ‘col1’ , with= FALSE] or dt1[, list (col1)]
- Show info for each data.table in memory:
tables()
key(dt1)
- Create index for col1 and reorder data according to col1:
setkey(dt1,col1)
dt1[c(‘col1value1’, ‘col1value2’,]
dt1[J(‘1’, c(‘2’, ‘3’)), ]
dt1[, list(col1=mean(col1)), by = col2 ]
dt1[, list(col1=mean(col1), col2sum= sum(col2)), by = list(col3, col4) ]
Matrix:
matrix1 < - matrix(1:10, nrow = 5 )
# fills rows 1 to 5, column 1 with 1:5, and column 2 with 6:10
matrix1 %*% t (matrix2)
# where t() is transpose
Download a Printable PDF of this Cheat Sheet
With this, we come to an end of Data Structures in R Cheatsheet. To get in-depth knowledge, check out our data Manipulation with R programming, Data visualization in R tutorial, advanced analytics topics like regressions, and data mining using RStudio. You will work on real-life projects and assignments to master data analytics.