Python NumPy Tutorial for Beginners

In this Python NumPy Tutorial, we will be covering One of the robust and most commonly used Python libraries i.e. Python NumPy. Python library is a collection of script modules that are accessible to a Python program. It helps simplify the programming process and remove the need to rewrite commonly used commands again and again. Okay, so, what is NumPy in Python? Well, NumPy stands for ‘Numerical Python’ which provides a multidimensional array object, an assortment of routines for fast operations on arrays, and various derived objects (such as masked arrays and matrices), including mathematical, logical, basic linear algebra, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms basic statistical operations, random simulation, and much more.

numpy

 

Learn more about Python from this Python Data Science Course to get ahead in your career!

Some of the key features of NumPy Python are as follows:

  • It is a powerful N-dimensional Python array object.
  • It is a sophisticated broadcasting function.
  • It is a tool for integrating C, C++, and Fortran code.
  • It is useful in linear algebra, Fourier transforms, and random number capabilities.

Watch this Python Numpy Tutorial Video for Beginners:

In this Python NumPy tutorial, we will see how to use NumPy Python to analyze data on the Starbucks menu. This data set consists of information related to various beverages available at Starbucks which include attributes like Calories, Total Fat (g), Sodium (mg), Total Carbohydrates (g), Cholesterol (mg), Sugars (g), Protein (g), and Caffeine (mg). Here, we will learn how we can work with NumPy, and we will try to figure out the nutrition facts for the Starbucks menu.

Calories, Total Fat (g), Sodium (mg), Total Carbohydrates (g),Cholesterol (mg), Sugars (g), Protein (g),Caffeine (mg),Nutrition_Value
3,0.1,0,5,0,0,0.3,175,5
70,0.1,5,75,10,9,6,75,5
110,1.5,5,60,21,17,7,85,6
100,0.1,5,70,19,18,6,75,4
5,0,0,5,1,0,0.4,75,5

Here, we have the first few rows of the starbucks.csv file, which we’ll be using throughout this Python NumPy tutorial. The data is in the csv (comma-separated values) format—each record is separated by a comma (,)—and rows are separated by a new line. There are approximately 1,800 rows, including the header row, and 9 columns in the file. I hope by now, your basic question, i.e. what is NumPy in Python would have been answered.

Before we start, here is a quick note on the version—we’ll be using Python Version 3.5. Our code examples will be done using Jupyter Notebook.

Here, we have the list of topics covered in this Python NumPy Tutorial:

Are you interested in learning Python from experts? Enroll in our Python Course in Bangalore now!

Lists of Lists for CSV Data

Before proceeding to NumPy, on the important thing if you have ssv format file data set, then convert it into csv format file by using the csv.reader object and pass the keyword argument delimiter as “,” , this will help us to read into the content and split up all the content that are available in the ssv file

import csv
with open('Starbucks.csv', 'r') as f:
starbucks = list(csv.reader(f, delimiter=','))
print(starbucks)

It’s always good to have data in the right format ina table to make it easier to view:

Calories Total Fat (g) Sodium (mg) Total Carbohydrates (g) Cholesterol (mg) Sugars (g) Protein (g) Caffeine (mg) Nutrition_Value
3 0.1 0 5 0 0 0.3 175 5
70 0.1 5 75 10 9 6 75 5

As we can observe from the table above, we have the first three rows from the entire table, where the first row contains column headers. The first row is the header row, and the next rows represent the values of different attributes of various beverages at Starbucks. The first element of each row is the Calories, the second is the Total Fat, and so on. We can find the average nutrition value as follows:

  • First of all, we have to extract the last element from each row, except from the header row.
  • Then, we have to convert the extracted elements to a float data type.
  • Then, we have to assign all the extracted elements to the list of qualities.
  • And then, we have to divide the sum of all elements in qualities by the total number of elements in qualities to get the mean.
Nutrition_Value = [float(item[-1]) for item in Starbucks[1:]]
Sum(Nutrition_Value) / len(Nutrition_Value)
5.6360225140712945

Here, we are able to do the calculation in the way we wanted, but the code is a little complex. And, it won’t be fun if we have to repeat something similar every time to compute the average Nutrition_Value. We are lucky enough to use the Python NumPy library to make it easier to work with our data.

Check out our blog why python is considered the best programming language for Data Science.

Let’s explore it!

Watch this Text Mining in Python Tutorial Video for Beginners:

Numpy 2-Dimensional Arrays

Python NumPy 2-dimensional Arrays

In NumPy, it is very easy to work with multidimensional arrays. Here in this Python NumPy tutorial, we will dive into various types of multidimensional arrays. Currently, we are focusing on 2-dimensional arrays.

A 2-dimensional array is also called as a matrix. A 2-dimensional array is a collection of rows and columns. By specifying a row number and a column number, we can easily extract an element from a matrix.

In the below 2-dimensional array, the first row is the header row, and the first column is the Caloriescolumn:

Calories Total Fat (g) Sodium (mg) Total Carbohydrates (g) Cholesterol (mg) Sugars (g) Protein (g) Caffeine (mg) Nutrition_Value
3 0.1 0 5 0 0 0.3 175 5
70 0.1 5 75 10 9 6 75 5

If we pick the element which is present in the first row and in the second column, that is total fat. If we pick the element in the third row and in the second column, here we get 0.1.

In a NumPy array in Python, the rank is specified to the number of dimensions, and each dimension is called an axis. So, the first axis is the row, and the second axis is the column.

These are the basics of matrices. Now, we will see how we can convert our Python list of lists to a NumPy array in Python.

Certification in Full Stack Web Development

Creating a Python NumPy Array

The numpy.array function is used to create a NumPy array in Python. Here, we just have to pass in a list of lists, and it will automatically generate a NumPy array in Python with the same number of rows and columns. For easy computation, we want all elements in the array to be float elements, so we’ll leave off the header row and the first column that contains strings.

This is one of the limitations of NumPy in Python as, in NumPy all elements in an array have to be of the same Python Data Type. Here, if we include the header row and the first column, then all elements in the array will be read in as a string. So, to do computations in the way we want, like finding the average Nutrition_Value, we need the elements to be presented in floats.

In the below code:

  • First, we will Import NumPy.
  • Then, we will pass the list of lists Starbucks into the array function, which converts it into a NumPy array.
  • Here, we will exclude the header row and the first column with list slicing.
  • Then, we will specify the keyword argument d type to make sure that each element is converted to a float. We will explore more about what the d type is later.
import numpy as np
starbucks = np.array(starbucks[1:], dtype=np.float)
starbucks

Creating A NumPy Array - example
Now, to check the number of rows and columns in our data, we will use the shape property of NumPy arrays.
using the shape property of NumPy arrays

Alternative Python NumPy Array Creation Methods
Are there other methods to create a NumPy array? Yes, we can use a variety of methods to create NumPy arrays? First, we will look at the creation of an array where every element is zero. The below code will create an array with four rows and three columns, where every element is 0. Here, we will be using numpy.zeros:
import numpy as np
empty_array = np.zeros((4,3))
empty_array

Array in NumPy with all zero elements
An array with all zero elements will be useful at the time when we want an array of fixed size; otherwise, it will not have any value.

Similarly, we can create an array of all ones.

import numpy as np
All_One_array = np.ones((4,3))
All_One_array

Array in NumPy with all ones elements

import numpy as np
Random_array = np.random.rand(4,3)
Random_array

We can also create an array with random numbers using numpy.random.rand. Here’s an example:
NumPy array with random numbers
Creating an array which is completely filled with random numbers can be useful at a time when we want to quickly test our code with sample arrays.

Interested in learning Python? Enroll in our Python Course in London now!

Reading Text Files

Here in NumPy, we can directly read csv or other files into an array. This can be done using the numpy.genfromtxt function. We will use this Python function on our initial data on Starbucks.

Here is the code:

  • To read in the starbucks.csv file, here we will use the genfromtxt function.
  • Next, we have specified the keyword argument delimiter as ‘,’ so that the fields are parsed properly.
  •  And then, we have specified the keyword argument skip_header=1, which will help eliminate the header row.
starbucks= np.genfromtxt("Starbucks.csv", delimiter=",", skip_header=1)

Here, if we read it into a list and then convert it to an array of floats, the Starbucks will be looking the same. Here, NumPy  in Python will automatically pick up a data type for the elements in the array based on their format.

Become a Full Stack Web Developer

Python NumPy Arrays: Indexing and Slicing

So, how we can do indexing and slicing in the created NumPy arrays to retrieve results from them? Let’s get further into this Python NumPy tutorial and learn about that as well. In NumPy, the index for the first row and the first column starts with 0. Suppose, if we want to select the fifth column, then its index will be 4, or if we want to select third row data, then its index will be 2, and so on.

0 1 2 3 4 5 6 7 8
0 3 0.1 0 5 0 0 0.3 175 5
1 70 0.1 5 75 10 9 6 75 5
2 110 1.5 5 60 21 17 7 85 6
3 100 0.1 5 70 19 18 6 75 4
4 5 0 0 5 1 0 0.4 75 5
5 50 0.1 5 60 8 7 5 75 5.5
6 5 0 0 0 1 0 0.4 75 5
7 10 0 0 1 2 0 1 150 5

Let’s say, we want to select the element at row 7 and column 3. Here, we will pass index 6 as row index:

starbucks= np.genfromtxt("Starbucks.csv", delimiter=",", skip_header=1)
starbucks[6,4]

and the index 4 as the column index:
passing index in Numpy
Thus, with the help of index, we have seen indexing.

Suppose, we want to select the first five elements from the second column. This we can implement by using a colon (:). A colon in slicing indicates that we want to select all elements from the starting index excluding the ending index.

starbucks= np.genfromtxt("Starbucks.csv", delimiter=",", skip_header=1)
starbucks[0:5,1]

selecting entire column in NumPy
And suppose we want to select the entire column then just by using the colon (:), with no starting or ending indices we will get the desired result.

starbucks= np.genfromtxt("Starbucks.csv", delimiter=",", skip_header=1)
starbucks[0:5,1]

selecting the entire array
And suppose we want to select the entire array then use two colons to select all the rows and column. But this is not required while creating a good application.

starbucks= np.genfromtxt("Starbucks.csv", delimiter=",", skip_header=1)
starbucks[:,:]

example code for assigning values to certain elements in arrays
Now, how can we assign values to certain elements in arrays?

We can do that by directly assigning the value to a particular element. Here is an example:

starbucks= np.genfromtxt("Starbucks.csv", delimiter=",", skip_header=1)
starbucks[1,5] =10
starbucks[1,5]

overwriting entire column

starbucks= np.genfromtxt("Starbucks.csv", delimiter=",", skip_header=1)
starbucks[:5]=10

Even, we can overwrite the entire column by using this code. The above code will overwrite the entire sixth column with 10.

Kick-start your career in Python with the perfect Python Course in New York now!

Multidimensional Python NumPy Arrays

Currently, we have worked with Starbucks array which was a 2-dimensional array. However, the NumPy in Python package provides us the privilege to work with multidimensional arrays. The most common multidimensional

Fifth_starbucks= np.genfromtxt("Starbucks.csv", delimiter=",", skip_header=1)
Fifth_starbucks = starbucks[5,:]
Fifth_starbucks

array is a 1-dimensional array. Previously, when we sliced the Starbucks data, there we had created a 1-dimensional array. A 1-dimensional array will have a single index to retrieve an element from it. Interestingly, each row and each column in a 2-dimensional array is treated as a 1-dimensional array. As for a list of lists the analogous is a 2-dimensional array, for a single list the analogous is a 1-dimensional array. Suppose, we slice the Starbucks data and retrieve only the fifth row, then as output, we will receive a 1-dimensional array.

retrieve an individual element using a single index

And suppose if we want to retrieve an individual element from Fifth_starbucks we can do that by using a single index.

Fifth_starbucks[0]

generating a random vector useing a single dimensional array
Even most of the NumPy functions, such as numpy.random.rand which we used with multidimensional arrays to generate a random vector, can be used with a 1-dimensional array as well. Here, we just have to pass the single parameter.

np.random.rand(5)

dealing with the 1, 2, 3-dimensional array

Mostly for our applications, we deal with 1-, 2-, and 3-dimensional arrays. Though, it is very real that we might come across an array that is more than a 3-dimensional array. Imagine this as a list of lists of lists.

For a better understanding of this, let’s take an example of the monthly earnings of a supermarket. The month-wise data will be in the form of a list, and if we want a quick look on it we can see the data in quarter-wise and year-wise formats.

A monthly earning of a supermarket will look something like this:

[400, 250, 300, 470, 560, 630, 820, 740, 605, 340,420,340]

Here, the supermarket has earned $400 in January, $250 in February, and so on. Now, if we split this earning quarter-wise, then it will be a list of lists:

One_Year = [
[400, 250, 300],
[470, 560, 630],
[820, 740, 605]
[340,420,340]]
]

Here, we can retrieve the earnings of the month of January by calling One_Year[0],[0], and if we want the result for a complete quarter, then we can all One_Year[0] or One_Year[1]. So, this is a two-dimensional array.

But, if we add the earning of another year, then it will become the third dimension:

Yearly_Earning = [
[
[400, 250, 300],
[470, 560, 630],
[820, 740, 605],
[340,420,340]]
]
[
[500, 350, 430],
[430, 760, 640],
[720, 530, 800],
[345,900, 700]
]
]

Here, we can retrieve the earnings for the month of January in the first year by calling Yearly_Earning[0],[0],[0].

So, we need three indexes to retrieve a single element. We have the same case in a 3-dimensional array in NumPy; in fact, we can convert this Yearly_Earning to an array, and then we can get the earnings for the month of January of the first year. It will be as follows:

Yearly_Earning = np.array(Yearly_Earning)
Yearly_Earning [0,0,0]

We will get the result as follows:

400

Now, to know the shape of the array, we will use:

Yearly_Earning.shape

The result will be as follows:

2,4,3

In 3-dimensional arrays also, indexing and slicing work exactly the same way as they work in 2-dimensional arrays, but here we have to pass in one extra axis.
Suppose if we need the earning for January of all years, then it could be:

Yearly_Earning[:,0,0]

The result will be:

array ([400,500])

Suppose, if we require to get first-quarter earnings from both years, then:

Yearly_Earning[:,0,:]

The result will be:

array ( [ [400, 250, 300],
[500, 350, 430]])

By adding more dimensions, we can make it much easier for us to query our data as it will be organized in a certain way.

Suppose, if we go from 3-dimensional arrays to 4-dimensional arrays or more than that, we will apply the same properties, and they will be indexed and sliced in the same ways.

Go for the most professional Python Course Online in Toronto for a stellar career now!

NumPy in Python Data Types

As we have discussed earlier in this Python NumPy tutorial, each element of a NumPy array can be stored in a single data type. In our Starbucks example, all elements contain only float values. In NumPy, values are stored using its own data types, which are different from Python data types like float and str. The reason behind this is that the core of NumPy in Python is written in the C programming language, which stores data differently in comparison to the Python data types. NumPy in Python itself maps data types between Python and C and allows us to use NumPy arrays without any conversion hitches.

We can find the data type of a NumPy array by accessing its dtype property:

Starbucks.dtype
  • NumPy in Python provides various data types which are in line with Python data types, like float and str. Some of the important NumPy data types are:
    • float: numeric floating-point data
    • int: integer data
    • string: character data
    • object: Python objects

Even, we have additional data types with a suffix that indicates the bits of memory that the particular data type can take up. Like int 32 is a 32-bit integer data type and float 64 is a 64-bit float data type.

Converting Python NumPy Data Types
To convert an array to a different type, we can use the numpy.ndarray.astype method. This method will make a copy of the actual array and will return a new array with the specified data type. For example, if we want to convert Starbucks data to the int data type, we need to perform this:
starbucks.astype(int)

performing all of the elements in the resulting array as integers
In the output, we can observe that all elements in the resulting array are integers. To check the name property of the dtype of the resulting array, we will use the following code:

integer_starbucks = starbucks.astype(int)
integer_starbucks.dtype.name

storing the values as 32-bit integers
Here, the array has been converted to a 32-bit integer data type which means that it will be storing the values as 32-bit integers.

If we want more control over the way the array is stored in memory and allows for very long integer values, then we can directly create NumPy dtype objects like numpy.int64:

np.int64

converting Numpy data types into 64-bit

Now, we can directly use these to convert between data types:

integer_starbucks.astype(np.int64)

converting NumPy data types into int 64-bit

NumPy Python Array Operations

It is easy to perform mathematical operations on arrays using NumPy in Python you’ll see that in the further topics of this Python NumPy tutorial.

Single Array Math

It is easy to perform basic arithmetic operations on NumPy arrays. We can use +, -, *, and / symbols or add(), subtract(), multiply(), and divide() methods to perform basic operations like addition, subtraction, multiplication, and division, respectively. By using the sqrt() function, we can find the square root of each element in a NumPy array.

Let’s say after the quality check, we want to add 10 to the Nutrition_Value of Starbucks beverages. Here, we will use the following code:

starbucks[:,8] + 10

new 1-dimensional array is returned with 10 added to each elements

It is interesting to note that after performing the above operation we need not change the Starbucks array, but a new 1-dimensional array is returned where 10 has been added to each element in the Nutrition_Value column of the Starbucks data.
Similarly, instead of ‘+’ if we modify the array with ‘+=’, then the result will be:

starbucks[:,8] += 10
starbucks[:,8]

multiplication example in Numpy

In the same way, we can perform other operations. Suppose, we want to multiply each Nutrition_Value by 2. It can be done in this way:

starbucks[:,8] * 2

multipying each elements in NumPy

Multiple Array Math
We can perform mathematical operations between multiple arrays. These operations will be applied to pairs of elements. Suppose, if we add the Nutrition_Value column to itself, here’s what we will get:
starbucks[:,8] + starbucks[:,8]

Multiple Array Math in NumPy

The output is equivalent to starbucks[:,8] * 2. This is because here each pair of elements is added by NumPy. The first array first element is added to the second array first element, the first array second element to the second array second element, and so on.

Also, we can use this to multiple arrays. Let’s say, we want to pick a beverage that is fat-filled with nutrition value, then we have to multiply Total Fat, Protein, and Nutrition_Value, and then we can select the beverage with the highest score.

starbucks[:,1] * starbucks[:,6] * starbucks[:,8]

using multiple arrays in NumPy

We can perform all common operations like /, *, -, +, and ^ to work between arrays.

Become a Professional Python Programmer with this complete Python Training in Singapore!

Broadcasting

Till now, we have performed operations on exactly the same-sized arrays, and they are done with the corresponding elements. But what if the dimension of the two arrays is not similar? However, it is possible to perform NumPy in Python on two arrays that are dissimilar by using broadcast. In broadcast, we will try to match up the elements using certain rules. A few essential steps involved in broadcasting are:

  • Compare the last dimension of each array
  • If lengths of the dimension are equal, or one of the dimension lengths is 1, then we keep working
  • If lengths of dimension aren’t equal, and none of the dimension lengths is 1, then there is an error
  • Repeat checking dimensions until the shortest array is out of dimensions.

For example, we can compare the following two array shapes:

X: (60,5)
Y: (5)

The comparison is possible here because array X has the trailing dimension length of 5, and array Y has a trailing dimension length of 5. They’re equally as trailing dimensions are equal. But array Y is out of elements. So for broadcasting, array X is stretched to become an array with the same shape as array Y, and thus arrays are compatible for mathematical operations.

Let’s take another example that is also compatible.

X: (3,4)
Y: (10,4)

Here, the last dimension of both arrays is matching, and array X’s first dimension is of length 1.

Now, for better understanding, we will look at another two arrays that don’t match:

X: (52,52)
Y: (55,55)

Here, in this example neither the lengths of the dimensions are equal nor any of the arrays has a dimension length equal to 1.

Let’s illustrate the principle of broadcasting with the help of our Starbucks dataset:

starbucks * np.array([1,2])

illustrating the principle of Broadcast

The error statement ‘ValueError: operands could not be broadcast together with shapes (1888,9) (2,)’ appears as the two arrays don’t have a matching trailing dimension.

Here’s an example where the last dimension is matching:

X_array = np.array(
    [
        [3,4],
        [5,7]
    ]
)
Y_array = np.array([5,2])
X_array + Y_array

Elements of random_Example_array in NumPy
Elements of random_Example_array are broadcast over each row of the Starbucks dataset, so the first column it has the first value in random_Example_array added to it, and so on.

Python NumPy Array Methods

NumPy in Python provides so many methods other than arithmetic operations to solve more complex calculations in the array. One of the most commonly used NumPy array methods is the numpy.ndarray.sum method. This method helps find the sum of all elements in an array when

starbucks[:,8].sum()

output5

The total of all values in the Nutrition_Value column is 271.5.
Here, as a keyword argument for the sum method, we can also pass the axis to find sums over an axis.

Suppose, we call sum across the Starbucks matrix and pass in axis as 0, then we will be able to find sums over the first axis of the array. As a result, this will provide the sum of all values in every column.

Sums over the first axis would give us the sum of each column, or another way to think about this is that the specified axis is the one that is ‘going away.
If we assign ‘axis=0’, this means that we would like the rows to go away, and we are willing to find the sums for each of the remaining axes across each row.

starbucks.sum(axis=0)

output4
To verify whether our sum is correct, we can check the shape. In our dataset the shape is 9, corresponding to the number of columns.

starbucks.sum(axis=0).shape

output3
Here, if we provide ‘axis=1’, then it will find the sums over the second axis of the array.

starbucks.sum(axis=1)

output2
Other than the sum, in NumPy, we have several other methods which work like the sum method, including:

  • ndarray.min: finds the minimum value in an array
  • ndarray.max: finds the maximum value in an array
  • ndarray.std: finds the standard deviation of an array
  • ndarray.mean: finds the mean of an array
  • And many more

Go for this in-depth job-oriented Python Training in Hyderabad now!

Python NumPy Array Comparisons

In NumPy, it is possible to test and check whether the rows match with certain values by using mathematical comparison operations like <, >, >=, <=, and ==.
Suppose in our Starbucks data, we want to check which beverage has a Nutrition_Value greater than 5, we can do this:

starbucks[:,8] > 5

output1
As a result, we will receive a Boolean array which tells us which of the beverages has a Nutrition_Value greater than 5. We can perform similar things with the other operators. For instance, we can see if there is any beverage which has a Nutrition_Value equal to 10:

starbucks[:,8] == 10

output9

Subsetting

With a Boolean and a NumPy array, one of the powerful things we can do is to select only certain rows or columns as per our requirement. For example, we will select only those rows from the Starbucks data where Nutrition_Value of beverages is greater than 5.

Highly_Nutrition = starbucks[:,8] > 5
starbucks[Highly_Nutrition,:][:3,:]

output 8

Here, we have selected only three rows where Highly_Nutrition contains the value True, with their all columns.

So, subsetting makes it easier to filter arrays with certain criteria.

Another example: We want beverages with a lot of Protein and Highly_Nutrition. In order to specify multiple conditions, we will place each condition in parentheses, and we will separate conditions with an ampersand (&):

Highly_Nutrition_and_Max_Protein = (starbucks[:,8] > 5) & (starbucks[:,6] > 6)
starbucks[Highly_Nutrition_and_Max_Protein,2:]

output 7
Here, we can even combine subsetting and assignment to overwrite certain values in an array:

Highly_Nutrition_and_Max_Protein = (starbucks[:,8] > 5) & (starbucks[:,6] > 6)
starbucks[Highly_Nutrition_and_Max_Protein,2:] = 5
starbucks[Highly_Nutrition_and_Max_Protein,2:]

output22

Reshaping Python NumPy Arrays

In NumPy, it is very easy to change the shape of arrays and still protect all their elements. There are often many functions which make it easier to access array elements.

One of the simplest ways of reshaping an array is to flip its axes, where columns become rows and vice versa. We can perform this operation with the numpy.transpose function:

np.transpose(starbucks).shape

output 6

Another important function is the numpy.ravel function. This function will turn an array into a 1-dimensional representation with a long sequence of values:

starbucks.ravel()

output 5

Here, we have another example which will help us better understand and see the ordering of numpy.ravel:

Example_Array_One = np.array(
    [
        [1, 2, 3, 4],
        [5, 6, 7, 8]
    ]
)
Example_Array_One.ravel()

output 4

And finally, we are going to use the numpy.reshape function. This function will help us reshape an array to a certain shape as per our requirement. In the below example, we will turn the third row of Starbucks data into a 2-dimensional array with three rows and three columns:

starbucks[2,:].reshape((3,3))

outpt

Combining Python NumPy Arrays

With NumPy, we can easily combine multiple arrays into a single unified array. To perform this task, we can use numpy.vstack which will vertically stack multiple arrays. In this way, the second array’s items are added as new rows to the first array.
Let’s take an example where we want to combine the old Nutritional dataset of Starbucks beverages with our existing dataset, which contains information on the current Nutritional value of Starbucks beverages.

In the below code, we:

  • Read in starbucks_old data
  • Display the shape of starbucks_old data
import csv
with open('starbucks_old_data.csv', 'r') as f:
starbucks_old = list(csv.reader(f, delimiter=','))
import numpy as np
starbucks_old = np.array(starbucks_old[1:], dtype=np.float)
starbucks_old.shape

python

Here we can see, we have attributes for 196 beverages in the starbucks_old data, we can combine all the wine data.

Now, we will use the vstack function to combine the Starbucks data and the starbucks_old data, and then we will display the shape of the result:

All_beverages = np.vstack((starbucks, starbucks_old))
All_beverages.shape

output 1

Here we can observe, the result has 2,084 rows, which is the sum of the number of rows in the Starbucks data and in the starbucks_old data.

Similarly, we can combine arrays horizontally, which means that our number of rows will stay constant, but the columns will be joined. For this purpose, we can use the numpy.hstack function.

Another useful function is numpy.concatenate. it is a general-purpose version of hstack and vstack. With the help of this function, if we want to concatenate two arrays, we can pass them to concatenate specifying the axis keyword argument that we want to concatenate along. When we concatenate along the first axis, it is similar to vstack, and when we concatenate along the second axis, it is similar to hstack:

np.concatenate((starbucks, starbucks_old), axis=0)

output

Conclusion

This brings us to the end of the Python NumPy tutorial. In this Python NumPy tutorial, we learned in detail about the Python NumPy library with the help of a real-time dataset. Here, we have also explored how to perform various operations via the Python NumPy library, which is most commonly used in many Data Science applications. Now, if you are interested in knowing why Python is the most preferred language for data science, you can go through this blog on Python Data Science tutorial.
While in this Python tutorial, we have covered quite a bit of NumPy’s core functionalities, there is still a lot more to know about it. Try out Intellipaat courses like Python for Data Science which covers various techniques of how Python is deployed for Data Science, working with various libraries for Data Science, doing data munging, data cleaning, advanced numeric analysis, and much more in-depth than what we were able to cover here.

Practice the examples that have been explained in this Python NumPy tutorial. To become a Data Scientist and a successful and productive team member in the workplace, the Python NumPy library is definitely one of the most important tools to learn about and practice. I hope this Python NumPy tutorial helped you, head over to the next module in this Python tutorial.

Further, check out our offers for Python training Courses and also refer to the trending Python interview questions prepared by the industry experts.

Course Schedule

Name Date Details
Python Course 01 Apr 2023(Sat-Sun) Weekend Batch
View Details
Python Course 08 Apr 2023(Sat-Sun) Weekend Batch
View Details
Python Course 15 Apr 2023(Sat-Sun) Weekend Batch
View Details

1 thought on “Python NumPy Tutorial: Learn Python NumPy with Examples”

  1. Hey there,

    Awesome tutorial, congrats.

    But I would like to experiment with it myself. Is it possible to get the data source (starbucks.csv) ?

    Thanks in advance!

Leave a Reply

Your email address will not be published. Required fields are marked *