Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

As of August 2017, Pandas DataFame.apply() is unfortunately still limited to working with a single core, meaning that a multi-core machine will waste the majority of its compute-time when you run

df.apply(myfunc, axis=1).

How can you use all your cores to run apply on a dataframe in parallel? 

2 Answers

0 votes
by (41.4k points)
edited by

Using this below code will apply function f in a parallel fashion to column col of dataframe df:

import multiprocessing as mp

pool = mp.Pool(mp.cpu_count())

df['newcol'] = pool.map(f, df['col'])

pool.terminate()

pool.join()

If you want to make your career in Artificial Intelligence then go through this video:

0 votes
ago by (3.5k points)

Applying the apply function in parallel is possible and contributes greatly in enabling operations on a DataFrame to utilize every available CPU core. The default setting of the function is to be single-threaded or mono core but alternate libraries can be used to share the workload on other cores .

Parallelization using joblib

Joblib is mainly created for multiclassing programs that are CPU intensive, applications can be integrated together in the form of applying Parallel and delayed to the apply function.

To install: pip install joblib

Code Implementation

import pandas as pd

from joblib import Parallel, delayed

# Sample DataFrame

df = pd.DataFrame({'A': range(1, 1000001)})

def func(x):

   return x * 2

# Using joblib to parallelize the apply

df['B'] = Parallel(n_jobs=-1)(delayed(func)(x) for x in df['A'])

In this example, n_jobs=-1 tells joblib that it should want to use all of the available cores.

joblib: For smaller data sizes , a more powerful and flexible method is available which replaces the use of apply without much hassle.

Related questions

0 votes
1 answer
0 votes
2 answers
0 votes
1 answer
asked Sep 24, 2019 in Data Science by ashely (50.2k points)

31k questions

32.9k answers

507 comments

693 users

Browse Categories

...