Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
in AWS by (12.9k points)

I am running some machine learning algorithms on EMR Spark cluster. I am curious about which kind of instance to use so I can get the optimal cost/performance gain?

For the same level of prices, I can choose among:

          vCPU  ECU  Memory(GiB)

m3.xlarge  4     13     15     

c4.xlarge  4     16      7.5

r3.xlarge  4     13     30.5

Which kind of instance should be used in EMR Spark cluster?

1 Answer

0 votes
by (18.2k points)

It entirely depends on your requirements and uses cases. Considering the information that you have shared I can suggest a minimum configuration:

You might be needing 1 master and 2 nodes to configure a small distributed cluster. Since master won't be doing any computing so you won't be needing many resources for master.

You can add instances as per your needs:

  • 1 x master : m5.xlarge - vCPU : 4 , RAM : 16 GB with EBS storage.
  • 2 x slaves : c5.xlarge - vCPU : 16, RAM : 32 GB with EBS storage.

Browse Categories