Thursday, May 23, 2013

Amazon EMR : How to add More than 256 Steps to a Cluster?

If you have been using Amazon EMR for long and complex tasks, you might know that EMR currently limits the number of steps which can be added to 256.

But at times, it becomes a tad difficult to limit the number of steps to 256. It might be because the problem at hand being complex and needed to be broken into several steps and needed to be run over varied sets of data. Or one might have long running jobs taking care of multiple tasks for a Hive-based data warehouse. Whatever, may be the reason, you shouldn't get depressed about it! As there is a simple workaround for this:

Manually connect to the master node and submit job steps! Just like you run it on you local machine!
Yeah, that's it. Simple.

EMR's CLI already has ways which can facilitate things for us here.
Assuming that you already have a cluster spawned and have it's JobFlowID, follow the below steps to submit job steps directly to the master node:

  1. Move your executables to the master node
    In order to run your job step, you will need to have the jar and or other files required by your job to be moved to the master node. This can be done as follows using EMR CLI's --scp:
    ruby elastic-mapreduce --jobflow JobFlowID --scp myJob.jar
  2. Execute hadoop command, just like you do in local machine.
    This can also be done as follows using EMR CLI's --ssh:
    ruby elastic-mapreduce --jobflow JobFlowID --ssh ' hadoop jar myJob.jar inputPath outputPath otherArguments'
There are other ways also. Refer here for more.


5 comments:

  1. Hi Admin, I went through your article and it’s totally awesome. You can consider including RSS feed for easy content sharing, So that you can drive huge traffic to your blog. Hadoop Training in Chennai | Big Data Training in Chennai

    ReplyDelete
  2. I wish to show thanks to you just for bailing me out of this particular
    trouble.As a result of checking through the net and meeting
    techniques that were not productive, I thought my life was done.

    java training in chennai

    ReplyDelete
  3. Presently we should investigate precisely how you put items available to be purchased on Amazon Marketplace. How to sell on Amazon

    ReplyDelete
  4. Selling on Amazon is an awesome method to profit from home. guide to selling on amazon

    ReplyDelete
  5. mytectra placement Portal is a Web based portal brings Potentials Employers and myTectra Candidates on a common platform for placement assistance.

    ReplyDelete

Any feedback, good or bad is most welcome.

Name

Email *

Message *