Sunday, March 31, 2013

How to ensure that one file is completely processed by the same mapper ?

There are times when we want one particular file to be read/processed by the same mapper. This requirement may arise in situations when you have a sequential data in each file and you want to process all the records of a file in exact same sequence they appear in the input file.

So, basically what we are asking here is that : please don'e split our input files and distribute it among different mappers. Simple.

It is even simpler to achieve this :

You have to create your own version of FileInputFormat and override isSplittable(), like this:

Class NonSplittableFileInputFormat extends FileInputFormat{

@Override 
public boolean isSplitable(FileSystem fs, Path filename){ 
return false; 
}
}

And then use the above class to setInputFormatClass().

6 comments:

  1. ok thanks for this post it's quite informative and I have learned new things.
    appvn

    ReplyDelete
  2. thank u for sharing this information and if u want to know more about Watch Online TV for Free kindly visit us

    ReplyDelete
  3. thanks for this post and if u want to know more about Skin Clinic Aesthetic Centre kindly visit us

    ReplyDelete
  4. thanks for sharing this information and if u want to know more about Injury Clinic west palm beach
    kindly visit us

    ReplyDelete
  5. Felt happy to read your blog. This kind conceptual content written on different blogs always appreciated. Most of the points can be learned from your blog.
    Thanks
    Affiliate marketing guide

    ReplyDelete
  6. The post made me to post reply. Very convincing content on your blog. Keep doing the collective work about information.
    Thanks
    Dedicated hosting provider

    ReplyDelete

Any feedback, good or bad is most welcome.

Name

Email *

Message *