Have you ever thought of this?
We do provide the jar to be executed by hadoop while executing hadoop command, don't we?
$ hadoop jar/some-jar.jar
So, why do we need to have the following line of code in our Driver section while declaring the Job object properties:
job.setJarByClass(SomeClass.class);
Answer to that is very simple. Here you help Hadoop to find out that which jar it should send to nodes to perform Map and Reduce tasks. Your some-jar.jar might have various other jars in it's classpath, also your driver code might be in a separate jar than that of your Mapper and Reducer classes.
Hence, using this setJarByClass method we tell Hadoop to find out the relevant jar by finding out that the class specified as it's parameter to be present as part of that jar. So usually you should provide either MapperImplementation.class or your Reducer implementation or any other class which is present in the same jar as that pf Mapper and Reducer. Also make sure that both Mapper and Reducer are part of the same jar.
We do provide the jar to be executed by hadoop while executing hadoop command, don't we?
$ hadoop jar
So, why do we need to have the following line of code in our Driver section while declaring the Job object properties:
job.setJarByClass(SomeClass.class);
Answer to that is very simple. Here you help Hadoop to find out that which jar it should send to nodes to perform Map and Reduce tasks. Your some-jar.jar might have various other jars in it's classpath, also your driver code might be in a separate jar than that of your Mapper and Reducer classes.
Hence, using this setJarByClass method we tell Hadoop to find out the relevant jar by finding out that the class specified as it's parameter to be present as part of that jar. So usually you should provide either MapperImplementation.class or your Reducer implementation or any other class which is present in the same jar as that pf Mapper and Reducer. Also make sure that both Mapper and Reducer are part of the same jar.