Saturday, April 6, 2013

How to write a file in HDFS using Hadoop

At times you may require to write to HDFS on your own rather than relying on hadoop framework's default way of writing outputs. i.e. if you want to create a file in some specific custom format and you need to write that file on your own from inside the Java code of your mapper or reducer, just as you would write a simple file in your local drive.
You know how to write a file in local file system, but what about writing in HDFS?
That is simple too!

Just follow the following sample code, it is pretty much self-explanatory.

import java.io.BufferedWriter;
import java.io.OutputStreamWriter;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

public class WriteToHDFS {

public static void main(String[] args) throws Exception {

if (args.length < 2) {
System.out.println("Usage: WriteToHDFS <hdfs-file-path-to-write-into> <text-to-write-in-file>");
System.out
.println("Example: WriteToHDFS 'hdfs:/localhost:9000/myFirstSelfWriteFile' 'Hello HDFS world'");
System.exit(-1);
}
try {
Path path = new Path(args[0]);
FileSystem fileSystem = FileSystem.get(new Configuration());
BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(fileSystem.create(path, true)));
bufferedWriter.write(args[1]);
bufferedWriter.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}

If you want to append to an already existing file in HDFS, then instead of using:
BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(fileSystem.create(path, true)));

use, the following:
BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(fileSystem.append(path, true)));

4 comments:

  1. Thanks a lot for the code. However, I have tried to use append(path, true) but it doesn't exit.
    I have also tried to use append(path) but then I got an exception which says:
    java.io.IOException: Not Supported

    Anyone knows how can I append a file in HDFS?
    THANKS a lot

    ReplyDelete
  2. Ok, Try using append(Path f, int bufferSize, Progressable progress), see the Javadoc here : http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html#append%28org.apache.hadoop.fs.Path%29

    ReplyDelete
  3. […] on my post on How to write a file in HDFS using Hadoop, here is the simple code to read a file from […]

    ReplyDelete
  4. Thoughtful piece ! my colleague last year used https://goo.gl/Gmr1vg to arrange pdf - It's phenominal easy to get the hang of and it's economical , I was informed they offer a 30 day promotion currently

    ReplyDelete

Any feedback, good or bad is most welcome.

Name

Email *

Message *