Mastering Hadoop 3
上QQ阅读APP看书,第一时间看更新

HDFSFileSystemWrite.java

The following is the code for HDFS write:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;

import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.URI;

public class HDFSFileSystemWrite {
public static void main(String[] args) throws IOException {
String sourceURI = args[0];
String targetURI = args[1];

Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(URI.create(targetURI), conf);

FSDataOutputStream out = null;
InputStream in = new BufferedInputStream(new FileInputStream(sourceURI));
try {
out = fs.create(new Path(targetURI));
IOUtils.copyBytes(in, out, 4096, false);

} finally {
in.close();
out.close();
}
}
}

FSDataOutputStream has a method for returning the current position of a file but, unlike the read operation, writing to a file in HDFS cannot start from any position other than the end of the file. FileSystem also provides a method for creating directories. mkdirs() is used to create directories, and there are a number of overloaded versions available inside the FileSystem class:

public boolean mkdirs(Path f) throws IOException

This method will create all the parent directories if they do not exist. Remember that, while creating a file using create(), you don't need to explicitly call mkdirs() because create() automatically creates directories in the path if they don't exist.