Java操作HDFS文件系统
生活随笔
收集整理的這篇文章主要介紹了
Java操作HDFS文件系统
小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
對于操作HDFS文件系統(tǒng),需要有一個入口,對于Hadoop來說,編程入口就是FileSystem。
例如我們要使用API創(chuàng)建一個文件夾:
FileSystem.get(new URI("hdfs://vmware-ubuntu-1:9000"),configuration,"myusername");用戶名必須要寫,否則提示沒有權(quán)限
常用的API操作
package hadoop.hdfs;import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.*; import org.apache.hadoop.io.IOUtils; import org.apache.hadoop.util.Progressable; import org.junit.After; import org.junit.Before; import org.junit.Test;import java.io.*; import java.net.URI; import java.net.URISyntaxException;/*** 使用Java API操作HDFS文件系統(tǒng)* 關(guān)鍵點:* 1) 創(chuàng)建Configuration* 2) 獲取FileSystem* 3) HDFS API 操作*/ public class HDFSAPP {public static final String HDFS_PATH = "hdfs://swarm-worker1:9000";FileSystem fileSystem = null;Configuration configuration = null;/*** 構(gòu)造一個訪問指定HDFS系統(tǒng)的客戶端對象* 第一個參數(shù):HDFS的URI* 第二個參數(shù):客戶端指定的配置參數(shù)* 第三個參數(shù):客戶端的身份,即用戶名** @throws URISyntaxException*/@Beforepublic void setUp() throws URISyntaxException, IOException, InterruptedException {System.out.println("--------setUp---------");configuration = new Configuration();configuration.set("dfs.replication", "1");fileSystem = FileSystem.get(new URI("hdfs://swarm-worker1:9000"), configuration, "iie4bu");}/*** 創(chuàng)建HDFS文件夾** @throws IOException*/@Testpublic void mkdir() throws IOException {Path path = new Path("/hdfsapi/test/myDir");boolean mkdirs = fileSystem.mkdirs(path);System.out.println(mkdirs);}/*** 查看HDFS內(nèi)容*/@Testpublic void text() throws IOException {FSDataInputStream in = fileSystem.open(new Path("/hdfsapi/test/a.txt"));IOUtils.copyBytes(in, System.out, 1024);in.close();}/*** 創(chuàng)建文件** @throws Exception*/@Testpublic void create() throws Exception {FSDataOutputStream out = fileSystem.create(new Path("/hdfsapi/test/b.txt"));out.writeUTF("hello world b");out.flush();out.close();}@Testpublic void testReplication() {System.out.println(configuration.get("dfs.replication"));}/*** 重命名文件** @throws Exception*/@Testpublic void rename() throws Exception {Path src = new Path("/hdfsapi/test/b.txt");Path dst = new Path("/hdfsapi/test/c.txt");boolean rename = fileSystem.rename(src, dst);System.out.println(rename);}/*** 小文件* 拷貝本地文件到HDFS文件系統(tǒng)* 將本地的E:/test/uid_person.txt文件拷貝到hdfs上的路徑/hdfsapi/test/下** @throws Exception*/@Testpublic void copyFromLocalFile() throws Exception {Path src = new Path("E:/test/uid_person.txt");Path dst = new Path("/hdfsapi/test/myDir/");fileSystem.copyFromLocalFile(src, dst);}/*** 大文件* 拷貝本地文件到HDFS文件系統(tǒng):帶進(jìn)度* 將本地的E:/test/uid_person.txt文件拷貝到hdfs上的路徑/hdfsapi/test/my.txt** @throws Exception*/@Testpublic void copyFromLocalBigFile() throws Exception {InputStream in= new BufferedInputStream(new FileInputStream(new File("E:/tools/linux/jdk-8u101-linux-x64.tar.gz")));FSDataOutputStream out = fileSystem.create(new Path("/hdfsapi/test/jdk.tar.gz"), new Progressable() {public void progress() {System.out.print(".");}});IOUtils.copyBytes(in, out, 4096);}/*** 拷貝HDFS文件到本地:下載* @throws Exception*/@Testpublic void copyToLocalFile() throws Exception {Path src = new Path("/hdfsapi/test/c.txt");Path dst = new Path("E:/test/a.txt");fileSystem.copyToLocalFile(src, dst);}/*** 查看目標(biāo)文件夾下的所有文件* @throws Exception*/@Testpublic void listFile() throws Exception {Path path = new Path("/hdfsapi/test/");FileStatus[] fileStatuses = fileSystem.listStatus(path);for(FileStatus file: fileStatuses) {String isDir = file.isDirectory() ? "文件夾": "文件";String permission = file.getPermission().toString();short replication = file.getReplication();long len = file.getLen();String stringPath = file.getPath().toString();System.out.println("isDir:" + isDir + ", " + "permission: " + permission + ", " + "replication: " + replication + " , len: " + len + ", stringPath" + stringPath);}}/*** 遞歸查看目標(biāo)文件夾下的所有文件* @throws Exception*/@Testpublic void listFileRecursive() throws Exception {Path path = new Path("/hdfsapi/test/");RemoteIterator<LocatedFileStatus> files = fileSystem.listFiles(path, true);while (files.hasNext()) {LocatedFileStatus file = files.next();String isDir = file.isDirectory() ? "文件夾": "文件";String permission = file.getPermission().toString();short replication = file.getReplication();long len = file.getLen();String stringPath = file.getPath().toString();System.out.println("isDir:" + isDir + ", " + "permission: " + permission + ", " + "replication: " + replication + " , len: " + len + ", stringPath" + stringPath);}}/*** 查看文件塊信息* @throws Exception*/@Testpublic void getFileBlockLocations() throws Exception {Path path = new Path("/hdfsapi/test/jdk.tar.gz");FileStatus fileStatus = fileSystem.getFileStatus(path);BlockLocation[] blocks = fileSystem.getFileBlockLocations(fileStatus, 0, fileStatus.getLen());for(BlockLocation block: blocks) {for (String name: block.getNames()) {System.out.println(name + ": " + block.getOffset() + ": " + block.getLength());}}}/*** 刪除文件* @throws Exception*/@Testpublic void delete() throws Exception {boolean delete = fileSystem.delete(new Path("/hdfsapi/test/a.txt"), true);System.out.println(delete);}@Afterpublic void tearDown() {configuration = null;fileSystem = null;System.out.println("--------tearDown---------");}}總結(jié)
以上是生活随笔為你收集整理的Java操作HDFS文件系统的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 本地操作HDFS报错:java.net.
- 下一篇: Flink 读取文本文件,聚合每一行的u