JAVA连接HDFS使用案例
一、引言
Hadoop分布式文件系统(HDFS)是大数据存储的基础。对于Java开发者来说,能够通过Java代码操作HDFS是处理大数据任务的关键技能。本文将通过几个简单的示例,展示如何使用Java连接HDFS并执行一些基本的文件操作。
二、连接HDFS
1、第一步:添加依赖
在Maven项目中,需要添加Hadoop客户端依赖:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.8.2</version>
</dependency>
2、第二步:创建连接
创建HDFS连接的基本代码如下:
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(new URI("hdfs://localhost:9000"), conf);
三、操作HDFS
1、检查文件是否存在
public static boolean test(Configuration conf, String path) {
try (FileSystem fs = FileSystem.get(conf)) {
return fs.exists(new Path(path));
} catch (IOException e) {
e.printStackTrace();
return false;
}
}
2、在文件开头或末尾插入内容
public static void insertContent(String filePath, String content, boolean insertAtBeginning) throws URISyntaxException {
try {
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(new URI("hdfs://localhost:9000"), conf);
Path path = new Path(filePath);
if (fs.exists(path)) {
BufferedReader reader = new BufferedReader(new InputStreamReader(fs.open(path)));
StringBuilder originalContent = new StringBuilder();
String line;
while ((line = reader.readLine()) != null) {
originalContent.append(line).append("\n");
}
reader.close();
FSDataOutputStream outputStream = fs.create(path, true);
if (insertAtBeginning) {
outputStream.write(content.getBytes());
outputStream.write(originalContent.toString().getBytes());
} else {
outputStream.write(originalContent.toString().getBytes());
outputStream.write(content.getBytes());
}
outputStream.close();
} else {
FSDataOutputStream outputStream = fs.create(path);
outputStream.write(content.getBytes());
outputStream.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
3、删除文件
public static void main(String[] args) throws IOException, URISyntaxException {
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(new URI("hdfs://localhost:9000"), conf);
String name = "hdfs://localhost:9000/a/a.txt";
boolean res = fs.delete(new Path(name), false);
if (res) {
System.out.println("File deleted successfully.");
} else {
System.out.println("File deletion failed.");
}
}
4、移动文件
public static void moveFile(String sourcePath, String destinationPath) {
try {
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(new URI("hdfs://localhost:9000"), conf);
Path src = new Path(sourcePath);
Path dst = new Path(destinationPath);
if (!fs.exists(src)) {
System.out.println("Source file does not exist.");
return;
}
if (!fs.exists(dst.getParent())) {
fs.mkdirs(dst.getParent());
}
boolean success = fs.rename(src, dst);
if (success) {
System.out.println("File moved successfully.");
} else {
System.out.println("File move failed.");
}
} catch (Exception e) {
e.printStackTrace();
}
}
四、总结
通过上述示例,我们可以看到Java连接HDFS并执行基本文件操作的过程相对直接。这些操作包括检查文件是否存在、在文件开头或末尾插入内容、删除文件以及移动文件。掌握这些基本操作对于处理大数据任务至关重要。
版权声明:本博客内容为原创,转载请保留原文链接及作者信息。
标签:HDFS,fs,JAVA,案例,outputStream,conf,FileSystem,new From: https://blog.csdn.net/NiNg_1_234/article/details/142603582