首页 > 其他分享 >hadoop hdfs的一些用法

hadoop hdfs的一些用法

时间:2023-09-21 10:03:32浏览次数:49  
标签:hdfs his args Hadoop hadoop 用法 new public out


Example 3-1. Displaying files from a Hadoop filesystem on standard output using a
URLStreamHandler


Java代码

//Reading Data from a Hadoop URL

public class URLCat {
	static {
		URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory());
	}

	public static void main(String[] args) throws Exception {
		InputStream in = null;
		try {
			in = new URL(args[0]).openStream();
			IOUtils.copyBytes(in, System.out, 4096, false);
		} finally {
			IOUtils.closeStream(in);
		}
	}
}
-----------------------------------------
result:
Here’s a sample run:
% hadoop URLCat hdfs://localhost/user/tom/quangle.txt
On the top of the Crumpetty Tree
The Quangle Wangle sat,
But his face you could not see,
On account of his Beaver Hat.


Example 3-2. Displaying files from a Hadoop filesystem on standard output by using the FileSystem
directly


Java代码

public class FileSystemCat {
	public static void main(String[] args) throws Exception {
		String uri = args[0];
		Configuration conf = new Configuration();
		FileSystem fs = FileSystem.get(URI.create(uri), conf);
		InputStream in = null;
		try {
			in = fs.open(new Path(uri));
			IOUtils.copyBytes(in, System.out, 4096, false);
		} finally {
			IOUtils.closeStream(in);
		}
	}
}

------------------------------------------
The program runs as follows:
% hadoop FileSystemCat hdfs://localhost/user/tom/quangle.txt
On the top of the Crumpetty Tree
The Quangle Wangle sat,
But his face you could not see,
On account of his Beaver Hat.
The


Example 3-3 is a simple extension of Example 3-2 that writes a file to standard out
twice: after writing it once, it seeks to the start of the file and streams through it once
again.


Java代码



//Example 3-3. Displaying files from a Hadoop filesystem on standard output twice, by using seek

public class FileSystemDoubleCat {
	public static void main(String[] args) throws Exception {
		String uri = args[0];
		Configuration conf = new Configuration();
		FileSystem fs = FileSystem.get(URI.create(uri), conf); //通过get()方法获得一个FileSystem流
		FSDataInputStream in = null;
		try {
			in = fs.open(new Path(uri)); //通过open()方法打开一个FSDataInputStream流
			IOUtils.copyBytes(in, System.out, 4096, false);
			in.seek(0); // go back to the start of the file
			IOUtils.copyBytes(in, System.out, 4096, false);
		} finally {
			IOUtils.closeStream(in);
		}
	}
}
----------------------------------------------------
Here’s the result of running it on a small file:
% hadoop FileSystemDoubleCat hdfs://localhost/user/tom/quangle.txt
On the top of the Crumpetty Tree
The Quangle Wangle sat,
But his face you could not see,
On account of his Beaver Hat.
On the top of the Crumpetty Tree
The Quangle Wangle sat,
But his face you could not see,
On account of his Beaver Hat.


Example 3-4 shows how to copy a local file to a Hadoop filesystem. We illustrate progress
by printing a period every time the progress() method is called by Hadoop, which
is after each 64 K packet of data is written to the datanode pipeline. (Note that this
particular behavior is not specified by the API, so it is subject to change in later versions
of Hadoop. The API merely allows you to infer that “something is happening.”)


Java代码

1.  //Example 3-4. Copying a local file to a Hadoop filesystem, and shows progress   
2.  public class FileCopyWithProgress {   
3.      public static void main(String[] args) throws Exception {   
4.          String localSrc = args[0];   
5.          String dst = args[1];   
6.          InputStream in = new BufferedInputStream(new FileInputStream(localSrc));   
7.          Configuration conf = new Configuration();   
8.          FileSystem fs = FileSystem.get(URI.create(dst), conf);   
9.          OutputStream out = fs.create(new Path(dst), new Progressable() {   
10.              public void progress() {   
11.                  System.out.print(".");   
12.              }   
13.          });   
14.          IOUtils.copyBytes(in, out, 4096, true);   
15.      }   
16.  }   
17.     
18.  Typical usage:   
19. % hadoop FileCopyWithProgress input/docs/1400-8.txt hdfs://localhost/user/tom/1400-8.txt

标签:hdfs,his,args,Hadoop,hadoop,用法,new,public,out
From: https://blog.51cto.com/u_16255870/7548704

相关文章

  • hadoop,hbase,hive安装全记录
    操作系统:CentOS5.5Hadoop:hadoop-0.20.203.0jdk1.7.0_01namenode主机名:master,namenode的IP:10.10.102.15datanode主机名:slave1,datanode的IP:10.10.106.8datanode主机名:slave2,datanode的IP:10.10.106.9一、hadoop安装1、建立用户useraddhadooppasswdhadoop2.安装JDK*先查......
  • HDFS高可用架构
    1HDFS高可用架构原理1.1HDFS的基本架构NameNode负责响应客户端的请求,负责管理整个文件系统的元数据HDFS的读、写操作都必须向NameNode申请,元数据非常关键负责维持文件副本的数据SecondNameNode是为了帮助NameNode合并编辑日志,减少NameNode启动时间。另外NamNode的元数据......
  • VS2022插件用法大全
    C#MethodsCodeSnippetsC#方法片段代码在代码区直接输入片段关键字+Tab,即可快速生成想要的方法签名https://marketplace.visualstudio.com/items?itemName=jsakamoto.CMethodsCodeSnippetsmethod普通方法imethod接口方法(没有方法体实现)vmethod虚方法smethod静态方法xmet......
  • 【面试题精讲】JavaOptional用法
    有的时候博客内容会有变动,首发博客是最新的,其他博客地址可能会未同步,认准https://blog.zysicyj.top首发博客地址文章更新计划系列文章地址Java8引入了Optional类,用于解决空指针异常(NullPointerException)的问题。Optional是一个容器类,可以包含一个非空的值或者表示值......
  • hadoop权威指南
    Hadoop权威指南第1部分Hadoop基础知识第2章关于MapReduceMapReduce分为两个阶段,map阶段和reduce阶段。map函数是数据准备阶段,它会准备好一个键值对的数据集合,然后交由reduce函数来处理,比如进行排序、分组、聚合等操作。MapReduce处理示例,每年全球记录的最高记录是多少?Map阶......
  • java stream流的高端用法
    并行流(ParallelStream):Stream提供了parallel()方法,可以将普通的顺序流转换为并行流,以便使用多线程并发执行操作。例如:list.parallelStream().filter(...).map(...).forEach(...);并行流适用于对大规模数据进行操作,并且可以通过并行计算充分利用多核处理器的能力......
  • java stream流的高端用法
    并行流(ParallelStream):Stream提供了parallel()方法,可以将普通的顺序流转换为并行流,以便使用多线程并发执行操作。例如:list.parallelStream().filter(...).map(...).forEach(...);并行流适用于对大规模数据进行操作,并且可以通过并行计算充分利用多核处理器的能力......
  • Ansible专栏文章之二:初入Ansible世界,用法概览和初体验
    回到:Ansible系列文章各位读者,请您:由于Ansible使用Jinja2模板,它的模板语法{%raw%}{{}}{%endraw%}和{%raw%}{%%}{%endraw%}和我博客系统hexo的模板使用的符号一样,在渲染时会产生冲突,尽管我尽我努力地花了大量时间做了调整,但无法保证已经全部都调整。因此,如果各位阅......
  • (笔记)机器人坐标系用法和算法原理
     机器人坐标系 一、基坐标系机器人都有一个不会变的坐标系,叫基坐标系或世界坐标系(每家叫法不同,原理一样)。基坐标系是怎么来的呢? 拿6轴机器人举例: 第一轴的旋转轴 一般都会定义机器人第一轴的旋转轴为基坐标系Z轴,旋转中心即是坐标系原点,X和Y的方向是的电机零点......
  • pandas学习-函数drop_duplicates的用法
    pandas函数drop_duplicates用于去除DataFrame中的重复行。语法:DataFrame.drop_duplicates(subset=None,keep='first',inplace=False)参数说明:subset:指定要考虑的列名或列名的列表。默认值为None,表示考虑所有列。keep:指定保留哪个重复的行。可选值为'first'(保留第一个出现......