hdfs du命令是算的一份数据

时间：2023-07-04 19:31:55浏览次数：51

标签：files hdfs fs HDFS storage hadoop directory du 一份

As you can see, hadoop fsck and hadoop fs -dus report the effective HDFS storage space used, i.e. they show the “normal” file size (as you would see on a local filesystem) and do not account for replication in HDFS. In this case, the directory path/to/directory has stored data with a size of 16565944775310 bytes (15.1 TB). Now fsck tells us that the average replication factor for all files in path/to/directory is exactly 3.0 This means that the total raw HDFS storage space used by these files – i.e. factoring in replication – is actually: 1
3.0 x 16565944775310 (15.1 TB) = 49697834325930 Bytes (45.2 TB)
This is how much HDFS storage is consumed by files in path/to/directory

hdfs du命令是算的一份数据

If you never change the default value of 3 for the HDFS replication count of any files you store in your Hadoop cluster, this means in a nutshell that you should always multiply the numbers reported by hadoop fsck or hadoop fs -dus times 3 when you want to reason about HDFS space quotas.

参考：

http://www.michael-noll.com/blog/2011/10/20/understanding-hdfs-quotas-and-hadoop-fs-and-fsck-tools/

stackoverflow也有回答

https://stackoverflow.com/questions/11574410/how-to-find-the-size-of-a-hdfs-file

hadoop fs -dus /user/frylock/input
and you would get back the total size (in bytes) of all of the files in the "/user/frylock/input" directory.

Also, keep in mind that HDFS stores data redundantly so the actual physical storage used up by a file might be 3x or more than what is reported by hadoop fs -ls and hadoop fs -dus.

du得出的是一份数据。如果要得到数据存储空间就是得到平均副本数，然后平均副本数 * du得到的大小就是数据占空间大小。

标签：files,hdfs,fs,HDFS,storage,hadoop,directory,du,一份
From： https://blog.51cto.com/u_11908275/6624578

APScheduler
APSchedulerAPScheduler（advancededpythonscheduler）是一款Python开发的定时任务工具。文档地址https://apscheduler.readthedocs.io/en/latest/userguide.html#starting-the-scheduler特点：不依赖于Linux系统的crontab系统定时，独立运行可以动态添加新的定时任务，如下单......
A Go library implementing an FST (finite state transducer)——mark下
https://github.com/couchbaselabs/vellumBuildinganFSTTobuildanFST,createanewbuilderusingthe New() method.Thismethodtakesan io.Writer asanargument.AstheFSTisbeingbuilt,datawillbestreamedtothewriterassoonaspossible.Withthi......
JavaScript 数组的 reduce 方法有哪些应用
JavaScript数组的reduce方法有哪些应用JavaScript中的reduce()方法可以用于将数组元素汇总为单个值，它接受一个回调函数作为参数，并在每个数组元素上调用该函数，以便将其累加到一个累加器变量中。下面是一些实际应用：数组求和：使用reduce()方法将数组元素相加，从而计算数组的总......
【雕爷学编程】Arduino动手做（149）---MAX9814咪头传感器模块
37款传感器与执行器的提法，在网络上广泛流传，其实Arduino能够兼容的传感器模块肯定是不止这37种的。鉴于本人手头积累了一些传感器和执行器模块，依照实践出真知（一定要动手做）的理念，以学习和交流为目的，这里准备逐一动手尝试系列实验，不管成功（程序走通）与否，都会记录下来—小小的进步或是搞......
巨详细的一份Python学习路径文档--如何精准的入手Python
当谈论学习Python编程时，有许多不同的途径和资源可供选择。为了帮助你建立起一个学习Python的有效路线，下面是一个包含基本步骤和关键资源的建议。「请注意，这只是一个指南，你可以根据自己的兴趣和学习风格进行调整。」简章「确定学习目标：」明确自己学习Python的目的和用途。是为了数......
Spring Boot集成Dubbo 3.X
关注王有志，一个分享硬核Java技术的互金摸鱼侠欢迎加入Java人的提桶跑路群：共同富裕的Java人上一篇我们一起认识了Dubbo与RPC，今天我们就来一起学习如何使用Dubbo，并将Dubbo集成到SpringBoot的项目中。我们来看下今天要使用到的软件及版本：软件版本说明Java11Spri......
11-MapReduce(3)
1.Counter计数器1.1概述在执行MapReduce程序的时候，控制台输出信息中通常有下面所示片段内容：FileSystemCountersFILE:Numberofbytesread=136988FILE:Numberofbyteswritten=589973FILE:Numberofreadoperations=0FILE......
Cisco Catalyst 8000 Series Edge Platforms, IOS XE Release Dublin-17.11.01a ED
CiscoCatalyst8000SeriesEdgePlatforms,IOSXEReleaseDublin-17.11.01aEDCiscoCatalyst8000边缘平台系列请访问原文链接：https://sysin.org/blog/cisco-catalyst-8000/，查看最新版。原创作品，转载请保留出处。作者主页：sysin.orgCiscoCatalyst8000：随心所欲访问位于......
Cisco Catalyst 9800-CL Wireless Controller for Cloud, Release Dublin-17.11.01 ED
CiscoCatalyst9800-CLWirelessControllerforCloud,ReleaseDublin-17.11.01ED面向云的思科Catalyst9800-CL无线控制器，专为基于意图的网络全新打造请访问原文链接：https://sysin.org/blog/cisco-catalyst-9800-cl/，查看最新版。原创作品，转载请保留出处。作者主页：sysin.......
Cisco Catalyst 9000 Series Switches, IOS-XE Release Dublin-17.11.1 ED
CiscoCatalyst9000SeriesSwitches,IOS-XEReleaseDublin-17.11.1EDCiscoCatalyst9000交换产品系列请访问原文链接：https://sysin.org/blog/cisco-catalyst-9000/，查看最新版。原创作品，转载请保留出处。作者主页：sysin.org接入和核心交换机与Wi-Fi6解决方案的产品组......

hdfs du命令是算的一份数据

相关文章

赞助商

阅读排行