the command line 命令行用于数据分析
is not just for installing software, configuring systems, and searching files.
数据输入--数据变换-数据输出- take data as input, do something to it, and print the result.
(1) obtaining data, (2) scrubbing data, (3) exploring data, (4) modeling data, and (5) interpreting data
discusses more than 90 command-line tools
命令行:
explore, and model data
awk sed grep
sort cut
流程--
01.获取数据--从其他地方下载到本地,查询数据库后者API的数据,从文件系统提取数据,产生自定义数据
format: plain text, CSV, JSON, HTML, or XML format.
02.数据规整: 数据清洗,数据格式整理-过滤-提取-替换-处理缺失值和重复值,格式转换
03.数据探索:描述性统计,示例
04.数据建模:
05.数展示: 提取结论-沟通数据结果
参考
https://jeroenjanssens.com/dsatcl/chapter-1-introduction.html#obtaining-data
标签:数据分析,format,--,data,命令行,Linux,数据
From: https://www.cnblogs.com/ytwang/p/17554546.html