相信大家对于grep都不陌生,或多或少都用过。
但大部分人可能都只用过最基本的字符匹配,而稍微复杂一点的用法没有使用过。
我们不追求过于复杂的参数用法,而是要了解grep还能干什么,有什么我平常没有用到的功能,从而能够提升我的工作效率。
比如我自己,很长一段时间,我都只会用:
grep -Enr 'xxx' file
如果工作中经常需要用到文本搜索,那么掌握grep更多用法就很有必要了。
命令格式
命令语法
grep [OPTION]... PATTERNS [FILE]...
命令选项
Pattern selection and interpretation:
-E, --extended-regexp PATTERNS are extended regular expressions
-F, --fixed-strings PATTERNS are strings
-G, --basic-regexp PATTERNS are basic regular expressions
-P, --perl-regexp PATTERNS are Perl regular expressions
-e, --regexp=PATTERNS use PATTERNS for matching
-f, --file=FILE take PATTERNS from FILE
-i, --ignore-case ignore case distinctions in patterns and data
--no-ignore-case do not ignore case distinctions (default)
-w, --word-regexp match only whole words
-x, --line-regexp match only whole lines
-z, --null-data a data line ends in 0 byte, not newline
Miscellaneous:
-s, --no-messages suppress error messages
-v, --invert-match select non-matching lines
-V, --version display version information and exit
--help display this help text and exit
Output control:
-m, --max-count=NUM stop after NUM selected lines
-b, --byte-offset print the byte offset with output lines
-n, --line-number print line number with output lines
--line-buffered flush output on every line
-H, --with-filename print file name with output lines
-h, --no-filename suppress the file name prefix on output
--label=LABEL use LABEL as the standard input file name prefix
-o, --only-matching show only nonempty parts of lines that match
-q, --quiet, --silent suppress all normal output
--binary-files=TYPE assume that binary files are TYPE;
TYPE is 'binary', 'text', or 'without-match'
-a, --text equivalent to --binary-files=text
-I equivalent to --binary-files=without-match
-d, --directories=ACTION how to handle directories;
ACTION is 'read', 'recurse', or 'skip'
-D, --devices=ACTION how to handle devices, FIFOs and sockets;
ACTION is 'read' or 'skip'
-r, --recursive like --directories=recurse
-R, --dereference-recursive likewise, but follow all symlinks
--include=GLOB search only files that match GLOB (a file pattern)
--exclude=GLOB skip files that match GLOB
--exclude-from=FILE skip files that match any file pattern from FILE
--exclude-dir=GLOB skip directories that match GLOB
-L, --files-without-match print only names of FILEs with no selected lines
-l, --files-with-matches print only names of FILEs with selected lines
-c, --count print only a count of selected lines per FILE
-T, --initial-tab make tabs line up (if needed)
-Z, --null print 0 byte after FILE name
Context control:
-B, --before-context=NUM print NUM lines of leading context
-A, --after-context=NUM print NUM lines of trailing context
-C, --context=NUM print NUM lines of output context
-NUM same as --context=NUM
--color[=WHEN],
--colour[=WHEN] use markers to highlight the matching strings;
WHEN is 'always', 'never', or 'auto'
-U, --binary do not strip CR characters at EOL (MSDOS/Windows)
关于正则表达式的模式选择(-E, -F, -G, -P)可参考知乎文章:
https://zhuanlan.zhihu.com/p/435815082
几个派系有所区别,我本人习惯用 -E (extended regular expressions)。
常用参数
-E, --extended-regexp 选择正则表达式的模式(扩展模式)
-i, --ignore-case 忽略大小写
-w, --word-regexp 单词匹配
-v, --invert-match 反转模式(输出不含有目标字符的文本)
-n, --line-number 打印匹配的行号
-r, --recursive 递归搜索(搜索对象是目录)
-c, --count 仅仅打印匹配的个数
案例
创建测试文件grep.txt, 内容如下:
$ cat grep.txt
Today is a good day, a sunny day, a wonderful day, a important day.
I am a boy, a good boy, a lovely boy.
I like reading, sports, and coding.
Enjoy coding.
基本搜索
搜索文本中含有字母'a'的所有行,
-E: 扩展模式
-n: 输出行号
可以看到其实reading/and中的a也配匹配到了。
grep -En "a" grep.txt
1:Today is a good day, a sunny day, a wonderful day, a important day.
2:I am a boy, a good boy, a lovely boy.
3:I like reading, sports, and coding.
现在限制为单词匹配,
加上参数 -w.
$ grep -Ewn "a" grep.txt
1:Today is a good day, a sunny day, a wonderful day, a important day.
2:I am a boy, a good boy, a lovely boy.
只显示匹配行的行数。
(注意不是a的个数,而是含有目标字符的行数)
$ grep -Ewnc "a" grep.txt
2
进阶搜索
匹配多个模式
也就是同时匹配多个文本或者字符。
同时匹配含有a或者boy的行;
$ grep -Ewn "a|boy" grep.txt
1:Today is a good day, a sunny day, a wonderful day, a important day.
2:I am a boy, a good boy, a lovely boy.
那如果我要匹配同时含有a和boy的行呢?
.表示匹配任意字符;
*表示任意个;
意思就是说a和boy中可以有任意字符;
$ grep -Ewn "a.*boy" grep.txt
2:I am a boy, a good boy, a lovely boy.
更多参数用法
比如有时候我比较关注发生匹配时前面几行或者后面几行,就可以使用
-A n; 打印模式匹配之后的n行;
-B n; 打印模式匹配之前的n行;
-C n; 打印模式匹配周围的n行;
$ grep -A 1 -Ewn "a.*boy" grep.txt
2:I am a boy, a good boy, a lovely boy.
3-I like reading, sports, and coding.
善用正则表达式组合
如果熟悉正则表达式,可以更加灵活的搜索自己想要的文本。
可参考正则表达式大全:https://www.cnblogs.com/fozero/p/7868687.html
比如搜索数字,字母;
比如匹配手机号,身份证号;
各种pattern都可以通过正则表达式达到要求。
标签:boy,grep,--,lines,文本处理,day,match,三剑客 From: https://www.cnblogs.com/bailiji/p/17644284.html