首页 > 系统相关 >Linux下regex.h知识点和使用样例

Linux下regex.h知识点和使用样例

时间:2022-11-25 11:33:18浏览次数:84  
标签:regex 知识点 匹配 regmatch 样例 rm type REG


查看:man regex.h

定位:find / -name regex.h 2>/dev/null


<regex.h>(P)               POSIX Programmer’s Manual              <regex.h>(P)



PROLOG
This manual page is part of the POSIX Programmer’s Manual. The Linux
implementation of this interface may differ (consult the corresponding
Linux manual page for details of Linux behavior), or the interface may
not be implemented on Linux.

NAME
regex.h - regular expression matching types

SYNOPSIS
#include <regex.h>

DESCRIPTION
The <regex.h> header shall define the structures and symbolic constants
used by the regcomp(), regexec(), regerror(), and regfree() functions.

The structure type 【regex_t】 shall contain at least the following member:


size_t re_nsub Number of parenthesized subexpressions.

The type size_t shall be defined as described in <sys/types.h> .

The type regoff_t shall be defined as a signed integer type that can
hold the largest value that can be stored in either a type off_t or
type ssize_t. The structure type regmatch_t shall contain at least the
following members:


regoff_t rm_so Byte offset from start of string
to start of substring.
regoff_t rm_eo Byte offset from start of string of the
first character after the end of substring.

Values for the 【cflags 】parameter to the regcomp() function are as fol-
lows:

REG_EXTENDED 设定使用扩展正则表达式
Use Extended Regular Expressions.

REG_ICASE 设定忽略大小写
Ignore case in match.

REG_NOSUB 设定不存储匹配后的结果
Report only success or fail in regexec().

REG_NEWLINE 设定识别换行,单行匹配。没有全文当一串匹配
Change the handling of <newline>.


Values for the 【eflags】 parameter to the regexec() function are as fol-
lows:

REG_NOTBOL 设定^作为指定的字符,不用于匹配字符串开头
The circumflex character ( ’^’ ), when taken as a special char-
acter, does not match the beginning of string.

REG_NOTEOL 设定$作为指定的字符,不用于匹配字符串尾部
The dollar sign ( ’$’ ), when taken as a special character, does
not match the end of string.


The following constants shall be defined as 【error return values】:

REG_NOMATCH 匹配不成功
regexec() failed to match.

REG_BADPAT 无效的正则表达式
Invalid regular expression.

REG_ECOLLATE 无效元素引用
Invalid collating element referenced.

REG_ECTYPE 无效字符串类型引用
Invalid character class type referenced.

REG_EESCAPE
Trailing ’\’ in pattern.

REG_ESUBREG \数字 无效或出错
Number in \digit invalid or in error.

REG_EBRACK []不成对匹配
"[]" imbalance.

REG_EPAREN "\(\)" or "()" 不成对匹配
"\(\)" or "()" imbalance.

REG_EBRACE "\{\}" 不成对匹配
"\{\}" imbalance.

REG_BADBR "\{\}"所填数据无效:不是数字,数字太大,数字多于两个,数字第一个大于第二个
Content of "\{\}" invalid: not a number, number too large, more
than two numbers, first larger than second.

REG_ERANGE 表达式范围内无效终结点
Invalid endpoint in range expression.

REG_ESPACE 内存超限
Out of memory.

REG_BADRPT 正则表达式’?’ , ’*’ , or ’+’使用错误,之前没有限定字符
’?’ , ’*’ , or ’+’ not preceded by valid regular expression.

REG_ENOSYS 保留
Reserved.


The following shall be declared as functions and may also be defined as
macros. Function prototypes shall be provided.


int regcomp(regex_t *restrict, const char *restrict, int);根据正则字符串 初始化成 程序规定格式的正则数据结构
(返回的数据结构,正则字符串,【cflags 】)
size_t regerror(int, const regex_t *restrict, char *restrict, size_t);错误获取

int regexec(const regex_t *restrict, const char *restrict, size_t,
regmatch_t[restrict], int);根据程序规定格式的正则数据结构 匹配 待匹配字符串
(正则数据结构,匹配字符串,存储匹配结果个数,存储匹配结果缓冲区数据结构,【eflags】)
void regfree(regex_t *);//释放空间

The implementation may define additional macros or constants using
names beginning with REG_.

The following sections are informative.

APPLICATION USAGE
None.

RATIONALE
None.

FUTURE DIRECTIONS
None.

SEE ALSO
<sys/types.h> , the System Interfaces volume of IEEE Std 1003.1-2001,
regcomp(), the Shell and Utilities volume of IEEE Std 1003.1-2001

COPYRIGHT
Portions of this text are reprinted and reproduced in electronic form
from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
-- Portable Operating System Interface (POSIX), The Open Group Base
Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
Electrical and Electronics Engineers, Inc and The Open Group. In the
event of any discrepancy between this version and the original IEEE and
The Open Group Standard, the original IEEE and The Open Group Standard
is the referee document. The original Standard can be obtained online
at http://www.opengroup.org/unix/online.html .



IEEE/The Open Group 2003 <regex.h>(P)



原来代码是C++的​​链接​​ http://blog.chinaunix.net/uid-28323465-id-4083290.html

更改一小部分后成为C的。

可以把正则表达式用vi保存,然后用od工具查看  查看命令:od -tx1 -c  file.txt

//编译 gcc regex_xjy.c
//运行 ./a.out
#include<sys/types.h>
#include<regex.h>
#include<string.h>
#include<stdio.h>
int main()
{
char *haa = "a very simple simple simple string";
char *regex = "([a-z]+)[ \t]([a-z]+)";
regex_t comment;
size_t nmatch;
int i;
int cnt;
char str[256];
regmatch_t regmatch[100];
regcomp(&comment, regex, REG_EXTENDED|REG_NEWLINE);
while(1)
{
int j = regexec(&comment,haa,sizeof(regmatch)/sizeof(regmatch_t),regmatch,0);
if(j != 0)
break;
for( i = 0; i< 100 && regmatch[i].rm_so!=-1;i++)
{
memset(str,sizeof(str),0);
cnt=regmatch[i].rm_eo-regmatch[i].rm_so;
printf("cnt=%d \t",cnt);
memcpy(str,&haa[regmatch[i].rm_so],cnt);
str[cnt]='\0';
printf("%s\n",str);
}
printf("cyc:**************%d \n",i);

if(regmatch[0].rm_so != -1)
haa+= regmatch[0].rm_eo;
}
regfree(&comment);
return 0;
}



标签:regex,知识点,匹配,regmatch,样例,rm,type,REG
From: https://blog.51cto.com/datrilla/5886072

相关文章

  • 图论知识点全明星
    NOIP考前攒rp。图论是是数学的一个分支,图是图论的主要研究对象。图(Graph)是由若干给定的顶点及连接两顶点的边所构成的图形,这种图形通常用来描述某些事物之间的某种特......
  • JAVA 相关知识点整理
    序号标题内容1 springboot请求设置 server:tomcat:#等待队列最大长度 accept-count:1000#最大工作线程数 max-threads:1000#最......
  • 知识点汇总和目录
    杂题乱写:AtCoderdp26题杂题2022vjudge上专题强化训练ARC&AGC\(\text{dp}\)方向:基础\(\text{dp}\):背包\(\text{dp}\),线性\(\text{dp}\),区间\(\text{dp}......
  • 字符编码,存储引擎及MySQL字段类型相关知识点
    字符编码,存储引擎及MySQL字段类型相关知识点一、字符编码1.在终端输入\s,查看数据库的基本信息(当前用户,版本,编码,端口号)2.默认的配置文件是my-default.ini拷贝上述的文......
  • css 不常用实用知识点
    1,:target伪类与:hover、:link、:visited、:focus等伪类的用法一样:target{color:blue}<divclass="box"><aclass="btn"href="#stop">stop</a><aclass="btn"href=......
  • python基础知识点
    目录字典列表字典a={}a['you']=['a','b']a['me']=['c','d']print(a)输出结果:{'you':['a','b'],'me':['c','d']}列表print([2]+[3])输出结果......
  • regexcrossword Hamlet篇
    一个练习正则表达式的网站,用类似数独的方式填写,但是规则是正则。从简单到难循序渐进,网址:https://regexcrossword.com偶尔访问不了基本的话,DoubleCross篇之前的做一遍,正......
  • 黑马程序员 学生管理系统中的一些数据验证知识点
    用户名长度必须在3-15位之间,只能是字母加数字的组合,但不能是纯数字publicstaticbooleancheckUsername(Stringusername){intlength=username.length();i......
  • 多线程与线程池知识点
    多线程https://www.cnblogs.com/empty-me/p/15664024.htmlJava多线程:向线程传递参数的三种方法......
  • JS对象RegExp2和JS对象RegExp3
    JS对象RegExp2:1.正则对象:1.创建1.varreg=new_RegExp(""正则表达式"");2.varreg=/正则表达式/;2.方法1.test(参数):验证指定的字符串是否符合正则定义的规范......