查看:man regex.h
定位:find / -name regex.h 2>/dev/null
<regex.h>(P) POSIX Programmer’s Manual <regex.h>(P)
PROLOG
This manual page is part of the POSIX Programmer’s Manual. The Linux
implementation of this interface may differ (consult the corresponding
Linux manual page for details of Linux behavior), or the interface may
not be implemented on Linux.
NAME
regex.h - regular expression matching types
SYNOPSIS
#include <regex.h>
DESCRIPTION
The <regex.h> header shall define the structures and symbolic constants
used by the regcomp(), regexec(), regerror(), and regfree() functions.
The structure type 【regex_t】 shall contain at least the following member:
size_t re_nsub Number of parenthesized subexpressions.
The type size_t shall be defined as described in <sys/types.h> .
The type regoff_t shall be defined as a signed integer type that can
hold the largest value that can be stored in either a type off_t or
type ssize_t. The structure type regmatch_t shall contain at least the
following members:
regoff_t rm_so Byte offset from start of string
to start of substring.
regoff_t rm_eo Byte offset from start of string of the
first character after the end of substring.
Values for the 【cflags 】parameter to the regcomp() function are as fol-
lows:
REG_EXTENDED 设定使用扩展正则表达式
Use Extended Regular Expressions.
REG_ICASE 设定忽略大小写
Ignore case in match.
REG_NOSUB 设定不存储匹配后的结果
Report only success or fail in regexec().
REG_NEWLINE 设定识别换行,单行匹配。没有全文当一串匹配
Change the handling of <newline>.
Values for the 【eflags】 parameter to the regexec() function are as fol-
lows:
REG_NOTBOL 设定^作为指定的字符,不用于匹配字符串开头
The circumflex character ( ’^’ ), when taken as a special char-
acter, does not match the beginning of string.
REG_NOTEOL 设定$作为指定的字符,不用于匹配字符串尾部
The dollar sign ( ’$’ ), when taken as a special character, does
not match the end of string.
The following constants shall be defined as 【error return values】:
REG_NOMATCH 匹配不成功
regexec() failed to match.
REG_BADPAT 无效的正则表达式
Invalid regular expression.
REG_ECOLLATE 无效元素引用
Invalid collating element referenced.
REG_ECTYPE 无效字符串类型引用
Invalid character class type referenced.
REG_EESCAPE
Trailing ’\’ in pattern.
REG_ESUBREG \数字 无效或出错
Number in \digit invalid or in error.
REG_EBRACK []不成对匹配
"[]" imbalance.
REG_EPAREN "\(\)" or "()" 不成对匹配
"\(\)" or "()" imbalance.
REG_EBRACE "\{\}" 不成对匹配
"\{\}" imbalance.
REG_BADBR "\{\}"所填数据无效:不是数字,数字太大,数字多于两个,数字第一个大于第二个
Content of "\{\}" invalid: not a number, number too large, more
than two numbers, first larger than second.
REG_ERANGE 表达式范围内无效终结点
Invalid endpoint in range expression.
REG_ESPACE 内存超限
Out of memory.
REG_BADRPT 正则表达式’?’ , ’*’ , or ’+’使用错误,之前没有限定字符
’?’ , ’*’ , or ’+’ not preceded by valid regular expression.
REG_ENOSYS 保留
Reserved.
The following shall be declared as functions and may also be defined as
macros. Function prototypes shall be provided.
int regcomp(regex_t *restrict, const char *restrict, int);根据正则字符串 初始化成 程序规定格式的正则数据结构
(返回的数据结构,正则字符串,【cflags 】)
size_t regerror(int, const regex_t *restrict, char *restrict, size_t);错误获取
int regexec(const regex_t *restrict, const char *restrict, size_t,
regmatch_t[restrict], int);根据程序规定格式的正则数据结构 匹配 待匹配字符串
(正则数据结构,匹配字符串,存储匹配结果个数,存储匹配结果缓冲区数据结构,【eflags】)
void regfree(regex_t *);//释放空间
The implementation may define additional macros or constants using
names beginning with REG_.
The following sections are informative.
APPLICATION USAGE
None.
RATIONALE
None.
FUTURE DIRECTIONS
None.
SEE ALSO
<sys/types.h> , the System Interfaces volume of IEEE Std 1003.1-2001,
regcomp(), the Shell and Utilities volume of IEEE Std 1003.1-2001
COPYRIGHT
Portions of this text are reprinted and reproduced in electronic form
from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
-- Portable Operating System Interface (POSIX), The Open Group Base
Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
Electrical and Electronics Engineers, Inc and The Open Group. In the
event of any discrepancy between this version and the original IEEE and
The Open Group Standard, the original IEEE and The Open Group Standard
is the referee document. The original Standard can be obtained online
at http://www.opengroup.org/unix/online.html .
IEEE/The Open Group 2003 <regex.h>(P)
原来代码是C++的链接 http://blog.chinaunix.net/uid-28323465-id-4083290.html
更改一小部分后成为C的。
可以把正则表达式用vi保存,然后用od工具查看 查看命令:od -tx1 -c file.txt
//编译 gcc regex_xjy.c
//运行 ./a.out
#include<sys/types.h>
#include<regex.h>
#include<string.h>
#include<stdio.h>
int main()
{
char *haa = "a very simple simple simple string";
char *regex = "([a-z]+)[ \t]([a-z]+)";
regex_t comment;
size_t nmatch;
int i;
int cnt;
char str[256];
regmatch_t regmatch[100];
regcomp(&comment, regex, REG_EXTENDED|REG_NEWLINE);
while(1)
{
int j = regexec(&comment,haa,sizeof(regmatch)/sizeof(regmatch_t),regmatch,0);
if(j != 0)
break;
for( i = 0; i< 100 && regmatch[i].rm_so!=-1;i++)
{
memset(str,sizeof(str),0);
cnt=regmatch[i].rm_eo-regmatch[i].rm_so;
printf("cnt=%d \t",cnt);
memcpy(str,&haa[regmatch[i].rm_so],cnt);
str[cnt]='\0';
printf("%s\n",str);
}
printf("cyc:**************%d \n",i);
if(regmatch[0].rm_so != -1)
haa+= regmatch[0].rm_eo;
}
regfree(&comment);
return 0;
}