re(正则)
一、正则表达式作用
正则表达式匹配指定规则的字符串
二、re常用方法
- findall(pattern, string, flags=0):使用正则表达式,匹配所有符合条件的字符串,返回匹配到的所有子串,返回list
pattern:正则表达式
string:匹配的字符串
flags=0:自定义的一些规则,比如不区分大小写
string = "to1212ken132435testr"
re_demo = "\D"
res4 = re.findall(re_demo,string)
print(res4) #输出:['t', 'o', 'k', 'e', 'n', 't', 'e', 's', 't', 'r']
- match(pattern, string, flags=0):匹配字符串开始位置的子串,返回的是对象,需要通过结果处理函数(group())返回;如果开始位置未匹配到就会返回none
string = "to1212ken132435testr"
re_demo = "\D"
res4 = re.match(re_demo,string).group()
print(res4) #输出:t
- search(pattern, string, flags=0):匹配找到的第一个符合条件的字符,返回的是对象,需要通过结果处理函数(group())返回
string = "111212ken132435testr"
re_demo = "\D"
res5 = re.search(re_demo,string).group()
print(res5) #输出:k
- finditer(pattern, string, flags=0):匹配所有符合条件的子串,返回他们的迭代器对象(需要用for循环遍历,遍历的结果也需要通过结果处理函数(group())返回)
string = "to1212ken132435testr"
re_demo = "\D"
res5 = re.finditer(re_demo,string)
print(res5) #输出:<callable_iterator object at 0x0000016CAD67DD60>
for i in res5:
print(i.group()) #输出:t o k e n t e s t r
三、单字符匹配
- . :匹配任意一个字符(除\n),匹配多次每次匹配一个字符,返回匹配结果的list【匹配\n时会报错】
string = "token test"
re_demo = "t."
res1= re.findall(re_demo,string)
print(res1) #输出:['to', 'te']
- []:匹配[]中列举的任意一个字符
string = "token testr"
re_demo = "[tos]"
res2 = re.findall(re_demo,string)
print(res2) #输出:['t', 'o', 't', 's', 't']
- \d:匹配数字,即0-9
string = "to1212ken132435testr"
re_demo = "\d"
res3 = re.findall(re_demo,string)
print(res3) #输出:['1', '2', '1', '2', '1', '3', '2', '4', '3', '5']
- \D:匹配非数字,即不是数字
string = "to1212ken132435testr"
re_demo = "\D"
res4 = re.findall(re_demo,string)
print(res4) #输出:['t', 'o', 'k', 'e', 'n', 't', 'e', 's', 't', 'r']
- \s:匹配空白,即空格,tab键
string = " tok en testr "
re_demo = "\s"
res2 = re.findall(re_demo,string)
print(res2) #输出:[' ', ' ', ' ', ' ']
- \S:匹配非空白
string = " tok en testr "
re_demo = "\S"
res2 = re.findall(re_demo,string)
print(res2) #输出:['t', 'o', 'k', 'e', 'n', 't', 'e', 's', 't', 'r']
- \w:匹配非特殊字符,即a-z、A-Z、0-9、_、汉字
string = "'hello好好学Python3_-&^%$#@"
re_demo = "\w"
res2 = re.findall(re_demo,string)
print(res2) #输出:['h', 'e', 'l', 'l', 'o', '好', '好', '学', 'P', 'y', 't', 'h', 'o', 'n', '3', '_']
- \W:匹配特殊字符,即非字母、非数字、非汉字、非下划线
string = "'hello好好学Python3_-&^%$#@"
re_demo = "\W"
res2 = re.findall(re_demo,string)
print(res2) #输出:["'", '-', '&', '^', '%', '$', '#', '@']
四、多字符匹配
- *:匹配前一个字符出现0次或无限次,即可有可无,输出的字符与字符串的长度一致i,没有匹配到的字符回显示成空字符
string = "token test ktv"
re_demo ="k*"
res6 = re.findall(re_demo,string)
print(res6) #输出:['', '', 'k', '', '', '', '', '', '', '', '', 'k', '', '', '']
- +:匹配前一个字符出现1次或无限次,即至少有1次匹配一个字符串
string = "token test ktn"
re_demo ="k.+n"
res6 = re.findall(re_demo,string)
print(res6) #输出:['ken test ktn']
- ?:匹配前一个字符出现0次或1次,即要么1次,要么没有
string = "token test ktn"
re_demo ="k.?n"
res6 = re.findall(re_demo,string)
print(res6) #输出:['ken', 'ktn']
- +?:匹配任意一个出现过的字符
string = "token test ktn"
re_demo ="k.+?n"
res7 = re.findall(re_demo,string)
print(res7) #输出:['ken', 'ktn']
string = "tokn test ktn"
re_demo ="k.+?n"
res7 = re.findall(re_demo,string)
print(res7) #输出:['kn test ktn']
string = "token test kn"
re_demo ="k.+?n"
res7 = re.findall(re_demo,string)
print(res7) #输出:['ken']
- {n}:匹配前一个字符连续出现n次
string = "tokken test kkktn"
re_demo ="k{2}"
res8 = re.findall(re_demo,string)
print(res8) #输出:['kk', 'kk']
- {m,n}:匹配前一个字符连续出现从m到n次【至少出现m次,最多出现n次】
string = "tokken test kkktn"
re_demo ="k{1,3}"
res9 = re.findall(re_demo,string)
print(res9) #输出:['kk', 'kkk']
五、逻辑运算符
- |:将两个匹配条件进行逻辑”或“(or)运算
string = "tokken test kkktn"
re_demo ="to|te|kk"
res10 = re.findall(re_demo,string)
print(res10) #输出:['to', 'kk', 'te', 'kk']
六、边界值
- ^:匹配输入字符串开始位置
string = "tokken test kkktn"
re_demo ="^to"
res11 = re.findall(re_demo,string)
print(res11) #输出:['to']
- $:匹配输入字符串结束位置
string = "tokken test kkktn"
re_demo ="tn$"
res12 = re.findall(re_demo,string)
print(res12) #输出:['tn']
七、分组匹配
- ():只取括号内的值
data = '{"member_id":"#member_id#","#id#","amount":100}'标签:匹配,string,python,demo,re,正则,print,findall From: https://www.cnblogs.com/mango-93/p/16410589.html
re_str = "#(\w.+?)#" #取括号内匹配非特殊字符的任意出现过的字符
res = re.findall(re_str,data)
print(res) #输出:['member_id', 'id']