该脚本根据输入的路径,可以读取路径下的所有文件,实现匹配字符串替换,添加内容和删除内容的功能。
import os
from fileinput import FileInput
#删除内容
def match_then_delete(inputpath):
for root,dirs,files in os.walk(inputpath):
for file in files:
path = os.path.join(root,file)
output_file_path = ""+file
print(out_file_path)
with open(path,'r',encoding='gbk') as infile:
input_stream=infile.read()
output_stream=""
#换行分切分内容
input_stream_lines=input_stream.split("\n")
for line in input_stream_lines:
if line.startwith(""):
pass
else:
output_stream=output_stream+line+'\n'
#读取去掉指定内容后的新内容,重新写文件
g = open(output_file_path,'w')
g.write(output_stream)
#添加内容,在匹配内容match上方添加内容content
def match_then_insert(filename,match,content):
for line in FileInput(filename,inplace=True):
if match in line:
line = content+'\n'+line
print(line,end='')
#匹配字符串替换
def match_then_replace(filename,oldtext,newtext):
for line in FileInput(filename, inplace=True):
if oldtext in line:
line = line.replace(oldtext,newtext)
print(line,end='')
if __name__=='__main__':
inputpath = ""
for root,dirs,files in os.walk(inputpath):
for file in files:
path = os.path.join(root,file)
output_file_path = inputpath+file
match_then_replace(output_file_path,"oldtext","newtext")
需要注意的点:
当我们需要处理的文件是utf-8编码时,而python3中默认的文件解码格式是gbk,若直接使用FileInput模块,会报错误
UnicodeDecodeError: 'gbk' codec can't decode byte 0x89 in position 116: illegal multibyte sequence
若我们使用如下形式
for line in fileinput.input(filename,openhook=fileinput.hook_encoded('utf-8','')
使用openhook指定编码格式为utf-8时,此时则无法设置inplace=True,即无法写入文件
这里使用的解决办法是修改fileinput的源码,在340和360行附近,在代码中加入enconding="utf-8"