setwd("C:\\Users\\Administrator\\Desktop")
# 读取txt文件
microbial_names <- readLines("your_input_file.txt")
# 使用正则表达式提取属水平的名称
genus_names <- sapply(microbial_names, function(name) {
matches <- regmatches(name, regexpr("(?<=g__).*$", name, perl = TRUE))
return(matches[1])
})
# 将结果写入新的txt文件
writeLines(genus_names, "output_genus_names.txt")
如,“k__Bacteria.p__Proteobacteria.c__Betaproteobacteria.o__Burkholderiales.f__Comamonadaceae.g__Hydrogenophaga”只保留“Hydrogenophaga”
标签:__,Hydrogenophaga,提取,正则表达式,microbial,名称 From: https://www.cnblogs.com/wzbzk/p/17782569.html