首页 > 其他分享 >R语言字符串替换

R语言字符串替换

时间:2023-11-06 16:03:05浏览次数:34  
标签:case ignore string --- gsub characters 字符串 替换 语言

R gsub Function

 

gsub() function replaces all matches of a string, if the parameter is a string vector, returns a string vector of the same length and with the same attributes (after possible coercion to character). Elements of string vectors which are not substituted will be returned unchanged (including any declared encoding).

gsub(pattern, replacement, x, ignore.case = FALSE, perl = FALSE,
    fixed = FALSE, useBytes = FALSE)

• pattern: string to be matched
• replacement: string for replacement
• x: string or string vector
• ignore.case: if TRUE, ignore case
...

> x <- "R Tutorial"
> gsub("ut","ot",x)
[1] "R Totorial"

Case insensitive replace:

> gsub("tut","ot",x,ignore.case=T))
[1] "R otorial"

If ignore.case is not set to True, no replace take place:

> gsub("tut","ot",x)
[1] "R Tutorial"

 

> x <- "line 4322: He is now 25 years old, and weights 130lbs"
> y <- gsub("\\d+","---",x)
> y
[1] "line ---: He is now --- years old, and weights ---lbs"

 

> x <- "line 4322: He is now 25 years old, and weights 130lbs"
> y <- gsub("[[:lower:]]","-",x)
> y
[1] "---- 4322: H- -- --- 25 ----- ---, --- ------- 130---"

Vector replacement:

> x <- c("R Tutorial","PHP Tutorial", "HTML Tutorial")
> gsub("Tutorial","Examples",x)
[1] "R Examples"    "PHP Examples"  "HTML Examples"


Regular Expression Syntax:

Syntax

Description

\\d

Digit, 0,1,2 ... 9

\\D

Not Digit

\\s

Space

\\S

Not Space

\\w

Word

\\W

Not Word

\\t

Tab

\\n

New line

^

Beginning of the string

$

End of the string

\

Escape special characters, e.g. \\ is "\", \+ is "+"

|

Alternation match. e.g. /(e|d)n/ matches "en" and "dn"


Any character, except \n or line terminator

[ab]

a or b

[^ab]

Any character except a and b

[0-9]

All Digit

[A-Z]

All uppercase A to Z letters

[a-z]

All lowercase a to z letters

[A-z]

All Uppercase and lowercase a to z letters

i+

i at least one time

i*

i zero or more times

i?

i zero or 1 time

i{n}

i occurs n times in sequence

i{n1,n2}

i occurs n1 - n2 times in sequence

i{n1,n2}?

non greedy match, see above example

i{n,}

i occures >= n times

[:alnum:]

Alphanumeric characters: [:alpha:] and [:digit:]

[:alpha:]

Alphabetic characters: [:lower:] and [:upper:]

[:blank:]

Blank characters: e.g. space, tab

[:cntrl:]

Control characters

[:digit:]

Digits: 0 1 2 3 4 5 6 7 8 9

[:graph:]

Graphical characters: [:alnum:] and [:punct:]

[:lower:]

Lower-case letters in the current locale

[:print:]

Printable characters: [:alnum:], [:punct:] and space

[:punct:]

Punctuation character: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~

[:space:]

Space characters: tab, newline, vertical tab, form feed, carriage return, space

[:upper:]

Upper-case letters in the current locale

[:xdigit:]

Hexadecimal digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f

REF:

http://www.endmemo.com/program/R/gsub.php

http://cran.r-project.org/web/packages/stringr/stringr.pdf

http://stackoverflow.com/questions/11936339/in-r-how-do-i-replace-text-within-a-string



标签:case,ignore,string,---,gsub,characters,字符串,替换,语言
From: https://blog.51cto.com/emanlee/8213336

相关文章

  • 2023-8-24 大型语言模型的科学挑战 2023 人工智能大会青年科学家论坛
    大型语言模型的科学挑战|2023人工智能大会青年科学家论坛复旦大学邱锡鹏MOSS模型开发过程中文预训练基座(CPT,CBART)2021.9对话模型(MOSS)2023.2工具增强(MOSS-Plugin)2023.4大模型时代,自然语言处理还存在吗?graphLRG[V1]-->A1A1[输入]B1[基础模型]C1[词法分......
  • 实验二 C语言分支与循环基础应用
    1.实验11#include<stdio.h>2#include<stdlib.h>3#include<time.h>45#defineN56#defineN13747#defineN246589intmain()10{11intnumber;12inti;1314srand(time(0));1516for(i=......
  • 大型语言模型可以通过情绪刺激理解并实现增强
    作者:爱可可-爱生活链接:https://zhuanlan.zhihu.com/p/665119618来源:知乎著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。要点:探索了大型语言模型是否能理解和利用心理情感刺激来增强自身,这是人类智能的一个重要方面。提出“EmotionPrompt”方法,将原始......
  • (十一)Python之字符串类型
    字符串类型Python中的字符串用单引号(‘’)或双引号(”“)括起来,同时使用反斜杠(\)转义特殊字符语法:s=”a1a2...an“(n>=0)Python使用单引号(‘)、双引号(“)、三引号(”“”)来表示字符串、其中三引号可以由多行组成,它是编写多行文本的快捷语法,常用于文档字符串,在文件的特定地点,被当作注......
  • 编程语言分类
    编程语言分类1.编译型将源代码通过编译器转化为目标代码的一个过程 源代码通常是高级语言编写代码执行编译器程序的称为编译器执行程序是执行目标代码优点:对于相同的源代码编译所产生的目标代码,它的执行速度更快,目标代码不需要通过编译器可以直接运行缺点:需要修改源程序......
  • go语言并发,释放程序潜能的魔力
    Go语言并发:释放程序潜能的魔力原创 Go先锋 Go先锋 2023-11-0608:02 发表于广东收录于合集#Go语言并发1个Go先锋读完需要9分钟速读仅需3分钟  概述在编程领域,处理多任务和并发操作是必不可少的。Go语言以其简洁而强大的并发机制而闻名。本文将简单探......
  • R语言install.packages("jpeg")报错
    R语言install.packages("jpeg")时报错Infileincludedfromread.c:1:0:rjcommon.h:11:21:fatalerror:jpeglib.h:Nosuchfileordirectoryrjcommon.h:11:21:致命错误:jpeglib.h:没有那个文件或目录#include<jpeglib.h>compilationterminated.make:***[read.o]Error......
  • Java去除字符串中空格的方法详解
    1、方法str.trim();str.replace("","");str.replaceAll("","");str.replaceAll("+","");str.replaceAll("\\s*","");\\s*可以匹配空格、制表符、换页符等空白字符的其中任意一个。 2、示例packagetest;publicc......
  • 程序设计语言的分类
    程序设计语言是什么人与机器沟通的语言程序设计语言的分类:1.机器语言:二进制语言,机器可以直接识别的2.汇编语言:使用助句符号方便机器语言一一对应3.高级语言:接近自然语言的计算机程序设计语言Python,Java都是高级语言 ......
  • 脚本语言
    脚本语言shell脚本语言属于一种弱类型语言无需声明变量类型,直接定义使用。强类型语言,必须先定义变量类型,确定是数字、字符串等,之后再赋予同类型的值。centossteam9系统中支持的shell情况,有如下种类[root@localhost~]#cat/etc/shells/bin/sh......