grep {base} | R Documentation |
Pattern Matching and Replacement
Description
grep
searches for matches to pattern
(its first
argument) within the vector x
of character strings (second
argument). regexpr
does too, but returns more detail in a
different format.
sub
and gsub
perform replacement of matches
determined by regular expression matching.
Usage
grep(pattern, x, ignore.case=FALSE, extended=TRUE, value=FALSE)
sub(pattern, replacement, x,
ignore.case=FALSE, extended=TRUE)
gsub(pattern, replacement, x,
ignore.case=FALSE, extended=TRUE)
regexpr(pattern, text, extended=TRUE)
Arguments
pattern |
character string containing a regular expression
to be matched in the vector of character string |
x , text |
a vector of character strings where matches are sought. |
ignore.case |
if |
extended |
if |
value |
if |
replacement |
a replacement for matched pattern in
|
Details
The two *sub
functions differ only in that sub
replaces only
the first occurrence of a pattern
whereas gsub
replaces
all occurrences.
The regular expressions used are those specified by POSIX 1003.2,
either extended or basic, depending on the value of the
extended
argument.
Value
For gsub
a vector giving either the indices of the elements
of x
that yielded a match or, if value
is TRUE
,
the matched elements.
For sub
and gsub
a character vector of the same
length as the original.
For regexpr
an integer vector of the same length as
text
giving the starting position of the first match, or -1
if there is none, with attribute "match.length"
giving the
length of the matched text (or -1 for no match).
See Also
charmatch
, pmatch
, match
.
apropos
uses regexps and has nice examples.
Examples
grep("[a-z]", letters)
txt <- c("arm","foot","lefroo", "bafoobar")
if(any(i <- grep("foo",txt)))
cat("`foo' appears at least once in\n\t",txt,"\n")
i # 2 and 4
txt[i]
## Double all 'a' or 'b's; "\" must be escaped, i.e. `doubled'
gsub("([ab])", "\\1_\\1_", "abc and ABC")
txt <- c("The", "licenses", "for", "most", "software", "are",
"designed", "to", "take", "away", "your", "freedom",
"to", "share", "and", "change", "it.",
"", "By", "contrast,", "the", "GNU", "General", "Public", "License",
"is", "intended", "to", "guarantee", "your", "freedom", "to",
"share", "and", "change", "free", "software", "--",
"to", "make", "sure", "the", "software", "is",
"free", "for", "all", "its", "users")
( i <- grep("[gu]", txt) ) # indices
stopifnot( txt[i] == grep("[gu]", txt, value = TRUE) )
(ot <- sub("[b-e]",".", txt))
txt[ot != gsub("[b-e]",".", txt)]#- gsub does "global" substitution
txt[gsub("g","#", txt) !=
gsub("g","#", txt, ignore.case = TRUE)] # the "G" words
regexpr("en", txt)