grep {base} | R Documentation |
Pattern Matching and Replacement
Description
grep
searches for matches to pattern
(its first
argument) within the vector x
of character strings (second
argument). regexpr
does too, but returns more detail in a
different format.
sub
and gsub
perform replacement of matches
determined by regular expression matching.
Usage
grep(pattern, x, ignore.case=FALSE, extended=TRUE, value=FALSE)
sub(pattern, replacement, x,
ignore.case=FALSE, extended=TRUE)
gsub(pattern, replacement, x,
ignore.case=FALSE, extended=TRUE)
regexpr(pattern, text, extended=TRUE)
Arguments
pattern |
character string containing a regular expression
to be matched in the vector of character string |
x , text |
a vector of character strings where matches are sought. |
ignore.case |
if |
extended |
if |
value |
if |
replacement |
a replacement for matched pattern in
|
Details
The two *sub
functions differ only in that sub
replaces only
the first occurence of a pattern
whereas gsub
replaces
all occurences.
The regular expressions used are those specified by POSIX 1003.2,
either extended or basic, depending on the value of the
extended
argument.
Value
For gsub
a vector giving either the indices of the elements
of x
that yielded a match or, if value
is TRUE
,
the matched elements.
For sub
and gsub
a character vector of the same
length as the original.
For regexpr
an integer vector of the same length as
text
giving the starting position of the first match, or -1
if there is none, with attribute "match.length"
giving the
length of the matched text (or -1 for no match).
Note
It is possible to compile R without support for regular expressions, and then these functions are not operational.
On the Macintosh port this function is based on the regex regular expression library written by Henry Spencer of the University of Toronto.
See Also
charmatch
, pmatch
, match
.
apropos
uses regexps and has nice examples.
Examples
grep("[a-z]", letters)
txt <- c("arm","foot","lefroo", "bafoobar")
if(any(i <- grep("foo",txt)))
cat("`foo' appears at least once in\n\t",txt,"\n")
i # 2 and 4
txt[i]
## Double all 'a' or 'b's; "\" must be escaped, i.e. `doubled'
gsub("([ab])", "\\1_\\1_", "abc and ABC")
txt <- c("The", "licenses", "for", "most", "software", "are",
"designed", "to", "take", "away", "your", "freedom",
"to", "share", "and", "change", "it.",
"", "By", "contrast,", "the", "GNU", "General", "Public", "License",
"is", "intended", "to", "guarantee", "your", "freedom", "to",
"share", "and", "change", "free", "software", "--",
"to", "make", "sure", "the", "software", "is",
"free", "for", "all", "its", "users")
( i <- grep("[gu]", txt) ) # indices
all( txt[i] == grep("[gu]", txt, value = TRUE) )
(ot <- sub("[b-e]",".", txt))
txt[ot != gsub("[b-e]",".", txt)]#- gsub does "global" substitution
txt[gsub("g","#", txt) !=
gsub("g","#", txt, ignore.case = TRUE)] # the "G" words
regexpr("en", txt)