| grep {base} | R Documentation |
Pattern Matching and Replacement
Description
grep searches for matches to pattern (its first
argument) within the character vector x (second
argument). regexpr does too, but returns more detail in a
different format.
sub and gsub perform replacement of matches
determined by regular expression matching.
Usage
grep(pattern, x, ignore.case=FALSE, extended=TRUE, value=FALSE)
sub(pattern, replacement, x,
ignore.case=FALSE, extended=TRUE)
gsub(pattern, replacement, x,
ignore.case=FALSE, extended=TRUE)
regexpr(pattern, text, extended=TRUE)
Arguments
pattern |
character string containing a regular expression
to be matched in the vector of character string |
x, text |
a character vector where matches are sought. |
ignore.case |
if |
extended |
if |
value |
if |
replacement |
a replacement for matched pattern in
|
Details
The two *sub functions differ only in that sub replaces only
the first occurrence of a pattern whereas gsub replaces
all occurrences.
The regular expressions used are those specified by POSIX 1003.2,
either extended or basic, depending on the value of the
extended argument.
Value
For grep a vector giving either the indices of the elements
of x that yielded a match or, if value is TRUE,
the matched elements.
For sub and gsub a character vector of the same
length as the original.
For regexpr an integer vector of the same length as
text giving the starting position of the first match, or -1
if there is none, with attribute "match.length" giving the
length of the matched text (or -1 for no match).
See Also
tolower, toupper and chartr
for character translations.
charmatch, pmatch, match.
apropos uses regexps and has nice examples.
Examples
grep("[a-z]", letters)
txt <- c("arm","foot","lefroo", "bafoobar")
if(any(i <- grep("foo",txt)))
cat("`foo' appears at least once in\n\t",txt,"\n")
i # 2 and 4
txt[i]
## Double all 'a' or 'b's; "\" must be escaped, i.e. `doubled'
gsub("([ab])", "\\1_\\1_", "abc and ABC")
txt <- c("The", "licenses", "for", "most", "software", "are",
"designed", "to", "take", "away", "your", "freedom",
"to", "share", "and", "change", "it.",
"", "By", "contrast,", "the", "GNU", "General", "Public", "License",
"is", "intended", "to", "guarantee", "your", "freedom", "to",
"share", "and", "change", "free", "software", "--",
"to", "make", "sure", "the", "software", "is",
"free", "for", "all", "its", "users")
( i <- grep("[gu]", txt) ) # indices
stopifnot( txt[i] == grep("[gu]", txt, value = TRUE) )
(ot <- sub("[b-e]",".", txt))
txt[ot != gsub("[b-e]",".", txt)]#- gsub does "global" substitution
txt[gsub("g","#", txt) !=
gsub("g","#", txt, ignore.case = TRUE)] # the "G" words
regexpr("en", txt)