Regular Expressions: Methods - Doc JavaScript | WebReference

Regular Expressions: Methods - Doc JavaScript


JavaScript Regular Expressions

Regular Expression (and String) Methods

In this section we'll discuss methods that are related to regular expressions. Some are invoked as a method of a regular expression, whereas others are called as a string's method.

compile

compile() is invoked as a method of a regular expression. Its syntax is:

regexp.compile("PATTERN", ["g"|"i"|"gi"])

regexp is the name of a regular expression.
PATTERN is the text of the regular expression.

Use the compile() method with a regular expression that was created with the constructor function (not the literal notation). Use the compile method when you know the regular expression will remain constant (after getting its pattern) and will be used repeatedly throughout the script. This method actually converts the specified pattern into its internal format, for faster execution.

The compile() method can also be used to change a regular expression during execution:

var reg = new RegExp("Bart", "i");
// reg matches "Bart" here
reg.compile("Lisa", "i")
// reg doesn't match "Bart" here

You can also use this method to modify a regular expression's modifier:

var reg = new RegExp("bart", "i");
// reg matches "Bart" here
reg.compile("bart")
// reg doesn't match "Bart" here

exec

exec() is invoked as a method of a regular expression. Its syntax is:

regexp.exec(str)

regexp is the name of a regular expression.
str is the string against which to match the regular expression.

If no match is found, the method returns null, which converts to a Boolean false, when used as a Boolean expression.

If at least one match is found, the method returns an array with the following properties:

The following example demonstrates this method:

var reg = /(.).(.er)/i;
var ar = reg.exec("Internet");
document.write(ar.index, "<BR>",
               ar.input, "<BR>",
               ar[0],    "<BR>",
               ar[1],    "<BR>",
               ar[2]);

Here's the script's output:

0
Internet
Inter ("inter" in Navigator 4.0x, due to a bug)
I
ter

ar.index is 0 because the matched portion of the string is Inter, which is at the beginning of the string. Note that if we used the string "Internet Explorer", all properties would remain the same, except for ar.input which would be "Internet Explorer". The same applies if we specified the /g modifier for global matching.

If the exec() method finds a match, it assigns several properties to the regular expression (reg in our preceding example):

We'll use our last example to demonstrate this as well:

var reg = /(.).(.er)/i;
var ar = reg.exec("Internet");
document.write(reg.lastIndex,  "<BR>",
               reg.ignoreCase, "<BR>",
               reg.global,     "<BR>",
               reg.source);

Have a look at this script's output, which lists the properties of the regular expression, after invoking the method:

0
true
false
(.).(.er)

With Internet Explorer your results will differ.

The only property supported by Internet Explorer 4.0 is source, so you should avoid the others. Microsoft's official documentation claims that the other properties are also supported, but they are not. Some of these properties are supported by Internet Explorer 4.0 as properties of the global RegExp object. See the following section for more information.

If the exec() method succeeds, it also assigns several properties to the global RegExp object. We'll discuss these properties in the following section.

test

test() is invoked as a method of a regular expression. Its syntax is:

regexp.test(str)

regexp is the name of a regular expression.
str is the string against which the regular expression is matched.

The test() method checks if a pattern exists within a string, and returns true if so, and false() otherwise. This method doesn't affect the global RegExp object.

The following script segment demonstrates the test() method:

var str = "tomer@netscent.com";
var reg = new RegExp("@");
if (reg.test(str))
  alert(str + " is a valid e-mail address!")
else
  alert(str + " is an invalid e-mail address!");

We'll discuss advanced e-mail verification later in the column.

match

match() is invoked as a method of a string. Its syntax is:

str.match(regexp)

regexp is the name of a regular expression. You can supply it as a literal or as a variable.
str is any string.

This method is the same as exec(), but its object is a string, and its argument is a regular expression.

replace

replace() is invoked as a method of a string. Its syntax is:

str.replace(regexp, replaceStr)

regexp is the name of a regular expression. You can supply it as a literal or as a variable.
str is any string.

This method is equivalent to Perl's s/// operator. The following script swaps the first two words in a string:

var str = "One Two Three".replace(/^([^ ]+) +([^ ]+)/, "$2 $1");
document.write(str); // prints "Two One Three"

Notice the variable interpolation that takes place. As opposed to Perl, it doesn't matter if you use double-quotes or single-quotes for the second argument. The replacement string undergoes variable interpolation each time the pattern matches. Note that only Perl-like variables ($...) are interpolated in the replacement string. You can also embed other variables in the string, but they are not interpolated each time the pattern matches. Here's an example:

var company = "Digital";
var str = "Intel is a chip manufacturer!";
var newstr = str.replace(/Intel is/, company + " is");
document.write(newstr); // prints "Digital is a chip manufacturer!"

Note that only in Navigator 4.0x the regular expression can also be enclosed in ordinary quotes:

var str = "One Two Three".replace("^([^ ]+) +([^ ]+)", "$2 $1");
document.write(str);
// prints "Two One Three" under Navigator 4.0x
// prints "One Two Three" under Internet Explorer 4.0

Like exec() and match(), this method updates the RegExp object's properties.

If you want to enable multiple replacements, the regular expression should utilize a /g modifier:

var str = "Car Car Car";
var newstr = str.replace(/Car/g, "Bus");
document.write(newstr); // prints "Bus Bus Bus"

If you do not include this modifier, only the first match is replaced with the alternative string, so the preceding script would print "Bus Car Car".

search

search() is invoked as a method of a string. Its syntax is:

str.search(regexp)

regexp is the name of a regular expression. You can supply it as a literal or as a variable.
str is any string.

This method is the same as test(), but its object is a string, and its argument is a regular expression.

split

split() is invoked as a method of a string. Its syntax is:

str.split(regexp)

regexp is the name of a regular expression. You can supply it as a literal or as a variable. It can also be an ordinay string.
str is any string.

This method updates the RegExp object if a match is found.

The split() method scans a string (which is actually its object) for delimiters, and splits the string into a list of substrings, returning the resulting list in the form of an array. The delimiters are determined by repeated pattern matching, using the given regular expression. Thus, the delimiters may be of any size and do not need to be the same string on every match. If the pattern does not match at all, the method returns the original string as a single substring. If it matches once, you get two substrings, in the form of a two-element array.

In Netscape Navigator 4.0x you can also hand the method an integer, so the method splits the string into no more than that many fields. If a regular expression is not provided, the method returns the original string. So the following statement does not change the value of the string:

str = str.split();

A pattern never matches in one spot more than once, even if it matched with a zero width. Here's an example:

document.write("a string".split(/ */).join(", "));

This statement outputs the following string:

a, s, t, r, i, n, g

The space between the two words (a, string) disappeared because it matched as part of the delimiter. As a reminder, the join() method joins an array's elements into one string and puts the given delimiter between each of the substrings. The statement:

anyString.split(//);

should return an array of the characters in anyString, including spaces. For example:

document.write("a string".split(//).join(", "));

should produce the following output, but Navigator 4.0x and Internet Explorer 4.0 generate an error when a null pattern is used:

a,  , s, t, r, i, n, g

So instead of a null pattern, you should use an ordinary null string to split a string into characters:

document.write("a string".split("").join(", "));

The split() method doesn't usually return delimiters, but if the pattern contains parentheses, then the substring matched by each pair of parentheses is included in the resulting array, interspersed with the fields that are ordinarily returned. Here's a simple example (from the book "Programming Perl"):

"1-10,20".split(/([-,])/)

which returns an array with the following values (in order):

If the pattern consists of several parentheses, and some of the pairs don't match, Navigator 4.0x and Internet Explorer 4.0 do not act the same. For example:

document.write("1-10,20".split(/(-)|(,)/).join(":"));

produces the following output for Navigator 4.0x:

1:-:10::,:20

and the following output for Internet Explorer 4.0:

1:10:20

Try to avoid such patterns with the split() method.

http://www.internet.com

Created: October 23, 1997, 1997
Revised: December 4, 1997
URL: http://www.webreference.com/js/column5/methods.html