Regular Expressions: Backreferences - Doc JavaScript | 2 | WebReference

Regular Expressions: Backreferences - Doc JavaScript | 2


Unix Regular Expressions

Backreferences

We mentioned in the previous section that you can use parentheses to group things for quantifiers, but you can also use them to remember pieces of what you have already matched. That is, a pair of parentheses around a part of a regular expression causes the string's portion that was matched by that part, to be remembered for later use. Take a look at the following regular expressions:

/\d+/
/(\d+)/

Both regular expressions match as many digits as possible, but in the latter case they will be remembered in a special variable so they can be backreferenced later.

Within the same regular expression, use a backslash followed by an integer as a backreference. The integer corresponding to a given pair of parentheses is determined by counting left parentheses from the beginning of the pattern. The following regular expression matches something similar to an HTML tag (like <STRONG>text here</STRONG>):

/<(.*)>.*<\/\1>/

The following regular expression matches a string (at least four-characters long) whose first two characters are also its last two characters, but in reverse order (such as "abcdefba"):

/^(.)(.).*\2\1$/

Outside regular expressions, such as in the replacement part of a substitution, this backreference special variable is used as if it were a scalar variable named as an integer (e.g., $1, $2, $3, $99). Even though this variable naming convention applies to Perl, JavaScript copied it for the sake of regular expressions. So, if you want to swap the first two words of a string, for example, you could use:

s/(\S+)\s+(\S+)/$2 $1/

In JavaScript this would be:

str = str.replace(/(\S+)\s+(\S+)/, "$2 $1");

If you don't have any regular expression background, you probably don't understand any of these functions. We'll introduce the substitution operator in the next section.

In JavaScript, the special variable is called RegExp.$## (unless you are referencing it in the regular expression itself or handing it as the second argument to the replace() method). Since RegExp is a global object, you can access RegExp.$1, RegExp.$2, and so forth, from anywhere in the script. Note that $1 and $2 are local variables in Perl.

Perl and JavaScript also support several other backreference variables:

http://www.internet.com

Created: October 23, 1997, 1997
Revised: December 4, 1997
URL: http://www.webreference.com/js/column5/values.html