Regular Expression Cheatsheet
December 15, 2004.    
For novice programmers trying to grok some of the code involved with Perl, here is a handy cheatsheet of regular expressions used in Perl, PHP, and *nix in general.

Almost a year ago Stu MacKenzie posted the following explaination which is similar. I'd meant to post it some time ago and never got around to it. Here it is, finally!

# A "phrase" is defined by (); thus (abc) means "locate 'abc' in the string". Also used for making backreferences to matched bits.

# A "set" of characters is defined by []; thus [abc] means "a" or "b" or "c". Putting ^ in front of the characters means anything-but-these-characters; thus [^abc] means anything but "a" or "b" or "c")

# In all other uses, ^ means the start of the string.

# As the last character in the pattern, ? means the end of the string

# + means the preceeding character or set must be present one or more times

# ? means the preceeding character or set may be present zero or one times

# \ is an escape char; when preceeded by \ the following characters gain special meaning: d, D, w, W, s, S, and b. \d is any digit; \D is any non-digit; \w is any word character; \W is any non-word character; \s is any whitespace char; \S is any non-whitespace char; \b is the boundary between a \w and a \W.

# A full stop normally means "any character" (except a newline, in most cases). To specify a full stop, escape it with \, as \.

So, Rael's sample [this is in answer to a List Member's question regarding interpolate_fancy] breaks down something like this:

(^[^\.]+/?$) -- the string begins with one or more of anything that's not "."; the string ends with zero or one /

| -- or --

(^$) -- the string begins and then immediately ends; i.e. it's an empty string

| -- or --

(index\.\w+$) -- "index" followed by "." followed by one or more words, all at the end of the string.

Click the button to