Database, Oracle

Pattern Matching – Regular Expressions Part 1: META Characters

caret1

What are Regular Expressions?

Regular Expressions are methods of describing both simple and complex patterns for searching and manipulating. We uses META Characters to define the construct of the search criteria and Oracle’s implementation is an extension of the POSIX (Portable Operating System for UNIX).

The following post is will give the reader an overview of META characters that are used in the Oracle database for Regular Expression pattern matching.

META Characters

Symbol Description
^ Marks the start of a line
$ Marks the end of a line
[ ] Matching list
| Operator for specifying alternative matches (logical OR)
? Matches zero or one occurrence
. Matches any character except NULL
{m} Matches exactly m times
{m,n} Matches at least m times but no more than n times
[: :] Specifies a character class and matches any character in the class
\ Escape character
+ Matches one or more occurrences
* Matches zero or more occurrences
() Grouping for expression
\n Back-reference expression

In this first segment, only the META Characters above in Bold will be addressed.

Caret

beginning of line

Dollar Sign

end of line

Logical OR – Says if either character set matches, the pattern is valid.

logicalOR

These first three examples are relatively straight forward.  The following set not so much.  After each example, I will provide a breakdown of the process to determine if the pattern is valid.

Dot

dot

 

The dot (.) represents a single character match like the underscore (_) when using simple pattern matching

dotbreakdown

 

{m} Will match consecutive occurrences of the preceding character the desired number of times.

M

Valid patterns occur where the letter ‘s’ appears consecutively. Notice that while sister contains the requisite number of the ‘s’ character, they are not consecutive.

mbreakdown

 

Star Will match zero or more occurrences of the preceding character

star

Notice in this instance ‘acc’ is not a valid pattern.  While it matches from the point zero occurrences of the letter ‘b’, it the ‘c’ in the second spot that invalidates this option.  Subsequently, this same argument is why ‘ac’ is valid.

starbreakdown

 

There you have it.  Next time will will put these into action as I discuss the 5 Regular Expression Functions.

Enjoy!dbaOnTap

Related Posts Plugin for WordPress, Blogger...

1 Comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.