Pacific Beach Drive

Mike's Drive.

Follow me on GitHub

Pattern Matching - Grouping and Delimiting

Anchors

  • By default, matching occurs anywhere in the string
  • Provide matching the position of a pattern
  • use anchors to match the position or location of a pattern
  • Caret (^) indicates the start of a string, matching the beginning of a string
    • distinct from the caret in character class
    • redundant when using the match() method
  • Dollar sign ($) indicates the end of a string

Start Anchor Examples

Pattern String  
^[Bb]erkeley UC Berkeley no match
^[Bb]erkeley berkeley match
^[Bb]erkeley Berkeley, Ca match
^[Bb]erkeley UC Berkley no match (misspelling)

Anchors Examples

  • $ will only match strings that contain Berkeley
Pattern String  
^[Bb]erkeley$ UC Berkeley no match
^[Bb]erkeley$ berkeley match
^[Bb]erkeley$ Berkeley, Ca no match
^[Bb]erkeley$ UC Berkley no match (misspelling)

Grouping — Alternative Patterns

  • Pipe or vertical bar character ( ) separates multiple patterns
    • Acts like an or operator
      • pattern1 pattern2 pattern3
    • Match is made if any of the pattern have a match
    • pattern matching goes in sequence with groups

Or Example

Pattern String  
software|hardware software no match
software|hardware hardware no match
software|hardware dinnerware hardware match
software|hardware firmware no match

Grouping Patterns

  • parenthesis allow for grouping of patterns
  • treats a series of patterns as one unit (repetition qualifiers)

Grouping

(...)
# starts and ends a group of regex patterns
(?P<name>...)
# starts and ends a named group name of regex patterns

Grouping Example

Pattern String  
(soft|hard)ware software match
(soft|hard)ware hardware match
(soft|hard)ware dinnerware hardware match
(soft|hard)ware firmware no match