Match Beginning & End of Words

One of the things I had to study up on while creating jPaq was regular expressions. The reason for this is because I wanted a way for those who know how to use wildcard expressions, but not regular expressions, to be able to use equivalent regular expressions. One of the more interesting tasks that I had was approximating the ability to use the word ending meta-characters:

  • <
  • >

The less than sign represents the beginning of a word, while the greater than sign represents the end of a word. The question is, how do you represent these two in the form of a regular expression. In order to do so, I am using a positive look-ahead grouping. Here is the regular expression for matching the beginning of a word:

/(?=\b\w)/

Here is the regular expression for matching the end of a word:

/(?=\b\W|\b$)/

As of right now, jPaq creates regular expressions which do the same thing, but these are actually better optimized. Therefore, in the next version, I will definitely use these to shorten and optimize the generated expressions. In fact, tomorrow I will talk about what else needs to change in jPaq to better approximate wildcard expressions that are available in Microsoft Word.

Leave a Reply

Your email address will not be published. Required fields are marked *