One of the things I had to study up on while creating jPaq was regular expressions. The reason for this is because I wanted a way for those who know how to use wildcard expressions, but not regular expressions, to be able to use equivalent regular expressions. One of the more interesting tasks that I had was approximating the ability to use the word ending meta-characters:
- <
- >
The less than sign represents the beginning of a word, while the greater than sign represents the end of a word. The question is, how do you represent these two in the form of a regular expression. In order to do so, I am using a positive look-ahead grouping. Here is the regular expression for matching the beginning of a word:
/(?=\b\w)/
Here is the regular expression for matching the end of a word:
/(?=\b\W|\b$)/
As of right now, jPaq creates regular expressions which do the same thing, but these are actually better optimized. Therefore, in the next version, I will definitely use these to shorten and optimize the generated expressions. In fact, tomorrow I will talk about what else needs to change in jPaq to better approximate wildcard expressions that are available in Microsoft Word.