jPaq – Changes In Wildcard Expression Parsing

After reviewing this page, I realized that I need to fix some of the ways that jPaq converts wildcard expressions into regular expressions. The first thing I need to do is have the @ character act as a meta-character equivalent to the + meta-character in a regular expression. The next thing I need to change is the effect of prefixing a character with the ~ character. In reality, this character is actually doing what both the \ character and the ^ do. This means that the \ character will need to start acting like the escape character (which will not be hard since that automatically happens in regular expressions). That also means that the following translations will have to occur:

Wildcard RegExp
^t \t
^^ \^
^s \u00A0

Match Beginning & End of Words

One of the things I had to study up on while creating jPaq was regular expressions. The reason for this is because I wanted a way for those who know how to use wildcard expressions, but not regular expressions, to be able to use equivalent regular expressions. One of the more interesting tasks that I had was approximating the ability to use the word ending meta-characters:

  • <
  • >

The less than sign represents the beginning of a word, while the greater than sign represents the end of a word. The question is, how do you represent these two in the form of a regular expression. In order to do so, I am using a positive look-ahead grouping. Here is the regular expression for matching the beginning of a word:

/(?=\b\w)/

Here is the regular expression for matching the end of a word:

/(?=\b\W|\b$)/

As of right now, jPaq creates regular expressions which do the same thing, but these are actually better optimized. Therefore, in the next version, I will definitely use these to shorten and optimize the generated expressions. In fact, tomorrow I will talk about what else needs to change in jPaq to better approximate wildcard expressions that are available in Microsoft Word.

From Wildcards To Regular Expressions

There are a lot of people out there that no how to do file searches and Microsoft Word searches by using wildcard characters, but not as many people know how to work with regular expressions. That is the main reason why I added the RegExp.fromWildExp() function to jPaq. In addition, I have created an example page which dynamically generates the regular expression from the wildcard expression that you specify. Check this jPaq example out by clicking here.  The initial example shows you how to write a regular expression that matches all text that starts at the beginning of a word with a capital “J” and ends at the end of a word with a lowercase “t”.  NOTE:  Unlike regular expressions, wildcard expressions look to find the smallest match possible.