Category Archives: PHP

PostgreSQL – Convert A String Into A Table

Recently at work we had the need to convert a string parameter (passed into an iReport) into a table. For many people, the first thing that may come to mind is either the STRING_TO_ARRAY(...) function combined with the UNNEST(...) function. The biggest issue with using just those two functions is the fact that you don’t have any way of allowing whatever delimiters that will be used for cells and rows to remain. For this reason, I created the following PL/pgSQL function which uses the commas as cell delimiters and semicolons as row delimiters:

CREATE OR REPLACE FUNCTION public.uri_decode_2d_array(IN input TEXT)
  RETURNS SETOF TEXT[]
  LANGUAGE plpgsql
  STABLE
AS $function$
/*******************************************************************************
 * Function Name: uri_decode_2d_array
 * In-coming Params:
 *   input [TEXT] - The string to decoded and convert into a set of text arrays.
 * Returns:
 *      TEXT
 * Description:
 *   Takes in a string and converts it to a set of arrays.
 * Created On: 2012-06-14
 * Updated On: 2012-06-14
 * Author: Chris West
 ******************************************************************************/
BEGIN
  RETURN QUERY
    SELECT array_agg(t)
    FROM (
      SELECT
        REPLACE(REPLACE(REPLACE(UNNEST(a), '%2C', ','), '%3B', ';'), '%25', '%') AS t,
        ROW_NUMBER() OVER (ORDER BY 1) AS r
      FROM (
        SELECT STRING_TO_ARRAY(UNNEST(STRING_TO_ARRAY(input, ';')), ',') a
      ) t
    ) t
    GROUP BY r
    ORDER BY r;
END;
$function$

Now the question is, how do you convert your 2D array into a string that will be interpreted by the above SQL function? The following PHP function can do just that:

function uriEncode2DArray($inputArray) {
  $ret = "";
  foreach($inputArray as $kOuter => $vOuter) {
    if($kOuter) {
      $ret .= ";";
    }
    foreach($vOuter as $kInner => $vInner) {
      $ret .= ($kInner ? "," : "") . cryptThis($vInner);
    }
  }
  return $ret;
}

The following is an example of converting a 2D array into a string using the above function:

$arr = array(
  array("Jen Harring", "001238192", "January 1, 1987"),
  array("Tim Alekper", "902340340", "June 30, 1987", "Mister 99%; AKA Vampire")
);
echo uriEncode2DArray($arr);
# The above outputs the following:
#   Jen Harring,001238192,January 1%2C 1987;Tim Alekper,902340340,June 30%2C 1987,Mister 99%25%3B AKA Vampire

The following is an example of how you would use the generated string to generate a table:

SELECT a[1]::VARCHAR AS name,
  REGEXP_REPLACE(a[2]::VARCHAR, '(...)(..)(....)', '\\1-\\2-\\3') AS ssn,
  a[3]::DATE AS date_of_birth,
  age(a[3]::TIMESTAMP) AS age,
  a[4]::TEXT AS comments
FROM uri_decode_2d_array('Jen Harring,001238192,January 1%2C 1987;Tim Alekper,902340340,June 30%2C 1987,Mister 99%25%3B AKA Vampire') AS a;

The following is the generated table:

name ssn date_of_birth age comments
Jen Harring 001-23-8192 1987-01-01 25 years 5 mons 13 days (null)
Tim Alekper 902-34-0340 1987-06-30 24 years 11 mons 14 days Mister 99%; AKA Vampire

Regular Expression Examples

One of the things that I love about string manipulation is the existence of regular expressions. For this reason, I have decided to share a few examples that may help those who are learning about regular expressions so that can understand them a bit better.


JavaScript – General Variable Name

In many languages, a variable must start off with a letter and may be followed by letters, numbers, and/or the underscore character.  Knowing this, we could use the following regular expression in JavaScript to match a variable name:

var expVarIFlag = /^[A-Z]\w*$/i;

Basically, the above regular expression will only match a string if it matches the pattern for a variable name. The reason that I only put [A-Z] and not [A-Za-z] is because in JavaScript you can specify an “i” flag after the regular expression which indicates that the expression will be case-insensitive. Another thing to note is that I used the \w class which basically represents a word. A word in regular expressions typically means any letter (A to Z regardless of case), any digit (0 to 9), or the underscore character. The reason I used the asterisk instead of the plus sign is because a variable may be just one letter.

NOTE: Although this regular expression may work for other languages, in JavaScript, a variable name can also start off with an underscore or dollar sign.


PostgreSQL – Date (MM/DD/YYYY)

Even though using a regular expression shouldn’t be the way to completely validate a date, you can do so partially with the following in PostgreSQL:

SELECT id, text
FROM answers
WHERE text ~ '^(0\\d|1[012])/([012]\\d|3[01])/\\d{4}$';

The above query will pull all of the answers with a text that matches the pattern to see if it looks like a valid date.

  1. First it specifies that the first two characters are a 0 followed by another digit or a 1 followed by either a 0, 1, or 2.
  2. Next should come a forward slash.
  3. Next should be either…
    1.  0, 1, or 2 followed by any digit
    2. or 3 followed by a 0 or 1
  4. Finally should be another forward slash followed by four digits.

One thing to notice is that in order to properly escape the class inside of a string (which is what we have to do here in PostgreSQL), you have to escape the backslash so that it will be interpreted as one backslash in front of the next character thus rendering “\\w” as “\w“.


PHP – Hexadecimal Color Code

In CSS, a color code can be in many different forms. One accepted form is hexadecimal. The hex form can be three characters or six characters long. It can start off with a number sign, but this symbol isn’t required. Knowing all of this, we could use the following in PHP to validate the hex color:

$pattern = '/^#?([0-9A-F]{3}){1,2}$/i';
$validHex = preg_match($pattern, $_GET['hex']);

The preg_match() function is used to validate the GET parameter called “hex” against our regular expression:

  1. First it specifies that the first character may be a number sign (#).
  2. Next I have defined a parenthesized group which matches any three hexadecimal digits.
  3. After that, I am specifying that my parenthesized group pattern may appear once or two times in a row and that no other characters should follow.
  4. Finally, you will notice that I am again using the “i” flag to indicate that this is a case-insensitive pattern.

Python – Simple Image File Names

Let’s use Python now to check to see if a file name looks like a valid image name:

# Import the regular expression library
import re

# Defining the compiled regular expression.
pat = "^[^/\\?%*:|\"<>]+\\.(jpg|png|gif|bmp)$"
reImg = re.compile(pat, re.I)

# Getting the file name from the user
fileName = raw_input("File name:  ")

# Determine if the file name is an image name
isImage = reImg.match(fileName) is not None

The regular expression created does the following:

  1. First makes sure that the string starts off with one more characters which are none of the following:  /  \  ?  %  *  :  |  ”  <  >
  2. In the end it checks that a dot is found followed by one of the following extensions which must appear at the end of the string:
    1. jpg
    2. png
    3. gif
    4. bmp
  3. It is also important to note that by using “re.I“, I specified that casing would be ignored.

The code should basically prompt the user for a file name and then validate the string entered to determine if it matches the regular expression for an image.  The boolean value indicating whether or not it is an image is stored in the isImage variable.


VBScript – Format Large Integer With Commas

The following is how you could use a regular expression to insert commas into a number (integer):

' Setup the RegExp for testing if input is an integer.
Dim re : Set re = new RegExp
re.Pattern = "^(0|-?[1-9]\d*)$"

' Get the input integer from the user.
input = InputBox("Enter an integer", "Your Integer", 123456789)

' If the input is an integer...
If re.Test(input) Then
  ' Modify the pattern to input the commas correctly.
  re.Pattern = "(\d)(?=(\d{3})+$)"
  re.Global = True

  ' Reformat the integer, if given.
  newInput = re.Replace(input, "$1,")

  ' Display the input formatted with commas.
  MsgBox input & " became " & newInput

' If the input is not an integer, tell the user so.
Else
  MsgBox "The input given wasn't recognized as an integer."
End If

The first regular expression basically tests to make sure that the input is either simply a zero or one or more digits with the first one being non-zero. In other words, the first pattern makes sure that the input is an integer that doesn’t start with a zero (unless it is zero). The second regular expression is what is used to insert the comma(s) in the right place(s). It finds every instance in which one digit is followed by at least one group of three digits. By starting the group off with “?=” I am ensuring that the matched group will not be skipped on the next pass through.