Tag Archives: regex

Basic REGEX for cleaning and prepping data

Over the last year or so, I have used a PHP Framework Code Igniter, Amongst many of its good features is its built in Form Validation Library.

Every now and then I either write a a small site out of Code Ignite, or help others to write or develop sections of their site, I constantly get MSN/IM messages asking for help which I am happy to help.

More recently I have started compiling a list of the little help functions I have written to generate my own ‘Form Validation’ Class, I hope to post it here soon.

Today though I have been doing a lot of Code Igniter validation, with some extra ‘callbacks’ using regex/preg_match & preg_replace combos,

Note: This is not are Regex tutorial, just some basic common examples.

Basic Examples

Strip non-numeric characters from a string.

< ?php

$string = 'sheldon is not james bond 007';
$numbers = preg_replace('/[^0-9]+/', '', $string);
// returns '007';
?>

Strip all none numeric and alpha characters from a string

$string = 'sheldon, Is james bond (007)';
$numbers = preg_replace('/^([a-z0-9])+$/i' '', $string); // the i means case insensitive.
// returns 'sheldon Is james bond 007';
?>

Create a safe URL string.

want to make url safe permalinks or strings?
lets strip out anything that’ll break our address.

// assume $_POST['name'] = "sheldon's wonderful *product* costs $19.95";
$name = str_replace(' ', '-', trim($_POST['name']));
$url = preg_replace('/^([a-z0-9-_])+$/i', '', $name);

header("Location: http://sheldon.lendrum.co.nz/". $url);
//  http://sheldon.lendrum.co.nz/sheldons-wonderful-product-costs-19.95
?>

preg_replace() & preg_match()

I have used preg_replace() to strip out the unwanted tags, but if you just want to validation against with out altering the users data, you can use preg_match().

Here is an example to make sure a user has only submitted a trough z and 0 trough 9 characters and spaces.

< ?php
// assume $_POST['name'] = "sheldon's wonderful *product* costs $19.95";
if(preg_match('/^([a-z0-9\ ])+$/i',  $_POST['name'])) {
	// valid data ( spaces are allowed. )
	// case insative
	echo('Validation Passed');
} else {
	// this was invalid data
	echo('Please enter a valid string.');
}
?>

In this example, our string would fail.

We allowed spaces in our test, to run the same match with out allowing spaces, re can remove the ‘\ ‘ form the Regex.

Here is an example to make sure a user has only submitted a trough z and 0 trough 0 characters .

< ?php
// assume $_POST['name'] = "sheldon";
if(preg_match('/^([a-z0-9])+$/i',  $_POST['name'])) {
	// valid data ( spaces are  NOT allowed. )
	// case insative
	echo('Validation Passed');
} else {
	// this was invalid data
	echo('Please enter a valid string.');
}
?>

In this example, our post data would pass the validation.