Using sscanf

at 2008-01-26 in Examples by friebe (0 comments)

To parse strings, you've probably used different of the many string functions that come bundled with PHP, strtok, explode, strstr, regular expressions and maybe even the more "exotic" C-like strcspn or strcasecmp. One of the candidates most often overlooked is sscanf.

Here are some examples.

Parsing a HTTP request
Given the following input: GET / HTTP/1.1 we would like to parse the verb, the path and the minor version. Using regular expression, it would probably look like this:

<?php 
$r= preg_match('#([A-Z]+) ([^ ]+) HTTP/1.([0-9])#', 'GET / HTTP/1.1', $matches);
$success= (1 == $r);
// verb = matches[1]
// path = matches[2]
// minor = matches[3]
?>


We can rewrite the above as follows:
<?php 
$r= sscanf('GET / HTTP/1.1', '%[A-Z] %s HTTP/1.%d', $verb, $path, $minor);
$success= (3 == $r);
?>

The benefit here is that the minor variable's type is set to int (while anything preg_match returns into its matches array is a string). Also, the sscanf version is about twice as fast:-)

Parsing hexadecimal colors
To split a HTML color notation (#RRGGBB, where RR, GG and BB are hexadecimal numbers), we could use the following:
<?php 
$h= hexdec('FA0D3E');
$r= $h >> 0x10 & 0xFF;
$g= $h >> 0x08 & 0xFF;
$b= $h & 0xFF;
?>

...or rewrite this using sscanf like this:
<?php 
sscanf
('FA0D3E', '%2x%2x%2x', $r, $g, $b); // $r= 250, $g= 13, $b= 62
?>


Optional tail values
Unmatched tokens in sscanf will be assigned a NULL value. Given an input of classname::method, this can be used as follows:
<?php 
sscanf
('news::list', '%[^:]::%s', $facade, $method); // facade= 'news', method= 'list'
sscanf
('news', '%[^:]::%s', $facade, $method); // facade= 'news', method= NULL
?>

To accomplish the same using regular expressions, the following code is needed:
<?php 
preg_match
('/([^:]+)(::(.+))?/', $input, $matches);
$facade= $matches[1];
$method= isset($matches[3]) ? $matches[3] : NULL;
?>


Not assigning values
Using regular expressions, result variables are created by using brackets enclosing a pattern part. By omitting them, no value is assigned:
<?php 
preg_match
('/.+([0-9]+)/', 'node1', $matches); // Here, .+ matches "node1"
$number= $matches[1];
?>

The same can be written as follows using sscanf:
<?php 
sscanf
('node1', '%*[^0-9]%d', $number); // The asterisk suppresses assignment, $number= 1
?>



What sscanf can't do
Of course, sscanf is not a full-feature regular expression replacement, and is not designed to be. For simple cases, it proves to be a fast alternative, for features like look-ahead, optional head values, grouping, repetition, unicode, callbacks and other features provided, regular expressions are the way to go.



Subscribe

You can subscribe to the XP framework's news by using RSS syndication.


Categories

News
General
PHP5
Announcements
RFCs
Further reading
Examples
Editorial
EASC
Experiments
Unittests
Databases

Related

Find related articles by a search for «Using».