Performant URL routing — with PHP and regular expressions (regex)

Simple tutorial for a performant URL routing with PHP and regex. Basic knowledge of PHP, htaccess and regex is required.

Stephan Romhart
Level Up Coding

--

At the beginning briefly explained: What does a url router do? A url router evaluates the requested URL and interprets it according to defined specifications. I use this router to load a specific controller by URL. An example:

http://www.website.com/user/edit/4

This url should calls the controller “user” with the action “edit” and the parameter “4”.

include 'ctrl/user-edit.ctrl.php';

Another example:

http://www.website.com/company/search/autocomplete/german

This url should call the controller “company” with the action “search-autocomplete” and the filter “german”.

include 'ctrl/company-search-autocomplete.ctrl.php';

In both examples, I still have parameters like the user id or the filter. These should be able to be passed to the php script and are not part of the controller name, which should come from the url.

Step 1: The htaccess file

In order for the part after “http://www.website.com/" to be read by PHP, you need an htaccess file that sends a rewrite instruction to the web server. In the example this is done by ModRewrite, which is standard on most Apache and nginx webservers.

# ModRewrite activate
RewriteEngine On

# Let files and folder through
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d

# rewrite URL
RewriteRule (.+) index.php?path=$0 [L,QSA]

The last line says how the part after “http://www.domain.com/" is delivered to the PHP script: as a get parameter “path”.

Step 2: The Routes

For the routing to work, the script needs the info about which routes exist. In the example I define three routes and put everything in a file “index.php”. For larger projects you can also put the routes in a config file.

The routes are a triple nested array. The “path_pattern” value is a regular expression that represents a pattern formulation that applies generally to a url. The value “controller” in this example is a path to a controller file.

Step 3: The Router Function

I’ll now show the complete router function in one piece, and then go through it step by step to explain what each block of code does.

The function

function router($routes)
{
}

I have named the function “router”. Parameter is the previously created array “routes”.

The basic variables

function router($routes)
{
$route_match = false;
$url_path = 'index';
$url_params = array();
}

“route_match” is a boolean variable that is set to false for now. So until the function is executed, no route matches yet. Inital the variable “url_path” is set to the value “index”. To store the possible URL parameters I initalize the array “url_params”.

The GET parameter path

function router($routes)
{
$route_match = false;
$url_path = 'index';
$url_params = array();

if(isset($_GET['path']))
{
$url_path = $_GET['path'];
if(substr($url_path,-1) == '/')
{
$url_path = substr($url_path,0,-1);
}
}
}

Here I check if a GET variable exists at all. If so, I overwrite “url_path” with the value from the GET parameter “path”.

The if condition “if(substr($url_path,-1) == ‘/’)” truncates the last character of the string “url_path” if it is a “/”.

Checking routes with regex

function router($routes)
{
$route_match = false;
$url_path = 'index';
$url_params = array();

if(isset($_GET['path']))
{
$url_path = $_GET['path'];
if(substr($url_path,-1) == '/')
{
$url_path = substr($url_path,0,-1);
}
}

foreach($routes as $route)
{
if(preg_match($route['path_pattern'],$url_path,$matches))
{
$url_params = array_merge($url_params,$matches);
$route_match = true;
break;
}
}
}

With the foreach loop I run through the variable “routes”. The if-condition “if(preg_match(…))” checks if the pattern of the current loop element matches the variable “url_path” and immediately passes the match variables with the variable “matches”. The variable “url_params” then contains parameters that were contained in the URL.

The variable “route_match” is set to “true” and a “break” is given for the foreach loop.

Evaluation of the variables

function router($routes)
{
$route_match = false;
$url_path = 'index';
$url_params = array();

if(isset($_GET['path']))
{
$url_path = $_GET['path'];
if(substr($url_path,-1) == '/')
{
$url_path = substr($url_path,0,-1);
}
}

foreach($routes as $route)
{
if(preg_match($route['path_pattern'],$url_path,$matches))
{
$url_params = array_merge($url_params,$matches);
$route_match = true;
break;
}
}

if(!$route_match)
{
exit('URL path "'.$url_path.'" is not defined.');
}

if(file_exists($route['controller']))
{
include($route['controller']);
}
else
{
exit('Controller "'.$route['controller'].'" does not exists.');
}
}

Then I check if “route_match” is still “false” and do an exit with the error message that the route was not found. If a route was found, I check if the controller file from the router specification exists and include it.

The complete script

With a simple “router($routes);” the router can be called now.

At the url “http://www.website.com/user/edit/4" the controller “ctrl/user-edit.ctrl.php” is loaded and in the associative array “url_params” there is now the entry “user_id” => 4, which can be accessed by the include.

The value for “user_id” comes from the pattern “/^user\/edit\/(?P\d+)$/”.

In my example I have shown the basic idea. With this approach you can create a very flexible router. The array of routes can be extended by various information and the complete handling of the information obtained from the url is freely configurable.

I have implemented several projects with this system and I am very satisfied with the performance. What do you think about the script and the approach? I am looking forward to your feedback!

--

--

All-rounder who works passionately without Apple products. Founder of kreisform Design Agency in Esslingen / Neckar. Designer, Musician, Coder, lost Poet.