Joomla! Programmers Documentation
Manual Index
Routing
Routing is the process of mapping a URL to the right code in a system (and the other way around). Without SEF URLS
(see below) Joomla is simply reading the query elements of a URL and calls the code based on that. The routing primarily is split in two parts, the application routing and the component routing, where the later is happening inside the former.
Application routing
All routing starts with the Joomla\CMS\Router\Router
class, which provides a parse() and a build() method. An application can then provide specific behavior by extending that class and attaching callbacks via Router::attachBuildRule()
and Router::attachParseRule()
. You can see an example in the CMS in the constructor of the Joomla\CMS\Router\SiteRouter
class. You can also attach additional code with plugins for example. Just make sure that you are adding the code before Joomla runs the parsing step. This can be done in the onAfterInitialise
system event, as the CMS does for example in the languagefilter system plugin.
The callbacks can then implement behavior for the router and for this have first the router object and then a Joomla\CMS\Uri\Uri
object as parameter. All parsing and building is done on the Uri
object, which contains the current URL to process. When parsing or building a URL, you are expected to read a part of the object, transform it as expected, store it back in the object and delete what you have processed. This is for the router to finally build the right URL or also discover if a URL couldn't be properly parsed, because for parsing it is expected that the router converts the whole path-part of the URL into query elements. If at the end of the parsing the path-part of the Uri
object still contains characters, the router assumes that no router code could understand that part of the URL and then throws a 404 error. Likewise when building the URL from a query string to our SEF URL, we want to remove what we have processed, so that our URL doesn't contain unnecessary query parameters.
Required query parameters and default values
Every URL in Joomla contains at least an &option=
element, which defines the component to be called for that page. If that &option=
is not set when parsing a URL, it will default to the component of the home menu item for the frontend or com_cpanel
in the backend. Likewise every URL in the frontend contains a &Itemid=
element, which defines the menu item to be used for that page and it falls back onto the default menu item if not set.
While components can do whatever they want, if you are following the best practice of MVC components in Joomla, every URL also contains a &task=
element, which defines the controller to be used and the task of that controller, separated by a .
. An example would be &task=article.save
, which will call the Article
controller and run the save()
method in it. If no &task=
is given this defaults to the default controller and the task display
. In that case it then also reads the &view=
element of the URL, pointing to the respective view of the component.
Search engine friendly (SEF) URLs
Joomla internally is controlled by the query parameters of a URL or in other terms: By an array of key-value-pairs of config-values. However these are not very human-friendly. To achieve "speaking" URLs, the site router in Joomla has a SEF
mode, where these query elements are converted into speaking URLs. (Upon parsing, these URLs are of course again converted back into query elements/this array of config values.)
This starts with discovering the right menu item. Joomla will split the path-part of the URL into segments delimited by /
, starting from the base URL of the site and will then try to match these segments with the alias of the menu items. If it finds a match, it will compare the next segment with the aliases of that first menu items child items and repeat that process until it can't find a matching menu item anymore. That last menu item is the active menu item for this request and defines the component to be called. In the Uri
object it will set the &Itemid=
parameter for this.
After this, the application instantiates the router of the component to be called and gives it the rest of the path of the URL and expects an array of key-value-pairs back.
The other way around, when building a URL, the router will ask the component router to preprocess the URL (where the component router can for example set the correct Itemid) and then to build the path of the component, which is then prepended with the path of the menu item.
Example of a SEF URL
We are trying to build/transform the following URL into a SEF URL: index.php?option=com_content&view=article&id=23:minas-tirith&catid=66&Itemid=42
The router will then first of all look up the menu item 42
, which would result in the path /lotr-wiki/
, then the component router would look up the path of the category 66
and come to /cities-in-middleearth
and last but not least add the alias of the article 42
at the end. The router might also add a suffix or do additional transformations, but your final URL might look like this: /lort-wiki/cities-in-middleearth/minas-tirith
Example |
---|
index.php?<span style={{color:'blue'}}>option=com_content&view=article&<span style={{color:'green'}}>id=23:minas-tirith&<span style={{color:'red'}}>catid=66&<span style={{color:'yellow'}}>Itemid=42 |
/<span style={{color:'yellow'}}>lotr-wiki/<span style={{color:'red'}}>cities-in-middleearth/<span style={{color:'green'}}>minas-tirith |
<span style={{color:'yellow'}}>Menu-part, <span style={{color:'red'}}>Category-part (component), <span style={{color:'green'}}>Article alias (component) |
Parsing URLs and error handling in the frontend
An important part of parsing a URL is deciding when a URL is actually not correct. Since the Joomla router does not know which code might recognize a part of a URL and thus does not know if a part has been properly parsed, it requires all code which recognized and parsed a part to remove what they were successfully able to identify from a SEF URL (=the path part of the URL). This has the end result that at the end of running all parsing code, the Uri
object should only contain the array of key-value-pairs in the query part of the URL and the path should be completely empty. If the path is NOT empty, this means that the URL contains parts which were not recognized and the URL should not be accepted. Since Joomla 4.0 the router in that situation throws an exception with a 404. If the router would not recognize these unrecognized parts, it would lead to multiple URLs pointing to the same content.
Since Joomla 5.3 the router has also been extended to allow for "softer" error handling. There are situations where the router can correctly parse a URL, but recognizes that the URL does not fit the expected format. This could be the case when URLs should contain a suffix (.html
) and the URL doesn't have that or if the URL still contains the ID of the article as part of the segment (/23-minas-tirith
from the above example), but the alias is actually not correct (for example /23-minas-morgul
instead). In both cases the router can correctly parse the URL, because we defaulted to the suffix .html
and we are loading the article with the ID 23, but the URL is not actually correct. The router can then set the tainted
flag by calling \Joomla\CMS\Router\Router::setTainted()
for the current parse process. Afterwards the following code can then check that flag by calling \Joomla\CMS\Router\Router::isTainted()
and decide what to do. In a default installation the SEF system plugin will check that flag and if it is set, try to build a URL with the recognized parameters and redirect to that new URL with a 301 redirect. The redirect is specifically NOT done directly in the router, because otherwise it would mean that you can't parse another URL while running Joomla. It also allows for more than one fix at a time. If for example the correct URL would be https://domain.com/lotr-wiki/cities-of-middleearth/minas-tirith.html
and the calling URL would be http://www.domain.com/lotr-wiki/cities-of-middleearth/23-minas-morgul
, we don't want to have a redirect redirecting to https
and then another one directly afterwards to the domain without www
and then yet another one to the article segment without ID and the right alias, only to close this up with a last redirect to the right URL with the suffix at the end. Instead the flag allows for this to be marked as tainted and does one redirect at the end.
You may find it helpful to view 2 videos, covering parsing a URL and building a URL.