# Arabic to Roman Numerals Conversion with PHP and Regex

During this week’s CodeKata, we were asked to write a tool to convert Arabic numbers (1, 5, 100, etc.) into Roman Numerals (I, V, C, etc.). For those unfamiliar with CodeKata, the point of the exercise is not to find a working solution, but to practice our approaches to problem-solving, pair-programming, and TDD.

As a bit of a running joke once the serious exercise was completed, I tried to solve the problem using regular expressions in PHP. This is written in a single method and is certainly not supposed to be an efficient or best-practice way to approach the problem in a production environment.

The very first step is to represent the Arabic number using just the numeral `I`, e.g. 1 becomes `I` and 18 becomes `IIIIIIIIIIIIIIIIII`.

If, for a moment, we concern ourselves just with sorting out the numerals `I`, `V`, and `X`, there are two phases of regex replacement that need to be completed:

Phase 1:

• All occurrences of `IIIII` (5) need to be replaced with `V`;
• Then, all occurrences of `VV` (10) need to be replaced with `X`;

Phase Two:

• All remaining occurrences of `IIII` (4) need to be replaced with `IV`;
• Then, any instances of `VIV` (9) need to be replaced with `IX`;

The pattern of replacements in the phases above are exactly the same if we multiply all the numbers by 10.

For 1 (`I`), 5 (`V`) and 10 (`X`), it’s the same for 10 (`X`), 50 (`L`) and 100 (`C`), and the same for 100 (`C`), 500 (`D`) and 1000 (`M`).

Phase 1 needs to be completed for each of these groups (`IVX`, `XLC`, `CDM`) first, then Phase 2 can be completed for each group afterwards, so that the substitutions are made in the correct order.

In the code below, the first `foreach` loop iterates through each of the phases above. The strings in the array each contain four space-separated tokens, representing two find-replace pairs. E.g. `/I{5}/ V` will be used to replace `IIIII` with `V`, and `/V{2}/ X` will be used to replace `VV` with `X`.

The second `foreach` loop iterates through our groups of numerals (each a multiple of 10 greater than the last).

We use `strtr(\$p, 'IVX', \$r)` to translate all the find-replace tokens in the phase to the correct multiple-of-ten group, as the patterns are identical. E.g. the phase `/I{5}/ V /V{2}/ X` will become `/X{5}/ L /L{2}/ C`.

The last step after this translation is to `explode` the space-separated string and feed the tokens into the correct parameters of `preg_replace`.

``````<?php

namespace Seniorio;

class Numeralizor
{
public function arabicToRoman(\$n)
{
\$n = str_repeat('I', \$n);

foreach (array('/I{5}/ V /V{2}/ X', '/I{4}/ IV /VIV/ IX') as \$p) {
foreach (array('IVX', 'XLC', 'CDM') as \$r) {
\$a = explode(' ', strtr(\$p, 'IVX', \$r));
\$n = preg_replace(array(\$a, \$a), array(\$a, \$a), \$n);
}
}

return \$n;
}
}
``````

Just a bit of fun. You can find the accompanying 3000-line(!) PhpSpec test on GitHub.