## Pages

### CodeEval Burrows-Wheeler transform

The CodeEval problem #184 asks us to generate the original text from an input line representing the result of a Burrows-Wheeler transformation.

The CodeEval page explains in a few words what is that transformation and what it useful for. Even if could look a bit unclear at first, it is quite easy to write a function that operates the transformation. Here is my python 3 code for it:
```def bw_transform(s):
size = len(s)
rots = sorted([s[i:size] + s[0:i] for i in range(size)])
result = [rot[-1] for rot in rots]
return ''.join(result)
```
The input line is rotated by one char at time, and all these rotations are stored in a list, that get sorted. Then I extract the last character in each rotation to another list that, joined, gives the result.

However our job is to reverse the result of the transformation, and that leads to a piece of code more complicated. Actually, I'm not sure the fuction I get could be considered a good result. It gets accepted by CodeEval, though.

Here is the test case I used to help me getting through the job:
```def test_provided_1(self):
self.assertEqual('Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo\$',
solution('oooooooo\$  ffffffff     ffffffffuuuuuuuuaaaaaaaallllllllbbBbbBBb'))

def test_provided_2(self):
solution('edarddddddddddntensr\$  ehhhhhhhhhhhJ aeaaaaaaaaaaalhtf thmbfe           tcwohiahoJ eeec t e '))

def test_provided_3(self):
self.assertEqual('Neko no ko koneko, shishi no ko kojishi\$', solution('ooooio,io\$Nnssshhhjo  ee  o  nnkkkkkkii '))

def test_extra(self):
self.assertEqual('easy-peasy\$', solution('yyeep\$-aass'))
```
As you can see also from the tests, in this implementation of the Burrows-Wheeler algorithm there is the trick of adding a marker, here a dollar sign, at the end of the line. That's why I defined a constant in my script:
```EOL = '\$'
```
Then I create a matrix, initialized as a list of empty lists, then, for each element, I repeatedly inserted at the beginning of each rotation each character in line.
But notice that, after each run of the inner for loop, I sort the rotations:
```def solution(line):
rots = [[] for _ in line]
for _ in line:
for ch, rot in zip(line, rots):
rot.insert(0, ch)
rots.sort()
```
So, in the end, rots contains all the rotations of the original string.

Now I should find out the right rotation. But that's easy, since I used the EOL marker to identify it.
```for candidate in rots:
if candidate[-1] == EOL:
return ''.join(candidate)
```
It is just a matter of selecting the rotation that has it as its last character.

CodeEval took quite a longish time to give its positive response on this solution. I think I should work more on it to get a better result. Anyway, I put the test case and the python script I have written on GitHub.