The CodeEval page explains in a few words what is that transformation and what it useful for. Even if could look a bit unclear at first, it is quite easy to write a function that operates the transformation. Here is my python 3 code for it:
def bw_transform(s): size = len(s) rots = sorted([s[i:size] + s[0:i] for i in range(size)]) result = [rot[-1] for rot in rots] return ''.join(result)The input line is rotated by one char at time, and all these rotations are stored in a list, that get sorted. Then I extract the last character in each rotation to another list that, joined, gives the result.
However our job is to reverse the result of the transformation, and that leads to a piece of code more complicated. Actually, I'm not sure the fuction I get could be considered a good result. It gets accepted by CodeEval, though.
Here is the test case I used to help me getting through the job:
def test_provided_1(self): self.assertEqual('Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo$', solution('oooooooo$ ffffffff ffffffffuuuuuuuuaaaaaaaallllllllbbBbbBBb')) def test_provided_2(self): self.assertEqual('James while John had had had had had had had had had had had a better effect on the teacher$', solution('edarddddddddddntensr$ ehhhhhhhhhhhJ aeaaaaaaaaaaalhtf thmbfe tcwohiahoJ eeec t e ')) def test_provided_3(self): self.assertEqual('Neko no ko koneko, shishi no ko kojishi$', solution('ooooio,io$Nnssshhhjo ee o nnkkkkkkii ')) def test_extra(self): self.assertEqual('easy-peasy$', solution('yyeep$-aass'))As you can see also from the tests, in this implementation of the Burrows-Wheeler algorithm there is the trick of adding a marker, here a dollar sign, at the end of the line. That's why I defined a constant in my script:
EOL = '$'Then I create a matrix, initialized as a list of empty lists, then, for each element, I repeatedly inserted at the beginning of each rotation each character in line.
But notice that, after each run of the inner for loop, I sort the rotations:
def solution(line): rots = [[] for _ in line] for _ in line: for ch, rot in zip(line, rots): rot.insert(0, ch) rots.sort()So, in the end, rots contains all the rotations of the original string.
Now I should find out the right rotation. But that's easy, since I used the EOL marker to identify it.
for candidate in rots: if candidate[-1] == EOL: return ''.join(candidate)It is just a matter of selecting the rotation that has it as its last character.
CodeEval took quite a longish time to give its positive response on this solution. I think I should work more on it to get a better result. Anyway, I put the test case and the python script I have written on GitHub.
No comments:
Post a Comment