Partitioning Souvenirs (patched)

Given a list of integers, we want to know if there is a way to split it evenly in three parts, so that the sum of each part is the same than the other ones.

Problem given in week six of the edX MOOC Algs200x Algorithmic Design and Techniques by the UC San Diego.

My first solution, as you can see in a previous post (please check it out for more background information), was accepted in the MOOC but didn't work in a few cases, as Shuai Zhao pointed out.

So, I added the Shuai test case to my testing suite, and then a couple of other ones, that I felt would help me in writing a more robust solution:
def test_shuai(self):
    self.assertFalse(solution([7, 2, 2, 2, 2, 2, 2, 2, 3]))

def test_confounding_choice(self):
    self.assertFalse(solution([3, 2, 2, 2, 3]))

def test_duplicates(self):
    self.assertTrue(solution([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]))
Firstly, I rewrote the initial check, that now looks in this way:
def solution(values):
    third, module = divmod(sum(values), 3)  # 1
    if len(values) < 3 or module or max(values) > third:  # 2
        return False
1. Add up the passed values, our target is its third, and surely we can't split it there is a remainder after the division. Using divmod() we get both values in a single shot.
2. If we have less than three values in input, a remainder by dividing their sum by three, or if there is a value bigger than a third of the sum, our job is already done.

Then, since I saw that the problem was due to the lack of check in the ownership for values, I added a shadow table to the one I'm using for the Dynamic Programming procedure. I named it 'taken', and it keeps the id of the owner for the relative value for the current row. Zero means the value is currently not taken.
table = [[0] * (len(values) + 1) for _ in range(third + 1)]
taken = [[0] * (len(values) + 1) for _ in range(third + 1)]
As before, I apply the standard DP procedure, looping on all the "valid" elements.
for i in range(1, third + 1):
    for j in range(1, len(values) + 1):
        # ...
Another not a substantial change, it occurred to me that I keep track of a maximum of two positive matches in the table, so, if I have already reached that value, there's no need in doing other checks, just set the current value to two.
if table[i][j-1] == 2:
    table[i][j] = 2
    continue
The next check was already in, I just refactored it out, because the combined if clause in which it was included was getting too complicated, and I added the first use of the taken table.
if values[j-1] == i:
    table[i][j] = 1 if table[i][j-1] == 0 else 2  # 1
    taken[i][j] = j  # 2
    continue

ii = i - values[j-1]  # 3
1. If the current value is the same of the current row value, I have found a match. So I increase its value in the DP table
2. And I mark it as taken for the j value in the current row.
3. If j is not an exact match for i, I split it, looking for the difference in the previous column.

Now it's getting complicated. If there is a good value in the previous column, before using it we have to ensure it is available. And, since it could have been the result of a splitting, we need to check on the full row to do our job correctly.
available = True
taken_id = taken[ii][j-1]  # 1
if taken_id:
    for jj in range(1, j):  # 2
        if taken[ii][jj] == taken_id:  # 3
            if taken[i][jj]:
                available = False
                break
1. The current value id from the taken table is zero if was not used in the 'ii' row.
2. If the current value has a valid id, we need to ensure it has not be already used in current 'i' row.
3. The current jj is marked as used in the ii row for the id we care about, if it is taken also in the i row, we have a problem. We are trying to using in 'i' a value that has been already used. This was the check I missed.

Good, now I can safely going on with my DP algorithm.
if ii > 0 and table[ii][j - 1] > 0 and not taken[i][j - 1] and available:
    table[i][j] = 1 if table[i][j - 1] == 0 else 2
    
    // ...
The annoying thing is that I have to set the taken table too:
taken[i][j] = j  # 1
taken[i][j-1] = j

matcher = values[j-1] + values[j-2]  # 2
if taken_id:
    for jj in range(1, len(values)):  # 3
        if taken[ii][jj] == taken_id:
            taken[i][jj] = j
            matcher += values[jj-1]
if matcher < i:
    for jj in range(j-2, 0, -1):  # 4
        if taken[i][jj] or matcher + values[jj-1] > i:
            continue
        matcher += values[jj-1]
        taken[i][jj] = j
        if matcher == i:
            break
1. This is nice and simple. The current j and the previous one should be marked as j.
2. We have to mark all the other values concurring to generate the current sum. These are two just marked 'j' on this row, all the ones marked as stored in taken_id on 'ii' and, if they are not enough, also the ones free on 'i', to the get the expected result. To perform this last check I need another variable, that I called 'matcher'.
3. Mark on 'i' as 'j' all the items marked taken_id on 'ii'. And adjust matcher.
4. If needed, loop on the left elements looking for the missing values.

And finally:
        # ...
        else:  # 1
            table[i][j] = table[i][j - 1]

return True if table[-1][-1] == 2 else False  # 2
1. This 'else' follows the 'if ii > 0 and ...' above, ensuring the current cell is adjourned correctly if nothing else happened before.
2. The bottom right cell in our DP table should be set to 2 only if we find a correct 3-way split.

See full python code and test cases on GitHub.

No comments:

Post a Comment