Pages

Partitioning Souvenirs (patched)

Given a list of integers, we want to know if there is a way to split it evenly in three parts, so that the sum of each part is the same than the other ones.

Problem given in week six of the edX MOOC Algs200x Algorithmic Design and Techniques by the UC San Diego.

My first solution, as you can see in a previous post (please check it out for more background information), was accepted in the MOOC but didn't work in a few cases, as Shuai Zhao pointed out.

So, I added the Shuai test case to my testing suite, and then a couple of other ones, that I felt would help me in writing a more robust solution:
def test_shuai(self):
    self.assertFalse(solution([7, 2, 2, 2, 2, 2, 2, 2, 3]))

def test_confounding_choice(self):
    self.assertFalse(solution([3, 2, 2, 2, 3]))

def test_duplicates(self):
    self.assertTrue(solution([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]))
Firstly, I rewrote the initial check, that now looks in this way:
def solution(values):
    third, module = divmod(sum(values), 3)  # 1
    if len(values) < 3 or module or max(values) > third:  # 2
        return False
1. Add up the passed values, our target is its third, and surely we can't split it there is a remainder after the division. Using divmod() we get both values in a single shot.
2. If we have less than three values in input, a remainder by dividing their sum by three, or if there is a value bigger than a third of the sum, our job is already done.

Then, since I saw that the problem was due to the lack of check in the ownership for values, I added a shadow table to the one I'm using for the Dynamic Programming procedure. I named it 'taken', and it keeps the id of the owner for the relative value for the current row. Zero means the value is currently not taken.
table = [[0] * (len(values) + 1) for _ in range(third + 1)]
taken = [[0] * (len(values) + 1) for _ in range(third + 1)]
As before, I apply the standard DP procedure, looping on all the "valid" elements.
for i in range(1, third + 1):
    for j in range(1, len(values) + 1):
        # ...
Another not a substantial change, it occurred to me that I keep track of a maximum of two positive matches in the table, so, if I have already reached that value, there's no need in doing other checks, just set the current value to two.
if table[i][j-1] == 2:
    table[i][j] = 2
    continue
The next check was already in, I just refactored it out, because the combined if clause in which it was included was getting too complicated, and I added the first use of the taken table.
if values[j-1] == i:
    table[i][j] = 1 if table[i][j-1] == 0 else 2  # 1
    taken[i][j] = j  # 2
    continue

ii = i - values[j-1]  # 3
1. If the current value is the same of the current row value, I have found a match. So I increase its value in the DP table
2. And I mark it as taken for the j value in the current row.
3. If j is not an exact match for i, I split it, looking for the difference in the previous column.

Now it's getting complicated. If there is a good value in the previous column, before using it we have to ensure it is available. And, since it could have been the result of a splitting, we need to check on the full row to do our job correctly.
available = True
taken_id = taken[ii][j-1]  # 1
if taken_id:
    for jj in range(1, j):  # 2
        if taken[ii][jj] == taken_id:  # 3
            if taken[i][jj]:
                available = False
                break
1. The current value id from the taken table is zero if was not used in the 'ii' row.
2. If the current value has a valid id, we need to ensure it has not be already used in current 'i' row.
3. The current jj is marked as used in the ii row for the id we care about, if it is taken also in the i row, we have a problem. We are trying to using in 'i' a value that has been already used. This was the check I missed.

Good, now I can safely going on with my DP algorithm.
if ii > 0 and table[ii][j - 1] > 0 and not taken[i][j - 1] and available:
    table[i][j] = 1 if table[i][j - 1] == 0 else 2
    
    // ...
The annoying thing is that I have to set the taken table too:
taken[i][j] = j  # 1
taken[i][j-1] = j

matcher = values[j-1] + values[j-2]  # 2
if taken_id:
    for jj in range(1, len(values)):  # 3
        if taken[ii][jj] == taken_id:
            taken[i][jj] = j
            matcher += values[jj-1]
if matcher < i:
    for jj in range(j-2, 0, -1):  # 4
        if taken[i][jj] or matcher + values[jj-1] > i:
            continue
        matcher += values[jj-1]
        taken[i][jj] = j
        if matcher == i:
            break
1. This is nice and simple. The current j and the previous one should be marked as j.
2. We have to mark all the other values concurring to generate the current sum. These are two just marked 'j' on this row, all the ones marked as stored in taken_id on 'ii' and, if they are not enough, also the ones free on 'i', to the get the expected result. To perform this last check I need another variable, that I called 'matcher'.
3. Mark on 'i' as 'j' all the items marked taken_id on 'ii'. And adjust matcher.
4. If needed, loop on the left elements looking for the missing values.

And finally:
        # ...
        else:  # 1
            table[i][j] = table[i][j - 1]

return True if table[-1][-1] == 2 else False  # 2
1. This 'else' follows the 'if ii > 0 and ...' above, ensuring the current cell is adjourned correctly if nothing else happened before.
2. The bottom right cell in our DP table should be set to 2 only if we find a correct 3-way split.

See full python code and test cases on GitHub.

7 comments:

  1. this is very long code, the problem has much easier solution, however the way you are solving the 2 partition problem should be different.
    please check below solving the 2 partition problem from Wikipedia

    https://en.wikipedia.org/wiki/Partition_problem

    from the above , you can expand it to 3 partition problem

    ReplyDelete
    Replies
    1. Thank you for commenting, Ayman. May I ask you to post your solution somewhere so to share it to everybody?

      As you can see in my first attempt to solve this problem, I originally came out with a slimmer and simpler solution, that was also closer to the well known 2-partition approach. Alas, it didn't always work right. Hence this patchy and a bit involute piece of code.

      Delete
  2. My solution was easy, but probably more complicated than it needed to be. Say we have z souvenirs to partition amongst three people, with x = sum(z) / 3 integer, len(z) > 3, and max(z) <= x.

    I created a 3D array T of size (z+1) by (x+1) by (x+1), where T[i][j][k] = true iff amongst the first i souvenirs, you can give person 1 a subset that equals j in value, and person 2 a subset that equals k in value.

    That made the recurrences really simple to define, and then of course the answer is in T[z][x][x], leaving the rest of the souvenirs of value x to the third person.

    It passed on the Data Structures and Algorithms course with good time, so I was content.

    ReplyDelete
  3. BTW, here's my solution in case anyone is curious:

    https://github.com/sraaphorst/data_structures_and_algorithms/blob/master/course_1_algorithmic_toolbox/week6_dynamic_programming2/2_partitioning_souvenirs/partition3.py

    ReplyDelete
    Replies
    1. Actually, I think I initially tried something like your approach to get a working solution. Unfortunately, I've got lost in the 3D structure, so I decided to move to the 2D approach, illusory simpler, that in the end required the patch showed above to work.

      These MOOC's are a lot of fun, aren't they? :)

      Thank you for sharing your code, I'm going to start soon working on it!

      Delete
  4. Also, you seem to like many of the same things I do... I love C++, Qt, Python, and Java. My job has me mostly doing Scala programming these days, but my personal projects are almost all C++ / Python.

    ReplyDelete
    Replies
    1. Yep, I guess our backgrounds are close. I know very little about Scala - I followed a just course on it, and my focus was more on functional programming than on the language itself - but I think we could have interesting chats on software development.

      Delete