[2023 Day 12] I feel like I might be missing a trick regarding combinations

purplemonkeymad@programming.dev · edit-2 11 months ago

[2023 Day 12] I feel like I might be missing a trick regarding combinations

hades@lemm.ee · 11 months ago

I can bail out of branches of combinations if the info so far won’t fit, but that still leads me to visiting every valid combination which in one of the examples is 500k.

By “every valid combination” do you mean every substitution of ‘?’ with a ‘#’ or ‘.’? If yes, then you’re wrong, you can bail out of branches that don’t fit early, and cut a lot of them this way.

Consider the following example:

???????????? [1, 2, 2]

When you substitute the first two question marks with ##, the answer already doesn’t match the input string, so you can throw away 1M of the combinations that don’t fit.

Also, while you’re at it, avoid generic type annotations (e.g. list), try to always specify the generic argument (e.g. list[str]) :)

purplemonkeymad@programming.dev · edit-2 11 months ago

When you substitute the first two question marks with ##, the answer already doesn’t match the input string, so you can throw away 1M of the combinations that don’t fit.

You know I figured I was already doing that, but printing your example shows I was not. I also added some logic to the other end since 1,2,2 needs a space of 7 and if the check is all dots and I only have 6 chars left, I know it can’t fit.

still taking a long time for the real data so it must be something inefficient in my code then, rather than the method.

Also, while you’re at it, avoid generic type annotations

Good point. Recently figured that one out, still not automatic as you can see.

Thanks for the pointers.

Gobbel2000@feddit.de · 11 months ago

I don’t think there are many significant optimizations with regards to reducing the search tree. It took me long enough to get behind it, but the “solution” (not saying there aren’t other ways) to part 2 is to not calculate anything more than once. Instead put partial solutions in a dict indexed by the current state and use that cached value if you need it again.

It seems like you are actually constructing all rows with replaced ?. This won’t be viable for part 2, your memory usage will explode. I have a recursive function that calls itself twice whenever a ? is encountered, once assuming it’s a ., and once a #.

purplemonkeymad@programming.dev · 11 months ago

Memory is fine but I think I get what you mean. In the example:

????.###.????.### 1,3,1,3

I’m checking the second unknowns combinations for each of the first, but if my state was say

data: '...#.##.????.###'
position -----^
check_list: [1,3]

And I get 4 combinations from recursion then, I know that is the same number of combination for any of the first unknowns.

So I can then cache ".????.###",[1,3] -> 4.

Sekoia@lemmy.blahaj.zone · edit-2 11 months ago

I went with a completely different approach:

! iterate over our string. Whenever you hit a non-empty, check if the next N are also possible to be a # (N being the first element of our sequence) and that the N+1th isn’t a #. If they are, we can truncate the first N+1, the first element of our sequence, and recurse. If you hit a #, you know that the first element has to start here at the latest, so you can break. With this method, memoization is enough to get part 2 down to 25 ms. To make the memoization more efficient you can also truncate all the way up to the next non-empty when recursing. !<

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25