0

This algorithm (originally implemented in unl-aligner) calculates the longest list of increasing numbers with correspondingly increasing indices in the sequence, so given

seq = [0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15]

it will return

[0, 2, 6, 9, 11, 15]

the implementation looks like

def subseq(seq, keyfn=lambda value: value):
    if not seq: return seq
    tail = []
    prev = []
    for i in range(len(seq)):
        for k in range(len(tail)-1, -1, -1):
            if keyfn(seq[tail[k]]) < keyfn(seq[i]):
                if len(tail) == k+1:
                    tail.append(i)
                elif keyfn(seq[tail[k+1]]) > keyfn(seq[i]):
                    tail[k+1] = i
                prev.append(tail[k])                    
                break
        else:
            tail.append(i)
            prev.append(None)

    i = tail[-1]
    subseq = [seq[i]]
    while prev[i] is not None:
        i = prev[i]
        subseq.append(seq[i])
    subseq.reverse()
    return subseq

The algorithm performs a linear scan, while a bisect (binary) search should be preferred. Which is the best approach to refactor it to perform a binary search?

8
  • 1
    I don't think there's a way to do this more efficiently with a binary search vs. linear. The reason is because a binary search works on a sorted list, since you can compare adjacent elements. It's not obvious to me if that can be applied here. Commented May 23, 2019 at 19:21
  • 2
    How can you bisect something that isn't sorted? Maybe I'm misunderstanding the question. Commented May 23, 2019 at 19:23
  • 1
    first sort your sequence / use bisect to keep the list sorted. Now you can apply binary search. Commented May 23, 2019 at 19:24
  • 1
    Also, your output is not the longest increasing subsequence - I'm not sure what that output represents. Commented May 23, 2019 at 19:25
  • 1
    @loretoparisi none of the numbers in the output are consecutive. I think what you're describing is the longest list of increasing numbers in the sequence with correspondingly increasing indices in the sequence. Also your sample code fails because tail is undefined. Commented May 23, 2019 at 19:59

1 Answer 1

1

With this answer:

bisect = "longest_subsequence([1,2,3,4,5,6,7,2,2,2,2,2,5,1,7,8])"
_subseq = "subseq([1,2,3,4,5,6,7,2,2,2,2,2,5,1,7,8])"

from timeit import timeit

print(timeit(bisect, globals=globals(), number=10000))  # 0.2994734
print(timeit(_subseq, globals=globals(), number=10000))  # 0.32428109999999993

This is the result on a totally random test, for your example they seem almost exact time-wise

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.