4

Problem

I was given this problem in my Algorithms class today:

Given function maxSubstring(s, t), where s is a string and t is a substring of s, find the maximum number of iterations you can delete either the first or last occurrence of substring t .

Concept

Here is a visualization of the function maxSubstring called on s = banababbaa and t = ba.

          b  a  n  a  b  b  a  a
1st move: n  a  b  a  b  b  a            or   b  a  n  a  b  a  b  a
2nd move: n a b b a a  or  n a b a b a        n a b a b a  or  b a n a b a
3rd move:    n a b a   or   n a b a             n a b a    or    n a b a
4th move:             n  a                                n  a

Thus, this operation takes four moves.

Attempt

Here is my solution to the problem. It works, but it is very slow when I use larger strings as arguments.

Attempt #1

function maxSubstring(s, t) {
    if (s.includes(t)) {
        var idxSubstr = s.replace(t, '');
        var lastIdxSubstr = s.substr(0, s.lastIndexOf(t)) + s.substr(s.lastIndexOf(t) + t.length, s.length);
        return 1 + Math.max(maxSubstring(idxSubstr, t), maxSubstring(lastIdxSubstr, t)));
    }
    return 0;
}

Attempt #2

function maxSubstring(s, t) {
    if (s.includes(t)) {
        var idx = s.indexOf(t), lastIdx = s.lastIndexOf(t);
        var idxSubstr = s.substr(0, idx) + s.substr(idx + t.length, s.length);
        var lastIdxSubstr = s.substr(0, lastIdx) + s.substr(lastIdx + t.length, s.length);
        if (idx != lastIdx) {
            return 1 + Math.max(maxSubstring(idxSubstr, t), maxSubstring(lastIdxSubstr, t));
        } else {
            return 1 + maxSubstring(idxSubstr, t);
        }
    }
    return 0;
}

Reason for update: Minor change in efficiency by storing values of indexOf and lastIndexOf in variables.

Attempt #3

function maxSubstring(s, t) {
    var idx = s.indexOf(t);
    if (idx >= 0) {
        var lastIdx = s.lastIndexOf(t);
        var idxSubstr = s.substr(0, idx) + s.substr(idx + t.length);
        if (idx != lastIdx) {
            var lastIdxSubstr = s.substr(0, lastIdx) + s.substr(lastIdx + t.length);
            return 1 + Math.max(maxSubstring(idxSubstr, t), maxSubstring(lastIdxSubstr, t));
        } else {
            return 1 + maxSubstring(idxSubstr, t);
        }
    }
    return 0;
}

Reason for update: Reduced instances in which certain values were redefined and prevented lastIndexOf calculation before the first index is checked.

Answer Requirements

Is there any algorithm or method I may use to optimize this code? Math.max is the main culprit, so I would appreciate it if anyone has an idea on how to avoid using this method altogether.

In other words, maxSubstring should only be called once inside of itself, but Math.max requires that it be called twice (once for the first index of a substring and another time for the last index of that substring).

Lastly, do you mind telling me what the Big O Notation is for my solution and what the Big O Notation is for yours? This is not part of the original challenge, but I am curious, myself. Thanks in advance.

8
  • If Math.max is you're worried about only, then why don't you use <,> to compare, after all there are only 2 params. Use ternary operator and comparison operators to find max between 2. Commented Sep 6, 2017 at 19:36
  • This concept is outlined here, but for Java. It doesn't show a great change in runtime of my function. stackoverflow.com/questions/2103606/… Commented Sep 6, 2017 at 19:39
  • 1
    This won't produce a big speedup, but your .includes() and .replace() are basically doing the same work twice of finding t within s, so you could get a bit of a speed up by combining them. In fact, your lastIndexOf(t) calls are also doing the same expensive operation, so maybe do it once instead 4 times! Commented Sep 6, 2017 at 19:41
  • hmm. Other complexity included are indexOf and lastIndexOf, which are loops ultimately. Commented Sep 6, 2017 at 19:42
  • 1
    Try some dynamic programming approach Commented Sep 6, 2017 at 20:11

1 Answer 1

3

The major problem with the naive recursive algorithm that you have presented is that it is called very often on the same input s - exponentially often even, and exactly that is what causes the noticeable slowdown on larger strings.

What you can do against this is to use memoisation - remember the result for a specific input in a lookup table.

Another optimisation you can do is check whether deleting the first vs the last lead to different results at all. In most cases, it will absolutely not matter in which sequence you remove them, the number of possible removals is always the same. However, that is not the case when the matched substring can overlap with itself. As an example, try maxSubstring('ababaa', 'aba').

function maxSubstring(s, t, prevResults = new Map()) {
    function result(x) { prevResults.set(s, x); return x; }
    if (prevResults.has(s))
        return prevResults.get(s); // memoisation

    const first = s.indexOf(t);
    if (first == -1)
        return result(0);
    const withoutFirst = s.slice(0, first) + s.slice(first + t.length);

    const last = s.lastIndexOf(t);
    if (last == first) // only one match
        return result(1 + maxSubstring(withoutFirst, t, prevResults));

    if (t.lastIndexOf(t.charAt(t.length-1), t.length-1) == -1 // last character of t is found nowhere else in t
        || !t.includes(s.charAt(first+t.length))) // character after the match can never be part of a match
        // so this match is always removed in the optimal sequence and it doesn't matter whether as first or last
        return result(1 + maxSubstring(withoutFirst, t, prevResults));

    const withoutLast = s.slice(0, last) + s.slice(last + t.length);
    if (t.indexOf(t.charAt(0), 1) == -1 // first character of t is found nowhere else in t
        || !t.includes(s.charAt(last - 1))) // character before the match can never be part of a match
        // so this match is always removed and it doesn't matter when
        return result(1 + maxSubstring(withoutLast, t, prevResults));

    return result(1 + Math.max(maxSubstring(withoutFirst, t, prevResults),
                               maxSubstring(withoutLast, t, prevResults)));
}

Time Complexity Analysis

The number of recursive calls should be roughly quadratic in the number of removals. With my second suggestion, it might get down to linear in the best cases (depending on the patterns).

For every call, factor in the linear searches (indexOf, slice, etc.) and the Map lookup, though their average complexity will be less than that as the input get smaller and the pattern is often found early in the input. In any case, the complexity is polynomial, not exponential.

Sign up to request clarification or add additional context in comments.

11 Comments

Thanks for the advice! I added a statement that makes sure Math.max is only called if the indexOf and lastIndexOf values are not equal.
@AnthonyKrivonos I thought of that as well, but it doesn't really help - that happens only for the last removal anyway.
@AnthonyKrivonos A better optimisation would be to place the var idx = …, lastIdx = …; in the first line of the function and then test if (idx == -1 && lastIdx == -1) return 0; else if (idx == lastIdx) return 1; else …, not calling includes at all.
@AnthonyKrivonos In your updated attempts (up to #3 as I write this) you are not including the bulk of what Bergi has suggested as performance improvements, notably memoization and not analyzing both paths if the substring does not overlap itself. I suspect adding those enhancements will likely resolve your timeouts in the test cases.
@AnthonyKrivonos I have no idea, the number of recursive calls should be roughly quadratic in the number of removals. With my second suggestion, it might get down to linear in the best cases (depending on the patterns). For every call, factor in the linear searches (indexOf, slice etc) and the map lookup, though their average complexity will be less than that as the inputs get smaller and the pattern is often found early in the input. In any case, the complexity is polynomial not exponential.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.