Home Strings Longest substring without repeating characters

Longest substring without repeating characters

by nikoo28
2 comments 3 minutes read

Question: Given a string, find the length of the longest substring without repeating characters.
Input: “abcabcbb”
Output: “abc”

Input: “bbbb”
Output: “b”

The longest substring without repeating letters for “abcabcbb” is “abc”, which the length is 3.
For “bbbbb” the longest substring is “b”, with the length of 1.

We definitely have the brute-force method, where we find each string and then compare. But we want to speed up the process.

How can we can look up if a character had existed in the substring instantaneously? The answer is using a simple table to store the characters that have appeared. As you traverse through the string, update by using its ASCII value as index to the table.

The next question is to ask yourself what happens when you found a repeated character? For example, if the string is “abcdcedf”, what happens when you reach the second appearance of ‘c’?

When you have found a repeated character (let’s say at index j), it means that the current substring (excluding the repeated character of course) is a potential maximum, so update the maximum if necessary. It also means that the repeated character must have appeared before at an index i, where i is less than j.

Since you know that all substrings that start before or at index i would be less than your current maximum, you can safely start to look for the next substring with head which starts exactly at index i+1.

Therefore, you would need two indices to record the head and the tail of the current substring. Since i and j both traverse at most n steps, the worst case would be 2n steps, which the run time complexity must be O(n).

Below is the implementation in C++. Beware of the common mistake of not updating the maximum after the main loop, which is easy to forget.

int lengthOfLongestSubstring(string s)
{
	//Get the length of string
	int n = s.length();
	
	int i = 0, j = 0;
	int maxLen = 0;
	
	// Set all characters as not-existing
	bool exist[256] = { false };
	
	while (j < n)
	{
		// Check if the character exists
		if (exist[s[j]])
		{
			maxLen = max(maxLen, j-i);
			while (s[i] != s[j])
			{
				exist[s[i]] = false;
				i++;
			}
			
			i++;
			j++;
		}
		else
		{
			exist[s[j]] = true;
			j++;
		}
	}
	
	maxLen = max(maxLen, n-i);
	return maxLen;
}

You may also like

2 comments

Samuel Marks February 21, 2015 - 04:08

Inspired by your post to write one which finds the longest reoccurring character:

#include <iostream>

std::pair<char, unsigned> longestSubstring(const std::string&);

int main() {
    std::pair<char, unsigned> longest_substring = longestSubstring("aaaabbbbbbcccccccnee");
    std::cout << "Longest reoccuring character is: '" << longest_substring.first
              << "', at: " << longest_substring.second << " times." << std::endl;
    return 0;
}

std::pair<char, unsigned> longestSubstring(const std::string &s){
    unsigned max_length = 0, curr_length = 0;
    char last, max = 0;
    for(char c: s) {
        curr_length++;
        if(c != last) {
            if(curr_length > max_length) {
                max_length = curr_length;
                max = last;
            }
            curr_length = 0;
        }
        last = c;
    }
    return std::make_pair(max, max_length);
}
nikoo28 February 21, 2015 - 11:53

Thanks for the case Samuel. I will try to create a separate post for this. Here is the running link for your code.
http://ideone.com/D6vSCT

Comments are closed.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More