Site icon Study Algorithms

Longest substring without repeating characters

Question: Given a string, find the length of the longest substring without repeating characters.
Input: “abcabcbb”
Output: “abc”

Input: “bbbb”
Output: “b”

The longest substring without repeating letters for “abcabcbb” is “abc”, which the length is 3.
For “bbbbb” the longest substring is “b”, with the length of 1.

We definitely have the brute-force method, where we find each string and then compare. But we want to speed up the process.

How can we can look up if a character had existed in the substring instantaneously? The answer is using a simple table to store the characters that have appeared. As you traverse through the string, update by using its ASCII value as index to the table.

The next question is to ask yourself what happens when you found a repeated character? For example, if the string is “abcdcedf”, what happens when you reach the second appearance of ‘c’?

When you have found a repeated character (let’s say at index j), it means that the current substring (excluding the repeated character of course) is a potential maximum, so update the maximum if necessary. It also means that the repeated character must have appeared before at an index i, where i is less than j.

Since you know that all substrings that start before or at index i would be less than your current maximum, you can safely start to look for the next substring with head which starts exactly at index i+1.

Therefore, you would need two indices to record the head and the tail of the current substring. Since i and j both traverse at most n steps, the worst case would be 2n steps, which the run time complexity must be O(n).

Below is the implementation in C++. Beware of the common mistake of not updating the maximum after the main loop, which is easy to forget.

int lengthOfLongestSubstring(string s)
{
	//Get the length of string
	int n = s.length();
	
	int i = 0, j = 0;
	int maxLen = 0;
	
	// Set all characters as not-existing
	bool exist[256] = { false };
	
	while (j < n)
	{
		// Check if the character exists
		if (exist[s[j]])
		{
			maxLen = max(maxLen, j-i);
			while (s[i] != s[j])
			{
				exist[s[i]] = false;
				i++;
			}
			
			i++;
			j++;
		}
		else
		{
			exist[s[j]] = true;
			j++;
		}
	}
	
	maxLen = max(maxLen, n-i);
	return maxLen;
}
Exit mobile version