Determine if two strings/phrases are valid anagrams

Given two strings str1 and str2. Write a function to determine if str1 is an anagram of str2.
Example 1:
str1 = “listen”, str2 = “silent”
Output = True
Example 2:
str1 = “mississippi”, str2 = “mips”
Output = False

Terminology:

Anagram: Two words or phrases are said to be anagrams of each other if they can be formed by re-shuffling of characters in one of them.

figure showing example of a valid anagram — Fig: Example of an anagram

Problem Statement:

In the given problem we have 2 strings as input. They may be single words or phrases. Return true if they anagrams of each other, else return false. These words may or may not be dictionary words.

Note that just having same characters do not qualify two phrases/strings to be anagrams. The number of characters should also be the same. For example in the second test case mississippi and mips share the same characters m, i, p, s. But these are not anagrams of each other. Some other good example of anagrams are:

fired & fried
gainly & laying
sadder & dreads
a decimal point & im a dot in place

Brute Force Method:

This is a very common problem asked in a lot of programming interviews. As a good rule of thumb, try to find a method by which a solution is guaranteed. We know for a fact that if two strings are anagram of each other, then they would have the same characters. The only thing different could be ordering. The word ordering gives a good clue on how to solve the problem. The brute force algorithm would look something like:

Sort str1 ignoring the case of all characters.
Sort str2 ignoring the case of all characters.
Compare the two sorted strings.
If they are same, then they are anagrams of each other and we can return true.

This method works perfectly and if you have no time boundary, it will provided you the right answer. But it would take up a long time in sorting the two strings. The best time complexity for sorting we know is via Quick Sort, and that gives us $O(n \log n)$ . Hence, this approach wouldn’t be viable for large work loads.

Optimized Approach:

Our brute force solution focused on the fact that the characters might be in jumbled up in both the strings. However, we can also take leverage of the fact that the all the characters are same and have same frequency in case of a valid anagram. We can create buckets for each character to track their counts. An efficient algorithm to solve the problem in this manner would be:

Algorithm:

Convert both the strings to lowercase such that we can make comparisons irrespective of the case.
Remove all the white-space and special characters if required. (You may be given strings that have special characters in them)
Create a bucket array of size 26. Each index represent a character of the English alphabet. 0 represents a, 1 represents b and so on.
Start scanning the first string. For each character, increment the count in the bucket array by 1 at the corresponding index.
Once this scan is complete, start a scan of the second string.
For each character, decrement the count in the bucket array by 1.
Iterate over the entire bucket array. If any of the buckets have a value greater than or less than 0, it means that the strings are not anagram.
If all elements are 0, then return true.

Code:

public boolean isAnagram(String str1, String str2) {

    // Convert both to lowercase to ignore case match
    str1 = str1.toLowerCase();
    str2 = str2.toLowerCase();

    // Strip of all the white spaces
    str1 = str1.replace(" ", "");
    str2 = str2.replace(" ", "");

    // Initialize the bucket array
    int[] counts = new int[26];

    // Fill the buckets
    for (int i = 0; i < str1.length(); i++) {
      counts[str1.charAt(i) - 'a']++;
    }

    // Empty the buckets
    for (int i = 0; i < str2.length(); i++) {
      counts[str2.charAt(i) - 'a']--;
    }

    // Check if all buckets are empty
    for (int count : counts) {
      if (count != 0)
        return false;
    }

    return true;
  }
Code language: Java (java)

Time Complexity: $O(n)$
Space Complexity: $O(1)$ since we wouldn’t be creating an array of more than size 26.

You can also find the code and test cases on Github.

Video Explanation:

Valid anagram strings

Terminology:

Problem Statement:

Brute Force Method:

Optimized Approach:

Algorithm:

Code:

Video Explanation:

Sharing is caring:

nikoo28

First unique character in a string

Algorithmic Paradigms – Divide and Conquer

You may also like

[HackerRank] – Two Characters

Given an array of characters formed with a’s...

Length of longest palindrome that can be built...