How to generalize my algorithm to detect if one string is a rotation of another

Question

So I've been going through various problems to review for upcoming interviews and one I encountered is determining whether two strings are rotations of each other. Obviously, I'm hardly the first person to solve this problem. In fact, I did discover that my idea for solving this seems similar to the approach taken in this question.

Full disclosure: I do have a related question on Math SE that's focused on the properties from a more mathematical perspective (although it's worth noting that the way that I tried to formulate the ideas behind this there end up being incorrect for reasons that are explained there).

Here's the idea (and this is similar to the approach taken in the linked question): suppose you have a string abcd and the rotation cdab. Clearly, both cd and ab are substrings of cdab, but if you concatenate them together you get abcd.

So basically, a rotation simply entails moving a substring from the end of the string to the beginning (e.g. we constructed cdab from abcd by moving cd from the end of the string to the beginning of the string).

I came up with an approach that works in a very restricted case (if both of the substrings consist of consecutive letters, like they do in the example there), but it fails otherwise (and I give an example of passing and failing cases and inputs/outputs below the code). I'm trying to figure out if it's possible (or even worthwhile) to try to fix it to work in the general case.

    public bool AreRotations(string a, string b)
    {
        if (a == null)
            throw new ArgumentNullException("a");
        else if (b == null)
            throw new ArgumentNullException("b");
        else if (a.Trim().Length == 0)
            throw new ArgumentException("a is empty or consists only of whitespace");
        else if (b.Trim().Length == 0)
            throw new ArgumentException("b is empty or consists only of whitespace");

        // Obviously, if the strings are of different lengths, they can't possibly be rotations of each other
        if (a.Length != b.Length)
            return false;

        int[] rotationLengths = new int[a.Length];

        /* For rotations of length -2, -2, -2, 2, 2, 2, the distinct rotation lengths are -2, 2
         * 
         * In the example I give below of a non-working input, this contains -16, -23, 16, 23
         * 
         * On the face of it, that would seem like a useful pattern, but it seems like this
         * could quickly get out of hand as I discover more edge cases
         */
        List<int> distinctRotationLengths = new List<int>();
        for (int i = 0; i < a.Length; i++)
        {
            rotationLengths[i] = a[i] - b[i];

            if (i == 0)
                distinctRotationLengths.Add(rotationLengths[0]);
            else if (rotationLengths[i] != rotationLengths[i - 1])
            {
                distinctRotationLengths.Add(rotationLengths[i]);
            }
        }

        return distinctRotationLengths.Count == 2;
    }

And now for the sample inputs/outputs:

        StringIsRotation rot = new StringIsRotation();

        // This is the case that doesn't work right - it gives "false" instead of "true"
        bool success = rot.AreRotations("acqz", "qzac");

        // True
        success = rot.AreRotations("abcdef", "cdefab");

        // True
        success = rot.AreRotations("ablm", "lmab");

        // False, but should be true - this is another illustration of the bug
        success = rot.AreRotations("baby", "byba");

        // True
        success = rot.AreRotations("abcdef", "defabc");

        //True
        success = rot.AreRotations("abcd", "cdab");

        // True
        success = rot.AreRotations("abc", "cab");

        // False
        success = rot.AreRotations("abcd", "acbd");

        // This is an odd situation - right now it returns "false" but you could
        // argue about whether that's correct
        success = rot.AreRotations("abcd", "abcd");

Is it possible/worthwhile to salvage this approach and have it still be O(n), or should I just go with one of the approaches described in the post I linked to? (Note that this isn't actually production code or homework, it's purely for my own learning).

Edit: For further clarification based on the comments, there are actually two questions here - first, is this algorithm fixable? Secondly, is it even worth fixing it (or should I just try another approach like one described in the answers or the other question I linked to)? I thought of a few potential fixes but they all involved either inelegant special-case reasoning or making this algorithm O(n^2), both of which would kill the point of the algorithm in the first place.


Show source
| c#   | string   | algorithm   2016-12-28 23:12 3 Answers

Answers ( 3 )

  1. 2016-12-29 00:12

    Would something like this work?:

    private bool IsRotation(string a, string b)
    {
        if (a.Length != b.Length) { return false; }
        for (int i = 0; i < b.Length; ++i)
        {
            if (GetCharactersLooped(b, i).SequenceEqual(a))
            {
                return true;
            }
        }
        return false;
    }
    
    private IEnumerable<char> GetCharactersLooped(string data, int startPos)
    {
        for (int i = startPos; i < data.Length; ++i)
        {
            yield return data[i];
        }
        for (int i = 0; i < startPos; ++i)
        {
            yield return data[i];
        }
    }
    

    P.S. This will return true for abcd = abcd, since you could consider it a full rotation. If this is not desired, change the start of the loop from 0 to 1 in the first function.

  2. 2016-12-29 01:12

    Let suppose the first string is S and the second is S', clearly if they have different length then we output they are not a rotation of each other. Create a string S''=SS. In fact concatenation of S to itself. Then if S,S' are rotation of each other we find a substring S' in S'' by KMP Algorithm in O(n), otherwise we output they are not a rotation of each other. BTW if you are looking for a fast practical algorithm then instead of KMP use Boyer Moore algorithm.

    To address the question more explicit, I'd say that I don't expect an easy algorithm for this special case of string matching problem. So having this background in mind, I don't think an easy modification on your algorithm can work. In fact the field of string matching algorithms is very well developed. If there is a somewhat simpler algorithm than sth like KMP or suffix tree based algorithms, for this special case, then still I think studying those general algorithms can help.

  3. 2016-12-29 06:12

    If you're looking just for a method that will check if a string is a rotation of another string, this is a C# implementation of the function you linked (which as far as I know is about the fastest way to solve this particular problem):

    bool IsRotation(string a, string b)
    {
        if (a == null || b == null || a.Length != b.Length)
            return false;
    
        return (a + a).Contains(b);
    }
    

    If you're asking for feedback on your algorithm, I'm not sure I understand what your algorithm is trying to do. It seems like you are trying to detect a rotation by storing the difference of the char values in the string and seeing if they sum to 0? Or if the list of unique differences contains mirror pairs (pairs (x,y) where x = -y)? Or simply if the number of unique differences is even? Or something else entirely that I am missing from your description?

    I'm not sure if what you're doing can be generalized, simply because it depends so heavily on the characters within the words that it may not adequately check for the order in which they are presented. And even if you could, it would be a scholarly exercise only, as the above method will be far faster and more efficient than your method could ever be.

◀ Go back