GMail‘s auto-quoting feature had me confused for a while, and I thought it was just broken. Parsing MIME messages can be tricky, so I didn’t really mind, but bugs will annoy. For example, take a look at the stray line source of our mutual confusion below.
It’s marked in purple, which indicates it’s quoted from an earlier message, but there are no quote chars (”>”) like in the quote block above, so it looks a bit random. After some head-scratching and scrutiny, I think I know what’s going on.
It appears that GMail, for every line in every message, scans backwards through the conversation, and if a duplicate line is found, it’s considered a quote and marked as such.
At first I thought leading quote chars were a magic give-away, but GMail actually notices modifications in quoted lines, so things like snipped citations are rendered as content, not quotes. The duplicate detection seems to be the only rule at work.
I wonder if the same idea could be used to detect duplication in source code. Duplication comes in many guises, but simple line-by-line equality would be a nice first step to detect in a code-base, either on a file- or project basis. It would be interesting to see duplicate lines marked in Visual Studio, for example.