A complete rewrite is never a good idea. While this article gives a good behind the scene look at what (could have) happened with Google Docs, it doesn’t really go into the details as to why rewrites from scratch don’t work. While almost 15 years old, my reference on the subject remains Joel’s excellent “Things You Should Never Do”. A must read.
My favorite quote from that article:
When you throw away code and start from scratch, you are throwing away all that knowledge. All those collected bug fixes. Years of programming work.
This plugin features an extension of Lucene’s StandardTokenizer that allows the user to customize the word boundary property values for any Unicode character.