Words that appear exactly once in document are more likely a typo, the larger the document.
The attached patch attempts to point out such singletons that are not recognized by the spell checker.
Issues with the patch:
- it shows false positives. e.g. Ceph appears multiple times in the cloud 3 deployment guide, but is in the singleton list.
- it includes many word fragments, as we currently do not handle hyphenation well.
[ouch, is there no way to attach a file here?]