I'm really intrigued. Working in the OCR field, this is something that we do a l...

jeltz · on Jan 13, 2014

This algorithm seems to be based on edit distance so should be a poor fit for the OCR field, since OCR rarely swaps letters.

darklajid · on Jan 13, 2014

I'm not quite sure what you're saying here. Swap letters as in transpositions? Yes, correct. Usually the errors are simple replaces or deletes/inserts (which arguable is might include 'swapping' an 1 for an l).

But the greater field I'm working in doesn't hand you random OCR and that's it. Most projects here contain a way for typists to correct recognition mistakes or complete the missing pieces of information on a document. For that (-> human typist, often you have a database with valid/expected values for fields) transpositions aren't rare at all.