OCR villains
From MobileRead
This page lists many of the typical OCR errors found when proof reading a book. Some of these can be found with spell checking and a few more with grammar checking programs but some will just need a keen eye. In some cases you can search through the document and replace the ones that don't belong.
Contents |
[edit] Numbers, symbols and letters
0 <--> O {zero <--> Uppercase o} 1 l I i ! <--> each other {digit One, lowercase L, uppercase i, lowercase i, exclamation mark} 2 <--> Z 5 <--> S 6 <--> uppercase G 7 <--> ? {question mark} 7 and / = I {uppercase I in italic} ] = J square bracket = uppercase J ]ane = Jane
[edit] letters
e <--> c are <--> arc cl <---> d clock <--> dock close <--> dose f ligatures confusion ff, fi, fl, ffi h <--> b back <--> hack harrow <--> barrow H = ll weH = well H or h = li Hbrary = library hke = like hn = lm ahnost = almost j <--> J {lowercase <--> uppercase J } jane = Jane Jury = jury rn <--> m Mom <--> Morn stem <--> stern earnest = camest {this also had the e=c combo} modem = modern corner = comer ri <--> n arid <--> and r = f ringers = fingers m <--> in stein <--> stem rmg = ring inoth = moth im <--> un unport = import imdone = undone n <--> u bnt = but teut = tent uest = nest ii = u iinder = under B <--> R {uppercase} DEABEST = DEAREST Robby <--> Bobby F <--> P {uppercase} Full <--> Pull ih = th feaiher = feather di = th {weird, but it happens a lot} die = the tii = th tiie = the tli = th tlie = the Tm == "I'm (also with no leading quote) T = I {uppercase i} U = double ell, li, il WeU = Well Ufe = life untU = until vv = w vvhen = when \V = W y <--> v yery = very verv = very
[edit] Punctuation errors
/' = ," or .” {or single quote} * = quote mark ** *' '* '' = " {two single quotes, should be a double quote} Space following opening quote mark Space preceding closing quote or punctuation mark. He did this ; then he did that ; then he said : “ You aren’t ready ! ” Apostrophe goes missing, stranding the last letter I m = I’m, don t = don’t, Bob s = Bob’s These following often occur with a "Smarten Punctuation" action: Backward quote marks: ” close quote at start of paragraph “ open quote at end of paragraph Reversed single and double quotes in nested quotations: “And I said to him, ‘Quit that!”’ ‘“O what a tangled web we weave,’” she said. ’ Right single quote should replace "straight" apostrophe, not ‘ Left single quote. Happens often at start of a word: ‘em should be ’em, ‘tis should be ’tis - hyphenation problems. The source has hyphens when the word breaks at the end of a line but the hyphen is left in when the document reflows. (A search can usually find these.) ) with a space in front. Sometime ( will have a space after it. Search for these.