Very minor but interesting nitpick: the font used on checks is not OCR (optical) but MICR (magnetic ink). The design objectives are different and different font families exist for the two purposes. MICR as used on checks (more properly called E-13B) bears unusual, distinctive character shapes emphasizing abnormally wide horizontal components due to the need for each character to have a distinctive waveform when read as density from left to right, essentially by a tape recorder read head. Fonts optimized for OCR are usually more normal looking to humans because they emphasize clear detection of lines instead.
E-13B is a bit of an ideal use case for this method because of the highly constrained character set used on checks and the unusually nonuniform density of E-13B. The same thing can be done on text more generally but gets significantly more difficult.
But I doubt anyone would take the hazzle of doing OCR. Because it is still a lot of work to spot misstakes (which still do happen, also with standardfonts).
Also, who would want to lower the perceived visual quality of his resume? And scanning a document, means just that. It will be still readable, yes, but you can see, that it was scanned in.
This seems to be the worst of both worlds. It's not easy for a human to read a square (compared to a line of text). The pixellated font is also not easily readable compared to a vector font. It's also not easy for machines to read an optical coding with no spatially distributed redundancy.
QR codes and bar codes are brilliant for machines because misreads due to some spurious reflection or spec of dust is mitigated by error correction.
I feel like this problem is already well served by bar codes which have a human readable text representation below them (e.g. serial number stickers).
That said, I can see the security advantage of the computer reading the same representation as a human, although this is probably not the best place to enforce security. As there's no integrity check, there's little guarantee the computer will read what you see though. Maybe linear OCR combined with a barcode checksum would be a better way to achieve these goals.
Not sure if I get the point of this? On digital documents it makes no difference at all and if the NSA (or whoever else) really wanted to digitalise printed documents written in this font they could surely just make some minor modifications to their OCR to support this.
Not to mention that the kind of OCR software available to secret services is probably much better than what is on the consumer market.
The closest is: "He said the checks Thomas presented displayed a watermark that read VOID when they were scanned in a web viewer."
This is a feature of some security paper - https://en.wikipedia.org/wiki/Void_pantograph says "In security printing, void pantograph refers to a method of making copy-evident and tamper-resistant patterns in the background of a document. Normally these are invisible to the eye, but become obvious when the document is photocopied. Typically they spell out "void", "copy", "invalid" or some other indicator message"
What it means is their system wasn't at high enough resolution. Which should be expected. Which means they are either extremely poorly trained, or they are looking for an excuse.
Note the following from the freep article:
> According to TCF's Wennerberg ... Thomas wanted to deposit the two larger checks in his bank account, which, Wennerberg said, had only 52 cents in it. And he wanted to cash the $13,000 check,
Does your bank generally tell people how much is in your account, and what your bank transactions are?
How accurate is this? I tried ocr'ing my receipts a while back and the issues with misrecognized numbers were too large to be worth it. Unlike text where a few typos are unlikely to largely change the meaning, a single digit wrong on a receipt can be a big issue and receipt numbers aren't super high fidelity.
This looks like something I'd definitely use (and pay for, at that price).
I have been wondering the same thing. So many OCR engines spit out results that are obviously wrong, and I don't want them to get too clever but a little but of smarts would go a long way.
I guess that could be a form of lossy text compression - where the end result is not completely right (the letters not being completely in the right order) but it's good enough to be able to read the text.
So the use case is primarily to produce something machine readable rather than human readable (it's not that unreadable, but still). I can see that. Is there a human-readable flag?
E-13B is a bit of an ideal use case for this method because of the highly constrained character set used on checks and the unusually nonuniform density of E-13B. The same thing can be done on text more generally but gets significantly more difficult.
reply