> Unfortunately — and Shel was well aware of this very quickly, but of course by that time, it was too late — ISBNs are terribly abused in the United States. The company that issues ISBNs, Bowker, charges a lot of money for ISBNs (from the perspective of small publishers, anyway), and publishers don’t necessarily read all the rules. Small publishers were re-using ISBNs, and they also took their range of ISBNs and numbered through the entire range, rather than respecting the rule that the final character is actually a checksum, and you can only iterate through some of the digits. (It’s actually worse than just not using the last digit, but I’m not getting into that here.)
If there's a single company issuing them how are there companies using ones with invalid checksums?
Because you get issued a range and you actually do whatever you want within that range. Like IP addresses.
What you do within that range is up to up, you don't actually have to ask the authority for every single number within the range. As a simplification: If you get the number from 100 to 200 to assign to your books but the last digit is a check digit, you didn't get 100 numbers to choose from. You got to choose from: 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and the third number is the check digit, which is computed from the other numbers. Just with ISBNs (and other numbering schemes) the numbers are larger of course.
Going back to the IP address analogy: if you are issued 11.0.0.0/8, you can use anything from 11.0.0.1 to 11.255.255.255 but you don't have to start with 11.0.0.1 and you don't have to treat it as one big /8 either. You can subdivide it into multiple /28s for example.
The only difference here is that with ISBNs (or anything else that has a check digit) is that the last digit is not freely usable. It's just a check digit. It's not really part of the range so to speak. It's like you subdividing your /8 into lots of /28s but ignoring the fact that the network address on the /28s is actually the network address and having one of the computers in your network use it as their actual IP address. That's obviously gonna wreak some havoc because some of your network thinks it's a network address for one of the /28s and some others think it's an individual computer.
To elaborate on your example, if the last digit cast out nines from the previous two digits, valid numbers are 101 (1+0=1), 112, 123, 134, 145, 156, 167, 178, 189, 191 (1+9=10, 1+0=1), 202. So for example 102 is NOT a valid number — it doesn't exist! And if it is used, and someone somewhere tries to validate it, it will throw an error.
Hence the article says:
> Shel very quickly removed all ‘checksum software checks’ (which would have made sure it was a legal ISBN)
I find it incredibly fascinating (and a little bit depressing) that publishers would twist the system in that way, but maybe it has to be expected if something is sold for a large amount of money.
In France ISBN are free; there is a central agency (Electre) that issues them but they don't charge for it (at least not to small publishers).
But EAN/UPC codes are issued by a different agency, GS1, that does charge a recurring fee, and it is expensive and "infinite" (you're supposed to keep paying for as long as you're using the codes). But even them don't charge for the range size; the fee is based on the turnover of the company buying the codes.
For some reason, ISBNs having check digits seems really strange to me. I wonder why. I am sure that Australia's smart meter numbering scheme also uses a check digit, but on learning that it that didn't strike me as odd. Maybe the association with books and "old stuff" makes it seem incongruous to me.
In general, are check digits common in schemes like these? Is it something you'd do if you expect lots of manual copying, or scanning or other potentially-error-producing conversions?
I get it for a smart meter, you don't want to mix up numbers and provide wrong readings. And they could be read manually and entered into a system, even if they're "smart".
But for books, the worst that can happen is that you get the wrong book. Maybe back in the days when people ordered books via phone it was good to have a checksum, but even then the seller would probably read back the name of the book. So I don't really see how it's necessary here.
Lots of things have check digits that may no longer need them. Credit cards have a formula, etc. was important when hand doing things was common and barcode readers were weak perhaps
Before everything "online" a bookstore would order books from a wholesaler. often via paper. Giving the wrong number would either result in order forms being sent back and force or wrong books being delivered which had to be returned (if possible at all). So having check digits seems like a cleve idea to reduce possible errors resulting in unneeded transactions.
And International Bank Account Numbers (IBANs) have two check digits, but at the front, not at the back. And having check digits is important, because it might not be easy to get your money back, if transferred to some unintended account.
The IBAN consists of up to 34 alphanumeric characters comprising a country code; two check digits; and a number that includes the domestic bank account number, branch identifier, and potential routing information. The check digits enable a check of the bank account number to confirm its integrity before submitting a transaction.https://en.wikipedia.org/wiki/International_Bank_Account_Num...
I always assumed the check digit was included as safeguard for barcode scanning.
Upon a quick reading of the Wikipedia article on EAN-13 it appears that there is some parity to determine the direction of the scan (i.e. upside-down or upright) but the numbers are encoded as-is without any further error correction contrary to e.g. QR codes.
Alas, for a definitive answer on the reasoning one might have to buy the relevant ISO standard, if it is even described there.
Given that human adults are said to be able to keep 5-9 items in working memory, I would think it prudent to include a check digit for the ISBN when one might be required to manually enter them into search masks, forms, etc. and if only hearing it (e.g. over the phone) might not benefit from the grouping they visually gain from the interspersed dashes.
On the topic of correctly copying (manually):
whenever I have to be certain to correctly transcribe a long important number or string I would at the end cover up both the source and the transcription, then reveal one character from the end, then the second to last on e.g. the source, verify that the last and the second to last on the transcription matches (in that order!), then uncover the next 2 on the transcription, verify against the source, continue with the source and so on.
This helps immensely to discover transposed digits (especially since my mother tongue is German where we swap the tens and ones while speaking (or thinking with the inner voice) a 2-digit number) and not glance over well known digit groups such as zip code or special dates, etc.
The Finnish identity code, sort of like a US social security number, has had a check digit since 1971 (when the system was digitized). UPC barcodes are from 1974, and have checksums. The ISBN system is from 1970. I'd argue check digits were definitely best current practice in 1970s.
If there's a single company issuing them how are there companies using ones with invalid checksums?
reply