Validate GTIN Check Digits in Pure SQL
Your product table already has the barcodes. You do not need to export them to a script to find the broken ones — one SQL expression can audit every GTIN-8, UPC-A, EAN-13 and GTIN-14 in place.
What we are checking
Every GTIN ends in a check digit computed from the other digits with the GS1 mod-10 algorithm: starting from the right, digits are weighted alternately 3, 1, 3, 1… and a valid code's weighted sum (check digit included, with weight 1) is a multiple of 10. The full derivation, with worked examples, is in how check digits work; for a one-off code the GTIN check digit calculator is quicker. This guide is for the other situation: a products table with thousands of rows, some of them imported from spreadsheets years ago, and you want a query that flags the bad ones.
The one trick: pad to 14 first
The awkward part of doing this in SQL is that the weights are anchored to the right end of the number, while SUBSTRING counts from the left — and GTINs come in four lengths (8, 12, 13, 14). Write the weights left-to-right and they land on different digits depending on length.
The fix is standard GS1 practice: left-pad every code with zeros to exactly 14 digits. Zeros add nothing to a weighted sum, so validity is unchanged — and in a 14-character string the alternation always comes out the same way: positions 1, 3, 5, … 13 from the left carry weight 3, and positions 2, 4, … 14 carry weight 1 (the check digit, at position 14, gets 1, as it should). One fixed expression now covers all four formats.
Digit extraction is plain SUBSTRING(g, i, 1) — take 1 character at position i — cast to an integer. Remember SQL strings are 1-based: position 1 is the first character, a classic off-by-one trap if you are used to 0-based languages.
Portable version
This runs as-is on PostgreSQL and SQLite, and needs only the small substitutions listed below for MySQL, SQL Server and Oracle. It assumes a products table with a text column gtin:
SELECT gtin,
CASE
WHEN (
3 * CAST(SUBSTRING(g, 1, 1) AS INT) + CAST(SUBSTRING(g, 2, 1) AS INT) +
3 * CAST(SUBSTRING(g, 3, 1) AS INT) + CAST(SUBSTRING(g, 4, 1) AS INT) +
3 * CAST(SUBSTRING(g, 5, 1) AS INT) + CAST(SUBSTRING(g, 6, 1) AS INT) +
3 * CAST(SUBSTRING(g, 7, 1) AS INT) + CAST(SUBSTRING(g, 8, 1) AS INT) +
3 * CAST(SUBSTRING(g, 9, 1) AS INT) + CAST(SUBSTRING(g, 10, 1) AS INT) +
3 * CAST(SUBSTRING(g, 11, 1) AS INT) + CAST(SUBSTRING(g, 12, 1) AS INT) +
3 * CAST(SUBSTRING(g, 13, 1) AS INT) + CAST(SUBSTRING(g, 14, 1) AS INT)
) % 10 = 0 THEN 'valid'
ELSE 'bad check digit'
END AS gtin_status
FROM (
SELECT gtin,
SUBSTRING('00000000000000' || gtin, LENGTH(gtin) + 1, 14) AS g
FROM products
WHERE LENGTH(gtin) IN (8, 12, 13, 14)
) AS t;
Reading it inside-out:
- The inner query does the padding without
LPAD(which not every engine has): glue 14 zeros in front, then take 14 characters starting at positionLENGTH(gtin) + 1. For a 12-digit UPC that start position is 13, which keeps the last 2 of the zeros plus all 12 digits — a 14-character stringg. - The
WHEREkeeps only rows with a legal GTIN length. Anything else — including the infamous 11-digit UPC that lost its leading zero in Excel — is not a checksum problem, it is a length problem, and should be audited separately (WHERE LENGTH(gtin) NOT IN (8, 12, 13, 14)). - The big expression is the weighted sum written out once for the fixed 14-digit layout: odd positions ×3, even positions ×1.
% 10 = 0is the verdict. To list only the offenders, move the whole parenthesized sum into aWHERE … % 10 <> 0instead of theCASE.
Worked example: 036000291452
Take the UPC-A 036000291452. Padded to 14 it becomes 00036000291452, and the expression computes:
| Digit | 0 | 0 | 0 | 3 | 6 | 0 | 0 | 0 | 2 | 9 | 1 | 4 | 5 | 2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Weight | 3 | 1 | 3 | 1 | 3 | 1 | 3 | 1 | 3 | 1 | 3 | 1 | 3 | 1 |
| Product | 0 | 0 | 0 | 3 | 18 | 0 | 0 | 0 | 6 | 9 | 3 | 4 | 15 | 2 |
Sum = 60, and 60 % 10 = 0, so the row comes back valid. Change the last digit to 3 and the sum is 61 — bad check digit. Running the query above over the test set 036000291452, 4006381333931 (EAN-13), 96385074 (EAN-8) and 10036000291459 (GTIN-14) marks all four valid, and flags 036000291453.
PostgreSQL flavor
Postgres can replace the fourteen spelled-out terms with generate_series, use LPAD for the padding, and guard against stray non-digit characters with a regex (on Postgres, CAST('x' AS INT) raises an error rather than returning something — filter first):
SELECT gtin,
CASE
WHEN gtin !~ '^([0-9]{8}|[0-9]{12,14})$' THEN 'bad length or character'
WHEN (
SELECT SUM( (CASE WHEN i % 2 = 1 THEN 3 ELSE 1 END)
* SUBSTRING(LPAD(gtin, 14, '0') FROM i FOR 1)::int )
FROM generate_series(1, 14) AS i
) % 10 = 0 THEN 'valid'
ELSE 'bad check digit'
END AS gtin_status
FROM products;
The correlated subquery walks positions 1–14, applies weight 3 to odd i and 1 to even i, and sums — exactly the arithmetic of the portable version, just generated instead of typed. Because CASE branches evaluate lazily, rows that fail the regex never reach the cast.
Other engines, briefly
- MySQL / MariaDB:
||is logical OR by default — pad withLPAD(gtin, 14, '0')instead, and cast withCAST(… AS SIGNED)(MySQL's CAST does not acceptINT). - SQL Server: pad with
RIGHT(REPLICATE('0', 14) + gtin, 14), useLEN()for length;%andSUBSTRINGwork as written. - Oracle:
SUBSTRinstead ofSUBSTRING,MOD(sum, 10)instead of%,TO_NUMBERinstead of the cast;LPADand||are native.
Edge cases that actually bite
- Leading zeros. If
gtinis anINTorBIGINTcolumn, every leading zero is already gone:036000291452was stored as36000291452, 11 digits. The padding step happens to repair the checksum math, but downstream systems (and GS1) will still see a malformed code. Store barcodes asVARCHAR. The same corruption happens upstream in spreadsheets — see fixing leading zeros in Excel. - Dirty input. Scanned or hand-keyed data carries spaces and dashes. Clean before checking:
REPLACE(REPLACE(gtin, ' ', ''), '-', ''), or Postgresregexp_replace(gtin, '[^0-9]', '', 'g')— but log what you cleaned; silent normalization hides data-quality problems. - Non-digit characters. Strict engines throw on the cast; SQLite quietly turns
CAST('X' AS INT)into 0 and may call a corrupt code valid. The length filter alone does not catch03600029145X— add a digits-only check (Postgres regex above; SQL Servergtin NOT LIKE '%[^0-9]%'). - The classic mistake: applying 3, 1, 3, 1… from the left without padding. It works for the even lengths (8, 12 and 14) and silently mis-validates only EAN-13, because 13 is odd and the weight parity flips. If your results look wrong for exactly the 13-digit codes, this is why.
Checksum is not a registry
A row marked valid is arithmetically self-consistent — nothing more. About 10% of random digit strings of a legal length pass mod-10 by chance, and a passing GTIN may never have been assigned by GS1 to any product. Treat this query as the cheap first gate of an audit (it is very good at catching typos and truncations); verifying that a code is genuinely licensed is a registry question, as explained in our methodology.
Validating at scale?
If the audit is recurring, or the data lives outside your database, the CodeClassify API validates up to 100 barcodes per call and also returns the format and the corrected check digit for failures:
curl -X POST "https://codeclassify-api.rosariovitale0096.workers.dev/v1/gtin/validate" \
-H "X-Api-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"codes":["036000291452","4006381333931","036000291453"]}'
{"ok":true,"count":3,"valid":2,"results":[
{"input":"036000291452","code":"036000291452","format":"UPC-A","valid":true},
{"input":"4006381333931","code":"4006381333931","format":"EAN-13","valid":true},
{"input":"036000291453","code":"036000291453","format":"UPC-A","valid":false,
"expected_check_digit":2,"corrected":"036000291452"}]}
The free tier includes 10 calls per month — at 100 codes per call, that is a thousand barcodes for nothing. Get a key at code-classify.com/api. For a one-off file, the browser-based bulk barcode validator needs no key at all.
Check a single barcode right now
Paste any UPC, EAN or GTIN-14 into the GTIN check digit calculator to verify or compute its final digit — no query required.
FAQ
Why pad the GTIN to 14 digits before applying the weights?
The GS1 weights (3, 1, 3, 1, …) are anchored to the right end of the number, but SQL's SUBSTRING counts positions from the left. Left-padding every GTIN with zeros to exactly 14 characters lines the two up: in a 14-digit string, odd positions from the left always carry weight 3 and even positions weight 1. The padding zeros contribute nothing to the sum, so one fixed expression validates GTIN-8, UPC-A, EAN-13 and GTIN-14 alike.
Can I validate GTINs stored in a numeric column?
You can, by casting to text first, and the left-padding step even restores the leading zeros the numeric type discarded, so the math still works. But it is a design smell: numeric columns silently strip leading zeros, and a UPC-A that starts with 0 becomes an 11-digit number that other systems will reject. Store barcodes as VARCHAR or TEXT, and treat the numeric column as something to migrate away from.
Does a passing check digit mean the barcode is real?
No. The check digit only proves the number is internally consistent — it was not obviously mistyped or truncated. Roughly one in ten random digit strings of the right length passes mod-10 by chance. It does not prove GS1 ever assigned that GTIN to a product, or that the company prefix is licensed. Checksum validation in SQL is a cheap first audit; registry verification is a separate step.
This guide is for general information only. Check digit validation confirms internal consistency of a number, not that a code is officially registered, assigned, or in use. For barcode assignment and licensing consult GS1.