Validate EU VAT Number Format in Python
There is no single "EU VAT format" — there are 28 of them. Here is a tested Python validator with the exact structure rules for every member state plus Northern Ireland's XI prefix.
What an EU VAT number looks like
Every VAT identification number starts with a two-letter country prefix followed by a country-specific body: Germany uses 9 digits, Italy 11, the Netherlands wedges a literal B into position 10, and Austria insists on a leading U. Because each tax authority invented its own scheme, a single generic regex like [A-Z]{2}[0-9A-Z]{8,12} will happily accept numbers that no country could ever have issued. The only correct approach is a per-country pattern table.
Unlike barcodes or IBANs, most VAT bodies carry no publicly documented check digit (a few do — Italy's 11-digit partita IVA passes the Luhn test, for example — see how check digits work). So in Python the realistic goal is a strict format check. If you just want to test one number right now, the free VAT number validator does this in the browser.
The validator: pattern table + function
The dictionary below encodes the structure published in the EU's VIES specifications for all 27 member states, plus XI for Northern Ireland. Note that Greece's key is EL, not its ISO code GR — more on that below.
import re
EU_VAT_PATTERNS = {
"AT": r"U\d{8}", # Austria: 'U' + 8 digits
"BE": r"[01]\d{9}", # Belgium: 10 digits, starts with 0 or 1
"BG": r"\d{9,10}", # Bulgaria: 9 or 10 digits
"CY": r"\d{8}[A-Z]", # Cyprus: 8 digits + letter
"CZ": r"\d{8,10}", # Czechia: 8, 9 or 10 digits
"DE": r"\d{9}", # Germany: 9 digits
"DK": r"\d{8}", # Denmark: 8 digits
"EE": r"\d{9}", # Estonia: 9 digits
"EL": r"\d{9}", # Greece: 9 digits (prefix EL, not GR)
"ES": r"[A-Z]\d{8}|\d{8}[A-Z]|[A-Z]\d{7}[A-Z]", # Spain: letter first, last, or both
"FI": r"\d{8}", # Finland: 8 digits
"FR": r"[A-HJ-NP-Z0-9]{2}\d{9}", # France: 2-char key (no O or I) + 9-digit SIREN
"HR": r"\d{11}", # Croatia: 11 digits
"HU": r"\d{8}", # Hungary: 8 digits
"IE": r"\d{7}[A-Z]{1,2}|\d[A-Z+*]\d{5}[A-Z]", # Ireland: new + legacy styles
"IT": r"\d{11}", # Italy: 11 digits
"LT": r"\d{9}|\d{12}", # Lithuania: 9 or 12 digits
"LU": r"\d{8}", # Luxembourg: 8 digits
"LV": r"\d{11}", # Latvia: 11 digits
"MT": r"\d{8}", # Malta: 8 digits
"NL": r"\d{9}B\d{2}", # Netherlands: 9 digits + 'B' + 2 digits
"PL": r"\d{10}", # Poland: 10 digits
"PT": r"\d{9}", # Portugal: 9 digits
"RO": r"[1-9]\d{1,9}", # Romania: 2-10 digits, no leading zero
"SE": r"\d{10}01", # Sweden: 12 digits, always ends in 01
"SI": r"\d{8}", # Slovenia: 8 digits
"SK": r"\d{10}", # Slovakia: 10 digits
"XI": r"\d{9}|\d{12}|GD[0-4]\d{2}|HA[5-9]\d{2}", # Northern Ireland
}
def validate_vat_format(vat):
"""Return a dict describing whether `vat` matches its country's format."""
cleaned = re.sub(r"[^A-Za-z0-9]", "", str(vat)).upper()
if cleaned.startswith("GR"): # Greece's VAT prefix is EL, not GR
cleaned = "EL" + cleaned[2:]
country, body = cleaned[:2], cleaned[2:]
pattern = EU_VAT_PATTERNS.get(country)
if pattern is None:
return {"input": vat, "valid": False, "reason": "unknown_country_prefix"}
is_valid = re.fullmatch(pattern, body) is not None
return {"input": vat, "country": country, "body": body, "valid": is_valid}
Four deliberate decisions in that function:
- Clean first, judge second. Real-world input arrives as
DE 136 695 976orBE 0123.456.749. There.substrips everything that is not a letter or digit, then uppercases, so punctuation never causes a false rejection. - Normalise GR to EL. Users type Greece's ISO code constantly. Rewriting it costs one line and saves a support ticket.
- Use
re.fullmatch, notre.match.re.match(r"\d{9}", body)only anchors at the start, so a 10-digit German body would slip through.fullmatchrequires the pattern to consume the entire body. - Return a dict, not a bare boolean. Downstream code (and your logs) will want to know which country pattern was applied and what the cleaned body was.
Worked example: real input, real output
Running the function over a batch that mixes clean numbers, messy formatting and classic mistakes:
tests = [
"DE 136 695 976", # spaces are common in the wild
"GR123456789", # ISO code typed instead of EL
"BE 0123.456.749", # Belgian dot notation, leading zero intact
"BE123456789", # same country, leading zero lost
"AT13585627", # Austria without the mandatory U
"US123456789", # not an EU prefix at all
]
for t in tests:
print(validate_vat_format(t))
Output (verbatim from Python 3.14):
{'input': 'DE 136 695 976', 'country': 'DE', 'body': '136695976', 'valid': True}
{'input': 'GR123456789', 'country': 'EL', 'body': '123456789', 'valid': True}
{'input': 'BE 0123.456.749', 'country': 'BE', 'body': '0123456749', 'valid': True}
{'input': 'BE123456789', 'country': 'BE', 'body': '123456789', 'valid': False}
{'input': 'AT13585627', 'country': 'AT', 'body': '13585627', 'valid': False}
{'input': 'US123456789', 'valid': False, 'reason': 'unknown_country_prefix'}
The interesting rows are the failures. BE123456789 is nine digits: Belgian numbers are ten and begin with 0 or 1, so a lost leading zero makes the number invalid — exactly the kind of typo a format check exists to catch. AT13585627 fails because Austrian bodies always start with U (the valid form is ATU13585627).
Edge cases that bite in production
- Leading zeros die in spreadsheets. If VAT numbers pass through Excel as numeric cells,
0123456749becomes123456749and every Belgian number in the file turns invalid. Store identifiers as text — the same disease that mangles barcodes, covered in Excel leading zeros. - Variable lengths are legitimate. Bulgaria (9 or 10), Czechia (8–10), Lithuania (9 or 12) and Romania (2–10) each accept several lengths. A validator that assumes one fixed length per country rejects real customers.
- Romania forbids leading zeros (
[1-9]\d{1,9}), while Belgium requires one. There is no shortcut around per-country rules. - France excludes O and I from its two-character key to avoid confusion with 0 and 1 —
[A-HJ-NP-Z0-9], not[A-Z0-9]. - GB is not an EU prefix anymore. Post-Brexit, only
XI(Northern Ireland, goods only) remains in the EU system; the table above rejectsGBon purpose. - Empty and short input.
cleaned[:2]on a one-character string simply yields a short prefix that misses the table, so the function degrades gracefully tounknown_country_prefixinstead of raising.
Format check vs. registry check: be honest with yourself
Everything above is syntax. A number can match its country's pattern perfectly and still belong to nobody — deregistered, fabricated, or simply never issued. If you are zero-rating an intra-EU B2B sale, tax authorities expect you to confirm the number against VIES, the EU's official registry, and keep evidence of the check. Use the format validator as the cheap first gate (it filters typos before you burn a slow VIES call, and VIES itself rejects malformed input), then confirm registration officially. If VIES says "invalid" for a number you believe is real, see why VIES rejects valid-looking VAT numbers — the cause is usually a non-registered domestic number or a stale cache, and our VAT number validator explains the same distinction interactively. Our methodology page documents exactly what each check does and does not prove.
Validating at scale?
If VAT numbers arrive by the thousand — supplier onboarding, invoice ingestion, marketplace KYB — the same per-country logic (plus the Italian Luhn checksum) is available as a JSON API that takes up to 100 numbers per call:
curl https://codeclassify-api.rosariovitale0096.workers.dev/v1/vat/validate \
-H "X-Api-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"vats":["ATU13585627","BE123456789"]}'
{"ok":true,"count":2,"valid":1,"results":[
{"input":"ATU13585627","country":"AT","format_valid":true,"valid":true,
"note":"Format/checksum only — confirm the number is active via the official VIES service."},
{"input":"BE123456789","country":"BE","format_valid":false,"valid":false,
"note":"Format/checksum only — confirm the number is active via the official VIES service."}]}
The free tier includes 10 calls per month with no card required — details and sign-up on the API page.
FAQ
Does a valid VAT format mean the number is actually registered?
No. A format check only confirms the number is shaped like a real VAT number for that country — right prefix, right length, right character pattern. It catches typos, truncations and lost leading zeros, but it cannot tell you whether the number was ever issued or is still active. Only the EU's official VIES service (or a national tax authority) can confirm registration, and for cross-border zero-rating you are expected to check VIES.
Why does Greece use EL instead of GR in VAT numbers?
Greece's ISO 3166 country code is GR, but its VAT prefix is EL, from the Greek name Elláda. VIES only accepts EL, so a validator should normalise GR to EL before looking up the pattern rather than rejecting the number outright — users type GR constantly because every other prefix matches the ISO code.
Can this validator check UK VAT numbers?
Only partially. Great Britain (GB prefix) left the EU VAT system in 2021, so GB numbers are no longer in VIES and must be checked against HMRC's own service. Northern Ireland still participates for goods under the XI prefix, which follows the old UK structure: 9 digits, 12 digits, or GD/HA plus 3 digits for government bodies. The validator in this guide includes XI but deliberately rejects GB.
Check a VAT number without writing code
Paste any EU VAT number into the free VAT number validator to see its country, expected structure, and format verdict instantly.
This guide is for general information only, not tax advice. A format match confirms structure, not registration: before zero-rating an intra-EU supply or relying on a counterparty's VAT status, verify the number through the official VIES service or the relevant national tax authority and retain proof of the check.