The Method

Why product data fails, and the rules that fix it.

Eight short chapters. They give the frustration you already have a vocabulary, and they explain why GPD/1 is shaped the way it is. Written to be cited.

  1. 01There is no record
  2. 02Claims, not facts
  3. 03Names are concepts
  4. 04A value is a dossier
  5. 05The level is the model
  6. 06The distribution is part of the data
  7. 07Schemas should grow
  8. 08Never break the reader
Chapter 01

There is no record.

Barcodes are forty years old and they still resolve to nothing. A GTIN names a box, not a product: scan one and you get whatever the nearest database happens to hold, which is usually a title, sometimes a price, rarely the truth. Forty years of identifiers and not one canonical record behind them.

Every company in commerce pays for this gap separately. Retailers re-type supplier feeds, manufacturers answer the same spec question a thousand times, and now shopping agents inherit the mess at machine speed. The gap is not a tooling problem; it is a missing institution. Someone has to keep the record.

gtin 4012345678901 Shop A · 15.000lm Feed B · 15,000 lm Shop C · brightness — the record missing
Fig. 1 · One identifier, three answers, and nothing canonical behind any of them

Therefore: one identity per product, one place to check, maintained as a public standard.

Chapter 02

Claims, not facts.

A fact is a claim that survived adjudication. The manufacturer states 15,000 lumens; a lab measures 14,280 under the same standard. Most databases store one number and discard the disagreement, which is exactly the part a buyer needed to see.

The record stores the contest, not only the winner. Every value keeps its sources, its method, its retrieval date, and a calibrated confidence; claimed and measured never merge into one flattering number. Honesty is cheaper to keep than to reconstruct.

claimed · 15,000 lm measured · 14,280 lm optical.light_output claimed 15,000 lm measured 14,280 lm delta −4.8% · both on record
Fig. 2 · Adjudication keeps the contest; nothing merges into one flattering number

Therefore: every response carries epistemic status, and contested is a respectable verdict.

Chapter 03

Names are concepts.

The industry calls one thing brightness, lumens output, ANSI lumens, Lichtstrom, and light intensity. A string-matching pipeline treats these as five attributes and corrupts every comparison built on them. The name was never the concept; it was one rendering of it.

In GPD/1 an attribute is a registered concept: optical.light_output, with a one-sentence definition, a canonical unit, required measurement conditions, and curated labels in eight languages. Synonyms map to it; they never multiply it.

brightness ANSI lumens Lichtstrom lumens output light intensity optical.light_output one concept · curated labels
Fig. 3 · Synonyms map in and feed search; the concept never multiplies

Therefore: locale changes labels and prose, never facts; mapping is cheap forever, creating is expensive forever.

Chapter 04

A value is a dossier.

100 watts means nothing alone. Into which impedance, across what bandwidth, at what distortion, measured by whom, valid for which firmware, in which market. A number stripped of its conditions is not data; it is decoration that happens to be numeric.

The record stores statements: value, unit, qualifiers, validity, epistemic status, confidence, sources. A number whose required conditions are missing is stored as incomplete and excluded from comparisons rather than quietly promoted.

15,000 lm unit · lm qualifier · iso_21118 epistemic · claimed validity · market de confidence · 0.97 claims · mfr datasheet
Fig. 4 · A statement: the number is one field of seven, and the other six make it usable

Therefore: filters can refuse to lie; comparing under unequal conditions returns an error, not a ranking.

Chapter 05

The level is the model.

A laptop's CPU belongs to the variant. Its plug belongs to the market execution. Its bundle price belongs to the composition; its scratches belong to the unit. Most catalogs flatten all of this onto one row and then wonder why Germany sees the US plug.

GPD/1 fixes six levels (family, variant, market execution, composition, offer, unit) and one law: information lives at the highest level where it is invariant, and is never duplicated downward. Every payload states its level.

family variant market execution composition offer unit shared capabilities · chipset color · storage · the axes plug · tuner · compliance bundle contents · sales unit price · availability · seller serial · wear · panel lottery
Fig. 5 · Information lives at the highest level where it is invariant, never duplicated downward

Therefore: you never guess where a value applies; the level is part of the answer.

Chapter 06

The distribution is part of the data.

A spec without context is an orphan. 28 kilograms: is that heavy? For a soundbar, catastrophically; for an installation projector, unremarkable. The question every buyer actually asks is not what the number is but where it stands.

The record computes distributions per type and attribute, under equal measurement conditions, and gives every value its percentile. Quieter than 78% of its class is not copywriting; it is a query result with evidence attached.

p10 p50 · 37 dB p90 this one · p22 · quieter than 78%
Fig. 6 · acoustic.noise_level within installation projector, n = 1,842

Therefore: percentile: is a filter, and comparison sentences ship with their population.

Chapter 07

Schemas should grow.

Committee schemas fossilize; scraped schemas sprawl. Both fail the same test: reality keeps inventing attributes, and a vocabulary either absorbs them in order or drowns in duplicates.

GPD/1 grows by evidence under a constitution. A new attribute needs five independent claims from two source domains; promotion to stable needs fifty families, three domains, and ninety days without a naming dispute. Releases ship monthly, dated, additive, with full diffs.

candidate stable deprecated retired ≥ 50 families · ≥ 3 domains · ≥ 90 days replaced_by named · ≥ 2 releases aliases resolve forever; retired is an address, not a hole
Fig. 7 · Promotion is earned on evidence; the gates are numbers, not opinions

Therefore: the Dictionary never feels finished, and never feels arbitrary; every entry can show its paperwork.

Chapter 08

Never break the reader.

A reference work that renames its entries is not a reference work. Identifiers in GPD/1 are forever: renames create aliases, merges are reversible with their reasoning on file, and the old address resolves until the heat death of commerce.

The API is pinned by date and never changes shape within a version; the vocabulary evolves additively and never breaks a parser. Corrections arrive as webhooks, not as silent edits discovered in production.

gpd:family:brand.x100 · old gpd:family:brand.x200 alias · resolves forever · 410 + superseded_by merge · reasoning on file unmerge · reversible window
Fig. 8 · Your stored identifiers keep working; corrections announce themselves by webhook

Therefore: stability is a moral position, and it is enforced in HTTP, not promised in prose.

The rules above are executable: validators reject what this document forbids. The Dictionary is where they become visible.