Open Source · MIT License · Paper + Code

Prime factorization
inside a language model.

TriadicGPT is a 40M-parameter GPT that learns algebraically verifiable prime-factor semantic representations end-to-end — as a side effect of language modeling.

GitHub Repo Read the paper Quick start
King = 2 × 3 × 5
Queen = 2 × 5 × 7
GCD → {2, 5} = Royalty, Authority

Benchmark Results

Zero cost. Real structure.

Adding a triadic head to a GPT costs nothing in language quality — and produces algebraically verifiable semantic representations.

FindingResult
Language cost of triadic headZero (PPL 7.69 vs 7.56 ablation)
Semantic ordering emergencePhase transition at 40M params
Optimal bit widthk = 32–64 bits
Analogy verification69.2% (random baseline 50%)
Semantic compression8× (64 bits = 512D probe accuracy)
Signature uniqueness100% across all concepts
GPT-2 transfer (InfoNCE)Closes 72% of gap to Engine PCA

Architecture

Two heads, one forward pass.

Standard next-token prediction plus a triadic projection head that produces discrete prime-factor signatures.

Text → BPE Tokenizer (4096 vocab) → Token IDs → TriadicGPT (12L/512D/8H) | +-----+-----+ | | LM Head Triadic Head | | Next-Token tanh(Wx) → bits Prediction [+, -, +, +, ...] | | L_lang PrimeMapper (CE) Φ = 2 × 5 × 7 | | +-----+-----+ | L = L_lang + α · L_triadic
🎯

Diversity

Bits fire ~50% of the time. No degenerate all-ones or all-zeros.

⚖️

Contrastive

Different sequences produce different bit patterns.

📈

Entropy

No dead bits — every bit carries information.

🔗

Alignment

Triadic similarity matches embedding similarity. The key innovation.

Experiment 10 — Transfer Learning

The bottleneck is the loss, not the model.

Same GPT-2 embeddings, 9× gap difference from changing only the alignment loss.

MSE
+0.011
Rank
+0.047
InfoNCE
+0.099
Engine PCA
+0.136

Semantic gap (intra − inter group similarity). Higher = better domain separation. Engine PCA is the post-hoc upper bound.

Quick Start

Up and running in minutes.

Install

$ pip install triadic-head

Standalone PyPI package. Triadic algebra + HuggingFace wrapper. Works with any causal LM.

Train from scratch

$ git clone https://github.com/arturoornelasb/triadic-gpt
$ cd triadic-gpt
$ python src/torch_train.py --scale xl --steps 50000

~76 min on an RTX 5060 Ti. Produces a 40M-param model with 64-bit triadic signatures.

# Use triadic-head with any HuggingFace model
from triadic_head import TriadicWrapper
wrapper = TriadicWrapper("gpt2", n_bits=64, align="infonce")
result = wrapper.encode("king")
print(result["composite"]) # 2 × 3 × 5 × ...
print(result["bits"]) # [1, 0, 1, 1, 0, ...]

Desktop Explorer

Explore semantics visually.

A full PySide6 desktop application for encoding, comparing, and chatting with TriadicGPT. 3 backends, 7 tabs.

🔢Encoder
⚖️Compare
📈Explore
🔄Analogy
Validate
💬Chat
📊Benchmarks
$ pip install PySide6
$ python ui/app.py

Prime Inspector

Click any prime factor to see all 280 probe words that share it. Discover what each learned prime “means” empirically.

3 Backends

Load native .pt checkpoints, GPT-2 Transfer (Exp10), or any HuggingFace model with post-hoc projection.

Live Triadic Chat

Converse with the model and see the prime signature of every turn in real time. Compare prompt vs response algebraically.

Research Paper

The science behind the algebra.

End-to-End Prime Factorization in a Generative Language Model:
Emergent Algebraic Semantics from Joint Training

We present TriadicGPT, a 40M-parameter GPT augmented with a triadic projection head that produces discrete prime-factor signatures alongside standard next-token predictions. Unlike post-hoc approaches that project frozen embeddings into prime space, TriadicGPT learns algebraically verifiable semantic representations end-to-end as a side effect of language modeling. Through 29 training runs and 11 experiments, we demonstrate: (i) zero language cost from the triadic head, (ii) a phase transition in semantic ordering at 40M parameters, (iii) 8× semantic compression (64 bits match 512D embedding probes), and (iv) a loss–embedding interaction where InfoNCE closes 72% of the gap to post-hoc PCA when attached to GPT-2.

Arturo Ornelas Brand, 2026. Based on the Triadic-Neurosymbolic-Engine (Ornelas Brand, 2026).

PDF (GitHub) LaTeX source

Ecosystem

Part of the Triadic stack.

🔬

TriadicGPT (this repo)

End-to-end GPT + triadic head. 29 training runs, 11 experiments, full paper. MIT license.

GitHub
📦

triadic-head (PyPI)

Standalone package: triadic algebra + HuggingFace wrapper. Works with any causal LM out of the box.

PyPI

Triadic Cloud API

Commercial SaaS. Encode, search, audit, and stream via REST API. Free tier available.

Learn more
📚

Triadic-Neurosymbolic-Engine

The original library and paper. Post-hoc prime factorization for any embedding model. Published on Zenodo.

GitHub