Training Guide¶

Two-Phase Protocol¶

TriadicGPT uses a two-phase training strategy:

Phase 1: Frozen backbone -- Train only the triadic head while the language model backbone is frozen. This lets the head learn to map hidden states to meaningful binary vectors without disrupting language quality.

Phase 2: Joint optimization -- Unfreeze the last N layers of the backbone for joint language + triadic optimization. This allows the hidden representations to co-adapt with the triadic projection.

from triadic_head import TriadicWrapper

model = TriadicWrapper("gpt2", n_bits=64, align_mode="infonce")

# Phase 1: triadic head only
model.freeze_backbone()
# ... train for M steps ...

# Phase 2: joint optimization
model.unfreeze_last_n(2)
# ... train for remaining steps ...

Loss Function¶

The total loss is:

\[L = L_{\text{lang}} + \alpha \cdot L_{\text{triadic}}\]

The triadic loss combines four components:

Component	Purpose	What it prevents
Diversity	Bits fire ~50% of the time	All-zero or all-one collapse
Contrastive	Different sequences get different signatures	Degenerate constant encoding
Entropy	No dead bits	Individual bits getting stuck
Embedding alignment	Triadic similarity tracks embedding similarity	Signatures unrelated to semantics

logits, triadic_proj, lang_loss = model(input_ids, labels=input_ids)

tri_loss = model.triadic_loss(
    triadic_proj,
    input_ids=input_ids,
    alpha=0.05,           # triadic weight
    entropy_weight=1.0,   # prevent dead bits
    align_weight=5.0,     # transfer semantic structure
    align_mode="infonce",
)

total_loss = lang_loss + tri_loss
total_loss.backward()

Alpha Tuning¶

Sharp Pareto cliff at alpha > 0.05

The alpha hyperparameter controls the weight of the triadic loss relative to the language loss. Do not exceed 0.10. There is a sharp Pareto cliff where language quality degrades rapidly beyond alpha = 0.05.

Alpha	Language quality	Triadic quality	Recommendation
0.01	Excellent	Weak	Too low
0.05	Excellent	Strong	Recommended
0.10	Good	Strong	Maximum safe value
0.20+	Degraded	Strong	Do not use

Alignment Modes¶

The align_mode parameter controls how the triadic head learns to mirror the structure of the backbone's embeddings:

Mode	Best for	Why
`infonce`	Pre-trained models (GPT-2, LLaMA, ...)	Mines positive/negative pairs from rich embeddings
`mse`	From-scratch training	Dense local gradients work with weak embeddings
`rank`	Best analogy accuracy	Preserves similarity ordering, not absolute values

Transfer Learning Results (GPT-2 124M)¶

Alignment Loss	Semantic Gap	Analogy Accuracy	Best For
MSE	+0.011	75.0%	From-scratch (weak embeddings)
Rank	+0.047	83.3%	Analogy tasks
InfoNCE	+0.076	100%	Pre-trained models (rich embeddings)

Training TriadicGPT from Scratch¶

For training the full TriadicGPT model (not just the triadic-head package):

# XL model (40M params, ~76 min on RTX 5060 Ti)
python src/torch_train.py --scale xl --steps 50000

# Reproduce the paper's production model
python src/torch_train.py \
  --scale xl --steps 50000 \
  --alpha 0.05 --entropy-weight 1.0 --align-weight 5.0 \
  --triadic-warmup-pct 0.3 --no-distill \
  --checkpoint-dir checkpoints/torch_run15_strongalign

Requirements¶

Python 3.10+
CUDA 12.8+ (for GPU training)
TinyStories dataset (~1.8 GB)

conda env create -f environment.yml
conda activate triadic-microgpt

Evaluation¶

# Full evaluation (perplexity, generation, triadic analysis)
python src/evaluate.py --model checkpoints/torch_run15_strongalign/model_best.pt

# Relational bias audit
python src/auditor.py