Skip to content

Training Guide

Two-Phase Protocol

TriadicGPT uses a two-phase training strategy:

Phase 1: Frozen backbone -- Train only the triadic head while the language model backbone is frozen. This lets the head learn to map hidden states to meaningful binary vectors without disrupting language quality.

Phase 2: Joint optimization -- Unfreeze the last N layers of the backbone for joint language + triadic optimization. This allows the hidden representations to co-adapt with the triadic projection.

from triadic_head import TriadicWrapper

model = TriadicWrapper("gpt2", n_bits=64, align_mode="infonce")

# Phase 1: triadic head only
model.freeze_backbone()
# ... train for M steps ...

# Phase 2: joint optimization
model.unfreeze_last_n(2)
# ... train for remaining steps ...

Loss Function

The total loss is:

\[L = L_{\text{lang}} + \alpha \cdot L_{\text{triadic}}\]

The triadic loss combines four components:

Component Purpose What it prevents
Diversity Bits fire ~50% of the time All-zero or all-one collapse
Contrastive Different sequences get different signatures Degenerate constant encoding
Entropy No dead bits Individual bits getting stuck
Embedding alignment Triadic similarity tracks embedding similarity Signatures unrelated to semantics
logits, triadic_proj, lang_loss = model(input_ids, labels=input_ids)

tri_loss = model.triadic_loss(
    triadic_proj,
    input_ids=input_ids,
    alpha=0.05,           # triadic weight
    entropy_weight=1.0,   # prevent dead bits
    align_weight=5.0,     # transfer semantic structure
    align_mode="infonce",
)

total_loss = lang_loss + tri_loss
total_loss.backward()

Alpha Tuning

Sharp Pareto cliff at alpha > 0.05

The alpha hyperparameter controls the weight of the triadic loss relative to the language loss. Do not exceed 0.10. There is a sharp Pareto cliff where language quality degrades rapidly beyond alpha = 0.05.

Alpha Language quality Triadic quality Recommendation
0.01 Excellent Weak Too low
0.05 Excellent Strong Recommended
0.10 Good Strong Maximum safe value
0.20+ Degraded Strong Do not use

Alignment Modes

The align_mode parameter controls how the triadic head learns to mirror the structure of the backbone's embeddings:

Mode Best for Why
infonce Pre-trained models (GPT-2, LLaMA, ...) Mines positive/negative pairs from rich embeddings
mse From-scratch training Dense local gradients work with weak embeddings
rank Best analogy accuracy Preserves similarity ordering, not absolute values

Transfer Learning Results (GPT-2 124M)

Alignment Loss Semantic Gap Analogy Accuracy Best For
MSE +0.011 75.0% From-scratch (weak embeddings)
Rank +0.047 83.3% Analogy tasks
InfoNCE +0.076 100% Pre-trained models (rich embeddings)

Training TriadicGPT from Scratch

For training the full TriadicGPT model (not just the triadic-head package):

# XL model (40M params, ~76 min on RTX 5060 Ti)
python src/torch_train.py --scale xl --steps 50000

# Reproduce the paper's production model
python src/torch_train.py \
  --scale xl --steps 50000 \
  --alpha 0.05 --entropy-weight 1.0 --align-weight 5.0 \
  --triadic-warmup-pct 0.3 --no-distill \
  --checkpoint-dir checkpoints/torch_run15_strongalign

Requirements

  • Python 3.10+
  • CUDA 12.8+ (for GPU training)
  • TinyStories dataset (~1.8 GB)
conda env create -f environment.yml
conda activate triadic-microgpt

Evaluation

# Full evaluation (perplexity, generation, triadic analysis)
python src/evaluate.py --model checkpoints/torch_run15_strongalign/model_best.pt

# Relational bias audit
python src/auditor.py