Architecture¶
reptimeline is organized in four layers, from backend-specific to domain-specific.
The Four Layers¶
graph TD
A[Extractors] --> B[Tracker]
B --> C[Discovery]
C --> D[Overlays]
A:::layer1
B:::layer2
C:::layer3
D:::layer4
classDef layer1 fill:#6366f1,stroke:#333,color:#fff
classDef layer2 fill:#8B5CF6,stroke:#333,color:#fff
classDef layer3 fill:#a78bfa,stroke:#333,color:#fff
classDef layer4 fill:#c4b5fd,stroke:#333,color:#000
Layer 1: Extractors (Backend-Specific)¶
Load model checkpoints and extract discrete representations. Each backend implements three methods.
Layer 2: Tracker (Backend-Agnostic)¶
Analyzes representation evolution across training snapshots. Detects births, deaths, connections, computes entropy/churn/utilization curves, and identifies phase transitions.
Layer 3: Discovery (Backend-Agnostic)¶
Bottom-up ontology discovery without prior knowledge. Finds bit semantics, dual pairs, dependencies, triadic interactions, and hierarchical structure.
Layer 4: Overlays (Domain-Specific)¶
Domain-specific analysis that adds meaning on top of the generic timeline. The triadic overlay maps results to the 63-primitive ontology.
Module Map¶
| Module | Layer | Purpose |
|---|---|---|
reptimeline.extractors.base |
Extractors | Abstract RepresentationExtractor interface |
reptimeline.extractors.triadic |
Extractors | TriadicGPT-specific implementation |
reptimeline.core |
Core | Data structures: ConceptSnapshot, CodeEvent, Timeline |
reptimeline.tracker |
Tracker | TimelineTracker: births, deaths, connections, curves |
reptimeline.discovery |
Discovery | BitDiscovery: duals, deps, triadic deps, hierarchy |
reptimeline.autolabel |
Discovery | AutoLabeler: 3 naming strategies |
reptimeline.reconcile |
Discovery | Reconciler: discovered vs. theoretical comparison |
reptimeline.overlays.primitive_overlay |
Overlays | Triadic: layer emergence, dual coherence |
reptimeline.viz.* |
Visualization | 4 plot types |
reptimeline.cli |
CLI | Command-line interface |
Adding a New Backend¶
To support a new discrete representation system, implement the RepresentationExtractor abstract class:
from reptimeline.extractors.base import RepresentationExtractor
from reptimeline.core import ConceptSnapshot
class MyExtractor(RepresentationExtractor):
def extract(self, checkpoint_path: str, concepts: list[str],
device: str = "cpu") -> ConceptSnapshot:
"""Load model, run concepts, return snapshot with binary codes."""
# Load your model from checkpoint_path
# Run each concept through the model
# Return ConceptSnapshot with codes dict
...
def similarity(self, code_a: list[int], code_b: list[int]) -> float:
"""Compute similarity [0, 1] between two codes."""
# e.g., Jaccard similarity for binary codes
...
def shared_features(self, code_a: list[int], code_b: list[int]) -> list[int]:
"""Return indices where both codes are active (=1)."""
...
The discover_checkpoints and extract_sequence methods are inherited and work automatically -- they find model_*step*.pt files and call your extract method on each.
Example: VQ-VAE Backend¶
class VQVAEExtractor(RepresentationExtractor):
def extract(self, checkpoint_path, concepts, device="cpu"):
model = load_vqvae(checkpoint_path)
codes = {}
for concept in concepts:
# Get codebook indices for this input
indices = model.encode(concept)
# Convert to binary vector (one-hot per codebook entry)
binary = indices_to_binary(indices, codebook_size=512)
codes[concept] = binary
return ConceptSnapshot(
step=parse_step(checkpoint_path),
codes=codes
)
def similarity(self, code_a, code_b):
return jaccard(code_a, code_b)
def shared_features(self, code_a, code_b):
return [i for i, (a, b) in enumerate(zip(code_a, code_b)) if a and b]
Data Flow¶
Checkpoints --> Extractor.extract_sequence() --> List[ConceptSnapshot]
|
TimelineTracker.analyze()
|
Timeline
/ | \
BitDiscovery Viz PrimitiveOverlay
|
AutoLabeler
|
Reconciler