PR #1896
openRecord: Flower Brain 6-Cell Ternary Architecture — val_bpb 1.1155 (unlimited compute)
by G3sparkyView on GitHub
val_bpb
1.1155
Architecture
Transformer
Optimizer
—
Artifact Size
10.4 MB
Training Techniques
Architecture
depth recurrence
6-cell Flower of Life hexagonal topology with specialized cells and recurrent layer reuse.
parameters: {"cells":6,"layers":12,"virtual_layers":17,"dimensions":512}
GQA
Grouped query attention used in the attention cell.
parameters: {"kv_heads":"8/4"}
Quantization
STE QAT
bits: null
scope: weights
ternary
bits: 2
scope: MLP layers
int6
bits: 6
scope: attention weights
Regularization
weight decay
parameters: {"weight_decay":0.095}
Compression
lzma
level: null
Novel Contributions
- 6-cell Flower of Life hexagonal topology for a language model
- Ternary BitLinear weights with STE-based quantization-aware training
- Depth recurrence producing 17 virtual layers from 12 physical layers
- Mixed ternary packing plus int6 GPTQ compression for a compact artifact
- Empirical finding that void fraction appears architecture-determined
- Observation that STE worsened the quantization gap in this setup