mixed int5/int6/int8

pr_number	bits	scope
272	—	MLP matrices int5, attention matrices int6, elsewhere int8
349	—	MLP weights int5, attention weights int6, embeddings int8/FP16 for small tensors
623	—	MLP weights (int5), Attention weights (int6), Bigram embeddings (int6), Token embeddings (int8)
678	—	MLP int5, attention int6, bigram embeddings int6, token embeddings int8
1090	—	MLP, attention, embeddings
1422	—	MLP, attention, embeddings
1773	—	MLP/attention/embeddings
2095	—	MLP/attn/embeddings