Commit graph

2686 commits

Author SHA1 Message Date
Pierrick HYMBERT
ac75fbd8c5 gguf-py: dbrx: reverse again the MOE tensors mapping:
layer.ffn_up_exps   -> Up-projection weights (w1)
    layer.ffn_gate_exps -> Gating weights (v1)
    layer.ffn_down_exps -> Down-projection weights (w2)
2024-04-09 02:41:39 +02:00
Pierrick HYMBERT
ac82aa0e63 gguf-py: revert spaces 2024-04-09 01:26:57 +02:00
Pierrick HYMBERT
c7b9a2e85e llama: dbrx: fix ggml context of the attention outputs weight 2024-04-09 00:58:50 +02:00
Pierrick HYMBERT
55943a281f model: dbrx: convert fix mixed ffn_gate_exps and ffn_down_exps 2024-04-08 21:47:59 +02:00
Pierrick HYMBERT
ea8b58c6cd llama: dbrx: first add the residuals and then do the norm 2024-04-08 21:10:49 +02:00
Pierrick HYMBERT
f30a73bb01 llama: dbrx: rename layer_out_norm to attn_out_norm 2024-04-08 20:38:31 +02:00
Pierrick HYMBERT
e66f1e3448 llama: dbrx: document changes, permute only FFN_DOWN_EXPS. Add a check for ftype 2024-04-08 20:08:54 +02:00
Pierrick HYMBERT
9968952921 llama: dbrx: fix experts 3D tensor layout (again) 2024-04-08 19:37:23 +02:00
Pierrick HYMBERT
18a84fedda llama: dbrx: fix experts 3D tensor layout (again) 2024-04-08 19:12:53 +02:00
Pierrick HYMBERT
48909ed2a7 model: dbrx convert permute experts directly torch, log shape 2024-04-08 19:01:44 +02:00
Pierrick HYMBERT
f20c04f01f llama: factorize moe graph implementation between grok, mixtral and dbrx 2024-04-08 17:45:35 +02:00
Pierrick HYMBERT
21fb24aa45 model: dbrx: convert-hf-to-gguf.py fix experts tensors shapes 2024-04-08 16:55:56 +02:00
Pierrick HYMBERT
81f308ad64 llama: dbrx: fix experts tensor layout 2024-04-08 15:04:18 +02:00
Pierrick HYMBERT
eb0847e6b1 llama: dbrx: load norm eps in hparams 2024-04-08 14:38:21 +02:00
Pierrick HYMBERT
506cc2ea53 llama: dbrx: convert remove previous reverse 2024-04-08 14:09:06 +02:00
Pierrick HYMBERT
35dce3e145 llama: dbrx: rename tensor to actual meaning. Fix normalization in graph. Permute expert tensors to the llama.cpp layout 2024-04-08 14:02:08 +02:00
Pierrick HYMBERT
8e22688401 llama: dbrx: move norm epsilon to convert. Fix missing normalization. 2024-04-08 11:22:24 +02:00
Pierrick HYMBERT
52c6276e12 llama: dbrx: fix k scale 2024-04-08 10:43:36 +02:00
Pierrick HYMBERT
71f9e479aa llama: dbrx: Try another rope type 2024-04-08 01:29:00 +02:00
Pierrick HYMBERT
f8f97e74f9 llama: dbrx: hardcode nn.LayerNorm epsilon 2024-04-08 01:17:33 +02:00
Pierrick HYMBERT
74e6d876f6 llama: dbrx: fix build kv att out tensor name 2024-04-08 00:37:28 +02:00
Pierrick HYMBERT
b01b062ab5 llama: dbrx: fix build kv att out 2024-04-08 00:25:54 +02:00
Pierrick HYMBERT
993f836029 llama: dbrx: move norm2 after attention, fix build kv 2024-04-08 00:11:19 +02:00
Pierrick HYMBERT
2897aa628c llama: dbrx: revert 2024-04-07 23:47:26 +02:00
Pierrick HYMBERT
830e46d7ae llama: dbrx: fix last normalization 2024-04-07 23:40:12 +02:00
Pierrick HYMBERT
0ab1bae854 llama: dbrx: output norm dim 2024-04-07 20:56:53 +02:00
Pierrick HYMBERT
50b4373673 model: dbrx: weird fix expert reshape 2024-04-07 20:14:43 +02:00
Pierrick HYMBERT
e2c919962b model: dbrx: fix again sic expert reshape 2024-04-07 20:10:16 +02:00
Pierrick HYMBERT
c9bddbf253 model: dbrx: fix expert reshape 2024-04-07 19:38:35 +02:00
Pierrick HYMBERT
7dd84b0924 model: dbrx: fix expert reshape 2024-04-07 19:12:24 +02:00
Pierrick HYMBERT
dbfd59114f model: dbrx: fix tensor names mapping broken 2024-04-07 18:52:28 +02:00
Pierrick HYMBERT
f062b834ed model: dbrx: convert experts to f16 2024-04-07 18:47:37 +02:00
Pierrick HYMBERT
d151d8fad9 model: dbrx: convert reshape expert tensors to 3D 2024-04-07 18:41:33 +02:00
Pierrick HYMBERT
e9987c66d0 llama: dbrx: fix tensor qkv number of elements 2024-04-07 18:21:57 +02:00
Pierrick HYMBERT
1bd94270e5 llama: quantize: remove wrong look for tensor qkv name as it was badly missing the .weight suffix
model: dbrx: convert to gguf force experts tensors to have .weight suffix
2024-04-07 17:55:33 +02:00
Pierrick HYMBERT
2449ef48a9 llama: dbrx: no weight suffix in ffn_gate_exps, ffn_up_exps and ffn_down_exps. Output tensor not optional. 2024-04-07 17:55:33 +02:00
Pierrick HYMBERT
8154617ff2 model: dbrx: convert-hf-to-gguf.py support python 3.8 2024-04-07 17:25:39 +02:00
Pierrick HYMBERT
3a9dc2eee2 model: dbrx: convert-hf-to-gguf.py fix 'token_embd.weight' has wrong shape, fix special tokens 2024-04-07 17:21:35 +02:00
Pierrick HYMBERT
d7546fda64 llama: quantize: remove wrong look for tensor qkv name as it was badly missing the .weight suffix 2024-04-07 15:59:07 +02:00
Pierrick HYMBERT
9e17dad087 model: dbrx: convert-hf-to-gguf.py add chat template 2024-04-07 15:57:36 +02:00
Pierrick HYMBERT
200ce21436 model: dbrx: convert-hf-to-gguf.py fix fix ftype missing, fix tensor names does not suffix with .weight 2024-04-07 15:54:19 +02:00
Pierrick HYMBERT
1fb6d95c1d model: convert-hf-to-gguf.py fix classname conflict with qwen2 2024-04-07 15:40:21 +02:00
Pierrick HYMBERT
61be4b91a6 model: convert-hf-to-gguf.py add _set_vocab_tiktoken gpt2 backed on llama.cpp 2024-04-07 12:15:16 +02:00
Pierrick HYMBERT
dccb012637 llama: dbrx: quantize fix n_attention_wv tensor name 2024-04-07 05:09:17 +02:00
Pierrick HYMBERT
b6522a9f5b model: dbrx: convert fix tokenizer 2024-04-07 05:02:14 +02:00
Pierrick HYMBERT
305ac3b61b llama: dbrx: quantize fix n_attention_wv tensor name 2024-04-07 05:01:33 +02:00
Pierrick HYMBERT
06a59abf0a model: dbrx: convert add n_ff 2024-04-07 03:17:24 +02:00
Pierrick HYMBERT
52c403355f llama: increase maximum experts allowed 2024-04-07 03:16:33 +02:00
Pierrick HYMBERT
7e7cd53ca6 llama: dbrx: remove unnecessary optional tensor on FFN_GATE_EXPS 2024-04-06 23:55:37 +02:00
Pierrick HYMBERT
69856297b9 Merge remote-tracking branch 'origin/master' into hp/model/support-dbrx 2024-04-06 23:53:11 +02:00