Commit graph

2668 commits

Author SHA1 Message Date
Pierrick HYMBERT
71f9e479aa llama: dbrx: Try another rope type 2024-04-08 01:29:00 +02:00
Pierrick HYMBERT
f8f97e74f9 llama: dbrx: hardcode nn.LayerNorm epsilon 2024-04-08 01:17:33 +02:00
Pierrick HYMBERT
74e6d876f6 llama: dbrx: fix build kv att out tensor name 2024-04-08 00:37:28 +02:00
Pierrick HYMBERT
b01b062ab5 llama: dbrx: fix build kv att out 2024-04-08 00:25:54 +02:00
Pierrick HYMBERT
993f836029 llama: dbrx: move norm2 after attention, fix build kv 2024-04-08 00:11:19 +02:00
Pierrick HYMBERT
2897aa628c llama: dbrx: revert 2024-04-07 23:47:26 +02:00
Pierrick HYMBERT
830e46d7ae llama: dbrx: fix last normalization 2024-04-07 23:40:12 +02:00
Pierrick HYMBERT
0ab1bae854 llama: dbrx: output norm dim 2024-04-07 20:56:53 +02:00
Pierrick HYMBERT
50b4373673 model: dbrx: weird fix expert reshape 2024-04-07 20:14:43 +02:00
Pierrick HYMBERT
e2c919962b model: dbrx: fix again sic expert reshape 2024-04-07 20:10:16 +02:00
Pierrick HYMBERT
c9bddbf253 model: dbrx: fix expert reshape 2024-04-07 19:38:35 +02:00
Pierrick HYMBERT
7dd84b0924 model: dbrx: fix expert reshape 2024-04-07 19:12:24 +02:00
Pierrick HYMBERT
dbfd59114f model: dbrx: fix tensor names mapping broken 2024-04-07 18:52:28 +02:00
Pierrick HYMBERT
f062b834ed model: dbrx: convert experts to f16 2024-04-07 18:47:37 +02:00
Pierrick HYMBERT
d151d8fad9 model: dbrx: convert reshape expert tensors to 3D 2024-04-07 18:41:33 +02:00
Pierrick HYMBERT
e9987c66d0 llama: dbrx: fix tensor qkv number of elements 2024-04-07 18:21:57 +02:00
Pierrick HYMBERT
1bd94270e5 llama: quantize: remove wrong look for tensor qkv name as it was badly missing the .weight suffix
model: dbrx: convert to gguf force experts tensors to have .weight suffix
2024-04-07 17:55:33 +02:00
Pierrick HYMBERT
2449ef48a9 llama: dbrx: no weight suffix in ffn_gate_exps, ffn_up_exps and ffn_down_exps. Output tensor not optional. 2024-04-07 17:55:33 +02:00
Pierrick HYMBERT
8154617ff2 model: dbrx: convert-hf-to-gguf.py support python 3.8 2024-04-07 17:25:39 +02:00
Pierrick HYMBERT
3a9dc2eee2 model: dbrx: convert-hf-to-gguf.py fix 'token_embd.weight' has wrong shape, fix special tokens 2024-04-07 17:21:35 +02:00
Pierrick HYMBERT
d7546fda64 llama: quantize: remove wrong look for tensor qkv name as it was badly missing the .weight suffix 2024-04-07 15:59:07 +02:00
Pierrick HYMBERT
9e17dad087 model: dbrx: convert-hf-to-gguf.py add chat template 2024-04-07 15:57:36 +02:00
Pierrick HYMBERT
200ce21436 model: dbrx: convert-hf-to-gguf.py fix fix ftype missing, fix tensor names does not suffix with .weight 2024-04-07 15:54:19 +02:00
Pierrick HYMBERT
1fb6d95c1d model: convert-hf-to-gguf.py fix classname conflict with qwen2 2024-04-07 15:40:21 +02:00
Pierrick HYMBERT
61be4b91a6 model: convert-hf-to-gguf.py add _set_vocab_tiktoken gpt2 backed on llama.cpp 2024-04-07 12:15:16 +02:00
Pierrick HYMBERT
dccb012637 llama: dbrx: quantize fix n_attention_wv tensor name 2024-04-07 05:09:17 +02:00
Pierrick HYMBERT
b6522a9f5b model: dbrx: convert fix tokenizer 2024-04-07 05:02:14 +02:00
Pierrick HYMBERT
305ac3b61b llama: dbrx: quantize fix n_attention_wv tensor name 2024-04-07 05:01:33 +02:00
Pierrick HYMBERT
06a59abf0a model: dbrx: convert add n_ff 2024-04-07 03:17:24 +02:00
Pierrick HYMBERT
52c403355f llama: increase maximum experts allowed 2024-04-07 03:16:33 +02:00
Pierrick HYMBERT
7e7cd53ca6 llama: dbrx: remove unnecessary optional tensor on FFN_GATE_EXPS 2024-04-06 23:55:37 +02:00
Pierrick HYMBERT
69856297b9 Merge remote-tracking branch 'origin/master' into hp/model/support-dbrx 2024-04-06 23:53:11 +02:00
Pierrick HYMBERT
4f12a580d9 llama: dbrx: remove not existing condition on empty output layer 2024-04-06 23:35:23 +02:00
Pierrick HYMBERT
fe8089871e model: dbrx: fix missing embedding tensor, mix with output layer 2024-04-06 23:27:29 +02:00
Pierrick HYMBERT
9c7dedb0f3 llama: dbrx: no attention output layer 2024-04-06 22:25:37 +02:00
Pierrick HYMBERT
76f266beef scripts: get-wikitext-2 add unzip 2024-04-06 21:10:19 +02:00
Pierrick HYMBERT
03da419fc0 llama: dbrx: remove wrong attn output layer in model arch 2024-04-06 20:43:46 +02:00
Pierrick HYMBERT
916b91852b convert: dbrx: fix remove wrong ATTN_OUT_NORM tensor, add output layer mapping 2024-04-06 20:30:30 +02:00
Pierrick HYMBERT
c8e6f903e0 doc: dbrx: add the model as supported 2024-04-06 20:09:01 +02:00
Pierrick HYMBERT
0a35f5881b convert: dbrx: fix mixed up and down expert tensors
llama: dbrx: review graph
2024-04-06 19:56:37 +02:00
Pierrick HYMBERT
e3c1e8127c convert: dbrx: fix mixed up and down expert tensors 2024-04-06 19:21:43 +02:00
Pierrick HYMBERT
a7f9a3eafc dbrx: minor 2024-04-06 19:09:04 +02:00
Georgi Gerganov
54ea0698fb
sync : ggml 2024-04-06 18:27:46 +03:00
Daniel Bevenius
b66aec675c
backend : fix typo in scheduler documentation (ggml/781)
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2024-04-06 17:42:26 +03:00
Clint Herron
57dd02c44b
Tests: Added integration tests for GBNF parser (#6472)
* Added integration tests for GBNF parser to validate correctness of parsing, as well as correctness of string matching. Intended for use to pin behavior while working on performance improvements.

* Fixing whitespace errors and cleaning error message alert to be clearer.

* Removing hacky include to llama.cpp from grammar integration test now that needed functions are available via internal API.

* Comment cleanup.

* Reorganizing tests for readability.

* Cleaning up debug message to make a bit more sense.
2024-04-06 10:31:33 -04:00
Pierrick HYMBERT
e4f8ee4f48 llama: support dbrx fix norm type 2024-04-06 16:14:58 +02:00
Pierrick HYMBERT
09210334bf model: dbrx fix python linter in convert-hf-to-gguf.py 2024-04-06 16:00:32 +02:00
Pierrick HYMBERT
c0beb3cf7e llama: add label for model 132B 2024-04-06 15:58:17 +02:00
Pierrick HYMBERT
3937100adb model: dbrx, trust remote code 2024-04-06 15:57:57 +02:00
Pierrick HYMBERT
3e3d2d127c gguf-py: remove wrong clip -> clamp 2024-04-06 15:46:47 +02:00