Pierrick HYMBERT
|
74e6d876f6
|
llama: dbrx: fix build kv att out tensor name
|
2024-04-08 00:37:28 +02:00 |
|
Pierrick HYMBERT
|
b01b062ab5
|
llama: dbrx: fix build kv att out
|
2024-04-08 00:25:54 +02:00 |
|
Pierrick HYMBERT
|
993f836029
|
llama: dbrx: move norm2 after attention, fix build kv
|
2024-04-08 00:11:19 +02:00 |
|
Pierrick HYMBERT
|
2897aa628c
|
llama: dbrx: revert
|
2024-04-07 23:47:26 +02:00 |
|
Pierrick HYMBERT
|
830e46d7ae
|
llama: dbrx: fix last normalization
|
2024-04-07 23:40:12 +02:00 |
|
Pierrick HYMBERT
|
0ab1bae854
|
llama: dbrx: output norm dim
|
2024-04-07 20:56:53 +02:00 |
|
Mark Fairbairn
|
855f54402e
|
Change Windows AMD example to release build to make inference much faster. (#6525)
|
2024-04-07 20:52:19 +02:00 |
|
Georgi Gerganov
|
b909236c0b
|
flake.lock: Update (#6517)
Flake lock file updates:
• Updated input 'flake-parts':
'github:hercules-ci/flake-parts/f7b3c975cf067e56e7cda6cb098ebe3fb4d74ca2' (2024-03-01)
→ 'github:hercules-ci/flake-parts/9126214d0a59633752a136528f5f3b9aa8565b7d' (2024-04-01)
• Updated input 'flake-parts/nixpkgs-lib':
'github:NixOS/nixpkgs/1536926ef5621b09bba54035ae2bb6d806d72ac8?dir=lib' (2024-02-29)
→ 'github:NixOS/nixpkgs/d8fe5e6c92d0d190646fb9f1056741a229980089?dir=lib' (2024-03-29)
• Updated input 'nixpkgs':
'github:NixOS/nixpkgs/d8fe5e6c92d0d190646fb9f1056741a229980089' (2024-03-29)
→ 'github:NixOS/nixpkgs/fd281bd6b7d3e32ddfa399853946f782553163b5' (2024-04-03)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
|
2024-04-07 11:25:30 -07:00 |
|
Pierrick HYMBERT
|
50b4373673
|
model: dbrx: weird fix expert reshape
|
2024-04-07 20:14:43 +02:00 |
|
Pierrick HYMBERT
|
e2c919962b
|
model: dbrx: fix again sic expert reshape
|
2024-04-07 20:10:16 +02:00 |
|
Pierrick HYMBERT
|
c9bddbf253
|
model: dbrx: fix expert reshape
|
2024-04-07 19:38:35 +02:00 |
|
DAN™
|
e0717e751e
|
Add GritLM as supported models. (#6513)
|
2024-04-07 19:33:59 +02:00 |
|
Pierrick HYMBERT
|
7dd84b0924
|
model: dbrx: fix expert reshape
|
2024-04-07 19:12:24 +02:00 |
|
Pierrick HYMBERT
|
dbfd59114f
|
model: dbrx: fix tensor names mapping broken
|
2024-04-07 18:52:28 +02:00 |
|
Pierrick HYMBERT
|
f062b834ed
|
model: dbrx: convert experts to f16
|
2024-04-07 18:47:37 +02:00 |
|
Pierrick HYMBERT
|
d151d8fad9
|
model: dbrx: convert reshape expert tensors to 3D
|
2024-04-07 18:41:33 +02:00 |
|
Pierrick HYMBERT
|
e9987c66d0
|
llama: dbrx: fix tensor qkv number of elements
|
2024-04-07 18:21:57 +02:00 |
|
Pierrick HYMBERT
|
1bd94270e5
|
llama: quantize: remove wrong look for tensor qkv name as it was badly missing the .weight suffix
model: dbrx: convert to gguf force experts tensors to have .weight suffix
|
2024-04-07 17:55:33 +02:00 |
|
Pierrick HYMBERT
|
2449ef48a9
|
llama: dbrx: no weight suffix in ffn_gate_exps, ffn_up_exps and ffn_down_exps. Output tensor not optional.
|
2024-04-07 17:55:33 +02:00 |
|
Pierrick HYMBERT
|
8154617ff2
|
model: dbrx: convert-hf-to-gguf.py support python 3.8
|
2024-04-07 17:25:39 +02:00 |
|
Pierrick HYMBERT
|
3a9dc2eee2
|
model: dbrx: convert-hf-to-gguf.py fix 'token_embd.weight' has wrong shape, fix special tokens
|
2024-04-07 17:21:35 +02:00 |
|
Georgi Gerganov
|
c37247796b
|
sync : ggml
|
2024-04-07 17:05:51 +03:00 |
|
Slava Primenko
|
f77261a7c5
|
ggml: bypass code incompatible with CUDA < 11.1 (whisper/2020)
`cudaHostRegisterReadOnly` parameter was only introduced in CUDA 11.1
See this issue for more details:
https://github.com/ggerganov/examples/whisper/whisper.cpp/issues/2007
|
2024-04-07 17:05:40 +03:00 |
|
Pierrick HYMBERT
|
d7546fda64
|
llama: quantize: remove wrong look for tensor qkv name as it was badly missing the .weight suffix
|
2024-04-07 15:59:07 +02:00 |
|
Pierrick HYMBERT
|
9e17dad087
|
model: dbrx: convert-hf-to-gguf.py add chat template
|
2024-04-07 15:57:36 +02:00 |
|
Pierrick HYMBERT
|
200ce21436
|
model: dbrx: convert-hf-to-gguf.py fix fix ftype missing, fix tensor names does not suffix with .weight
|
2024-04-07 15:54:19 +02:00 |
|
Pierrick HYMBERT
|
1fb6d95c1d
|
model: convert-hf-to-gguf.py fix classname conflict with qwen2
|
2024-04-07 15:40:21 +02:00 |
|
Georgi Gerganov
|
43e8995e75
|
scripts : sync ggml-cuda folder
|
2024-04-07 16:08:12 +03:00 |
|
limitedAtonement
|
9472bce308
|
Run make to build the project (#6457)
|
2024-04-07 13:05:40 +02:00 |
|
Pierrick HYMBERT
|
61be4b91a6
|
model: convert-hf-to-gguf.py add _set_vocab_tiktoken gpt2 backed on llama.cpp
|
2024-04-07 12:15:16 +02:00 |
|
Pierrick HYMBERT
|
dccb012637
|
llama: dbrx: quantize fix n_attention_wv tensor name
|
2024-04-07 05:09:17 +02:00 |
|
Pierrick HYMBERT
|
b6522a9f5b
|
model: dbrx: convert fix tokenizer
|
2024-04-07 05:02:14 +02:00 |
|
Pierrick HYMBERT
|
305ac3b61b
|
llama: dbrx: quantize fix n_attention_wv tensor name
|
2024-04-07 05:01:33 +02:00 |
|
Neo Zhang Jianyu
|
d4f220a5cc
|
support/fix OPs GGML_TYPE_IQ4_NL, GGML_TYPE_IQ4_XS, GGML_TYPE_IQ3_XXS, GGML_TYPE_IQ3_S, GGML_TYPE_IQ2_XXS, GGML_TYPE_IQ2_XS, GGML_TYPE_IQ2_S, GGML_TYPE_IQ1_S, GGML_TYPE_IQ1_M (#6521)
|
2024-04-07 10:55:59 +08:00 |
|
Pierrick HYMBERT
|
06a59abf0a
|
model: dbrx: convert add n_ff
|
2024-04-07 03:17:24 +02:00 |
|
Pierrick HYMBERT
|
52c403355f
|
llama: increase maximum experts allowed
|
2024-04-07 03:16:33 +02:00 |
|
Pierrick HYMBERT
|
7e7cd53ca6
|
llama: dbrx: remove unnecessary optional tensor on FFN_GATE_EXPS
|
2024-04-06 23:55:37 +02:00 |
|
Pierrick HYMBERT
|
69856297b9
|
Merge remote-tracking branch 'origin/master' into hp/model/support-dbrx
|
2024-04-06 23:53:11 +02:00 |
|
Pierrick HYMBERT
|
4f12a580d9
|
llama: dbrx: remove not existing condition on empty output layer
|
2024-04-06 23:35:23 +02:00 |
|
Pierrick HYMBERT
|
fe8089871e
|
model: dbrx: fix missing embedding tensor, mix with output layer
|
2024-04-06 23:27:29 +02:00 |
|
Pierrick HYMBERT
|
9c7dedb0f3
|
llama: dbrx: no attention output layer
|
2024-04-06 22:25:37 +02:00 |
|
Pierrick HYMBERT
|
76f266beef
|
scripts: get-wikitext-2 add unzip
|
2024-04-06 21:10:19 +02:00 |
|
Pierrick HYMBERT
|
03da419fc0
|
llama: dbrx: remove wrong attn output layer in model arch
|
2024-04-06 20:43:46 +02:00 |
|
Pierrick HYMBERT
|
916b91852b
|
convert: dbrx: fix remove wrong ATTN_OUT_NORM tensor, add output layer mapping
|
2024-04-06 20:30:30 +02:00 |
|
Pierrick HYMBERT
|
c8e6f903e0
|
doc: dbrx: add the model as supported
|
2024-04-06 20:09:01 +02:00 |
|
Pierrick HYMBERT
|
0a35f5881b
|
convert: dbrx: fix mixed up and down expert tensors
llama: dbrx: review graph
|
2024-04-06 19:56:37 +02:00 |
|
Pierrick HYMBERT
|
e3c1e8127c
|
convert: dbrx: fix mixed up and down expert tensors
|
2024-04-06 19:21:43 +02:00 |
|
Pierrick HYMBERT
|
a7f9a3eafc
|
dbrx: minor
|
2024-04-06 19:09:04 +02:00 |
|
Georgi Gerganov
|
54ea0698fb
|
sync : ggml
|
2024-04-06 18:27:46 +03:00 |
|
Daniel Bevenius
|
b66aec675c
|
backend : fix typo in scheduler documentation (ggml/781)
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
|
2024-04-06 17:42:26 +03:00 |
|