Commit graph

3686 commits

Author SHA1 Message Date
Molly Sophia
5f00c52be0 llama: rwkv6: Remove unused nodes
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:34 +08:00
Molly Sophia
e0ea51144e llama: rwkv6: Keep `time_mix_w1/w2` as F32
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:34 +08:00
Molly Sophia
601b5920c6 converter: Match `new_name instead of name` for float32 explicit tensors
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:34 +08:00
Molly Sophia
6d69fd77b1 llama: rwkv6: Add kv `time_mix_extra_dim and time_decay_extra_dim`
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:34 +08:00
Molly Sophia
c414a24a5a llama: rwkv6: Make use of key `feed_forward_length`
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:33 +08:00
Molly Sophia
87a29014a4 converter: Use class name `Rwkv6Model`
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:33 +08:00
Molly Sophia
7756afd8dd llama: rwkv6: Apply code style and misc changes
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:33 +08:00
Molly Sophia
e94778ade0 llama: rwkv6: Use `ggml_norm instead of ggml_group_norm`
Co-authored-by: compilade <git@compilade.net>
2024-08-28 10:22:33 +08:00
Molly Sophia
57decb4a38 Update src/llama.cpp
Co-authored-by: compilade <git@compilade.net>
2024-08-28 10:22:33 +08:00
Molly Sophia
f5d955d2fe llama: rwkv6: Use the new advanced batch splits
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:32 +08:00
Molly Sophia
6da6aa48b0 llama: rwkv6: Add quantization tensor exclusion
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:06 +08:00
Molly Sophia
c165e34629 llama: rwkv6: Clean up
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:06 +08:00
Molly Sophia
ee1b78c091 llama: rwkv6: Fix group_norm assertion failure with Metal
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
683d70cb68 llama: rwkv6: Fix tensor loading for 7B/14B models
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
b0f4fe5279 llama: rwkv6: Detect model.type
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
276d53b18f build_rwkv6: Simplify graph
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
12fbe1ade2 Use MODEL_ARCH.RWKV6 instead of MODEL_ARCH.RWKV
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
5afa3eff3a Update convert_hf_to_gguf.py
Co-authored-by: compilade <git@compilade.net>
2024-08-28 10:22:05 +08:00
Molly Sophia
ae9936a80d Update convert_hf_to_gguf.py
Co-authored-by: compilade <git@compilade.net>
2024-08-28 10:22:05 +08:00
Molly Sophia
8aa711ad98 ggml: Add backward computation for unary op `exp`
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
c6955525b4 Update convert_hf_to_gguf.py
Co-authored-by: compilade <git@compilade.net>
2024-08-28 10:22:05 +08:00
Molly Sophia
7f2e370fa2 convert_hf_to_gguf: rwkv tokenizer: Don't escape sequences manually
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
18decea3ed convert_hf_to_gguf: rwkv: Avoid using `eval`
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
8bc1f9ae80 build_rwkv: Avoid using inplace operations
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
6ae2f4866f Remove trailing whitespaces
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
01dcf4bb77 Fix parallel inferencing for RWKV
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:04 +08:00
Molly Sophia
98ce5f43f0 Fix offloading layers to CUDA
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:21:21 +08:00
Molly Sophia
903089b5eb Add `wkv.head_size` key for RWKV
so it doesn't reuse Mamba ssm parameters

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:21:21 +08:00
Molly Sophia
8d498c7075 Add `rescale_every_n_layers` parameter
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:21:21 +08:00
Molly Sophia
0784a0cf26 RWKV v6 graph building
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:21:20 +08:00
Molly Sophia
5732de89b7 ggml: Add unary operator Exp
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:20:24 +08:00
Molly Sophia
0e5ac349f8 Fix rwkv tokenizer
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:20:24 +08:00
Molly Sophia
a180b63b49 Load more tensors for rwkv v6
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:20:24 +08:00
Molly Sophia
700dad1b86 Fix build
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:20:24 +08:00
Layl Bongers
b3b17e05fe Add placeholder llm_build_time_mix 2024-08-28 10:20:24 +08:00
Layl Bongers
3cbeffc50f Add time mix output loading 2024-08-28 10:20:24 +08:00
Layl Bongers
b409fd8e11 Add remaining time mix parameters 2024-08-28 10:20:24 +08:00
Layl Bongers
dd3aa3d40e Add time mix KVRG & correct merge mistake 2024-08-28 10:20:24 +08:00
Layl Bongers
5479588569 Add rwkv5 layer norms 2024-08-28 10:20:24 +08:00
Layl Bongers
4e23d9715b Add logits conversion to rwkv5 2024-08-28 10:20:24 +08:00
Layl Bongers
a866789603 Add workaround for kv cache 2024-08-28 10:20:24 +08:00
Layl Bongers
a0aae8d671 Add (broken) placeholder graph builder for RWKV 2024-08-28 10:20:24 +08:00
Layl Bongers
e92c74f4a1 Fix model loading 2024-08-28 10:20:24 +08:00
Layl Bongers
7cac72a80b Do not use special tokens when matching in RWKV tokenizer 2024-08-28 10:20:24 +08:00
Molly Sophia
865167d01a Fix build
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:20:24 +08:00
Layl Bongers
dc0767f4b3 Add RWKV tokenization 2024-08-28 10:20:24 +08:00
Molly Sophia
8d2eca3507 convert_hf_to_gguf: Add support for RWKV v6
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:20:24 +08:00
Georgi Gerganov
20f1789dfb vulkan : fix build (#0)
ggml-ci
2024-08-27 22:41:27 +03:00
Georgi Gerganov
231cff5f6f sync : ggml 2024-08-27 22:41:27 +03:00
Xie Yanbo
3246fe84d7
Fix minicpm example directory (#9111) 2024-08-27 14:33:08 +02:00