Molly Sophia
7f2ef56639
llama: rwkv6: Add lora for some supported tensors
...
Currently att.key/receptance/value/gate/output, ffn.receptance/key/value, as well as head.weight
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-30 12:11:31 +08:00
Molly Sophia
7444046c47
llama: rwkv6: Apply code format changes
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:34 +08:00
Molly Sophia
5f00c52be0
llama: rwkv6: Remove unused nodes
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:34 +08:00
Molly Sophia
e0ea51144e
llama: rwkv6: Keep `time_mix_w1/w2
` as F32
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:34 +08:00
Molly Sophia
601b5920c6
converter: Match `new_name
instead of
name
` for float32 explicit tensors
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:34 +08:00
Molly Sophia
6d69fd77b1
llama: rwkv6: Add kv `time_mix_extra_dim
and
time_decay_extra_dim
`
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:34 +08:00
Molly Sophia
c414a24a5a
llama: rwkv6: Make use of key `feed_forward_length
`
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:33 +08:00
Molly Sophia
87a29014a4
converter: Use class name `Rwkv6Model
`
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:33 +08:00
Molly Sophia
7756afd8dd
llama: rwkv6: Apply code style and misc changes
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:33 +08:00
Molly Sophia
e94778ade0
llama: rwkv6: Use `ggml_norm
instead of
ggml_group_norm
`
...
Co-authored-by: compilade <git@compilade.net>
2024-08-28 10:22:33 +08:00
Molly Sophia
57decb4a38
Update src/llama.cpp
...
Co-authored-by: compilade <git@compilade.net>
2024-08-28 10:22:33 +08:00
Molly Sophia
f5d955d2fe
llama: rwkv6: Use the new advanced batch splits
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:32 +08:00
Molly Sophia
6da6aa48b0
llama: rwkv6: Add quantization tensor exclusion
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:06 +08:00
Molly Sophia
c165e34629
llama: rwkv6: Clean up
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:06 +08:00
Molly Sophia
ee1b78c091
llama: rwkv6: Fix group_norm assertion failure with Metal
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
683d70cb68
llama: rwkv6: Fix tensor loading for 7B/14B models
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
b0f4fe5279
llama: rwkv6: Detect model.type
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
276d53b18f
build_rwkv6: Simplify graph
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
12fbe1ade2
Use MODEL_ARCH.RWKV6 instead of MODEL_ARCH.RWKV
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
5afa3eff3a
Update convert_hf_to_gguf.py
...
Co-authored-by: compilade <git@compilade.net>
2024-08-28 10:22:05 +08:00
Molly Sophia
ae9936a80d
Update convert_hf_to_gguf.py
...
Co-authored-by: compilade <git@compilade.net>
2024-08-28 10:22:05 +08:00
Molly Sophia
8aa711ad98
ggml: Add backward computation for unary op `exp
`
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
c6955525b4
Update convert_hf_to_gguf.py
...
Co-authored-by: compilade <git@compilade.net>
2024-08-28 10:22:05 +08:00
Molly Sophia
7f2e370fa2
convert_hf_to_gguf: rwkv tokenizer: Don't escape sequences manually
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
18decea3ed
convert_hf_to_gguf: rwkv: Avoid using `eval
`
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
8bc1f9ae80
build_rwkv: Avoid using inplace operations
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
6ae2f4866f
Remove trailing whitespaces
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:05 +08:00
Molly Sophia
01dcf4bb77
Fix parallel inferencing for RWKV
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:22:04 +08:00
Molly Sophia
98ce5f43f0
Fix offloading layers to CUDA
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:21:21 +08:00
Molly Sophia
903089b5eb
Add `wkv.head_size
` key for RWKV
...
so it doesn't reuse Mamba ssm parameters
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:21:21 +08:00
Molly Sophia
8d498c7075
Add `rescale_every_n_layers
` parameter
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:21:21 +08:00
Molly Sophia
0784a0cf26
RWKV v6 graph building
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:21:20 +08:00
Molly Sophia
5732de89b7
ggml: Add unary operator Exp
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:20:24 +08:00
Molly Sophia
0e5ac349f8
Fix rwkv tokenizer
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:20:24 +08:00
Molly Sophia
a180b63b49
Load more tensors for rwkv v6
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:20:24 +08:00
Molly Sophia
700dad1b86
Fix build
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:20:24 +08:00
Layl Bongers
b3b17e05fe
Add placeholder llm_build_time_mix
2024-08-28 10:20:24 +08:00
Layl Bongers
3cbeffc50f
Add time mix output loading
2024-08-28 10:20:24 +08:00
Layl Bongers
b409fd8e11
Add remaining time mix parameters
2024-08-28 10:20:24 +08:00
Layl Bongers
dd3aa3d40e
Add time mix KVRG & correct merge mistake
2024-08-28 10:20:24 +08:00
Layl Bongers
5479588569
Add rwkv5 layer norms
2024-08-28 10:20:24 +08:00
Layl Bongers
4e23d9715b
Add logits conversion to rwkv5
2024-08-28 10:20:24 +08:00
Layl Bongers
a866789603
Add workaround for kv cache
2024-08-28 10:20:24 +08:00
Layl Bongers
a0aae8d671
Add (broken) placeholder graph builder for RWKV
2024-08-28 10:20:24 +08:00
Layl Bongers
e92c74f4a1
Fix model loading
2024-08-28 10:20:24 +08:00
Layl Bongers
7cac72a80b
Do not use special tokens when matching in RWKV tokenizer
2024-08-28 10:20:24 +08:00
Molly Sophia
865167d01a
Fix build
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:20:24 +08:00
Layl Bongers
dc0767f4b3
Add RWKV tokenization
2024-08-28 10:20:24 +08:00
Molly Sophia
8d2eca3507
convert_hf_to_gguf: Add support for RWKV v6
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-28 10:20:24 +08:00
Georgi Gerganov
20f1789dfb
vulkan : fix build ( #0 )
...
ggml-ci
2024-08-27 22:41:27 +03:00