hongruichen
ff601abc1c
add todo
2024-07-16 00:05:40 +08:00
hongruichen
f32327e2b2
remove multiply declearation of log in unit test
2024-07-15 12:06:12 +08:00
hongruichen
cd5a7331f7
add cpu backend as cross reference
2024-07-15 10:55:17 +08:00
hongruichen
4410fd6563
format with clang-format
2024-07-15 10:30:57 +08:00
hongruichen
c46b4deea9
[unit test] init all tensor by one function
2024-07-15 10:23:19 +08:00
hongruichen
30b40006cc
remove unused declarations
2024-07-14 23:50:11 +08:00
hongruichen
148ceab70c
add log op
2024-07-14 23:00:50 +08:00
hongruichen
c1e2283887
expose op at unit test
2024-07-13 11:07:06 +08:00
hongruichen
100ccd5e7f
add unary op template and more ops
2024-07-13 00:55:34 +08:00
hongruichen
7cbc4fbd8c
add mul
2024-07-12 23:26:38 +08:00
hongruichen
e3aa43adbd
suppress warning
2024-07-12 23:26:11 +08:00
hongruichen
0eb595cc6e
use table to simpilify the op mapping
2024-07-12 23:22:29 +08:00
hongruichen
f0894d897a
wip
...
wip
2024-07-12 19:57:34 +08:00
hongruichen
be3aa9631f
use template function directly
2024-07-11 11:18:06 +08:00
hongruichen
8932135fdb
add sqrt and mul ops
2024-07-11 00:08:08 +08:00
hongruichen
7ea28a6fac
add helper function for binary op
2024-07-10 23:39:03 +08:00
hongruichen
b6f29273f0
add function to get graph from cache
2024-07-10 23:08:32 +08:00
hongruichen
80051cfc4d
remove unused variables
2024-07-10 19:57:47 +08:00
hongruichen
b49b501e26
fix sprintf type
2024-07-10 19:48:57 +08:00
hongruichen
3feb574bf0
merge register_rpc_mem into alloc_rpc_mem
2024-07-10 19:40:02 +08:00
hongruichen
e97d3a6c48
fix tensor buffer allocation
...
add log
commit qnn buffer after changed
add log
register_rpc_mem 2 times
update input tensors before graph finalize
default to QNN_TENSORMEMTYPE_RAW
set new tensors at execute
move write input tensors to exec
check if mem registered before actual do
register rpc mem once allocated
2024-07-10 19:32:39 +08:00
hongruichen
dc7d83e121
add log
2024-07-10 00:33:23 +08:00
hongruichen
9add256efe
use helper function instead
2024-07-10 00:31:39 +08:00
hongruichen
a7be0693ba
add log
2024-07-10 00:29:43 +08:00
hongruichen
af869fd636
fix compiling error in debug build
2024-07-10 00:23:51 +08:00
Hongrui Chen
5f2e3918f6
refactoring ggml_qnn_tensor
2024-07-09 19:58:46 +08:00
Hongrui Chen
874216b9c8
remove unused members
2024-07-07 22:32:43 +08:00
hongruichen
263ffa962e
small opt of the qnn graph config init
2024-07-05 23:07:27 +08:00
hongruichen
4b0f6b0cd6
add helper function to get Qnn_TensorType_t from ggml_tensor
2024-07-05 19:37:58 +08:00
hongruichen
0f2e68713c
move tensor related function to utils
2024-07-05 19:02:38 +08:00
hongruichen
58cec14092
reformat
2024-07-05 17:38:54 +08:00
hongruichen
13dc3a02c3
use qnn graph inside add and mul ops
2024-07-05 13:27:16 +08:00
hongruichen
a688ed324b
add op param to add_nodes
2024-07-05 13:07:48 +08:00
hongruichen
4b2ee61f62
move graph map to backend object
2024-07-05 11:58:47 +08:00
hongruichen
ca0d999c2a
add ggml_qnn_graph
2024-07-05 11:35:18 +08:00
hongruichen
000240cf62
add clang format file and reformating
2024-07-04 23:29:31 +08:00
hongruichen
38f88d5fb1
fix compiling error after merge latest master
2024-07-03 00:13:53 +08:00
hongruichen
8b677d1b2f
move qnn backend into sub folder
2024-07-02 19:42:14 +08:00
hongruichen
3808a4c1e0
Merge branch 'master' into dev-refactoring
2024-07-01 22:52:08 +08:00
Roni
0ddeff1023
readme : update tool list ( #8209 )
...
* Added gppm to Tool list in README
* Update README.md
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-07-01 15:48:16 +03:00
Michael Francis
3840b6f593
nix : enable curl ( #8043 )
...
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-07-01 14:47:04 +03:00
Georgi Gerganov
257f8e41e2
nix : remove OpenCL remnants ( #8235 )
...
* nix : remove OpenCL remnants
* minor : remove parentheses
2024-07-01 14:46:18 +03:00
iacore
694c59cb42
Document BERT support. ( #8205 )
...
* Update README.md
document BERT support
* Update README.md
2024-07-01 13:40:58 +02:00
zhentaoyu
197fe6c1d7
[SYCL] Update SYCL-Rope op and Refactor ( #8157 )
...
* align with rope.cu and move sycl-op to a single file
2024-07-01 19:39:06 +08:00
Georgi Gerganov
d0a7145ba9
flake.lock: Update ( #8218 )
2024-06-30 16:09:34 -07:00
Xuan Son Nguyen
9ef0780062
Fix new line issue with chat template, disable template when in-prefix/suffix is set ( #8203 )
...
* preserve new line llama_chat_format_single
* disable chat template if in-prefix/suffix is set
* remove redundant change
2024-06-30 20:27:13 +02:00
Andrei
1c5eba6f8e
llama: Add attention and final logit soft-capping, update scaling factor to Gemma2 ( #8197 )
...
* Add attention and final logit softcapping.
* fix
* Add custom add_ functions
* Disable flash attention for Gemma2
* Update src/llama.cpp
Co-authored-by: slaren <slarengh@gmail.com>
* Add default value for attention and final logit softcap value
* Add custom kq scaling from Gemma2Attention
* Remove custom pre attention scaling and use computed value instead.
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-06-29 23:44:08 -04:00
Xuan Son Nguyen
72272b83a3
fix code typo in llama-cli ( #8198 )
2024-06-29 00:14:20 +02:00
Olivier Chafik
8748d8ac6f
json: attempt to skip slow tests when running under emulator ( #8189 )
2024-06-28 18:02:05 +01:00
Xuan Son Nguyen
26a39bbd6b
Add MiniCPM, Deepseek V2 chat template + clean up llama_chat_apply_template_internal
( #8172 )
...
* tmp_contains
* minicpm chat template
* add DeepSeek Lite template
* change deepseek-lite to deepseek2
* correct code comment
* correct code from master branch
2024-06-28 15:11:44 +02:00