Georgi Gerganov
b8efb0725d
llama.vim : minor [no ci]
2024-10-21 11:00:22 +03:00
Georgi Gerganov
fe78c39399
llama.vim : fix large chunk accept + comments [no ci]
2024-10-21 11:00:22 +03:00
Georgi Gerganov
6bb6e6dd80
llama.vim : display ring capacity [no ci]
2024-10-21 11:00:22 +03:00
Georgi Gerganov
1600d846b6
llama.vim : complete only whithin the local scope [no ci]
2024-10-21 11:00:22 +03:00
Georgi Gerganov
d1b8b215d5
llama.vim : fix repetitions of existing text
2024-10-21 11:00:21 +03:00
Georgi Gerganov
4583aef12b
llama.vim : final touches
...
ggml-ci
2024-10-21 11:00:21 +03:00
Georgi Gerganov
847c8c023e
llama.vim : update infill API params [no ci]
2024-10-21 11:00:21 +03:00
Georgi Gerganov
060573f7e8
llama.vim : add comments [no ci]
2024-10-21 11:00:21 +03:00
Georgi Gerganov
42a9008b31
llama.vim : process extra chunks in the background [no ci]
2024-10-21 11:00:21 +03:00
Georgi Gerganov
0c1f51b73e
llama : improve infill sampler
...
ggml-ci
2024-10-21 11:00:20 +03:00
Georgi Gerganov
e4be74b4b7
llama.vim : add top_p + improve responsivness + fix edge cases
2024-10-21 11:00:20 +03:00
Georgi Gerganov
25ecb35c4f
llama.vim : simplify job logic + improve robustness and responsivness
2024-10-21 11:00:20 +03:00
Georgi Gerganov
9f8fa900f6
llama.vim : fix repetitions [no ci]
2024-10-21 11:00:20 +03:00
Georgi Gerganov
ae76a092b8
llama.vim : pass filenames for each chunk
...
ggml-ci
2024-10-21 11:00:20 +03:00
Georgi Gerganov
916c2ee3fd
llama : simplify infill sampler
2024-10-21 11:00:19 +03:00
Georgi Gerganov
bc2857b88c
llama.vim : async context processing
...
ggml-ci
2024-10-21 11:00:19 +03:00
Georgi Gerganov
2960510153
llama.vim : do not auto-fim when far from the end of the line [no ci]
2024-10-21 11:00:19 +03:00
Georgi Gerganov
d81a0ac185
llama.vim : do not evict certain chunks [no ci]
2024-10-21 11:00:19 +03:00
Georgi Gerganov
27d53cb4ee
llama.vim : logic to evict old chunks that are similar to new one
2024-10-21 11:00:19 +03:00
Georgi Gerganov
f794549bae
llama.vim : gather chunk on leaving buffer [no ci]
2024-10-21 11:00:18 +03:00
Georgi Gerganov
27bc11da0f
llama.vim : update server command [no ci]
2024-10-21 11:00:18 +03:00
Georgi Gerganov
b8890229b6
llama.vim : add ring context from opened files and yanked text
2024-10-21 11:00:18 +03:00
Georgi Gerganov
4f46e29b09
llama : print more info about control tokens
2024-10-21 11:00:18 +03:00
Georgi Gerganov
491f211b4c
llama : improve infill sampler
...
ggml-ci
2024-10-21 11:00:18 +03:00
Georgi Gerganov
5624e919df
llama.vim : fix docs [no ci]
2024-10-21 11:00:17 +03:00
Georgi Gerganov
c9a46f4bd7
llama.vim : minor [no ci]
2024-10-21 11:00:17 +03:00
Georgi Gerganov
865d9bc48a
llama : clean-up
...
ggml-ci
2024-10-21 11:00:17 +03:00
Georgi Gerganov
4b1bd81661
llama : simplify infill sampler
2024-10-21 11:00:17 +03:00
Georgi Gerganov
2e8c350a5f
llama.vim : fix edge cases
2024-10-21 11:00:16 +03:00
Georgi Gerganov
6669b550db
llama.vim : set time limit for the generation phase
2024-10-21 11:00:16 +03:00
Georgi Gerganov
c507a65af5
llama.vim : async
2024-10-21 11:00:16 +03:00
Georgi Gerganov
41053f92d3
llama.vim : simplify init and cancel + auto-fim
2024-10-21 11:00:16 +03:00
Georgi Gerganov
7e0b5062af
llama.vim : reduce scope of ids to local [no ci]
2024-10-21 11:00:16 +03:00
Georgi Gerganov
26a0c61e8a
llama.vim : allow repeated suggestions [no ci]
2024-10-21 11:00:15 +03:00
Georgi Gerganov
6e82a03b9d
llama.vim : display realtime [no ci]
2024-10-21 11:00:15 +03:00
Georgi Gerganov
9d13e87b1b
llama.vim : add processing info overlay
2024-10-21 11:00:15 +03:00
Georgi Gerganov
07e7dd47f2
llama.vim : handle space
2024-10-21 11:00:15 +03:00
Georgi Gerganov
0c649c8967
llama.vim : fix suffix construction + fix virt text offset
2024-10-21 11:00:15 +03:00
Georgi Gerganov
0566c69531
llama.vim : neovim plugin
2024-10-21 11:00:14 +03:00
Georgi Gerganov
5aaf24766a
llama : add infill sampler
2024-10-21 11:00:14 +03:00
Georgi Gerganov
55e47786e3
llama : default sampling changes + greedy update ( #9897 )
...
* llama : deprecate softmax sampler + fix dist sampler
ggml-ci
* tests : replace macros with functions
ggml-ci
* sampling : change temperature sampler logic
For t <= 0.0f, keep the max logit intact and set the rest to -inf
* cont : no need for special "greedy" logic
top-k == 1 is the same
* tests : init prob correctly
* llama : handle temp <= 0.0 in the temp_ext sampler too
ggml-ci
* cont : avoid extra loop in temperature sampler for sub-zero temp
ggml-ci
2024-10-21 09:46:40 +03:00
Georgi Gerganov
bc21975084
speculative : fix handling of some input params ( #9963 )
...
* speculative : fix batch sizes at initialization
ggml-ci
* speculative : handle params.n_predict == -1
* speculative : limit batch size to llama_n_batch
2024-10-21 09:37:12 +03:00
Neo Zhang Jianyu
1db8c84fc6
fix mul_mat_vec_q and *_vec_q error ( #9939 )
...
Co-authored-by: arthw <14088817+arthw@users.noreply.github.com>
2024-10-21 14:26:09 +08:00
Loïc Carrère
45f097645e
readme : update bindings list ( #9951 )
...
Update the binding list by adding LM-Kit.NET (C# & VB.NET)
2024-10-20 19:25:41 +03:00
icppWorld
7cab2083c7
readme : update infra list ( #9942 )
...
llama_cpp_canister allows you to run llama.cpp as a Smart Contract on the Internet Computer. The smart contract runs as WebAssembly in a so-called 'canister'.
2024-10-20 19:01:34 +03:00
Xuan Son Nguyen
cda0e4b648
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch ( #9745 )
...
* refactor llama_batch_get_one
* adapt all examples
* fix simple.cpp
* fix llama_bench
* fix
* fix context shifting
* free batch before return
* use common_batch_add, reuse llama_batch in loop
* null terminated seq_id list
* fix save-load-state example
* fix perplexity
* correct token pos in llama_batch_allocr
2024-10-18 23:18:01 +02:00
Radoslav Gerganov
afd9909a64
rpc : backend refactoring ( #9912 )
...
* rpc : refactor backend
Use structs for RPC request/response messages
* rpc : refactor server
2024-10-18 14:33:58 +03:00
Ouadie EL FAROUKI
87421a23e8
[SYCL] Add SYCL Backend registry, device and Event Interfaces ( #9705 )
...
* implemented missing SYCL event APIs
* sycl : Added device and backend reg interfaces
* Restructured ggml-sycl.cpp
2024-10-18 06:46:16 +01:00
Ma Mingfei
60ce97c9d8
add amx kernel for gemm ( #8998 )
...
add intel amx isa detection
add vnni kernel for gemv cases
add vnni and amx kernel support for block_q8_0
code cleanup
fix packing B issue
enable openmp
fine tune amx kernel
switch to aten parallel pattern
add error message for nested parallelism
code cleanup
add f16 support in ggml-amx
add amx kernels for QK_K quant formats: Q4_K, Q5_K, Q6_K and IQ4_XS
update CMakeList
update README
fix some compilation warning
fix compiler warning when amx is not enabled
minor change
ggml-ci
move ggml_amx_init from ggml.c to ggml-amx/mmq.cpp
ggml-ci
update CMakeLists with -mamx-tile, -mamx-int8 and -mamx-bf16
ggml-ci
add amx as an ggml-backend
update header file, the old path for immintrin.h has changed to ggml-cpu-impl.h
minor change
update CMakeLists.txt
minor change
apply weight prepacking in set_tensor method in ggml-backend
fix compile error
ggml-ci
minor change
ggml-ci
update CMakeLists.txt
ggml-ci
add march dependency
minor change
ggml-ci
change ggml_backend_buffer_is_host to return false for amx backend
ggml-ci
fix supports_op
use device reg for AMX backend
ggml-ci
minor change
ggml-ci
minor change
fix rebase
set .buffer_from_host_ptr to be false for AMX backend
2024-10-18 13:34:36 +08:00
Georgi Gerganov
8901755ba3
server : add n_indent parameter for line indentation requirement ( #9929 )
...
ggml-ci
2024-10-18 07:32:19 +03:00