llama.cpp

Author	SHA1	Message	Date
Georgi Gerganov	b8efb0725d	llama.vim : minor [no ci]	2024-10-21 11:00:22 +03:00
Georgi Gerganov	fe78c39399	llama.vim : fix large chunk accept + comments [no ci]	2024-10-21 11:00:22 +03:00
Georgi Gerganov	6bb6e6dd80	llama.vim : display ring capacity [no ci]	2024-10-21 11:00:22 +03:00
Georgi Gerganov	1600d846b6	llama.vim : complete only whithin the local scope [no ci]	2024-10-21 11:00:22 +03:00
Georgi Gerganov	d1b8b215d5	llama.vim : fix repetitions of existing text	2024-10-21 11:00:21 +03:00
Georgi Gerganov	4583aef12b	llama.vim : final touches ggml-ci	2024-10-21 11:00:21 +03:00
Georgi Gerganov	847c8c023e	llama.vim : update infill API params [no ci]	2024-10-21 11:00:21 +03:00
Georgi Gerganov	060573f7e8	llama.vim : add comments [no ci]	2024-10-21 11:00:21 +03:00
Georgi Gerganov	42a9008b31	llama.vim : process extra chunks in the background [no ci]	2024-10-21 11:00:21 +03:00
Georgi Gerganov	0c1f51b73e	llama : improve infill sampler ggml-ci	2024-10-21 11:00:20 +03:00
Georgi Gerganov	e4be74b4b7	llama.vim : add top_p + improve responsivness + fix edge cases	2024-10-21 11:00:20 +03:00
Georgi Gerganov	25ecb35c4f	llama.vim : simplify job logic + improve robustness and responsivness	2024-10-21 11:00:20 +03:00
Georgi Gerganov	9f8fa900f6	llama.vim : fix repetitions [no ci]	2024-10-21 11:00:20 +03:00
Georgi Gerganov	ae76a092b8	llama.vim : pass filenames for each chunk ggml-ci	2024-10-21 11:00:20 +03:00
Georgi Gerganov	916c2ee3fd	llama : simplify infill sampler	2024-10-21 11:00:19 +03:00
Georgi Gerganov	bc2857b88c	llama.vim : async context processing ggml-ci	2024-10-21 11:00:19 +03:00
Georgi Gerganov	2960510153	llama.vim : do not auto-fim when far from the end of the line [no ci]	2024-10-21 11:00:19 +03:00
Georgi Gerganov	d81a0ac185	llama.vim : do not evict certain chunks [no ci]	2024-10-21 11:00:19 +03:00
Georgi Gerganov	27d53cb4ee	llama.vim : logic to evict old chunks that are similar to new one	2024-10-21 11:00:19 +03:00
Georgi Gerganov	f794549bae	llama.vim : gather chunk on leaving buffer [no ci]	2024-10-21 11:00:18 +03:00
Georgi Gerganov	27bc11da0f	llama.vim : update server command [no ci]	2024-10-21 11:00:18 +03:00
Georgi Gerganov	b8890229b6	llama.vim : add ring context from opened files and yanked text	2024-10-21 11:00:18 +03:00
Georgi Gerganov	4f46e29b09	llama : print more info about control tokens	2024-10-21 11:00:18 +03:00
Georgi Gerganov	491f211b4c	llama : improve infill sampler ggml-ci	2024-10-21 11:00:18 +03:00
Georgi Gerganov	5624e919df	llama.vim : fix docs [no ci]	2024-10-21 11:00:17 +03:00
Georgi Gerganov	c9a46f4bd7	llama.vim : minor [no ci]	2024-10-21 11:00:17 +03:00
Georgi Gerganov	865d9bc48a	llama : clean-up ggml-ci	2024-10-21 11:00:17 +03:00
Georgi Gerganov	4b1bd81661	llama : simplify infill sampler	2024-10-21 11:00:17 +03:00
Georgi Gerganov	2e8c350a5f	llama.vim : fix edge cases	2024-10-21 11:00:16 +03:00
Georgi Gerganov	6669b550db	llama.vim : set time limit for the generation phase	2024-10-21 11:00:16 +03:00
Georgi Gerganov	c507a65af5	llama.vim : async	2024-10-21 11:00:16 +03:00
Georgi Gerganov	41053f92d3	llama.vim : simplify init and cancel + auto-fim	2024-10-21 11:00:16 +03:00
Georgi Gerganov	7e0b5062af	llama.vim : reduce scope of ids to local [no ci]	2024-10-21 11:00:16 +03:00
Georgi Gerganov	26a0c61e8a	llama.vim : allow repeated suggestions [no ci]	2024-10-21 11:00:15 +03:00
Georgi Gerganov	6e82a03b9d	llama.vim : display realtime [no ci]	2024-10-21 11:00:15 +03:00
Georgi Gerganov	9d13e87b1b	llama.vim : add processing info overlay	2024-10-21 11:00:15 +03:00
Georgi Gerganov	07e7dd47f2	llama.vim : handle space	2024-10-21 11:00:15 +03:00
Georgi Gerganov	0c649c8967	llama.vim : fix suffix construction + fix virt text offset	2024-10-21 11:00:15 +03:00
Georgi Gerganov	0566c69531	llama.vim : neovim plugin	2024-10-21 11:00:14 +03:00
Georgi Gerganov	5aaf24766a	llama : add infill sampler	2024-10-21 11:00:14 +03:00
Georgi Gerganov	55e47786e3	llama : default sampling changes + greedy update (#9897 ) * llama : deprecate softmax sampler + fix dist sampler ggml-ci * tests : replace macros with functions ggml-ci * sampling : change temperature sampler logic For t <= 0.0f, keep the max logit intact and set the rest to -inf * cont : no need for special "greedy" logic top-k == 1 is the same * tests : init prob correctly * llama : handle temp <= 0.0 in the temp_ext sampler too ggml-ci * cont : avoid extra loop in temperature sampler for sub-zero temp ggml-ci	2024-10-21 09:46:40 +03:00
Georgi Gerganov	bc21975084	speculative : fix handling of some input params (#9963 ) * speculative : fix batch sizes at initialization ggml-ci * speculative : handle params.n_predict == -1 * speculative : limit batch size to llama_n_batch	2024-10-21 09:37:12 +03:00
Neo Zhang Jianyu	1db8c84fc6	fix mul_mat_vec_q and *_vec_q error (#9939 ) Co-authored-by: arthw <14088817+arthw@users.noreply.github.com>	2024-10-21 14:26:09 +08:00
Loïc Carrère	45f097645e	readme : update bindings list (#9951 ) Update the binding list by adding LM-Kit.NET (C# & VB.NET)	2024-10-20 19:25:41 +03:00
icppWorld	7cab2083c7	readme : update infra list (#9942 ) llama_cpp_canister allows you to run llama.cpp as a Smart Contract on the Internet Computer. The smart contract runs as WebAssembly in a so-called 'canister'.	2024-10-20 19:01:34 +03:00
Xuan Son Nguyen	cda0e4b648	llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745 ) * refactor llama_batch_get_one * adapt all examples * fix simple.cpp * fix llama_bench * fix * fix context shifting * free batch before return * use common_batch_add, reuse llama_batch in loop * null terminated seq_id list * fix save-load-state example * fix perplexity * correct token pos in llama_batch_allocr	2024-10-18 23:18:01 +02:00
Radoslav Gerganov	afd9909a64	rpc : backend refactoring (#9912 ) * rpc : refactor backend Use structs for RPC request/response messages * rpc : refactor server	2024-10-18 14:33:58 +03:00
Ouadie EL FAROUKI	87421a23e8	[SYCL] Add SYCL Backend registry, device and Event Interfaces (#9705 ) * implemented missing SYCL event APIs * sycl : Added device and backend reg interfaces * Restructured ggml-sycl.cpp	2024-10-18 06:46:16 +01:00
Ma Mingfei	60ce97c9d8	add amx kernel for gemm (#8998 ) add intel amx isa detection add vnni kernel for gemv cases add vnni and amx kernel support for block_q8_0 code cleanup fix packing B issue enable openmp fine tune amx kernel switch to aten parallel pattern add error message for nested parallelism code cleanup add f16 support in ggml-amx add amx kernels for QK_K quant formats: Q4_K, Q5_K, Q6_K and IQ4_XS update CMakeList update README fix some compilation warning fix compiler warning when amx is not enabled minor change ggml-ci move ggml_amx_init from ggml.c to ggml-amx/mmq.cpp ggml-ci update CMakeLists with -mamx-tile, -mamx-int8 and -mamx-bf16 ggml-ci add amx as an ggml-backend update header file, the old path for immintrin.h has changed to ggml-cpu-impl.h minor change update CMakeLists.txt minor change apply weight prepacking in set_tensor method in ggml-backend fix compile error ggml-ci minor change ggml-ci update CMakeLists.txt ggml-ci add march dependency minor change ggml-ci change ggml_backend_buffer_is_host to return false for amx backend ggml-ci fix supports_op use device reg for AMX backend ggml-ci minor change ggml-ci minor change fix rebase set .buffer_from_host_ptr to be false for AMX backend	2024-10-18 13:34:36 +08:00
Georgi Gerganov	8901755ba3	server : add n_indent parameter for line indentation requirement (#9929 ) ggml-ci	2024-10-18 07:32:19 +03:00

1 2 3 4 5 ...

3988 commits