llama.cpp

Author	SHA1	Message	Date
Yazan Agha-Schrader	9dcb514b1d	update start server scripts	2023-11-28 06:57:29 +01:00
Yazan Agha-Schrader	4fa32ad0e3	update	2023-11-27 21:45:12 +01:00
Yazan Agha-Schrader	1b6d4226b8	add start scripts to root path	2023-11-27 21:35:31 +01:00
Yazan Agha-Schrader	ae096d0a92	Merge branch 'ggerganov:master' into master	2023-11-27 20:10:11 +01:00
Kasumi	0dab8cd7cc	readme : add Amica to UI list (#4230 )	2023-11-27 19:39:42 +02:00
Yazan Agha-Schrader	6c318b54c8	Update README.md	2023-11-27 18:28:32 +01:00
Yazan Agha-Schrader	ecb39732e6	add min-p image	2023-11-27 18:25:51 +01:00
Yazan Agha-Schrader	082b33550f	Update README.md	2023-11-27 18:19:26 +01:00
Yazan Agha-Schrader	c48f3f2042	Merge pull request #3 from mounta11n/server-ui-improvements add min-p	2023-11-27 17:58:23 +01:00
Yazan Agha-Schrader	464f073307	add min-p	2023-11-27 17:56:30 +01:00
Yazan Agha-Schrader	d55b482361	Merge pull request #2 from mounta11n/server-ui-improvements Server UI improvements	2023-11-27 17:26:43 +01:00
Yazan Agha-Schrader	809b2697fe	Merge branch 'ggerganov:master' into master	2023-11-27 17:24:35 +01:00
Yazan Agha-Schrader	c161ad20db	add mmproj function	2023-11-27 17:17:38 +01:00
Yazan Agha-Schrader	d5683279b1	fix wrong translation	2023-11-27 16:19:08 +01:00
Bailey Chittle	bb03290c17	examples : iOS example with swift ui (#4159 ) * copy to llama.cpp as subdir * attempt enabling metal, fails * ggml metal compiles! * Update README.md * initial conversion to new format, utf8 errors? * bug fixes, but now has an invalid memory access :( * added O3, now has insufficient memory access * begin sync with master * update to match latest code, new errors * fixed it! * fix for loop conditionals, increase result size * fix current workflow errors * attempt a llama.swiftui workflow * Update .github/workflows/build.yml Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-11-27 16:56:52 +02:00
Yazan Agha-Schrader	09e3b50f62	fix wrong formattings	2023-11-27 15:54:21 +01:00
Yazan Agha-Schrader	cf8cb0d303	fix multi-modal-selection	2023-11-27 15:05:23 +01:00
Yazan Agha-Schrader	49d7c07210	Update README.md add description	2023-11-27 14:23:51 +01:00
Yazan Agha-Schrader	1bb2df7367	Update README.md add pictures of the ui	2023-11-27 14:22:31 +01:00
Yazan Agha-Schrader	25ed0c4f6b	add ui and tui pics	2023-11-27 14:18:58 +01:00
Yazan Agha-Schrader	1bc9ca6a9c	add ui and tui pics	2023-11-27 14:17:04 +01:00
Yazan Agha-Schrader	a28935febe	Update README.md	2023-11-27 14:14:46 +01:00
Yazan Agha-Schrader	ca22eb6cc7	Merge pull request #1 from mounta11n/server-ui-improvements Server UI improvements	2023-11-27 14:11:48 +01:00
Yazan Agha-Schrader	e7cfe1f5d9	add favicon	2023-11-27 13:58:54 +01:00
Yazan Agha-Schrader	9abb31011b	Update index.html add atlas	2023-11-27 13:47:08 +01:00
Yazan Agha-Schrader	4d15130fda	add start script	2023-11-27 13:06:27 +01:00
Yazan Agha-Schrader	2566e53945	ic	2023-11-27 11:33:06 +01:00
Jared Van Bortel	f3b269813f	ggml : fix -Warray-bounds warning with gcc (#4231 )	2023-11-26 22:58:43 -05:00
Georgi Gerganov	3e73d31d9c	lookahead : support `-n -1` infinite generation	2023-11-26 21:52:23 +02:00
Georgi Gerganov	9656026b53	readme : update hot topics	2023-11-26 20:42:51 +02:00
Georgi Gerganov	922754a8d6	lookahead : add example for lookahead decoding (#4207 ) * lookahead : init * lookahead : generate and store n-grams * lookahead : use loop instead recursion to generate n-grams * lookahead : initial working implementation * lookahead : filter repeating n-grams * lookahead : use deterministic init * lookahead : add to Makefile * lookahead : fix a bug in the seq_id of the lookahead tokens * lookahead : add comments --------- Co-authored-by: slaren <slarengh@gmail.com>	2023-11-26 20:33:07 +02:00
Xiao-Yong Jin	22da05536f	metal : fix yarn (#4220 ) get the correct n_orig_ctx in metal	2023-11-26 10:30:02 +02:00
Galunid	1ddb52ec38	scripts : Use mmap in torch load (#4202 ) * Use mmap in torch load, prefer .bin files when loading * Revert .bin > .safetensors preference	2023-11-25 22:45:02 +01:00
Marcus Dunn	f837c3a992	llama : grammar `reserve` space in `decode_utf8` (#4210 ) * reserve space for codepoints * improvement for the appended 0	2023-11-25 18:58:23 +02:00
crasm	3014b5415d	Update docs for yarn_ext_factor <0.0 as unspecified instead of NaN (#4189 )	2023-11-25 10:47:07 -05:00
Georgi Gerganov	04814e718e	readme : update hot topics	2023-11-25 12:02:13 +02:00
Georgi Gerganov	af19d35734	server : OAI API compatibility (#4198 ) * Add openai-compatible POST /v1/chat/completions API endpoint to server example * fix code style * Update server README.md * Improve server README.md * Fix server.cpp code style according to review * server : some style changes * server : indentation * server : enable special tokens during tokenization by default * server : minor code style * server : change random string generator * straightforward /v1/models endpoint --------- Co-authored-by: kir-gadjello <111190790+kir-gadjello@users.noreply.github.com> Co-authored-by: Tobi Lütke <tobi@Tobis-MacBook-Pro.local>	2023-11-25 11:29:06 +02:00
slaren	e9c13ff781	llama : set metal log callback correctly (#4204 )	2023-11-24 18:10:01 +01:00
slaren	8a052c131e	ggml-cuda : support stablelm rope (#4156 ) * ggml-cuda : support stablelm rope * remove unused freq_base kernel parameter * add n_dims parameter to llm_build_k_shift, default to n_rot via overload * llama : fix llm_build_k_shift args --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-11-24 18:04:31 +01:00
Galunid	189d68446e	convert : fix tensors using grad in some models (#4173 )	2023-11-24 15:02:49 +01:00
eastriver	2568a4bf54	main.swift : fix eos checking (#4197 ) llama_token_eos(const struct llama_model *) is currently getting struct llama_context type variable context as a parameter.	2023-11-24 11:25:10 +02:00
Aaryaman Vasishta	b35f3d0def	readme : use PATH for Windows ROCm (#4195 ) * Update README.md to use PATH for Windows ROCm * Update README.md * Update README.md	2023-11-24 09:52:39 +02:00
Haohui Mai	55978ce09b	Fix incorrect format strings and uninitialized variables. (#4133 ) * Fix incorrect format strings and uninitialized variables. * Address comments * Add the missing include statement	2023-11-23 22:56:53 +01:00
Georgi Gerganov	6b0a7420d0	llama : KV cache view API + better KV cache management (#4170 ) * llama : keep track of used KV cells + better KV cache management * llama : zero KV cache used upon clear ggml-ci * llama : allow exporting a view of the KV cache (#4180) * Allow exporting a view of the KV cache * Allow dumping the sequences per cell in common * Track max contiguous cells value and position as well * Fix max contiguous empty cells index calculation Make dump functions deal with lengths or sequences counts > 10 better * Fix off by one error in dump_kv_cache_view * Add doc comments for KV cache view functions Eliminate cell sequence struct; use llama_seq_id directly Minor cleanups * common : add -dkvc arg for enabling kv cache dumps --------- Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>	2023-11-23 19:07:56 +02:00
Georgi Gerganov	d103d935c0	readme : update hot topics	2023-11-23 13:51:22 +02:00
Daniel Bevenius	9d5949f04b	examples : fix typo in parallel example doc comment (#4181 ) Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>	2023-11-23 13:34:20 +02:00
Georgi Gerganov	ff8238f71d	docs : add llama-star arch idea	2023-11-23 11:35:04 +02:00
Galunid	8e672efe63	stablelm : simplify + speedup generation (#4153 )	2023-11-21 16:22:30 +01:00
Galunid	0b871f1a04	finetune - update readme to mention llama support only (#4148 )	2023-11-20 19:30:00 +01:00
Aaryaman Vasishta	dfc7cd48b1	readme : update ROCm Windows instructions (#4122 ) * Update README.md * Update README.md Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> --------- Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>	2023-11-20 17:02:46 +02:00

1 2 3 4 5 ...

1597 commits