llama.cpp

Author	SHA1	Message	Date
Kawrakow	e8d9158925	metal: somewhat faster f16 x f32 matrix multiply kernel (#2951 ) * Somewhat faster f16 x f32 matrix multiply kernel * Better use 32 thread groups for f16 x f32 --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>	2023-09-01 11:15:57 +03:00
Concedo	81abd3cb1f	Merge remote-tracking branch 'elbios/concat_output_mutex' into concedo_experimental	2023-09-01 15:24:13 +08:00
Concedo	d7fed4732f	fix for typical sampler	2023-09-01 15:24:00 +08:00
Elbios	30588617fb	Fix race condition by locking concat_output string Writer thread was appending to concat_output global string without a lock, while another thread could be reading the string invoked by HTTP API. Appending to std::string is not an atomic operation. Worst case would be if string was reallocated while being read. Fix it by locking the access in writer and reader with a mutex.	2023-09-01 07:18:48 +02:00
Cebtenzzre	bce1fef328	convert : fix another python 3.8 issue (#2949 )	2023-08-31 22:13:51 -04:00
slaren	528134dd02	remove convert-llama-7b-pth-to-gguf.py and convert-llama-hf-to-gguf.py (#2906 )	2023-09-01 01:32:09 +02:00
Kerfuffle	aeefac4ff7	scripts: Use local gguf package when running from repo (#2927 ) * scripts: Use local gguf when running from repo	2023-08-31 16:49:24 -06:00
Concedo	0c3a265187	fixed incorrect buffer size values	2023-09-01 01:31:09 +08:00
Concedo	35ba699a7c	Merge remote-tracking branch 'vxii/concedo' into concedo_experimental	2023-09-01 01:28:16 +08:00
Concedo	0fe3c9cf96	stronger banning bias	2023-09-01 01:25:23 +08:00
Concedo	fe4a233d79	Merge branch 'master' into concedo_experimental # Conflicts: # .devops/tools.sh # llama.cpp	2023-09-01 00:47:06 +08:00
vxiiduu	f2985a070b	Add support for 34B GGML models	2023-09-01 01:29:09 +10:00
DannyDaemonic	e8422de39e	@vxiiduu's fix for PrefetchVirtualMemory (#2930 ) Reimplement fix for `PrefetchVirtualMemory`. Co-authored-by: vxiiduu <73044267+vxiiduu@users.noreply.github.com>	2023-08-31 04:21:45 -07:00
Concedo	bc02f7663f	allow sse3 in failsafe	2023-08-31 18:07:17 +08:00
Concedo	07b02af8bc	fixed tab ordering , update lite for panel alignment	2023-08-31 16:33:00 +08:00
Concedo	e2fd30b5d1	reverted the failsafe removal, since they dropped support for dll check	2023-08-31 15:39:32 +08:00
Cebtenzzre	92d0b751a7	convert : fix python 3.8 support, modernize type annotations (#2916 ) * convert : fix python 3.8 support * convert : sort imports * convert : fix required parameters in convert-llama-ggmlv3-to-gguf * convert : fix mypy errors in convert-llama-ggmlv3-to-gguf * convert : use PEP 585 generics and PEP 604 unions Now that we have `from __future__ import annotations`, we can use this modern syntax in Python 3.7 instead of restricting support to Python 3.9 or 3.10 respectively. * gguf.py : a tuple is already a tuple * add mypy.ini * convert : add necessary `type: ignore` comments * gguf-py: bump version	2023-08-31 08:02:23 +03:00
Johannes Gäßler	8afe228000	CUDA: mul_mat_q=true llama_context_params default (#2912 )	2023-08-30 21:46:19 +02:00
Concedo	b6914ebd04	hotfix to revert the auto ctx scaling first, i didnt do it properly	2023-08-31 00:58:52 +08:00
Henri Vasserman	71d6975559	[Docker] fix tools.sh argument passing. (#2884 ) * [Docker] fix tools.sh argument passing. This should allow passing multiple arguments to containers with the full image that are using the tools.sh frontend. Fix from https://github.com/ggerganov/llama.cpp/issues/2535#issuecomment-1697091734	2023-08-30 19:14:53 +03:00
Concedo	5cd0309610	renamed incorrect identifier	2023-08-30 23:06:39 +08:00
Concedo	0ee394ae1b	falcon disable offload only for clblast	2023-08-30 22:35:24 +08:00
Concedo	29757de61f	cmake disable buggy logs	2023-08-30 22:15:33 +08:00
Concedo	aa4ad830e2	log.h is broken so disable it first Merge branch 'master' into concedo_experimental # Conflicts: # .github/workflows/build.yml # .gitignore # Makefile # README.md # tests/CMakeLists.txt	2023-08-30 21:58:54 +08:00
Concedo	a2a4eefa07	slight change to logits	2023-08-30 21:27:51 +08:00
Georgi Gerganov	b532a69b2f	convert.py : use dir name to name the llama	2023-08-30 13:29:40 +03:00
Concedo	1301bd7e29	Fix to skip GPU offloading so falcon models work correctly	2023-08-30 18:26:41 +08:00
Georgi Gerganov	c90d135eb4	examples : fix underscore in beam-search + .gitignore (close #2900 )	2023-08-30 12:53:24 +03:00
M. Yusuf Sarıgöz	0d1c706181	gguf : add workflow for Pypi publishing (#2896 ) * gguf : add workflow for Pypi publishing * gguf : add workflow for Pypi publishing * fix trailing whitespace	2023-08-30 12:47:40 +03:00
alonfaraj	9509294420	make : add test and update CI (#2897 ) * build ci: run make test * makefile: - add all - add test * enable tests/test-tokenizer-0-llama * fix path to model * remove gcc-8 from macos build test * Update Makefile * Update Makefile	2023-08-30 12:42:51 +03:00
Concedo	d4c22a8b02	updated lite, added autorope config based on trained ctxlen, hotfix for falcon gpu broken	2023-08-30 16:50:55 +08:00
Gilad S	35092fb547	docs : add `node-llama-cpp` to `README.md` (#2885 )	2023-08-30 11:40:12 +03:00
Kerfuffle	dc07dc492e	convert : various script cleanups/fixes + merges and special token handling (#2842 ) * convert: Fix permute calls and method/func definitions * Cleanups for gguf-py * Minor types cleanups. * Initial implementation of handling merges and special tokens * convert: Handle special tokens and merges in vocab only mode convert: Vocab only mode no longer requires loading model tensors * gguf: Refactor tensor name mapping * convert: Fix type hint for special_token_types in SpecialVocab * Use common special vocab handling in various conversion scripts * First pass at implementing suggested changes * Second pass * gguf: SpecialVocab: Fix issue with special token content not in a dict gguf: SpecialVocab: Allow skipping handling of merges * convert-falcon-hf-to-gguf: Support --vocab-only option, bail out if no tokenizer.json * convert-gptneox-hf-to-gguf and convert: Only handle merges for BPE tokenizer * gguf: SpecialVocab: Actually set load_merges in object * Uniform args parsing and vocab only mode for convert examples * convert.py: Set gpt2 as tokenizer model when using BPE * Squish last type warning in gguf.py - yay!	2023-08-30 11:25:50 +03:00
chaihahaha	ad9ddcff6e	llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879 )	2023-08-30 09:50:55 +03:00
staviq	8341a25957	main : log file (#2748 ) * initial, base LOG macro * add .log to .gitignore added basic log file handler * reverted log auto endline to better mimic printf * remove atomics and add dynamic log target * log_enable/disable, LOG_TEE, basic usage doc * update .gitignore * mv include to common, params, help msg * log tostring helpers, token vectors pretty prints * main: replaced fprintf/LOG_TEE, some trace logging * LOG_DISABLE_LOGS compile flag, wrapped f in macros * fix LOG_TEELN and configchecker * stub LOG_DUMP_CMDLINE for WIN32 for now * fix msvc * cleanup main.cpp:273 * fix stray whitespace after master sync * log : fix compile warnings - do not use C++20 stuff - use PRIu64 to print uint64_t - avoid string copies by using const ref - fix ", ##__VA_ARGS__" warnings - compare strings with == and != * log : do not append to existing log + disable file line func by default * log : try to fix Windows build * main : wip logs * main : add trace log * review: macro f lowercase, str append to sstream * review: simplify ifs and str comparisons * fix MSVC, formatting, FMT/VAL placeholders * review: if/else cleanup * review: if/else cleanup (2) * replace _ prefix with _impl suffix --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-08-30 09:29:32 +03:00
Cebtenzzre	849408957c	tests : add a C compliance test (#2848 ) * tests : add a C compliance test * make : build C compliance test by default * make : fix clean and make sure C test fails on clang * make : move -Werror=implicit-int to CFLAGS	2023-08-30 09:20:26 +03:00
Concedo	89495c0716	handle token unbanning over api	2023-08-30 10:51:49 +08:00
Concedo	f2c02dd06d	Merge branch 'master' into concedo_experimental # Conflicts: # .gitignore # CMakeLists.txt # Makefile # README.md # tests/test-grad0.cpp	2023-08-30 10:51:28 +08:00
YellowRoseCx	d7bdfbdd78	Update Makefile for misc amd gpu targetting (#407 ) adds the hipBlas gpu_target $(shell $(ROCM_PATH)/llvm/bin/amdgpu-arch) back to the gpu_target line, possibly allowing misc gpu arch's like gfx1031 or gfx1032 etc to be built	2023-08-30 09:54:15 +08:00
slaren	06abf8eeba	ggml : add view_src and view_offs to ggml_tensor for views (#2874 ) * ggml : add view_src and view_offs * update ggml-alloc to use view_src * update ggml_diag_mask to work correctly with automatic inplace * exclude other ops that set an inplace flag from automatic inplace	2023-08-29 23:24:42 +02:00
slaren	c03a243abf	remove outdated references to -eps and -gqa from README (#2881 )	2023-08-29 23:17:34 +02:00
Kawrakow	fa3582f509	Tell users attmepting to run perplexity with too few tokens to use more (#2882 ) Closes #2858 Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>	2023-08-29 23:55:45 +03:00
Kawrakow	e37e69dcc3	10X faster BPE tokenizer (#2876 ) * 10X faster BPE tokenizer * Remove comment that no longer applies --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>	2023-08-29 23:55:03 +03:00
Concedo	380fa0f0ca	fixed broken typical sampler issues	2023-08-29 23:50:59 +08:00
maddes8cht	53885d7256	py : fix "usage" messages (#2873 ) convert-to-gguf python scripts	2023-08-29 16:51:02 +03:00
jameswu2014	bcce96ba4d	convert.py : fix baichuan7B support (#2870 ) * [Fix]: convert.py support baichuan7B * convert.py : fix trailing whitespaces --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-08-29 12:48:41 +03:00
Jhen-Jie Hong	74e0caeb82	readme : add react-native binding (#2869 )	2023-08-29 12:30:10 +03:00
Cebtenzzre	d4b5e16c32	make : fix clang tests build, add missing examples (#2859 ) * make : do not pass headers to the compiler This fixes building tests with clang. * make : add missing examples * make : fix build-info.h dependencies	2023-08-29 11:42:41 +03:00
Georgi Gerganov	3a007648f2	metal : add option to disable debug logs (close #2764 )	2023-08-29 11:33:46 +03:00
Georgi Gerganov	611363ac79	scripts : add pipefail	2023-08-29 10:50:30 +03:00

1 2 3 4 5 ...

2069 commits