llama.cpp

Author	SHA1	Message	Date
Concedo	1369b46bb7	notice about false positives	2023-04-08 12:20:48 +08:00
Concedo	d1c957ee64	strip symbols	2023-04-08 00:59:34 +08:00
Concedo	289c40df94	updated embedded kobold	2023-04-07 22:39:20 +08:00
Concedo	1abcdb2394	should not be static	2023-04-07 20:35:19 +08:00
Concedo	43949f7c7c	Merge branch 'master' into concedo	2023-04-07 20:34:06 +08:00
Concedo	f322a5820e	fixed positional port arg	2023-04-07 17:46:33 +08:00
Concedo	1d48db4f63	dont build quantize	2023-04-07 17:11:26 +08:00
Sergey Alirzaev	cc9cee8e9e	Do not crash when it has nothing to say. (#796 ) Otherwise observing this in the interactive mode: /usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/stl_vector.h:1230: reference std::vector<int>::back() [_Tp = int, _Alloc = std::allocator<int>]: Assertion '!this->empty()' failed.	2023-04-06 17:59:11 +02:00
Concedo	4f5faf9612	some users report that this repo is now being flagged as malicious? no idea why, but I am removing all prebuilt binaries except libopenblas. windows users can still obtain it from /releases and osx and linux users can rebuild from source code.	2023-04-06 21:49:43 +08:00
Concedo	b56f872b61	update embedded kobold lite	2023-04-06 16:34:51 +08:00
Pavol Rusnak	d2beca95dc	Make docker instructions more explicit (#785 )	2023-04-06 08:56:58 +02:00
Concedo	0e889ed6db	Merge branch 'master' into concedo # Conflicts: # .gitignore # Makefile # README.md	2023-04-06 11:14:44 +08:00
Concedo	3d650d0e25	remove dependency of psutil, fixed compile error on WSL, handle exceptions when sending http response, added multiline for embedded kobold	2023-04-06 11:08:19 +08:00
Georgi Gerganov	eeaa7b0492	ggml : multi-thread ggml_rope() (~3-4 times faster on M1) (#781 )	2023-04-05 22:11:03 +03:00
Georgi Gerganov	986b6ce9f9	ggml, llama : avoid heavy V transpose + improvements (#775 ) ggml : - added ggml_view_3d() - ggml_view_tensor() now inherits the stride too - reimplement ggml_cpy() to account for dst stride - no longer require tensor->data to be memory aligned llama : - compute RoPE on 32-bit tensors (should be more accurate) - store RoPE-ed K in the KV cache - store transposed V in the KV cache (significant speed-up) - avoid unnecessary Q copy	2023-04-05 22:07:33 +03:00
Georgi Gerganov	3416298929	Update README.md	2023-04-05 19:54:30 +03:00
Ivan Stepanov	5a8c4f6240	llama : define non-positive top_k; top_k range check (#779 ) * Define non-positive top_k; top_k range check * minor : brackets --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-04-05 19:20:05 +03:00
at8u	ff05d05c96	miku.sh : add executable bit (#780 )	2023-04-05 18:59:13 +03:00
Georgi Gerganov	62b3e81aae	media : add logos and banners	2023-04-05 18:58:31 +03:00
Georgi Gerganov	8d10406d6e	readme : change logo + add bindings + add uis + add wiki	2023-04-05 18:56:20 +03:00
iacore	ed1c214e66	zig : add build.zig (#773 ) Co-authored-by: Locria Cyber <74560659+locriacyber@users.noreply.github.com>	2023-04-05 18:06:02 +03:00
Ivan Stepanov	0c44427df1	make : missing host optimizations in CXXFLAGS (#763 )	2023-04-05 17:38:37 +03:00
Adithya Balaji	594cc95fab	readme : update with CMake and windows example (#748 ) * README: Update with CMake and windows example * README: update with code-review for cmake build	2023-04-05 17:36:12 +03:00
at8u	88ed5761b8	examples : add Miku.sh (#724 ) * Add Miku.sh to examples * Add missing line to prompt in Miku.sh * Add --keep param to Miku.sh * Remove '[end_of_conversation]' line from Miku.sh No longer is necessary.	2023-04-05 17:32:42 +03:00
Andrew Duffy	58c438cf7d	Add Accelerate/BLAS when using Swift (#765 )	2023-04-05 06:44:24 -04:00
Concedo	5c1920df43	why nobody ever told me the makefile doesnt work outside x86 xD	2023-04-05 17:15:42 +08:00
Concedo	1490cdd71d	change GPT-J and GPT2 KVs to use fp16 instead	2023-04-05 15:53:07 +08:00
Concedo	57e9f929ee	renamed misnamed ACCELERATE define, and removed all -march=native and -mtune=native flags	2023-04-05 15:22:13 +08:00
Concedo	14273fea7a	integrated gpt2 support	2023-04-04 23:15:47 +08:00
Concedo	52de932842	removed main.exe to reduce clutter, added support for rep pen in gptj	2023-04-04 20:43:13 +08:00
Concedo	9c0dbbb08b	Merge branch 'master' into concedo	2023-04-04 00:51:05 +08:00
Concedo	dd2abd8bc7	lower default thread threshold	2023-04-04 00:42:49 +08:00
mgroeber9110	53dbba7695	Windows: reactive sigint handler after each Ctrl-C (#736 )	2023-04-03 18:00:55 +02:00
SebastianApel	437e77855a	10+% performance improvement of ggml_vec_dot_q4_0 on AVX2 (#654 ) * Performance improvement of AVX2 code * Fixed problem with MSVC compiler * Reviewer comments: removed double semicolon, deleted empty line 1962	2023-04-03 09:52:28 +02:00
Concedo	06c711d770	Merge branch 'master' into concedo # Conflicts: # .devops/full.Dockerfile # README.md	2023-04-03 15:10:08 +08:00
Concedo	eb5b22dda2	rebrand to koboldcpp	2023-04-03 10:35:18 +08:00
Ivan Stepanov	cd7fa95690	Define non-positive temperature behavior (#720 )	2023-04-03 02:19:04 +02:00
bsilvereagle	a0c0516416	Remove torch GPU dependencies from the Docker.full image (#665 ) By using `pip install torch --index-url https://download.pytorch.org/whl/cpu` instead of `pip install torch` we can specify we want to install a CPU-only version of PyTorch without any GPU dependencies. This reduces the size of the Docker image from 7.32 GB to 1.62 GB	2023-04-03 00:13:03 +02:00
Concedo	8dd8ab1659	Various enhancement and integration pygmalion.cpp	2023-04-03 00:04:43 +08:00
Thatcher Chamberlin	d8d4e865cd	Add a missing step to the gpt4all instructions (#690 ) `migrate-ggml-2023-03-30-pr613.py` is needed to get gpt4all running.	2023-04-02 12:48:57 +02:00
Christian Falch	e986f94829	Added api for getting/setting the kv_cache (#685 ) The api provides access methods for retrieving the current memory buffer for the kv_cache and its token number. It also contains a method for setting the kv_cache from a memory buffer. This makes it possible to load/save history - maybe support --cache-prompt paramater as well? Co-authored-by: Pavol Rusnak <pavol@rusnak.io>	2023-04-02 12:23:04 +02:00
Marian Cepok	c0bb1d3ce2	ggml : change ne to int64_t (#626 )	2023-04-02 13:21:31 +03:00
Concedo	3f4967b827	added new binaries	2023-04-02 17:14:38 +08:00
Concedo	bb965cc120	Merge branch 'master' into concedo # Conflicts: # README.md	2023-04-02 17:13:28 +08:00
Concedo	9aabb0d9db	massive refactor completed, GPT-J integrated	2023-04-02 17:03:30 +08:00
Leonardo Neumann	6e7801d08d	examples : add gpt4all script (#658 )	2023-04-02 10:56:20 +03:00
Stephan Walter	81040f10aa	llama : do not allocate KV cache for "vocab_only == true" (#682 ) Fixes sanitizer CI	2023-04-02 10:18:53 +03:00
Fabian	c4f89d8d73	make : use -march=native -mtune=native on x86 (#609 )	2023-04-02 10:17:05 +03:00
Murilo Santana	5b70e7de4c	fix default params for examples/main (#697 )	2023-04-02 04:41:12 +02:00
Concedo	b1f08813e3	added support for gpt4all original format	2023-04-02 00:53:46 +08:00

1 2 3 4 5 ...

396 commits