llama.cpp

Author	SHA1	Message	Date
Concedo	1543c700d8	added a missing endpoint for tavern	2023-04-09 17:41:33 +08:00
Concedo	b91abc3316	increase default blas batch size	2023-04-09 15:27:43 +08:00
Concedo	4d1825263b	Merge branch 'master' into concedo # Conflicts: # CMakeLists.txt # flake.nix	2023-04-09 13:22:40 +08:00
Concedo	26a7933084	hide the tiny tkinter window	2023-04-09 01:01:34 +08:00
Tomáš Pazdiora	aaf3b23deb	fix for windows utf-8 input (#840 ) Use UTF-16 as input on Windows, since UTF-8 does not work and reads multibyte characters as zeros	2023-04-08 17:49:39 +02:00
eiery	f2d1c47294	cmake should link openblas properly with -lopenblas like how it's done in the makefile (#839 )	2023-04-08 11:15:17 +00:00
lon	317fb12fbd	Add new binaries to flake.nix (#847 )	2023-04-08 12:04:23 +02:00
Concedo	d335fae7c4	missed a print statement	2023-04-08 17:59:53 +08:00
Concedo	0b904e12db	Merge branch 'master' into concedo # Conflicts: # Makefile	2023-04-08 17:42:09 +08:00
LostRuins	5dd610032e	Merge pull request #27 from ariez-xyz/patch-1 add more precise instructions for arch	2023-04-08 17:37:39 +08:00
Concedo	d8e37bfe75	new gpt2 format supported	2023-04-08 17:35:36 +08:00
ariez-xyz	b48255db19	add more precise instructions for arch	2023-04-08 10:41:57 +02:00
Concedo	1369b46bb7	notice about false positives	2023-04-08 12:20:48 +08:00
unbounded	62cfc54f77	Add quantize-stats command for testing quantization (#728 ) Command that calculates some statistics over the errors introduced by quantization, like mean square error, max error and some percentile errors for layer weights. Should be useful for testing quantization improvements. Exposes some internal state from ggml and llama for testing	2023-04-08 00:09:18 +02:00
Concedo	d1c957ee64	strip symbols	2023-04-08 00:59:34 +08:00
bhubbb	698f7b5d63	make : add libllama.so target for llama-cpp-python (#797 ) I was able to get llama-cpp-python working but only when I build libllama.so with make.	2023-04-07 19:11:58 +03:00
iacore	c1950c3431	zig : don't link examples/common.cpp for non-example (#814 )	2023-04-07 19:05:29 +03:00
Ivan Stepanov	4953e9007f	llama : always sort logits before nucleus sampling (#812 ) * Always sort logits before nucleus sampling * remove second normalization - fix windows build - remove normalization since std::discrete_distribution does not require it	2023-04-07 19:02:12 +03:00
Concedo	289c40df94	updated embedded kobold	2023-04-07 22:39:20 +08:00
Concedo	1abcdb2394	should not be static	2023-04-07 20:35:19 +08:00
Concedo	43949f7c7c	Merge branch 'master' into concedo	2023-04-07 20:34:06 +08:00
Concedo	f322a5820e	fixed positional port arg	2023-04-07 17:46:33 +08:00
Concedo	1d48db4f63	dont build quantize	2023-04-07 17:11:26 +08:00
Sergey Alirzaev	cc9cee8e9e	Do not crash when it has nothing to say. (#796 ) Otherwise observing this in the interactive mode: /usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/stl_vector.h:1230: reference std::vector<int>::back() [_Tp = int, _Alloc = std::allocator<int>]: Assertion '!this->empty()' failed.	2023-04-06 17:59:11 +02:00
Concedo	4f5faf9612	some users report that this repo is now being flagged as malicious? no idea why, but I am removing all prebuilt binaries except libopenblas. windows users can still obtain it from /releases and osx and linux users can rebuild from source code.	2023-04-06 21:49:43 +08:00
Concedo	b56f872b61	update embedded kobold lite	2023-04-06 16:34:51 +08:00
Pavol Rusnak	d2beca95dc	Make docker instructions more explicit (#785 )	2023-04-06 08:56:58 +02:00
Concedo	0e889ed6db	Merge branch 'master' into concedo # Conflicts: # .gitignore # Makefile # README.md	2023-04-06 11:14:44 +08:00
Concedo	3d650d0e25	remove dependency of psutil, fixed compile error on WSL, handle exceptions when sending http response, added multiline for embedded kobold	2023-04-06 11:08:19 +08:00
Georgi Gerganov	eeaa7b0492	ggml : multi-thread ggml_rope() (~3-4 times faster on M1) (#781 )	2023-04-05 22:11:03 +03:00
Georgi Gerganov	986b6ce9f9	ggml, llama : avoid heavy V transpose + improvements (#775 ) ggml : - added ggml_view_3d() - ggml_view_tensor() now inherits the stride too - reimplement ggml_cpy() to account for dst stride - no longer require tensor->data to be memory aligned llama : - compute RoPE on 32-bit tensors (should be more accurate) - store RoPE-ed K in the KV cache - store transposed V in the KV cache (significant speed-up) - avoid unnecessary Q copy	2023-04-05 22:07:33 +03:00
Georgi Gerganov	3416298929	Update README.md	2023-04-05 19:54:30 +03:00
Ivan Stepanov	5a8c4f6240	llama : define non-positive top_k; top_k range check (#779 ) * Define non-positive top_k; top_k range check * minor : brackets --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-04-05 19:20:05 +03:00
at8u	ff05d05c96	miku.sh : add executable bit (#780 )	2023-04-05 18:59:13 +03:00
Georgi Gerganov	62b3e81aae	media : add logos and banners	2023-04-05 18:58:31 +03:00
Georgi Gerganov	8d10406d6e	readme : change logo + add bindings + add uis + add wiki	2023-04-05 18:56:20 +03:00
iacore	ed1c214e66	zig : add build.zig (#773 ) Co-authored-by: Locria Cyber <74560659+locriacyber@users.noreply.github.com>	2023-04-05 18:06:02 +03:00
Ivan Stepanov	0c44427df1	make : missing host optimizations in CXXFLAGS (#763 )	2023-04-05 17:38:37 +03:00
Adithya Balaji	594cc95fab	readme : update with CMake and windows example (#748 ) * README: Update with CMake and windows example * README: update with code-review for cmake build	2023-04-05 17:36:12 +03:00
at8u	88ed5761b8	examples : add Miku.sh (#724 ) * Add Miku.sh to examples * Add missing line to prompt in Miku.sh * Add --keep param to Miku.sh * Remove '[end_of_conversation]' line from Miku.sh No longer is necessary.	2023-04-05 17:32:42 +03:00
Andrew Duffy	58c438cf7d	Add Accelerate/BLAS when using Swift (#765 )	2023-04-05 06:44:24 -04:00
Concedo	5c1920df43	why nobody ever told me the makefile doesnt work outside x86 xD	2023-04-05 17:15:42 +08:00
Concedo	1490cdd71d	change GPT-J and GPT2 KVs to use fp16 instead	2023-04-05 15:53:07 +08:00
Concedo	57e9f929ee	renamed misnamed ACCELERATE define, and removed all -march=native and -mtune=native flags	2023-04-05 15:22:13 +08:00
Concedo	14273fea7a	integrated gpt2 support	2023-04-04 23:15:47 +08:00
Concedo	52de932842	removed main.exe to reduce clutter, added support for rep pen in gptj	2023-04-04 20:43:13 +08:00
Concedo	9c0dbbb08b	Merge branch 'master' into concedo	2023-04-04 00:51:05 +08:00
Concedo	dd2abd8bc7	lower default thread threshold	2023-04-04 00:42:49 +08:00
mgroeber9110	53dbba7695	Windows: reactive sigint handler after each Ctrl-C (#736 )	2023-04-03 18:00:55 +02:00
SebastianApel	437e77855a	10+% performance improvement of ggml_vec_dot_q4_0 on AVX2 (#654 ) * Performance improvement of AVX2 code * Fixed problem with MSVC compiler * Reviewer comments: removed double semicolon, deleted empty line 1962	2023-04-03 09:52:28 +02:00

1 2 3 4 5 ...

412 commits