llama.cpp

Author	SHA1	Message	Date
Concedo	79f9743347	improved console info, fixed utf encoding bugs	2023-03-31 15:38:38 +08:00
Concedo	354d4f232f	fixed linux openblas build errors	2023-03-30 11:55:35 +08:00
Concedo	977a9a246f	Merge remote-tracking branch 'origin/master' into concedo # Conflicts: # .github/workflows/build.yml # README.md	2023-03-30 09:42:51 +08:00
Concedo	0f5b470c04	more library checks	2023-03-30 09:28:04 +08:00
anzz1	9cbc404ba6	ci : re-enable AVX512 testing (Windows-MSVC) (#584 ) * CI: Re-enable AVX512 testing (Windows-MSVC) Now with 100% less base64 encoding * plain __cpuid is enough here	2023-03-29 23:44:39 +03:00
Georgi Gerganov	b51c717d5c	ggml : init time on first ggml_init() call	2023-03-29 22:15:34 +03:00
Georgi Gerganov	0ba76c1e73	llama : fix compile warnings when reading the vocab	2023-03-29 22:13:12 +03:00
Georgi Gerganov	cea1c85948	ggml : add ARM_NEON dequantize_row_q4_1()	2023-03-29 22:10:01 +03:00
Georgi Gerganov	f202ada131	ggml : add ARM_NEON quantize_row_q4_1()	2023-03-29 22:03:07 +03:00
Georgi Gerganov	3b44d30d9b	ggml : add ARM_NEON ggml_vec_dot_q4_1()	2023-03-29 22:03:07 +03:00
Pavol Rusnak	61cbfff5c9	rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600 ) to match filenames of other converters	2023-03-29 20:09:25 +02:00
Thérence	d9ad104440	Create chat-13B.bat (#592 ) * Create chat-13B.bat Same script than chat-13B.sh, but for windows users. Tested and working on windows 10/11 v 22H2 * Apply suggestions from code review --------- Co-authored-by: anzz1 <anzz1@live.com>	2023-03-29 20:21:09 +03:00
Concedo	d8febc8653	renamed main python script	2023-03-30 00:48:44 +08:00
Concedo	664b277c27	integrated libopenblas for greatly accelerated prompt processing. Windows binaries are included - feel free to build your own or to build for other platforms, but that is beyond the scope of this repo. Will fall back to non-blas if libopenblas is removed.	2023-03-30 00:43:52 +08:00
Georgi Gerganov	b467702b87	readme : fix typos	2023-03-29 19:38:31 +03:00
Georgi Gerganov	516d88e75c	readme : add GPT4All instructions (close #588 )	2023-03-29 19:37:20 +03:00
Georgi Gerganov	53635c081c	py : add GPT4All conversion script For now: copy-paste Too much time for me to deduplicate the python code	2023-03-29 19:29:52 +03:00
Maël Kerbiriou	41318d708e	llama : use the same threshold for OpenBLAS and ggml thread limiting (#577 )	2023-03-29 19:10:07 +03:00
Tobias Lütke	a6956b25a1	add example of re-act pattern (#583 ) * add example of re-act pattern * spelling... * fixed whitespace in reverse prompt issue	2023-03-29 10:10:24 -05:00
anzz1	83df5639eb	Fix GCC warning about binary literal (#595 ) 0b10101010 -> 0xAA /* 0b10101010 */	2023-03-29 13:20:07 +00:00
anzz1	a5c42c4b13	Fix typo in llama.h (#593 )	2023-03-29 13:19:29 +00:00
Concedo	49c4c225b5	Merge branch 'master' into concedo # Conflicts: # .github/workflows/build.yml # .gitignore # CMakeLists.txt # Makefile	2023-03-29 21:08:03 +08:00
Concedo	271307232c	Merged PR with a few changes: - Thread count set equal to cpu_count() if it's < 6, otherwise set to cpu_count()-2 instead. This can be forcibly overwritten by the --threads parameter. Setting all threads=cpu_count() chokes my own PC and slows it down badly, so I'd rather make it optional. - Added localmodehost as a URL parameter in Kobold Lite instead, to avoid monkeypatching the embedded kobold lite directly. It should be parsed via ?localmodehost=(host). Also your updated klite file has the wrong encoding, it should be UTF-8, some of the symbols are incorrect such as the palette icon in settings. Repackaged the new version of Kobold Lite correctly with changes. - Reverting the TK GUI filedialog if no model is provided, because I want to keep it noob friendly for those who don't know how to use command line args. The file dialog only loads if there are no command line args. If command line args are present, the GUI will not trigger. - Modified the argparser to also take positional arguments for backwards compatibility, in addition to the optional argparse flags specified. - Your code does not work if embedded kobold is removed. The embedded KAI variable was not declared in the correct scope, and also Python f-string formatted variables cannot work with raw byte strings. You also have incorrect indentation when returning the response body - have corrected all the above but please do test all codepaths if possible. - There is a good reason to bind to "" (0.0.0.0) instead of a specific IP. It allows receiving requests from all routable interfaces. I don't know why you need an explicitly defined --host flag, but I will leave it there as an optional parameter, though the default should still be to accept from all interfaces. In that way, even if the displayed url is localhost, connecting via 192.168.x.x will also work, for example.	2023-03-29 20:38:57 +08:00
InconsolableCellist	13b4c05d66	Some more code cleanup	2023-03-28 16:59:27 -06:00
anzz1	5a5f8b1501	Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375 ) * Enable Fused-Multiply-Add (FMA) instructions on MSVC __FMA__ macro does not exist in MSVC * Enable F16C/CVT16 vector extensions on MSVC __F16C__ macro does not exist in MSVC, but is implied with AVX2/AVX512 * MSVC cvt intrinsics * Add __SSE3__ macro for MSVC too because why not even though it's not currently used for anything when AVX is defined	2023-03-28 22:44:29 +03:00
anzz1	f1217055ea	CI: fix subdirectory path globbing (#546 ) - Changes in subdirectories will now be detecter properly - (Windows-MSVC) AVX512 tests temporarily disabled	2023-03-28 22:43:25 +03:00
InconsolableCellist	13addf2a78	Merge branch 'concedo' of github.com:InconsolableCellist/llamacpp-for-kobold into concedo	2023-03-28 13:43:19 -06:00
InconsolableCellist	f7c905b0d0	Minor overhaul of code: * Set number of utilized llama.cpp threads back to os.cpu_count, which had better performance on my machine (20 threads vs. 6, 3m12s vs. 4m42s on 65B) * Using argparse for command line args * Supports binding to a specific interface, for use on LANs/WANs (no longer limited to just 127.0.0.1). Requires modified klite.embd * General code cleanup and passing some parameters around without globals	2023-03-28 13:39:34 -06:00
InconsolableCellist	003365907d	updating to version 17 of embedded koboldAI, and adding host address support	2023-03-28 13:39:10 -06:00
anzz1	7f4c5c6651	llama : fix linkage with mingw (#551 ) * Revert `7e53955` (#542) Still needs to be fixed properly * Fix linking on mingw32	2023-03-28 21:23:09 +03:00
slaren	2a98bc18ea	ggml : add AVX2 implementation of quantize_row_q4_1 (#515 ) * Add AVX2 implementation of quantize_row_q4_1 * Actually use AVX2 * Make quantize_row_q4_1 static Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-28 21:06:03 +03:00
thement	d0aaff571c	py : add temporary script to convert old ggml files to newer version (#539 ) Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>	2023-03-28 20:55:42 +03:00
Tai Duc Nguyen	d0330fd783	py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403 )	2023-03-28 20:51:29 +03:00
Stephan Walter	99c5b27654	ggml : refactor quantized processing functions (#509 ) * Refactor quantized processing functions * ggml : minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-28 20:13:01 +03:00
DooWoong Lee (David)	692ce3164e	py : removed unused `model` variable and verified that the code functions correctly with `vocab_only` setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547 )	2023-03-28 20:02:34 +03:00
Georgi Gerganov	96f9c0506f	ci : make ctest verbose, hopefully we see what is wrong with the sanitizer	2023-03-28 20:01:09 +03:00
Georgi Gerganov	d502bc7c9d	tests : free llama context at the end of the test	2023-03-28 19:51:55 +03:00
Stephan Walter	436e561931	all : be more strict about converting float to double (#458 ) * Be more strict about converting float to double * Test equivalence of round, SILU implementations Test module is commented out in CMakeLists.txt because the tests may take a long time, depending on how much the compiler optimizes. * Fix softmax in perplexity.cpp * all : prefer float over double where appropriate * perplexity : add <cmath> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-28 19:48:20 +03:00
Jed Fox	20e1e84884	deploy : add a Package.swift for SwiftPM support (#393 ) * Add a Package.swift for SwiftPM support * Swap from exclusions to allowlist	2023-03-28 19:39:01 +03:00
Stephan Walter	c1f885067c	ggml : introduce structs for the q4 data blocks (#356 ) * Introduce structs for the q4 data blocks * ggml : rename quant struct variables + fix ARM_NEON --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-28 18:56:03 +03:00
Georgi Gerganov	e0670260fb	gitignore : add "embedding"	2023-03-28 18:34:35 +03:00
dotpy314	28ba975aea	Check the existence of f16_model_path_base in quantize.py (#574 ) Co-authored-by: Jincheng Miao <jincheng.miao@gmail.com>	2023-03-28 18:06:28 +03:00
slaren	a6bdc47cba	Fix usage of F16C intrinsics in AVX code (#563 ) * Fix usage of F16C intrinsics in AVX code when F16C is not defined	2023-03-28 17:26:55 +03:00
anzz1	7b8dbcb78b	main.cpp fixes, refactoring (#571 ) - main: entering empty line passes back control without new input in interactive/instruct modes - instruct mode: keep prompt fix - instruct mode: duplicate instruct prompt fix - refactor: move common console code from main->common	2023-03-28 17:09:55 +03:00
Concedo	bf30406f50	Merge branch 'master' into concedo # Conflicts: # .github/workflows/build.yml # .github/workflows/docker.yml # Makefile # README.md	2023-03-28 17:13:38 +08:00
RJ Adriaansen	4b8efff0e3	Add embedding example to Makefile (#540 )	2023-03-28 09:11:09 +03:00
Concedo	46ddbb22bf	allow url params	2023-03-27 17:40:05 +08:00
Marco Matthies	7e5395575a	Fix missing ggml link in cmake for examples/* on w64-mingw32 (#542 )	2023-03-27 07:55:26 +03:00
Erik Scholz	34c1072e49	ci: add debug build to sanitizer build matrix (#527 )	2023-03-26 15:48:40 +00:00
Stephan Walter	939ad2d3a5	Fix undefined variables in debug build, remove unused variables (#531 )	2023-03-26 15:34:02 +00:00

1 2 3 4 5 ...

314 commits