llama.cpp

Author	SHA1	Message	Date
Concedo	fb3bcac368	handle memory separately for kcpp	2023-11-07 17:15:14 +08:00
Concedo	f277ed0e8c	Merge branch 'master' into concedo_experimental # Conflicts: # Makefile	2023-11-07 15:23:08 +08:00
Meng Zhang	46876d2a2c	cuda : supports running on CPU for GGML_USE_CUBLAS=ON build (#3946 ) * protyping the idea that supports running on CPU for a GGML_USE_CUBLAS=on build * doc: add comments to ggml_cublas_loaded() * fix defined(...)	2023-11-07 08:49:08 +02:00
Damian Stewart	381efbf480	llava : expose as a shared library for downstream projects (#3613 ) * wip llava python bindings compatibility * add external llava API * add base64 in-prompt image support * wip refactor image loading * refactor image load out of llava init * cleanup * further cleanup; move llava-cli into its own file and rename * move base64.hpp into common/ * collapse clip and llava libraries * move llava into its own subdir * wip * fix bug where base64 string was not removed from the prompt * get libllava to output in the right place * expose llava methods in libllama.dylib * cleanup memory usage around clip_image_* * cleanup and refactor again * update headerdoc * build with cmake, not tested (WIP) * Editorconfig * Editorconfig * Build with make * Build with make * Fix cyclical depts on Windows * attempt to fix build on Windows * attempt to fix build on Windows * Upd TODOs * attempt to fix build on Windows+CUDA * Revert changes in cmake * Fix according to review comments * Support building as a shared library * address review comments --------- Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com> Co-authored-by: Jared Van Bortel <jared@nomic.ai>	2023-11-07 00:36:23 +03:00
Concedo	feb60bc447	tokenizer tweaks (+2 squashed commit) Squashed commit: [18c70621] tokenizer tweaks [8002f897] handle if localstorage is inaccessible	2023-11-06 23:51:26 +08:00
Concedo	372cfef2c3	Merge branch 'concedo' into concedo_experimental	2023-11-06 20:16:07 +08:00
Concedo	2102942121	testing LLAMA_PORTABLE flag for building	2023-11-06 20:15:15 +08:00
Concedo	78ca0667a4	Merge branch 'master' into concedo_experimental	2023-11-06 16:58:58 +08:00
Concedo	93c4b2a9c6	add force rebuild	2023-11-06 14:33:42 +08:00
Concedo	2f16eccb89	special colab build	2023-11-06 01:46:58 +08:00
slaren	2833a6f63c	ggml-cuda : fix f16 mul mat (#3961 ) * ggml-cuda : fix f16 mul mat ggml-ci * silence common.cpp warning (bonus)	2023-11-05 18:45:16 +01:00
Kerfuffle	d9ccce2e33	Allow common process_escapes to handle \x sequences (#3928 ) * Allow common process_escapes to handle \x sequences * Fix edge case when second hex digit is NUL	2023-11-05 10:06:06 -07:00
Thái Hoàng Tâm	bb60fd0bf6	server : fix typo for --alias shortcut from -m to -a (#3958 )	2023-11-05 18:15:27 +02:00
Jared Van Bortel	132d25b8a6	cuda : fix disabling device with --tensor-split 1,0 (#3951 ) Co-authored-by: slaren <slarengh@gmail.com>	2023-11-05 10:08:57 -05:00
Concedo	2b32b170a1	clang 15 check for macOS	2023-11-05 22:57:05 +08:00
Concedo	ea81eae189	cleanup, up ver (+1 squashed commits) Squashed commits: [1ea303d6] cleanup , up ver (+1 squashed commits) Squashed commits: [79f09b22] cleanup	2023-11-05 22:49:23 +08:00
YellowRoseCx	e2e5fe56a8	KCPP Fetches AMD ROCm Memory without a stick, CC_TURING Gets the Boot, koboldcpp_hipblas.dll Talks To The Hand, and hipBLAS Compiler Finds Its Independence! (#517 ) * AMD ROCm memory fetching and max mem setting * Update .gitignore with koboldcpp_hipblas.dll * Update CMakeLists.txt remove CC_TURING for AMD * separate hipBLAS compiler, update MMV_Y, move CXX/CC print separate hipBLAS compiler, update MMV_Y value, move the section that prints CXX and CC compiler name	2023-11-05 22:23:18 +08:00
Concedo	a62468ec4c	Merge branch 'master' into concedo_experimental should fix multigpu	2023-11-05 22:14:40 +08:00
Concedo	bdf16d7a3c	aria2 needs to show more info	2023-11-05 22:13:22 +08:00
Meng Zhang	3d48f42efc	llama : mark LLM_ARCH_STARCODER as full offload supported (#3945 ) as done in https://github.com/ggerganov/llama.cpp/pull/3827	2023-11-05 14:40:08 +02:00
Eve	c41ea36eaa	cmake : MSVC instruction detection (fixed up #809 ) (#3923 ) * Add detection code for avx * Only check hardware when option is ON * Modify per code review sugguestions * Build locally will detect CPU * Fixes CMake style to use lowercase like everywhere else * cleanup * fix merge * linux/gcc version for testing * msvc combines avx2 and fma into /arch:AVX2 so check for both * cleanup * msvc only version * style * Update FindSIMD.cmake --------- Co-authored-by: Howard Su <howard0su@gmail.com> Co-authored-by: Jeremy Dunn <jeremydunn123@gmail.com>	2023-11-05 10:03:09 +02:00
Eve	a7fac013cf	ci : use intel sde when ci cpu doesn't support avx512 (#3949 )	2023-11-05 09:46:44 +02:00
slaren	48ade94538	cuda : revert CUDA pool stuff (#3944 ) * Revert "cuda : add ROCM aliases for CUDA pool stuff (#3918)" This reverts commit `629f917cd6`. * Revert "cuda : use CUDA memory pool with async memory allocation/deallocation when available (#3903)" This reverts commit `d6069051de`. ggml-ci	2023-11-05 09:12:13 +02:00
Concedo	351dcabd3e	lite fix	2023-11-05 14:47:02 +08:00
Concedo	faae84ee1d	removed c flag in wget	2023-11-05 10:21:28 +08:00
henk717	02595f9d21	Colabcpp improvements (#512 ) * Aria2 * Aria2 Typo fix * Streamlined Wget * Streamlining Fix * Back to .so downloading * Crash colab if no GPU is present * Created using Colaboratory * Restore proper link Colab overwrite the link, manually changing it back so people don't land on my branch. * Restore file juggle * Fixing the colab link... again	2023-11-05 10:19:09 +08:00
Concedo	5e5be717c3	fix for removing inaccessible backends in gui	2023-11-05 10:12:12 +08:00
Kerfuffle	f28af0d81a	gguf-py: Support 01.AI Yi models (#3943 )	2023-11-04 16:20:34 -06:00
Concedo	1e7088a80b	autopick cublas in gui if possible, better layer picking logic	2023-11-05 01:35:27 +08:00
Concedo	7a8c0df2e5	Merge branch 'master' into concedo_experimental	2023-11-04 09:18:28 +08:00
Concedo	135001abc4	try to make the tunnel more reliable	2023-11-04 09:18:19 +08:00
Concedo	38471fbe06	tensor core info better printout (+1 squashed commits) Squashed commits: [be4ef93f] tensor core info better printout	2023-11-04 08:38:25 +08:00
Peter Sugihara	d9b33fe95b	metal : round up to 16 to fix MTLDebugComputeCommandEncoder assertion (#3938 )	2023-11-03 21:18:18 +02:00
Xiao-Yong Jin	5ba3746171	ggml-metal: fix yarn rope (#3937 )	2023-11-03 14:00:31 -04:00
Concedo	36f43ae834	syntax correction	2023-11-04 00:03:45 +08:00
Concedo	9bc2e35b2e	Merge branch 'master' into concedo_experimental	2023-11-03 23:51:32 +08:00
Concedo	373c20ad51	print error log if tunnel fails	2023-11-03 23:48:21 +08:00
slaren	abb77e7319	ggml-cuda : move row numbers to x grid dim in mmv kernels (#3921 )	2023-11-03 12:13:09 +01:00
Concedo	c794fd5ceb	sampler seed added (+1 squashed commits) Squashed commits: [8a1b0d3d] sampler seed added	2023-11-03 17:30:16 +08:00
Concedo	d7729ac3eb	Merge branch 'master' into concedo_experimental	2023-11-03 16:00:05 +08:00
Georgi Gerganov	8f961abdc4	speculative : change default p_accept to 0.5 + CLI args (#3919 ) ggml-ci	2023-11-03 09:41:56 +02:00
Georgi Gerganov	05816027d6	common : YAYF (yet another YARN fix) (#3925 ) ggml-ci	2023-11-03 09:24:00 +02:00
cebtenzzre	3fdbe6b66b	llama : change yarn_ext_factor placeholder to -1 (#3922 )	2023-11-03 08:31:58 +02:00
Concedo	8c14c81b33	hopefully this fixes the dotnet nonsense	2023-11-03 11:23:56 +08:00
Concedo	bc2027b008	Merge remote-tracking branch 'ceb/fix-fast-ext-factor' into concedo_experimental	2023-11-03 11:21:14 +08:00
Concedo	c07c9b857d	Merge branch 'master' into concedo_experimental # Conflicts: # README.md	2023-11-03 11:17:07 +08:00
cebtenzzre	25fef506cf	llama : change yarn_ext_factor placeholder to -1	2023-11-02 21:53:59 -04:00
Kerfuffle	629f917cd6	cuda : add ROCM aliases for CUDA pool stuff (#3918 )	2023-11-02 21:58:22 +02:00
Andrei	51b2fc11f7	cmake : fix relative path to git submodule index (#3915 )	2023-11-02 21:40:31 +02:00
Georgi Gerganov	224e7d5b14	readme : add notice about #3912	2023-11-02 20:44:12 +02:00

1 2 3 4 5 ...

2571 commits