llama.cpp

Author	SHA1	Message	Date
Concedo	ea81eae189	cleanup, up ver (+1 squashed commits) Squashed commits: [1ea303d6] cleanup , up ver (+1 squashed commits) Squashed commits: [79f09b22] cleanup	2023-11-05 22:49:23 +08:00
YellowRoseCx	e2e5fe56a8	KCPP Fetches AMD ROCm Memory without a stick, CC_TURING Gets the Boot, koboldcpp_hipblas.dll Talks To The Hand, and hipBLAS Compiler Finds Its Independence! (#517 ) * AMD ROCm memory fetching and max mem setting * Update .gitignore with koboldcpp_hipblas.dll * Update CMakeLists.txt remove CC_TURING for AMD * separate hipBLAS compiler, update MMV_Y, move CXX/CC print separate hipBLAS compiler, update MMV_Y value, move the section that prints CXX and CC compiler name	2023-11-05 22:23:18 +08:00
Concedo	a62468ec4c	Merge branch 'master' into concedo_experimental should fix multigpu	2023-11-05 22:14:40 +08:00
Concedo	bdf16d7a3c	aria2 needs to show more info	2023-11-05 22:13:22 +08:00
Meng Zhang	3d48f42efc	llama : mark LLM_ARCH_STARCODER as full offload supported (#3945 ) as done in https://github.com/ggerganov/llama.cpp/pull/3827	2023-11-05 14:40:08 +02:00
Eve	c41ea36eaa	cmake : MSVC instruction detection (fixed up #809 ) (#3923 ) * Add detection code for avx * Only check hardware when option is ON * Modify per code review sugguestions * Build locally will detect CPU * Fixes CMake style to use lowercase like everywhere else * cleanup * fix merge * linux/gcc version for testing * msvc combines avx2 and fma into /arch:AVX2 so check for both * cleanup * msvc only version * style * Update FindSIMD.cmake --------- Co-authored-by: Howard Su <howard0su@gmail.com> Co-authored-by: Jeremy Dunn <jeremydunn123@gmail.com>	2023-11-05 10:03:09 +02:00
Eve	a7fac013cf	ci : use intel sde when ci cpu doesn't support avx512 (#3949 )	2023-11-05 09:46:44 +02:00
slaren	48ade94538	cuda : revert CUDA pool stuff (#3944 ) * Revert "cuda : add ROCM aliases for CUDA pool stuff (#3918)" This reverts commit `629f917cd6`. * Revert "cuda : use CUDA memory pool with async memory allocation/deallocation when available (#3903)" This reverts commit `d6069051de`. ggml-ci	2023-11-05 09:12:13 +02:00
Concedo	351dcabd3e	lite fix	2023-11-05 14:47:02 +08:00
Concedo	faae84ee1d	removed c flag in wget	2023-11-05 10:21:28 +08:00
henk717	02595f9d21	Colabcpp improvements (#512 ) * Aria2 * Aria2 Typo fix * Streamlined Wget * Streamlining Fix * Back to .so downloading * Crash colab if no GPU is present * Created using Colaboratory * Restore proper link Colab overwrite the link, manually changing it back so people don't land on my branch. * Restore file juggle * Fixing the colab link... again	2023-11-05 10:19:09 +08:00
Concedo	5e5be717c3	fix for removing inaccessible backends in gui	2023-11-05 10:12:12 +08:00
Kerfuffle	f28af0d81a	gguf-py: Support 01.AI Yi models (#3943 )	2023-11-04 16:20:34 -06:00
Concedo	1e7088a80b	autopick cublas in gui if possible, better layer picking logic	2023-11-05 01:35:27 +08:00
Concedo	7a8c0df2e5	Merge branch 'master' into concedo_experimental	2023-11-04 09:18:28 +08:00
Concedo	135001abc4	try to make the tunnel more reliable	2023-11-04 09:18:19 +08:00
Concedo	38471fbe06	tensor core info better printout (+1 squashed commits) Squashed commits: [be4ef93f] tensor core info better printout	2023-11-04 08:38:25 +08:00
Peter Sugihara	d9b33fe95b	metal : round up to 16 to fix MTLDebugComputeCommandEncoder assertion (#3938 )	2023-11-03 21:18:18 +02:00
Xiao-Yong Jin	5ba3746171	ggml-metal: fix yarn rope (#3937 )	2023-11-03 14:00:31 -04:00
Concedo	36f43ae834	syntax correction	2023-11-04 00:03:45 +08:00
Concedo	9bc2e35b2e	Merge branch 'master' into concedo_experimental	2023-11-03 23:51:32 +08:00
Concedo	373c20ad51	print error log if tunnel fails	2023-11-03 23:48:21 +08:00
slaren	abb77e7319	ggml-cuda : move row numbers to x grid dim in mmv kernels (#3921 )	2023-11-03 12:13:09 +01:00
Concedo	c794fd5ceb	sampler seed added (+1 squashed commits) Squashed commits: [8a1b0d3d] sampler seed added	2023-11-03 17:30:16 +08:00
Concedo	d7729ac3eb	Merge branch 'master' into concedo_experimental	2023-11-03 16:00:05 +08:00
Georgi Gerganov	8f961abdc4	speculative : change default p_accept to 0.5 + CLI args (#3919 ) ggml-ci	2023-11-03 09:41:56 +02:00
Georgi Gerganov	05816027d6	common : YAYF (yet another YARN fix) (#3925 ) ggml-ci	2023-11-03 09:24:00 +02:00
cebtenzzre	3fdbe6b66b	llama : change yarn_ext_factor placeholder to -1 (#3922 )	2023-11-03 08:31:58 +02:00
Concedo	8c14c81b33	hopefully this fixes the dotnet nonsense	2023-11-03 11:23:56 +08:00
Concedo	bc2027b008	Merge remote-tracking branch 'ceb/fix-fast-ext-factor' into concedo_experimental	2023-11-03 11:21:14 +08:00
Concedo	c07c9b857d	Merge branch 'master' into concedo_experimental # Conflicts: # README.md	2023-11-03 11:17:07 +08:00
cebtenzzre	25fef506cf	llama : change yarn_ext_factor placeholder to -1	2023-11-02 21:53:59 -04:00
Kerfuffle	629f917cd6	cuda : add ROCM aliases for CUDA pool stuff (#3918 )	2023-11-02 21:58:22 +02:00
Andrei	51b2fc11f7	cmake : fix relative path to git submodule index (#3915 )	2023-11-02 21:40:31 +02:00
Georgi Gerganov	224e7d5b14	readme : add notice about #3912	2023-11-02 20:44:12 +02:00
Georgi Gerganov	c7743fe1c1	cuda : fix const ptrs warning causing ROCm build issues (#3913 )	2023-11-02 20:32:11 +02:00
Oleksii Maryshchenko	d6069051de	cuda : use CUDA memory pool with async memory allocation/deallocation when available (#3903 ) * Using cuda memory pools for async alloc/dealloc. * If cuda device doesnt support memory pool than use old implementation. * Removed redundant cublasSetStream --------- Co-authored-by: Oleksii Maryshchenko <omaryshchenko@dtis.com>	2023-11-02 19:10:39 +02:00
Concedo	879061c5d5	noavx2 clblast selector	2023-11-02 23:13:16 +08:00
Concedo	c7c3f3d9ab	updated lite	2023-11-02 22:46:54 +08:00
Concedo	b0c7b88eac	try fix clouflare tunnel (+2 squashed commit) Squashed commit: [87d96bf2] update remote option [c30bc909] updated fixed colab (+1 squashed commits) Squashed commits: [97b77563] updated fixed colab (+2 squashed commit) Squashed commit: [d851b04c] replaced cloudflare manual dl with remotetunnel in colab [90ff1790] updated lite	2023-11-02 22:27:35 +08:00
Georgi Gerganov	4ff1046d75	gguf : print error for GGUFv1 files (#3908 )	2023-11-02 16:22:30 +02:00
Concedo	6dbb8d82b0	Merge branch 'master' into concedo_experimental # Conflicts: # CMakeLists.txt # models/ggml-vocab-llama.gguf	2023-11-02 20:51:45 +08:00
Concedo	42eabf2f2f	rope fixes	2023-11-02 20:41:16 +08:00
slaren	21958bb393	cmake : disable LLAMA_NATIVE by default (#3906 )	2023-11-02 14:10:33 +02:00
Concedo	bc4ff72317	not working merge	2023-11-02 17:52:40 +08:00
Georgi Gerganov	2756c4fbff	gguf : remove special-case code for GGUFv1 (#3901 ) ggml-ci	2023-11-02 11:20:21 +02:00
Georgi Gerganov	1efae9b7dc	llm : prevent from 1-D tensors being GPU split (#3697 )	2023-11-02 09:54:44 +02:00
Concedo	fca7a4c054	added noavx2 model for clblast (+1 squashed commits) Squashed commits: [291ecae6] added noavx2 mode for clblast (+1 squashed commits) Squashed commits: [562bc872] wip adding noavx2 cl	2023-11-02 15:22:34 +08:00
cebtenzzre	b12fa0d1c1	build : link against build info instead of compiling against it (#3879 ) * cmake : fix build when .git does not exist * cmake : simplify BUILD_INFO target * cmake : add missing dependencies on BUILD_INFO * build : link against build info instead of compiling against it * zig : make build info a .cpp source instead of a header Co-authored-by: Matheus C. França <matheus-catarino@hotmail.com> * cmake : revert change to CMP0115 --------- Co-authored-by: Matheus C. França <matheus-catarino@hotmail.com>	2023-11-02 08:50:16 +02:00
Georgi Gerganov	4d719a6d4e	cuda : check if this fixes Pascal card regression (#3882 )	2023-11-02 08:35:10 +02:00

1 2 3 4 5 ...

2556 commits