llama.cpp

Author	SHA1	Message	Date
Concedo	4afa38e744	Revert "opencl : no need to allocate cl_mem on heap (#1612 )" This reverts commit `bb051d9723`.	2023-05-31 10:20:23 +08:00
Concedo	56456797f4	Merge branch 'master' into concedo_experimental	2023-05-30 22:15:58 +08:00
Georgi Gerganov	7552ac5863	ggml : sync cgraph import / export API	2023-05-29 19:31:44 +03:00
Georgi Gerganov	5d1830b99d	ggml : fix bug in ggml_alibi	2023-05-29 19:30:49 +03:00
Concedo	ea336bfa33	rwkv eos	2023-05-29 22:40:27 +08:00
Concedo	6b3373cb81	revert bad fix	2023-05-29 22:06:12 +08:00
DannyDaemonic	248367605e	Work around for recalculating logits in cached prompts (Fixes #1585 ) (#1609 ) * Work around for recalculating logits in cached prompts	2023-05-29 05:13:40 -07:00
Concedo	ef16d09a51	fix for older gcc, updated lite	2023-05-29 18:54:15 +08:00
Concedo	3a73ebe8d2	Merge branch 'master' into concedo_experimental # Conflicts: # .devops/full.Dockerfile # .devops/main.Dockerfile # Makefile	2023-05-29 16:47:32 +08:00
Concedo	254a9ff12c	Merge commit '`ebc5d0651a`' into concedo_experimental # Conflicts: # ggml-opencl.cpp	2023-05-29 16:26:24 +08:00
Concedo	30ff1133f5	allow users to rename models for use in horde	2023-05-29 16:01:05 +08:00
Concedo	97b39f875c	fixed fstat64 build error on mac	2023-05-29 15:50:07 +08:00
Jiří Podivín	0e730dd23b	Adding git in container package dependencies (#1621 ) Git added to build packages for version information in docker image Signed-off-by: Jiri Podivin <jpodivin@gmail.com>	2023-05-28 21:45:50 -07:00
Johannes Gäßler	3b126f654f	LLAMA_DEBUG adds debug symbols (#1617 )	2023-05-28 21:01:02 +02:00
Kerfuffle	1b78ed2081	Only show -ngl option when relevant + other doc/arg handling updates (#1625 ) 1. Add a `LLAMA_SUPPORTS_GPU_OFFLOAD` define to `llama.h` (defined when compiled with CLBlast or cuBLAS) 2. Update the argument handling in the common example code to only show the `-ngl`, `--n-gpu-layers` option when GPU offload is possible. 3. Add an entry for the `-ngl`, `--n-gpu-layers` option to the `main` and `server` examples documentation 4. Update `main` and `server` examples documentation to use the new style dash separator argument format 5. Update the `server` example to use dash separators for its arguments and adds `-ngl` to `--help` (only shown when compiled with appropriate support). It will still support `--memory_f32` and `--ctx_size` for compatibility. 6. Add a warning discouraging use of `--memory-f32` for the `main` and `server` examples `--help` text as well as documentation. Rationale: https://github.com/ggerganov/llama.cpp/discussions/1593#discussioncomment-6004356	2023-05-28 11:48:57 -06:00
Vladimir Zorin	337aea1139	examples : add --alias option to gpt_params to set use friendly model name (#1614 )	2023-05-28 20:14:24 +03:00
Howard Su	bb051d9723	opencl : no need to allocate cl_mem on heap (#1612 )	2023-05-28 20:13:36 +03:00
Howard Su	ca74884f66	opencl : use strstr to check if fp16 supported (#1611 ) * Use strstr to check if fp16 supported * Ensure ext_buffer is null terminated	2023-05-28 20:09:56 +03:00
Concedo	28f1196f65	adjust default rep pen range	2023-05-28 19:36:21 +08:00
Concedo	7d159bacd7	updated kobold lite	2023-05-28 11:23:20 +08:00
apcameron	a6704643b6	ggml : add support for the RISCV architecture (#1616 )	2023-05-27 23:03:25 +03:00
Concedo	dcc426e2de	Merge branch 'master' into concedo_experimental # Conflicts: # .github/workflows/build.yml # CMakeLists.txt # Makefile # README.md	2023-05-28 01:08:39 +08:00
Kerfuffle	0df7d63e5b	Include server in releases + other build system cleanups (#1610 ) Set `LLAMA_BUILD_SERVER` in workflow so the `server` example gets build. This currently only applies to Windows builds because it seems like only Windows binary artifacts are included in releases. Add `server` example target to `Makefile` (still uses `LLAMA_BUILD_SERVER` define and does not build by default) Fix issue where `vdot` binary wasn't removed when running `make clean`. Fix compile warnings in `server` example. Add `.hpp` files to trigger workflow (the server example has one).	2023-05-27 11:04:14 -06:00
Concedo	5d9f5b28a6	rwkv integration completed	2023-05-28 00:48:56 +08:00
Henri Vasserman	97c9b77c4f	Add documentation about CLBlast (#1604 ) Installing, compiling and using.	2023-05-27 18:47:55 +03:00
Concedo	55e0fbf024	wip integrating new rwkv	2023-05-27 22:45:28 +08:00
Henri Vasserman	0ecb1bbbeb	[CI] Fix openblas (#1613 ) * Fix OpenBLAS build * Fix `LLAMA_BLAS_VENDOR` CMake variable that should be a string and not a boolean.	2023-05-27 17:24:06 +03:00
Georgi Gerganov	93618031c7	ggml : add ggml_tensor_overhead()	2023-05-27 16:19:56 +03:00
Henri Vasserman	83c54e6da5	[CI] CLBlast: Fix directory name (#1606 )	2023-05-27 14:18:25 +02:00
Concedo	fe63bfdb0f	Revert "allow 2048 blasbatchsize" This reverts commit `94dc5c2324`.	2023-05-27 18:13:27 +08:00
Concedo	94dc5c2324	allow 2048 blasbatchsize	2023-05-27 17:47:18 +08:00
Concedo	92a0d77712	Merge branch 'master' into concedo_experimental # Conflicts: # CMakeLists.txt # Makefile	2023-05-27 17:44:14 +08:00
Concedo	abfdfb702e	added top_a sampler	2023-05-27 17:32:37 +08:00
Georgi Gerganov	bdbda1b17a	ggml : sync ggml core (minor additions, e.g. ggml_get_tensor_by_name())	2023-05-27 12:23:16 +03:00
0cc4m	ebc5d0651a	Use events instead of clFinish, where possible	2023-05-27 10:03:35 +02:00
Concedo	01a0f206df	added support for starcoder, which is basically gpt2	2023-05-27 13:35:40 +08:00
Concedo	6d7749c98f	no difference	2023-05-27 12:42:19 +08:00
Concedo	bd4fe936f5	cleanup sampling code	2023-05-27 11:58:39 +08:00
Concedo	3c8f404243	integrated token probability viewer in debugmode	2023-05-26 16:40:26 +08:00
Kerfuffle	66874d4fbc	Some improvements to loading the session with --prompt-cache (#1550 ) Improvements to loading the session with `--prompt-cache` in the `main` example. 1. Fix an issue where the `--seed` parameter was ignored when loading a cached prompt. 2. When loading a cached prompt, you previously had to specify the saved prompt (or a prefix of it) again. This pull changes that behavior to default to the prompt that was cached if a prompt wasn't specified by the user.	2023-05-25 20:18:01 -06:00
Johannes Gäßler	1fcdcc28b1	cuda : performance optimizations (#1530 ) * xor hack * block y dim * loop unrolling * Fixed cmake LLAMA_CUDA_BY option * Removed hipblas compatibility code * Define GGML_CUDA_DMMV_BLOCK_Y if not defined * Fewer iters, more ops per iter * Renamed DMMV X/Y compilation options	2023-05-26 00:07:29 +03:00
Concedo	8b8f2f4cf5	up ver to 1.25.1	2023-05-25 14:49:30 +08:00
Concedo	e6eeb234f1	Merge branch 'master' into concedo_experimental # Conflicts: # .github/workflows/build.yml # README.md	2023-05-25 10:34:43 +08:00
Concedo	d2da155661	upgraded clblast	2023-05-25 10:18:12 +08:00
Concedo	37a34deaa0	added a second pyinstaller for my own use that uses a different python version. don't use this.	2023-05-24 23:34:11 +08:00
Concedo	bf482d1786	revert klite newline bug, trying to add win7 support	2023-05-24 22:21:01 +08:00
Concedo	844f92688a	subpattern fix	2023-05-24 16:48:39 +08:00
Henri Vasserman	ac7876ac20	Update CLBlast to 1.6.0 (#1580 ) * Update CLBlast to 1.6.0	2023-05-24 10:30:09 +03:00
Concedo	d04b3bbe5e	disable mmap when failsafe mode selected from GUI	2023-05-24 15:04:17 +08:00
Evan Jones	c31bbe934b	readme : add docs for chat-persistent.sh (#1568 ) * readme : add docs for chat-persistent.sh * Update README.md	2023-05-24 09:24:01 +03:00

1 2 3 4 5 ...

1033 commits