llama.cpp

Author	SHA1	Message	Date
Andrei Betlen	3693449c07	Update llama.cpp	2023-05-31 15:56:55 -07:00
Andrei Betlen	d595f330e2	Update llama.cpp	2023-05-31 15:56:55 -07:00
Andrei Betlen	ce0ca60b56	Update llama.cpp (llama_mmap_supported)	2023-05-31 15:56:49 -07:00
Mug	d0a7ce9abf	Make windows users happy (hopefully)	2023-05-31 15:25:57 -07:00
Mug	848b4021a3	Better custom library debugging	2023-05-31 15:25:57 -07:00
Mug	c8b5d0b963	Use environment variable for library override	2023-05-31 15:25:57 -07:00
Mug	d1b3517477	Allow local llama library usage	2023-05-31 15:25:57 -07:00
Mug	b36c04c99e	Added iterative search to prevent instructions from being echoed, add ignore eos, add no-mmap, fixed 1 character echo too much bug	2023-05-31 15:25:57 -07:00
Andrei Betlen	f25a81309e	Update model paths to be more clear they should point to file	2023-05-31 15:25:57 -07:00
Mug	e19909249d	More interoperability to the original llama.cpp, and arguments now work	2023-05-31 15:25:57 -07:00
Andrei Betlen	d5680144c5	Bugfix: Wrong size of embeddings. Closes #47	2023-05-31 15:25:57 -07:00
Mug	29e9fb66a3	Better llama.cpp interoperability Has some too many newline issues so WIP (Update) Fixed too many newlines, now onto args. Still needs shipping work so you could do "python -m llama_cpp.examples." etc.	2023-05-31 15:25:57 -07:00
Andrei Betlen	ce66405da1	Add quantize example	2023-05-31 15:25:57 -07:00
Mug	739e8d4c9b	Fix bug in init_break not being set when exited via antiprompt and others.	2023-05-31 15:25:57 -07:00
Mug	ae1f37f505	Fix repeating instructions and an antiprompt bug	2023-05-31 15:25:57 -07:00
Mug	3c1020b866	Fix stripping instruction prompt	2023-05-31 15:25:57 -07:00
Mug	0bfad75406	Added instruction mode, fixed infinite generation, and various other fixes	2023-05-31 15:25:57 -07:00
Mug	9e872410da	Add instruction mode	2023-05-31 15:25:57 -07:00
Mug	15bea0946b	Chat llama.cpp example implementation	2023-05-31 15:25:57 -07:00
MillionthOdin16	2b8147e7a8	Update llama_cpp.py	2023-05-31 15:25:57 -07:00
Andrei Betlen	62ce167b22	Update low level api example	2023-05-31 15:25:57 -07:00
Andrei Betlen	a71cda6546	Update llama.cpp	2023-05-31 15:25:57 -07:00
Andrei Betlen	a279acd680	Update llama.cpp (llama_n_embd)	2023-05-31 15:25:57 -07:00
Andrei Betlen	ef3c152257	Update llama.cpp (llama_progress_callback)	2023-05-31 15:25:57 -07:00
Andrei Betlen	def46dd9a6	Add example based on stripped down version of main.cpp from llama.cpp	2023-05-31 15:25:57 -07:00
Andrei Betlen	5bb1bc74d1	Fix type signature of token_to_str	2023-05-31 15:25:57 -07:00
Andrei Betlen	a7a6d88793	Fix ctypes typing issue for Arrays	2023-05-31 15:25:57 -07:00
Andrei Betlen	019650f416	Fix array type signatures	2023-05-31 15:25:57 -07:00
Andrei Betlen	a3da39af79	Bugfix: cross-platform method to find shared lib	2023-05-31 15:24:39 -07:00
Andrei Betlen	bd1c657f80	Bugfix: wrong signature for quantize function	2023-05-31 15:24:10 -07:00
Andrei Betlen	ef5a9a6160	Update llama.cpp and re-organize low-level api	2023-05-31 15:16:27 -07:00
Andrei Betlen	d9dfdec2bd	Initial commit (llama_cpp.py, llama-cpp-python)	2023-05-31 15:16:11 -07:00
Henri Vasserman	ffb06a345e	OpenLLaMA 3B support (#1588 ) This adds support to llama.cpp to load the model. Currently missing are changes that are required from convert.py to convert the model correctly. It needs some changes to start reading the JSON configuration for HF models instead of deriving the values by guessing. Co-authored-by: FNsi <125447286+FNsi@users.noreply.github.com>	2023-05-30 21:24:22 +03:00
Georgi Gerganov	7552ac5863	ggml : sync cgraph import / export API	2023-05-29 19:31:44 +03:00
Georgi Gerganov	5d1830b99d	ggml : fix bug in ggml_alibi	2023-05-29 19:30:49 +03:00
DannyDaemonic	248367605e	Work around for recalculating logits in cached prompts (Fixes #1585 ) (#1609 ) * Work around for recalculating logits in cached prompts	2023-05-29 05:13:40 -07:00
Jiří Podivín	0e730dd23b	Adding git in container package dependencies (#1621 ) Git added to build packages for version information in docker image Signed-off-by: Jiri Podivin <jpodivin@gmail.com>	2023-05-28 21:45:50 -07:00
Johannes Gäßler	3b126f654f	LLAMA_DEBUG adds debug symbols (#1617 )	2023-05-28 21:01:02 +02:00
Kerfuffle	1b78ed2081	Only show -ngl option when relevant + other doc/arg handling updates (#1625 ) 1. Add a `LLAMA_SUPPORTS_GPU_OFFLOAD` define to `llama.h` (defined when compiled with CLBlast or cuBLAS) 2. Update the argument handling in the common example code to only show the `-ngl`, `--n-gpu-layers` option when GPU offload is possible. 3. Add an entry for the `-ngl`, `--n-gpu-layers` option to the `main` and `server` examples documentation 4. Update `main` and `server` examples documentation to use the new style dash separator argument format 5. Update the `server` example to use dash separators for its arguments and adds `-ngl` to `--help` (only shown when compiled with appropriate support). It will still support `--memory_f32` and `--ctx_size` for compatibility. 6. Add a warning discouraging use of `--memory-f32` for the `main` and `server` examples `--help` text as well as documentation. Rationale: https://github.com/ggerganov/llama.cpp/discussions/1593#discussioncomment-6004356	2023-05-28 11:48:57 -06:00
Vladimir Zorin	337aea1139	examples : add --alias option to gpt_params to set use friendly model name (#1614 )	2023-05-28 20:14:24 +03:00
Howard Su	bb051d9723	opencl : no need to allocate cl_mem on heap (#1612 )	2023-05-28 20:13:36 +03:00
Howard Su	ca74884f66	opencl : use strstr to check if fp16 supported (#1611 ) * Use strstr to check if fp16 supported * Ensure ext_buffer is null terminated	2023-05-28 20:09:56 +03:00
apcameron	a6704643b6	ggml : add support for the RISCV architecture (#1616 )	2023-05-27 23:03:25 +03:00
Kerfuffle	0df7d63e5b	Include server in releases + other build system cleanups (#1610 ) Set `LLAMA_BUILD_SERVER` in workflow so the `server` example gets build. This currently only applies to Windows builds because it seems like only Windows binary artifacts are included in releases. Add `server` example target to `Makefile` (still uses `LLAMA_BUILD_SERVER` define and does not build by default) Fix issue where `vdot` binary wasn't removed when running `make clean`. Fix compile warnings in `server` example. Add `.hpp` files to trigger workflow (the server example has one).	2023-05-27 11:04:14 -06:00
Henri Vasserman	97c9b77c4f	Add documentation about CLBlast (#1604 ) Installing, compiling and using.	2023-05-27 18:47:55 +03:00
Henri Vasserman	0ecb1bbbeb	[CI] Fix openblas (#1613 ) * Fix OpenBLAS build * Fix `LLAMA_BLAS_VENDOR` CMake variable that should be a string and not a boolean.	2023-05-27 17:24:06 +03:00
Georgi Gerganov	93618031c7	ggml : add ggml_tensor_overhead()	2023-05-27 16:19:56 +03:00
Henri Vasserman	83c54e6da5	[CI] CLBlast: Fix directory name (#1606 )	2023-05-27 14:18:25 +02:00
Georgi Gerganov	bdbda1b17a	ggml : sync ggml core (minor additions, e.g. ggml_get_tensor_by_name())	2023-05-27 12:23:16 +03:00
Kerfuffle	66874d4fbc	Some improvements to loading the session with --prompt-cache (#1550 ) Improvements to loading the session with `--prompt-cache` in the `main` example. 1. Fix an issue where the `--seed` parameter was ignored when loading a cached prompt. 2. When loading a cached prompt, you previously had to specify the saved prompt (or a prefix of it) again. This pull changes that behavior to default to the prompt that was cached if a prompt wasn't specified by the user.	2023-05-25 20:18:01 -06:00

1 2 3 4 5 ...

639 commits