Commit graph

1065 commits

Author SHA1 Message Date
Concedo
7d159bacd7 updated kobold lite 2023-05-28 11:23:20 +08:00
apcameron
a6704643b6
ggml : add support for the RISCV architecture (#1616) 2023-05-27 23:03:25 +03:00
Concedo
dcc426e2de Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	CMakeLists.txt
#	Makefile
#	README.md
2023-05-28 01:08:39 +08:00
Kerfuffle
0df7d63e5b
Include server in releases + other build system cleanups (#1610)
Set `LLAMA_BUILD_SERVER` in workflow so the `server` example gets build. This currently only applies to Windows builds because it seems like only Windows binary artifacts are included in releases.

Add `server` example target to `Makefile` (still uses `LLAMA_BUILD_SERVER` define and does not build by default)

Fix issue where `vdot` binary wasn't removed when running `make clean`.

Fix compile warnings in `server` example.

Add `.hpp` files to trigger workflow (the server example has one).
2023-05-27 11:04:14 -06:00
Concedo
5d9f5b28a6 rwkv integration completed 2023-05-28 00:48:56 +08:00
Henri Vasserman
97c9b77c4f
Add documentation about CLBlast (#1604)
Installing, compiling and using.
2023-05-27 18:47:55 +03:00
Concedo
55e0fbf024 wip integrating new rwkv 2023-05-27 22:45:28 +08:00
Henri Vasserman
0ecb1bbbeb
[CI] Fix openblas (#1613)
* Fix OpenBLAS build

* Fix `LLAMA_BLAS_VENDOR` CMake variable that should be a string and not a boolean.
2023-05-27 17:24:06 +03:00
Georgi Gerganov
93618031c7
ggml : add ggml_tensor_overhead() 2023-05-27 16:19:56 +03:00
Henri Vasserman
83c54e6da5
[CI] CLBlast: Fix directory name (#1606) 2023-05-27 14:18:25 +02:00
Concedo
fe63bfdb0f Revert "allow 2048 blasbatchsize"
This reverts commit 94dc5c2324.
2023-05-27 18:13:27 +08:00
0cc4m
97c5cca4e5 OpenCL: Don't load gpu layers into RAM, add mul_f32 kernel 2023-05-27 12:00:56 +02:00
Concedo
94dc5c2324 allow 2048 blasbatchsize 2023-05-27 17:47:18 +08:00
Concedo
92a0d77712 Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
2023-05-27 17:44:14 +08:00
Concedo
abfdfb702e added top_a sampler 2023-05-27 17:32:37 +08:00
Georgi Gerganov
bdbda1b17a
ggml : sync ggml core (minor additions, e.g. ggml_get_tensor_by_name()) 2023-05-27 12:23:16 +03:00
0cc4m
ebc5d0651a Use events instead of clFinish, where possible 2023-05-27 10:03:35 +02:00
Concedo
01a0f206df added support for starcoder, which is basically gpt2 2023-05-27 13:35:40 +08:00
Concedo
6d7749c98f no difference 2023-05-27 12:42:19 +08:00
Concedo
bd4fe936f5 cleanup sampling code 2023-05-27 11:58:39 +08:00
Concedo
3c8f404243 integrated token probability viewer in debugmode 2023-05-26 16:40:26 +08:00
Kerfuffle
66874d4fbc
Some improvements to loading the session with --prompt-cache (#1550)
Improvements to loading the session with `--prompt-cache` in the `main` example.

1. Fix an issue where the `--seed` parameter was ignored when loading a cached prompt.
2. When loading a cached prompt, you previously had to specify the saved prompt (or a prefix of it) again. This pull changes that behavior to default to the prompt that was cached if a prompt wasn't specified by the user.
2023-05-25 20:18:01 -06:00
Johannes Gäßler
1fcdcc28b1
cuda : performance optimizations (#1530)
* xor hack

* block y dim

* loop unrolling

* Fixed cmake LLAMA_CUDA_BY option

* Removed hipblas compatibility code

* Define GGML_CUDA_DMMV_BLOCK_Y if not defined

* Fewer iters, more ops per iter

* Renamed DMMV X/Y compilation options
2023-05-26 00:07:29 +03:00
Concedo
8b8f2f4cf5 up ver to 1.25.1 2023-05-25 14:49:30 +08:00
Concedo
e6eeb234f1 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	README.md
2023-05-25 10:34:43 +08:00
Concedo
d2da155661 upgraded clblast 2023-05-25 10:18:12 +08:00
Concedo
37a34deaa0 added a second pyinstaller for my own use that uses a different python version. don't use this. 2023-05-24 23:34:11 +08:00
Concedo
bf482d1786 revert klite newline bug, trying to add win7 support 2023-05-24 22:21:01 +08:00
Concedo
844f92688a subpattern fix 2023-05-24 16:48:39 +08:00
Henri Vasserman
ac7876ac20
Update CLBlast to 1.6.0 (#1580)
* Update CLBlast to 1.6.0
2023-05-24 10:30:09 +03:00
Concedo
d04b3bbe5e disable mmap when failsafe mode selected from GUI 2023-05-24 15:04:17 +08:00
Evan Jones
c31bbe934b
readme : add docs for chat-persistent.sh (#1568)
* readme : add docs for chat-persistent.sh

* Update README.md
2023-05-24 09:24:01 +03:00
Senemu
1359b6aba5
chat-persistent.sh : use bracket expressions in grep (#1564) 2023-05-24 09:16:22 +03:00
Concedo
b314cbfb60 updated lite to support variable streaming lengths 2023-05-24 11:28:35 +08:00
Concedo
c97e10c50c Merge branch 'master' into concedo_experimental 2023-05-24 00:36:30 +08:00
Concedo
abb9ad789c fixed other arch 2023-05-24 00:20:43 +08:00
Maarten ter Huurne
7d873811f3
Fix handling of "invalid property" when creating OpenCL command queue (#1565)
The `clCreateCommandQueue()` function will return the code
`CL_INVALID_QUEUE_PROPERTIES` when passed unsupported properties,
not `CL_INVALID_PROPERTY` as the original code was checking for.
2023-05-23 19:01:15 +03:00
Concedo
0c0009e4b4 updated lite 2023-05-23 23:18:52 +08:00
Concedo
355007b019 added sampler seed 2023-05-23 21:52:26 +08:00
Concedo
cd4012c3ed minor fixes to debug logging, fixed a typo, added a new failsafe mode 2023-05-23 21:31:42 +08:00
Concedo
5bf9784381 Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	ggml-opencl.cpp
#	llama.cpp
2023-05-23 18:19:16 +08:00
0cc4m
2e6cd4b025
OpenCL Token Generation Acceleration (#1459)
* Move back to C++ for OpenCL

* Refactor OpenCL code to work more like the CUDA code, add missing functions

* Deduplicate dequant kernels

* Add OpenCL compile options

* Use compile args for preprocessing constants

* Restore default platform + device selection by id behavior

---------

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
Co-authored-by: Henri Vasserman <henv@hot.ee>
2023-05-23 00:33:24 +03:00
Concedo
7894e85788 fixed a bug in previous klite 2023-05-22 21:54:24 +08:00
Concedo
a05da31fe7 updated embedded lite 2023-05-22 20:58:54 +08:00
Concedo
e20e302e87 Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
2023-05-22 17:05:34 +08:00
Concedo
b9f06a7670 mavx only for windows by default, let them eat march native. 2023-05-22 16:48:55 +08:00
Concedo
981d5ba866 Merge remote-tracking branch 'occam/opencl-dev' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	CMakeLists.txt
#	Makefile
#	README.md
#	ggml-opencl.cpp
#	llama.cpp
#	otherarch/ggml_v2-opencl-legacy.c
2023-05-22 16:16:48 +08:00
Concedo
169a26d15f removed unused build targets 2023-05-22 13:53:10 +08:00
Concedo
587308a202 fixed some build errors on linux, changed icon resolution, added more error printing 2023-05-22 12:18:42 +08:00
Steward Garcia
7e4ea5beff
examples : add server example with REST API (#1443)
* Added httplib support

* Added readme for server example

* fixed some bugs

* Fix the build error on Macbook

* changed json11 to nlohmann-json

* removed some whitespaces

* remove trailing whitespace

* added support custom prompts and more functions

* some corrections and added as cmake option
2023-05-21 20:51:18 +03:00