Concedo
fe63bfdb0f
Revert "allow 2048 blasbatchsize"
...
This reverts commit 94dc5c2324
.
2023-05-27 18:13:27 +08:00
Concedo
94dc5c2324
allow 2048 blasbatchsize
2023-05-27 17:47:18 +08:00
Concedo
92a0d77712
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# CMakeLists.txt
# Makefile
2023-05-27 17:44:14 +08:00
Concedo
abfdfb702e
added top_a sampler
2023-05-27 17:32:37 +08:00
Georgi Gerganov
bdbda1b17a
ggml : sync ggml core (minor additions, e.g. ggml_get_tensor_by_name())
2023-05-27 12:23:16 +03:00
Concedo
01a0f206df
added support for starcoder, which is basically gpt2
2023-05-27 13:35:40 +08:00
Concedo
6d7749c98f
no difference
2023-05-27 12:42:19 +08:00
Concedo
bd4fe936f5
cleanup sampling code
2023-05-27 11:58:39 +08:00
Concedo
3c8f404243
integrated token probability viewer in debugmode
2023-05-26 16:40:26 +08:00
Kerfuffle
66874d4fbc
Some improvements to loading the session with --prompt-cache ( #1550 )
...
Improvements to loading the session with `--prompt-cache` in the `main` example.
1. Fix an issue where the `--seed` parameter was ignored when loading a cached prompt.
2. When loading a cached prompt, you previously had to specify the saved prompt (or a prefix of it) again. This pull changes that behavior to default to the prompt that was cached if a prompt wasn't specified by the user.
2023-05-25 20:18:01 -06:00
Johannes Gäßler
1fcdcc28b1
cuda : performance optimizations ( #1530 )
...
* xor hack
* block y dim
* loop unrolling
* Fixed cmake LLAMA_CUDA_BY option
* Removed hipblas compatibility code
* Define GGML_CUDA_DMMV_BLOCK_Y if not defined
* Fewer iters, more ops per iter
* Renamed DMMV X/Y compilation options
2023-05-26 00:07:29 +03:00
Concedo
8b8f2f4cf5
up ver to 1.25.1
2023-05-25 14:49:30 +08:00
Concedo
e6eeb234f1
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# README.md
2023-05-25 10:34:43 +08:00
Concedo
d2da155661
upgraded clblast
2023-05-25 10:18:12 +08:00
Concedo
37a34deaa0
added a second pyinstaller for my own use that uses a different python version. don't use this.
2023-05-24 23:34:11 +08:00
Concedo
bf482d1786
revert klite newline bug, trying to add win7 support
2023-05-24 22:21:01 +08:00
Concedo
844f92688a
subpattern fix
2023-05-24 16:48:39 +08:00
Henri Vasserman
ac7876ac20
Update CLBlast to 1.6.0 ( #1580 )
...
* Update CLBlast to 1.6.0
2023-05-24 10:30:09 +03:00
Concedo
d04b3bbe5e
disable mmap when failsafe mode selected from GUI
2023-05-24 15:04:17 +08:00
Evan Jones
c31bbe934b
readme : add docs for chat-persistent.sh ( #1568 )
...
* readme : add docs for chat-persistent.sh
* Update README.md
2023-05-24 09:24:01 +03:00
Senemu
1359b6aba5
chat-persistent.sh : use bracket expressions in grep ( #1564 )
2023-05-24 09:16:22 +03:00
Concedo
b314cbfb60
updated lite to support variable streaming lengths
2023-05-24 11:28:35 +08:00
Concedo
c97e10c50c
Merge branch 'master' into concedo_experimental
2023-05-24 00:36:30 +08:00
Concedo
abb9ad789c
fixed other arch
2023-05-24 00:20:43 +08:00
Maarten ter Huurne
7d873811f3
Fix handling of "invalid property" when creating OpenCL command queue ( #1565 )
...
The `clCreateCommandQueue()` function will return the code
`CL_INVALID_QUEUE_PROPERTIES` when passed unsupported properties,
not `CL_INVALID_PROPERTY` as the original code was checking for.
2023-05-23 19:01:15 +03:00
Concedo
0c0009e4b4
updated lite
2023-05-23 23:18:52 +08:00
Concedo
355007b019
added sampler seed
2023-05-23 21:52:26 +08:00
Concedo
cd4012c3ed
minor fixes to debug logging, fixed a typo, added a new failsafe mode
2023-05-23 21:31:42 +08:00
Concedo
5bf9784381
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# CMakeLists.txt
# Makefile
# ggml-opencl.cpp
# llama.cpp
2023-05-23 18:19:16 +08:00
0cc4m
2e6cd4b025
OpenCL Token Generation Acceleration ( #1459 )
...
* Move back to C++ for OpenCL
* Refactor OpenCL code to work more like the CUDA code, add missing functions
* Deduplicate dequant kernels
* Add OpenCL compile options
* Use compile args for preprocessing constants
* Restore default platform + device selection by id behavior
---------
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
Co-authored-by: Henri Vasserman <henv@hot.ee>
2023-05-23 00:33:24 +03:00
Concedo
7894e85788
fixed a bug in previous klite
2023-05-22 21:54:24 +08:00
Concedo
a05da31fe7
updated embedded lite
2023-05-22 20:58:54 +08:00
Concedo
e20e302e87
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# CMakeLists.txt
# Makefile
2023-05-22 17:05:34 +08:00
Concedo
b9f06a7670
mavx only for windows by default, let them eat march native.
2023-05-22 16:48:55 +08:00
Concedo
981d5ba866
Merge remote-tracking branch 'occam/opencl-dev' into concedo_experimental
...
# Conflicts:
# .github/workflows/build.yml
# CMakeLists.txt
# Makefile
# README.md
# ggml-opencl.cpp
# llama.cpp
# otherarch/ggml_v2-opencl-legacy.c
2023-05-22 16:16:48 +08:00
Concedo
169a26d15f
removed unused build targets
2023-05-22 13:53:10 +08:00
Concedo
587308a202
fixed some build errors on linux, changed icon resolution, added more error printing
2023-05-22 12:18:42 +08:00
Steward Garcia
7e4ea5beff
examples : add server example with REST API ( #1443 )
...
* Added httplib support
* Added readme for server example
* fixed some bugs
* Fix the build error on Macbook
* changed json11 to nlohmann-json
* removed some whitespaces
* remove trailing whitespace
* added support custom prompts and more functions
* some corrections and added as cmake option
2023-05-21 20:51:18 +03:00
Concedo
fea84c3cf5
fix for stupid msvc compiler
2023-05-21 22:41:33 +08:00
Stefan Sydow
7780e4f479
make : .PHONY clean ( #1553 )
2023-05-21 17:03:44 +03:00
Concedo
60e0c67874
fix compile errors on cuda
2023-05-21 21:13:17 +08:00
Concedo
33528f5b1d
fix for cublas
2023-05-21 21:03:36 +08:00
Concedo
994be9a4db
fix for cublas
2023-05-21 21:02:21 +08:00
Concedo
24127ebf98
updated lite, fixed some encoding issues
2023-05-21 17:29:00 +08:00
Georgi Gerganov
265db9834e
ggml : output 3d sizes in ggml_graph_dump_dot()
2023-05-21 11:56:23 +03:00
0cc4m
18e9dd87da
Explicitely set GEMM type
2023-05-21 08:34:17 +02:00
0cc4m
b6b39960c0
Use compile args for preprocessing constants
2023-05-21 08:17:17 +02:00
0cc4m
a1657d0233
Add OpenCL compile options
2023-05-21 07:53:22 +02:00
0cc4m
e41a7ae40c
Fix convert_row_f16 kernel issue
2023-05-21 07:53:22 +02:00
0cc4m
457eff920e
Deduplicate dequant kernels
2023-05-21 07:53:22 +02:00