Hendrik Langer
8131bc8b56
add new sampling algorithm mirostat
2023-05-05 13:23:47 +02:00
Concedo
c8f7eeb7fd
update kobold lite
2023-05-04 14:43:35 +08:00
Concedo
e01dc631f7
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# README.md
2023-05-04 14:04:41 +08:00
Concedo
7c129305f5
derp (+1 squashed commits)
...
Squashed commits:
[8fa8af7] suppress the rwkv Wwrite-strings warnings
2023-05-04 12:16:25 +08:00
Georgi Gerganov
799fdc1b5d
ggml : vectorize Q8_0 quantization
...
https://github.com/ggerganov/ggml/pull/127#issuecomment-1533648531
2023-05-03 23:24:20 +03:00
khimaros
6daa09d879
examples : read chat prompts from a template file ( #1196 )
2023-05-03 20:58:11 +03:00
Georgi Gerganov
bca9ad938a
minor : fix whitespaces ( #1302 )
2023-05-03 20:09:42 +03:00
Georgi Gerganov
e2a937ca6a
minor : fix trailing whitespaces
2023-05-03 18:43:23 +03:00
Concedo
ede8e4edbb
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# CMakeLists.txt
# Makefile
# README.md
2023-05-03 23:34:50 +08:00
KASR
b0c71c7b6d
scripts : platform independent script to verify sha256 checksums ( #1203 )
...
* python script to verify the checksum of the llama models
Added Python script for verifying SHA256 checksums of files in a directory, which can run on multiple platforms. Improved the formatting of the output results for better readability.
* Update README.md
update to the readme for improved readability and to explain the usage of the python checksum verification script
* update the verification script
I've extended the script based on suggestions by @prusnak
The script now checks the available RAM, is there is enough to check the file at once it will do so. If not the file is read in chunks.
* minor improvment
small change so that the available ram is checked and not the total ram
* remove the part of the code that reads the file at once if enough ram is available
based on suggestions from @prusnak i removed the part of the code that checks whether the user had enough ram to read the entire model at once. the file is now always read in chunks.
* Update verify-checksum-models.py
quick fix to pass the git check
2023-05-03 18:31:28 +03:00
CRD716
a8a2efdc81
examples : various prompt and example fixes ( #1298 )
...
* fix dan.txt
* miku prompt improvements
* use common characters
2023-05-03 18:26:47 +03:00
Concedo
105f818d45
integrated new version of rwkv from upstream
2023-05-03 23:26:39 +08:00
Concedo
4857739ab5
allow specifying a different thread count for GPU blas
2023-05-03 21:19:59 +08:00
Concedo
89044502fe
just use RT
2023-05-03 11:07:36 +08:00
Evan Jones
e216aa0463
llama : only copy used KV cache in get / set state ( #1272 )
...
* llama : only copy used KV cache in get / set state
* switch to ggml for copying k, v
* avoid designated initializers
2023-05-02 22:26:13 -04:00
Concedo
f43a63235b
priority adjustment for linux fixed
2023-05-03 10:16:43 +08:00
DannyDaemonic
2485d7a4d3
Process escape sequences given in prompts ( #1173 )
2023-05-02 18:46:20 -07:00
DannyDaemonic
13b0c68ed7
Handle signals properly on Windows ( #1123 )
2023-05-02 18:01:57 -07:00
DannyDaemonic
55bc5f0900
Call sh on build-info.sh ( #1294 )
2023-05-02 17:52:35 -07:00
kuvaus
9daff419f6
fix build-info.h for git submodules ( #1289 )
...
* make git build info work with submodules
---------
Co-authored-by: Green Sky <green@g-s.xyz>
2023-05-03 02:43:43 +02:00
slaren
bf4b22ffe4
fix missing parameters in llama_init_from_gpt_params
( #1293 )
2023-05-03 01:36:45 +02:00
Ron Evans
67c77799e0
examples : add llama_init_from_gpt_params() common function ( #1290 )
...
Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-02 23:39:51 +03:00
Georgi Gerganov
0e6cbff1b7
llama : fix compile warnings
2023-05-02 23:09:08 +03:00
Georgi Gerganov
5d5817ca60
ggml : fix 32-bit ARM
2023-05-02 22:14:50 +03:00
Ron Evans
8c9be35ff9
examples : improve vertical alignment of a few variables ( #1286 )
...
Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-02 20:53:52 +03:00
Marvin Gießing
cc0bb7235c
ggml : fix ppc64le build error and make cmake detect Power processors ( #1284 )
...
* Fix ppc64le build issue
* Added support to detect ppc64* processors
2023-05-02 19:42:16 +03:00
Robert Brisita
2bb992f034
llama : allow 0 as a seed number. ( #1275 )
2023-05-02 19:23:44 +03:00
Ron Evans
e2cd506999
main : switch input_noecho to input_echo to remove negation ( #979 )
...
Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-02 19:13:26 +03:00
Concedo
966cd2ce91
Merge remote-tracking branch 'temp/concedo' into concedo_experimental
...
# Conflicts:
# koboldcpp.py
2023-05-02 22:43:34 +08:00
Concedo
58f25dce86
added flag to increase processs priority
2023-05-02 22:26:55 +08:00
slaren
2d099e5193
ggml: add names to tensors ( #1268 )
...
* ggml: add names to tensors
* minor improvements to dot file formatting
2023-05-02 16:03:00 +02:00
Sergey Kucher
069b3d4c37
Adds --mlock argument
2023-05-02 16:19:37 +03:00
Concedo
5a10ea50da
up ver
2023-05-02 18:19:08 +08:00
Concedo
9a9b217e57
updated embedded kobold lite with multiuser chat
2023-05-02 18:18:05 +08:00
Concedo
6f702f2700
fixed stop sequence crash
2023-05-02 14:56:50 +08:00
Concedo
94827172e0
Merge branch 'master' into concedo
...
# Conflicts:
# CMakeLists.txt
# Makefile
# ggml-cuda.cu
# ggml-cuda.h
2023-05-02 14:38:31 +08:00
Concedo
433fa1e8b2
fix for stop sequence missing, added print for exception when loading GUI
2023-05-02 14:18:04 +08:00
Concedo
0703cdf2eb
remove cloudflare insights
2023-05-02 00:38:10 +08:00
DannyDaemonic
f4cef87edf
Add git-based build information for better issue tracking ( #1232 )
...
* Add git-based build information for better issue tracking
* macOS fix
* "build (hash)" and "CMAKE_SOURCE_DIR" changes
* Redo "CMAKE_CURRENT_SOURCE_DIR" and clearer build messages
* Fix conditional dependency on missing target
* Broke out build-info.cmake, added find_package fallback, and added build into to all examples, added dependencies to Makefile
* 4 space indenting for cmake, attempt to clean up my mess in Makefile
* Short hash, less fancy Makefile, and don't modify build-info.h if it wouldn't change it
2023-05-01 18:23:47 +02:00
slaren
58b367c2d7
cuBLAS: refactor and optimize f16 mat mul performance ( #1259 )
...
* cuBLAS: refactor, convert fp16 to fp32 on device
* cuBLAS: use multiple streams, choose smartly between mul_mat_q and mul_mat_f16
* fix build
* cuBLAS: update block_q5_1
2023-05-01 18:11:07 +02:00
xloem
ea3a0ad6b6
llama : update stubs for systems without mmap and mlock ( #1266 )
...
Co-authored-by: John Doe <john.doe@example.com>
2023-05-01 15:58:51 +03:00
Kerfuffle
2bdc09646d
ggml : fix ggml_used_mem() ( #1264 )
2023-05-01 14:56:07 +03:00
Georgi Gerganov
70269cae37
llama : fix session load / save ( #1263 )
2023-05-01 14:54:59 +03:00
slaren
b925f1f1b0
cuBLAS: fall back to pageable memory if pinned alloc fails ( #1233 )
...
* cuBLAS: fall back to pageable memory if pinned alloc fails
* cuBLAS: do not use pinned memory if env variable GGML_CUDA_NO_PINNED is set
2023-05-01 13:32:22 +02:00
Alex Klinkhamer
90b19bd6ee
llama : let context be const when accessing const data ( #1261 )
2023-05-01 10:24:20 +03:00
Concedo
4d38795563
add UI for token unbanning
2023-05-01 12:10:21 +08:00
Concedo
3de34ee492
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# CMakeLists.txt
# Makefile
# ggml-opencl.c
2023-05-01 12:03:46 +08:00
Concedo
560dacedbd
update readme
2023-05-01 11:41:25 +08:00
Georgi Gerganov
7ff0dcd320
ggml : fix UB (int << 31)
2023-04-30 22:28:51 +03:00
Pavol Rusnak
6f79699286
build: add armv{6,7,8} support to cmake ( #1251 )
...
- flags copied from Makefile
- updated comments in both CMakeLists.txt and Makefile to match reality
2023-04-30 20:48:38 +02:00