Commit graph

781 commits

Author SHA1 Message Date
Concedo
8a964e76c8 integrated mirostat as a launch parameter, works on all models 2023-05-06 00:47:17 +08:00
Concedo
851f55325a Merge remote-tracking branch 'temp/concedo' into concedo_experimental 2023-05-05 23:55:53 +08:00
Concedo
2edbcebe27 added optional force versioning flag 2023-05-05 22:02:00 +08:00
Concedo
39f3d1cf48 Merge branch 'master' into concedo_experimental
# Conflicts:
#	Makefile
#	README.md
#	examples/quantize/quantize.cpp
2023-05-05 21:34:33 +08:00
Ionoclast Laboratories
2d13786e91
Fix for OpenCL / clbast builds on macOS. (#1329) 2023-05-05 14:18:21 +02:00
Hendrik Langer
8131bc8b56 add new sampling algorithm mirostat 2023-05-05 13:23:47 +02:00
Benjamin Lecaillon
a90e96b266
Convert.py @staticmethod (#1327)
* Line 698 has one #staticmethod and should not

otherwise throw error at unpickle.load() as not callable

* Update convert.py

---------

Co-authored-by: Ivan Stepanov <ivanstepanovftw@gmail.com>
2023-05-05 03:17:07 +03:00
slaren
94c5652fc0
quantize: make output filename optional, default to ggml-model-<ftype>.bin (#1301) 2023-05-05 00:58:56 +02:00
Ivan Stepanov
34d9f22f44
Wrap exceptions in std::exception to verbose output on exception. (#1316) 2023-05-04 18:56:27 +02:00
Ivan Stepanov
d3e8093e9b
convert: support DT_BF16 tensors (#1309)
Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
2023-05-04 18:54:37 +02:00
44670
360cfe5bec
readme : add OpenBuddy link (#1321) 2023-05-04 19:33:31 +03:00
44670
2edbdb0f99
main : add --in-suffix option (#1318)
* adding --in-suffix option

* print input suffix before generation
2023-05-04 18:41:12 +03:00
Ron Jailall
20fbf2a2a0
ggml : change immintrin.h to intrin.h for compatibility (#1307)
* change immintrin.h to intrin.h for compatibility

Building on windows11 arm throws an error on this line. Seems like using intrin.h covers x86 and and arm

* conditional def of intrin.h

* fix typo in ggml.c
2023-05-04 18:05:59 +03:00
DannyDaemonic
db1080876a
Only escape prompts when used with -e (#1311) 2023-05-04 05:08:25 -07:00
DannyDaemonic
c65a7fbfa9
Update main's README.md with new features (#1296) 2023-05-04 03:02:59 -07:00
Tomas
f647ce040f
fix #1224 reverse prompt and multi line (#1297)
* fix reverse prompt and multi line

* Code Formatting

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-05-04 03:02:30 -07:00
Concedo
c8f7eeb7fd update kobold lite 2023-05-04 14:43:35 +08:00
Concedo
e01dc631f7 Merge branch 'master' into concedo_experimental
# Conflicts:
#	README.md
2023-05-04 14:04:41 +08:00
Concedo
7c129305f5 derp (+1 squashed commits)
Squashed commits:

[8fa8af7] suppress the rwkv Wwrite-strings warnings
2023-05-04 12:16:25 +08:00
Georgi Gerganov
799fdc1b5d
ggml : vectorize Q8_0 quantization
https://github.com/ggerganov/ggml/pull/127#issuecomment-1533648531
2023-05-03 23:24:20 +03:00
khimaros
6daa09d879
examples : read chat prompts from a template file (#1196) 2023-05-03 20:58:11 +03:00
Georgi Gerganov
bca9ad938a
minor : fix whitespaces (#1302) 2023-05-03 20:09:42 +03:00
Georgi Gerganov
e2a937ca6a
minor : fix trailing whitespaces 2023-05-03 18:43:23 +03:00
Concedo
ede8e4edbb Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	README.md
2023-05-03 23:34:50 +08:00
KASR
b0c71c7b6d
scripts : platform independent script to verify sha256 checksums (#1203)
* python script to verify the checksum of the llama models

Added Python script for verifying SHA256 checksums of files in a directory, which can run on multiple platforms. Improved the formatting of the output results for better readability.

* Update README.md

update to the readme for improved readability and to explain the usage of the python checksum verification script

* update the verification script

I've extended the script based on suggestions by @prusnak

The script now checks the available RAM, is there is enough to check the file at once it will do so. If not the file is read in chunks.

* minor improvment

small change so that the available ram is checked and not the total ram

* remove the part of the code that reads the file at once if enough ram is available

based on suggestions from @prusnak i removed the part of the code that checks whether the user had enough ram to read the entire model at once. the file is now always read in chunks.

* Update verify-checksum-models.py

quick fix to pass the git check
2023-05-03 18:31:28 +03:00
CRD716
a8a2efdc81
examples : various prompt and example fixes (#1298)
* fix dan.txt

* miku prompt improvements

* use common characters
2023-05-03 18:26:47 +03:00
Concedo
105f818d45 integrated new version of rwkv from upstream 2023-05-03 23:26:39 +08:00
Concedo
4857739ab5 allow specifying a different thread count for GPU blas 2023-05-03 21:19:59 +08:00
Concedo
89044502fe just use RT 2023-05-03 11:07:36 +08:00
Evan Jones
e216aa0463
llama : only copy used KV cache in get / set state (#1272)
* llama : only copy used KV cache in get / set state

* switch to ggml for copying k, v

* avoid designated initializers
2023-05-02 22:26:13 -04:00
Concedo
f43a63235b priority adjustment for linux fixed 2023-05-03 10:16:43 +08:00
DannyDaemonic
2485d7a4d3
Process escape sequences given in prompts (#1173) 2023-05-02 18:46:20 -07:00
DannyDaemonic
13b0c68ed7
Handle signals properly on Windows (#1123) 2023-05-02 18:01:57 -07:00
DannyDaemonic
55bc5f0900
Call sh on build-info.sh (#1294) 2023-05-02 17:52:35 -07:00
kuvaus
9daff419f6
fix build-info.h for git submodules (#1289)
* make git build info work with submodules

---------

Co-authored-by: Green Sky <green@g-s.xyz>
2023-05-03 02:43:43 +02:00
slaren
bf4b22ffe4
fix missing parameters in llama_init_from_gpt_params (#1293) 2023-05-03 01:36:45 +02:00
Ron Evans
67c77799e0
examples : add llama_init_from_gpt_params() common function (#1290)
Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-02 23:39:51 +03:00
Georgi Gerganov
0e6cbff1b7
llama : fix compile warnings 2023-05-02 23:09:08 +03:00
Georgi Gerganov
5d5817ca60
ggml : fix 32-bit ARM 2023-05-02 22:14:50 +03:00
Ron Evans
8c9be35ff9
examples : improve vertical alignment of a few variables (#1286)
Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-02 20:53:52 +03:00
Marvin Gießing
cc0bb7235c
ggml : fix ppc64le build error and make cmake detect Power processors (#1284)
* Fix ppc64le build issue

* Added support to detect ppc64* processors
2023-05-02 19:42:16 +03:00
Robert Brisita
2bb992f034
llama : allow 0 as a seed number. (#1275) 2023-05-02 19:23:44 +03:00
Ron Evans
e2cd506999
main : switch input_noecho to input_echo to remove negation (#979)
Signed-off-by: deadprogram <ron@hybridgroup.com>
2023-05-02 19:13:26 +03:00
Concedo
966cd2ce91 Merge remote-tracking branch 'temp/concedo' into concedo_experimental
# Conflicts:
#	koboldcpp.py
2023-05-02 22:43:34 +08:00
Concedo
58f25dce86 added flag to increase processs priority 2023-05-02 22:26:55 +08:00
slaren
2d099e5193
ggml: add names to tensors (#1268)
* ggml: add names to tensors

* minor improvements to dot file formatting
2023-05-02 16:03:00 +02:00
Sergey Kucher
069b3d4c37 Adds --mlock argument 2023-05-02 16:19:37 +03:00
Concedo
5a10ea50da up ver 2023-05-02 18:19:08 +08:00
Concedo
9a9b217e57 updated embedded kobold lite with multiuser chat 2023-05-02 18:18:05 +08:00
Concedo
6f702f2700 fixed stop sequence crash 2023-05-02 14:56:50 +08:00