Commit graph

518 commits

Author SHA1 Message Date
Gustavo Rocha Dias
a9253cdfba
fix - at some OSs the PyInstaller command is case sensitive, at lowercase it doen't work. (#81) 2023-04-18 17:39:06 +08:00
Concedo
8e923dc6e9 updated kobold lite 2023-04-17 21:33:57 +08:00
Concedo
1f4a69c051 version number api 2023-04-17 19:31:15 +08:00
Concedo
364e2736c9 Merge branch 'master' into concedo 2023-04-17 17:34:50 +08:00
Concedo
763ad172c0 arranged files, updated kobold lite, modified makefile for extra link args on linux, started RWKV implementation 2023-04-17 17:31:45 +08:00
slaren
47f61aaa5f
Fix: do not close file on mmap (#1017) 2023-04-16 21:27:38 +02:00
Concedo
9581171a9f updated embedded lite again 2023-04-16 22:42:51 +08:00
Concedo
bee6a401fd slight clarity fix 2023-04-16 22:04:19 +08:00
Concedo
96fb12cfa2 Merge branch 'master' into concedo 2023-04-16 21:59:05 +08:00
Concedo
c757fbee1d fixes to stopper tokens, fixed BLAS mode for GPT2 and GPTJ, updated kobold lite 2023-04-16 21:54:18 +08:00
Concedo
6548d3b3fb Added prints for stopping sequences, made makefile 1% friendlier to arch linux users 2023-04-16 20:43:17 +08:00
Georgi Gerganov
3173a62eb9
stdout : vertical align outputs for better readibility 2023-04-16 13:59:27 +03:00
Concedo
525184930d added a kobold API compatible implementation of stopping sequences 2023-04-16 18:37:49 +08:00
Pavol Rusnak
489537e6cf
examples: add missing <ctime> include for time() (#1011) 2023-04-16 10:13:00 +00:00
nanahi
2d3481c721
Fix msys2 build error and warnings (#1009) 2023-04-16 11:13:42 +02:00
Concedo
8bf2e50a11 converted the cl file to be a string literal instead 2023-04-16 15:57:30 +08:00
Concedo
5a4d1b5d15 Merge branch 'master' into concedo
# Conflicts:
#	CMakeLists.txt
#	Makefile
2023-04-16 14:08:23 +08:00
comex
74f5899df4
convert.py: Fix loading safetensors and ggml format on Windows (#991)
Calling `mmap.mmap` on Windows apparently resets the file offset of the
raw file object (and makes the BufferedReader return a *negative* file
offset).  For safetensors, avoid using the file offset after calling
mmap.  For GGML format, explicitly save and restore the offset.

Fixes #966.
2023-04-15 23:53:21 +02:00
Stephan Walter
2f7c8e014e
Fix potential int8 overflow in non-SIMD vec_dot (#986) 2023-04-15 18:28:56 +00:00
Concedo
ad5676810a merge CLBlast improvements - GPU dequant 2023-04-16 01:17:40 +08:00
Concedo
3e992eabb4 Merge remote-tracking branch 'occam/clblast-gpu-dequant' into concedo 2023-04-16 00:26:54 +08:00
Stephan Walter
0ad964631f
Refactor ggml.c for future tensor types (#1001) 2023-04-15 16:25:38 +00:00
Concedo
3eb1c1850e accept non positional model arg 2023-04-16 00:23:07 +08:00
0cc4m
57d046eeb6 Enable dequantization on GPU for ClBlast 2023-04-15 18:04:24 +02:00
Georgi Gerganov
e95b6554b4
ggml : add Q8_0 quantization for intermediate results (#951)
* ggml : add Q8_0 quantization for intermediate results

* quantize-stats : fix test + add it to Makefile default

* Q8: use int8_t, AVX/AVX2 optimizations

* ggml : fix quantize_row_q8_0() ARM_NEON rounding

* minor : updates after rebase to latest master

* quantize-stats : delete obsolete strings

* ggml : fix q4_1 dot func

---------

Co-authored-by: Stephan Walter <stephan@walter.name>
2023-04-15 17:53:22 +03:00
Georgi Gerganov
aa485cee33
ggml : use posix_memalign on non-Windows env 2023-04-15 14:25:45 +03:00
0cc4m
8fbfc80e03 Fix clblast device selection on Linux 2023-04-15 12:02:36 +02:00
Ivan Komarov
c12b14b77f
benchmark : fix result validation in benchmark-q4_0-matmult (#987) 2023-04-15 08:51:54 +03:00
katsu560
106faaf297
cmake : add finding the OpenBLAS header file (#992) 2023-04-15 08:51:11 +03:00
Concedo
d00b865eb1 Merge branch 'master' into concedo
# Conflicts:
#	.devops/full.Dockerfile
#	Makefile
#	flake.nix
2023-04-15 11:33:43 +08:00
Pavol Rusnak
c85e03d12e
Revert "main : alternative instruct mode (Vicuna support, etc.) (#863)" (#982)
This reverts commit f4d277ae17.
2023-04-14 22:58:43 +03:00
Pavol Rusnak
489093548c
py : bump sentencepiece to 0.1.98 to support Python 3.11 (#976) 2023-04-14 19:46:49 +00:00
Stephan Walter
93265e988a
make : fix dependencies, use auto variables (#983) 2023-04-14 22:39:48 +03:00
Pavol Rusnak
c56b715269
Expose type name from ggml (#970)
Avoid duplication of type names in utils

Co-authored-by: Håkon H. Hitland <haakon@likedan.net>
2023-04-14 20:05:37 +02:00
Concedo
ea5d01002f Merge branch 'concedo' of https://github.com/LostRuins/llamacpp-for-kobold into concedo 2023-04-15 01:14:10 +08:00
Concedo
8dc06c7ab3 Fixed compile error in OSX 2023-04-15 01:13:56 +08:00
AlpinDale
624dc8809e
Added openblas and clblas package names for debian (#63) 2023-04-15 01:08:56 +08:00
Concedo
c3b810868d fixed an offset bug? 2023-04-15 00:30:00 +08:00
Tomáš Pazdiora
f4d277ae17
main : alternative instruct mode (Vicuna support, etc.) (#863)
* Add support for configs, add configurable prefixes / suffixes, deprecate instruct mode, add stop prompt

* Add multiline mode, update text input.

* bugfix

* update implementation

* typos

* Change --multiline implementation to be toggled by EOF.

* bugfix

* default multiline mode

* add more configs

* update formating

* update formatting

* apply suggestions
2023-04-14 18:19:17 +03:00
Concedo
1b1c0730f5 Idk why people keep thinking its an error lol. 2023-04-14 22:58:45 +08:00
Concedo
1003c971ad update embedded kobold lite 2023-04-14 22:54:16 +08:00
Kerfuffle
c9a59b70a5
ggml : add unary and binary map operations (#874)
* GGML map ops proof of concept.

* Various cleanups.

Add handling for task setting.

Add handling for ggml_compute_backward.

Rename functions to ggml_map_unary_f32 and ggml_map_binary_f32

Fix compiler warnings related to casting function pointers and `void *`

Reorder functions and definitions based on the GGML op number.

Use typedefs for map op function pointer types.

* Fix position of map ops cases in ggml_compute_forward
2023-04-14 17:43:55 +03:00
Concedo
932d981222 more make targets 2023-04-14 21:54:18 +08:00
Concedo
a819f22cac Merge branch 'master' into concedo
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	README.md
#	flake.nix
2023-04-14 21:40:33 +08:00
Pavol Rusnak
a32f7acc9f
py : cleanup dependencies (#962)
after #545 we do not need torch, tqdm and requests in the dependencies
2023-04-14 15:37:11 +02:00
Concedo
8ad42a1102 read from inputs 2023-04-14 21:30:26 +08:00
Concedo
adb4df78d6 Added SmartContext mode, a way of prompt context manipulation that avoids frequent context recalculation. 2023-04-14 21:24:16 +08:00
Pavol Rusnak
43ffdefb74
py : fix flake8 and isort nitpicks (#960) 2023-04-14 14:23:21 +02:00
Georgi Gerganov
1623a6e9b4
ggml : minor 2023-04-14 13:31:29 +03:00
Georgi Gerganov
c14e0d2f23
ggml : always allocate buffers with size multiple of GGML_MEM_ALIGN 2023-04-14 13:31:15 +03:00