Commit graph

2233 commits

Author SHA1 Message Date
Concedo
ca8b315202 increase context for gguf to 32k, horde worker stats, fixed glitch in horde launcher ui, oai freq penalty, updated lite 2023-09-28 23:50:08 +08:00
Kevin Ji
45855b3f1c
docs : mark code as Bash (#3375) 2023-09-28 09:11:32 -04:00
Pierre Alexandre SCHEMBRI
4aea3b846e
readme : add Mistral AI release 0.1 (#3362) 2023-09-28 15:13:37 +03:00
slaren
da0400344b
ggml-cuda : perform cublas fp16 matrix multiplication as fp16 (#3370)
Some checks failed
Code Coverage / run (push) Has been cancelled
* ggml-cuda : perform cublas fp16 matrix multiplication as fp16

* try to fix rocm build

* restrict fp16 mat mul to volta and up
2023-09-28 13:08:28 +03:00
Concedo
6a821b268a improved SSE streamiing 2023-09-28 17:33:34 +08:00
Zhang Peiyuan
e519621010
convert : remove bug in convert.py permute function (#3364) 2023-09-27 20:45:20 +02:00
Richard Roberson
ac43576124
make-ggml.py : compatibility with more models and GGUF (#3290)
* Resync my fork with new llama.cpp commits

* examples : rename to use dash instead of underscore

* New model conversions

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-27 19:25:12 +03:00
Cebtenzzre
20c7e1e804
gguf : fix a few general keys (#3341) 2023-09-27 12:18:07 -04:00
Rickard Hallerbäck
dc6897404e
metal : reusing llama.cpp logging (#3152)
* metal : reusing llama.cpp logging

* cmake : build fix

* metal : logging callback

* metal : logging va_args memory fix

* metal : minor cleanup

* metal : setting function like logging macro to capital letters

* llama.cpp : trailing whitespace fix

* ggml : log level enum used by llama

* Makefile : cleanup ggml-metal recipe

* ggml : ggml_log_callback typedef

* ggml : minor

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-27 18:48:33 +03:00
Jag Chadha
527e57cfd8
build : add ACCELERATE_NEW_LAPACK to fix warning on macOS Sonoma (#3342) 2023-09-27 18:34:32 +03:00
BarfingLemurs
ffe88a36a9
readme : add some recent perplexity and bpw measurements to READMES, link for k-quants (#3340)
* Update README.md

* Update README.md

* Update README.md with k-quants bpw measurements
2023-09-27 18:30:36 +03:00
Concedo
38d4c6cedd updated lite 2023-09-27 16:06:17 +08:00
Concedo
cf31658cbf added a flag to keep console in foreground 2023-09-27 01:53:30 +08:00
Concedo
74edc401c1 Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	README.md
#	flake.nix
#	scripts/build-info.cmake
#	scripts/verify-checksum-models.py
2023-09-27 01:30:15 +08:00
Concedo
eb86cd4027 bump token limits 2023-09-27 01:26:00 +08:00
Concedo
8bf6f7f8b0 added simulated OAI endpoint 2023-09-27 00:49:24 +08:00
Concedo
7f112e2cd4 support genkeys in polled streaming 2023-09-26 23:46:07 +08:00
DAN™
99115f3fa6
cmake : fix build-info.h on MSVC (#3309) 2023-09-25 18:45:33 -04:00
2f38b454
1726f9626f
docs: Fix typo CLBlast_DIR var. (#3330) 2023-09-25 20:24:52 +02:00
Concedo
6c2134a860 improved makefile, allowing building without k quants 2023-09-25 22:10:47 +08:00
Erik Scholz
a98b1633d5
nix : add cuda, use a symlinked toolkit for cmake (#3202) 2023-09-25 13:48:30 +02:00
Concedo
17ee719c56 improved remotelink cmd, fixed lib unload, updated class.py 2023-09-25 17:50:00 +08:00
Concedo
fdadbd0fbb updated lite (+1 squashed commits)
Squashed commits:

[b4408c79] updated lite
2023-09-24 23:07:37 +08:00
Concedo
8ecf505d5d improved embedded horde worker (+2 squashed commit)
Squashed commit:

[99234379] improved embedded horde worker

[ebcd1968] update lite
2023-09-24 15:16:49 +08:00
slaren
c091cdfb24
llama-bench : add README (#3317)
* llama-bench : add README

* minor edit
2023-09-23 21:48:24 +02:00
Concedo
32cf02487e colab use mmq, update lite and ver 2023-09-23 23:32:00 +08:00
Cebtenzzre
51a7cf5c6e
examples : fix RoPE defaults to match PR #3240 (#3315) 2023-09-23 12:28:50 +03:00
Concedo
60098a176b update colab model 2023-09-23 16:30:40 +08:00
Concedo
bfc696fcc4 update lite, update ver 2023-09-23 12:35:23 +08:00
Kevin Ji
bedb92b603
scripts : use /usr/bin/env in shebang (#3313) 2023-09-22 23:52:23 -04:00
Concedo
bd2500db36 Merge branch 'master' into concedo_experimental
# Conflicts:
#	.github/workflows/build.yml
#	README.md
#	build.zig
#	flake.nix
2023-09-23 10:51:34 +08:00
Concedo
a64d182b8b sched yield fix again 2023-09-23 10:44:41 +08:00
Concedo
1f9e36c733 minor lite fixes 2023-09-23 09:37:49 +08:00
Concedo
de4e27904d clear reader copy on new gen 2023-09-23 00:13:19 +08:00
Lee Drake
bc9d3e3971
Update README.md (#3289)
* Update README.md

* Update README.md

Co-authored-by: slaren <slarengh@gmail.com>

---------

Co-authored-by: slaren <slarengh@gmail.com>
2023-09-21 21:00:24 +02:00
shibe2
36b904e200
ggml-opencl.cpp: Make private functions static (#3300) 2023-09-21 14:10:26 -04:00
Concedo
14295922f9 updated ver, updated lite (+1 squashed commits)
Squashed commits:

[891291bc] updated lite to v67
2023-09-21 17:44:01 +08:00
Edward Taylor
324f3403d5
zig : fix for updated c lib (#3259) 2023-09-21 12:08:20 +03:00
yuiseki
f56c418ab0
embedding : update README.md (#3224) 2023-09-21 11:57:40 +03:00
Johannes Gäßler
8185710a80
CUDA: use only 1 thread if fully offloaded (#2915) 2023-09-21 11:43:53 +03:00
Georgi Gerganov
7eb41179ed
readme : update hot topics 2023-09-20 20:48:22 +03:00
Cebtenzzre
a5661d7e71
llama : allow gguf RoPE keys to be overridden with defaults (#3240) 2023-09-20 12:12:47 -04:00
Cebtenzzre
65c2c1c5ab
benchmark-matmult : do not use integer abs() on a float (#3277) 2023-09-20 12:06:08 -04:00
Concedo
2dda63a4eb add tensor split field 2023-09-20 22:46:47 +08:00
kang
80834daecf
flake : Restore default package's buildInputs (#3262) 2023-09-20 15:48:22 +02:00
Concedo
712b8423f6 class.py changes 2023-09-20 21:27:49 +08:00
Concedo
b63cf223c9 add queue info 2023-09-20 21:07:21 +08:00
Concedo
0eb52cf6c2 Merge branch 'master' into concedo_experimental
# Conflicts:
#	Makefile
2023-09-20 21:01:34 +08:00
Concedo
006e87cb56 requirements txt 2023-09-20 21:00:23 +08:00
Alon
a40f2b656f
CI: FreeBSD fix (#3258)
* - freebsd ci: use qemu
2023-09-20 14:06:36 +02:00