Commit graph

397 commits

Author SHA1 Message Date
ariez-xyz
b48255db19
add more precise instructions for arch 2023-04-08 10:41:57 +02:00
Concedo
1369b46bb7 notice about false positives 2023-04-08 12:20:48 +08:00
Concedo
d1c957ee64 strip symbols 2023-04-08 00:59:34 +08:00
Concedo
289c40df94 updated embedded kobold 2023-04-07 22:39:20 +08:00
Concedo
1abcdb2394 should not be static 2023-04-07 20:35:19 +08:00
Concedo
43949f7c7c Merge branch 'master' into concedo 2023-04-07 20:34:06 +08:00
Concedo
f322a5820e fixed positional port arg 2023-04-07 17:46:33 +08:00
Concedo
1d48db4f63 dont build quantize 2023-04-07 17:11:26 +08:00
Sergey Alirzaev
cc9cee8e9e
Do not crash when it has nothing to say. (#796)
Otherwise observing this in the interactive mode:
/usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/stl_vector.h:1230: reference std::vector<int>::back() [_Tp = int, _Alloc = std::allocator<int>]: Assertion '!this->empty()' failed.
2023-04-06 17:59:11 +02:00
Concedo
4f5faf9612 some users report that this repo is now being flagged as malicious?
no idea why, but I am removing all prebuilt binaries except libopenblas. windows users can still obtain it from /releases and osx and linux users can rebuild from source code.
2023-04-06 21:49:43 +08:00
Concedo
b56f872b61 update embedded kobold lite 2023-04-06 16:34:51 +08:00
Pavol Rusnak
d2beca95dc
Make docker instructions more explicit (#785) 2023-04-06 08:56:58 +02:00
Concedo
0e889ed6db Merge branch 'master' into concedo
# Conflicts:
#	.gitignore
#	Makefile
#	README.md
2023-04-06 11:14:44 +08:00
Concedo
3d650d0e25 remove dependency of psutil, fixed compile error on WSL, handle exceptions when sending http response, added multiline for embedded kobold 2023-04-06 11:08:19 +08:00
Georgi Gerganov
eeaa7b0492
ggml : multi-thread ggml_rope() (~3-4 times faster on M1) (#781) 2023-04-05 22:11:03 +03:00
Georgi Gerganov
986b6ce9f9
ggml, llama : avoid heavy V transpose + improvements (#775)
ggml :

- added ggml_view_3d()
- ggml_view_tensor() now inherits the stride too
- reimplement ggml_cpy() to account for dst stride
- no longer require tensor->data to be memory aligned

llama :

- compute RoPE on 32-bit tensors (should be more accurate)
- store RoPE-ed K in the KV cache
- store transposed V in the KV cache (significant speed-up)
- avoid unnecessary Q copy
2023-04-05 22:07:33 +03:00
Georgi Gerganov
3416298929
Update README.md 2023-04-05 19:54:30 +03:00
Ivan Stepanov
5a8c4f6240
llama : define non-positive top_k; top_k range check (#779)
* Define non-positive top_k; top_k range check

* minor : brackets

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-04-05 19:20:05 +03:00
at8u
ff05d05c96
miku.sh : add executable bit (#780) 2023-04-05 18:59:13 +03:00
Georgi Gerganov
62b3e81aae
media : add logos and banners 2023-04-05 18:58:31 +03:00
Georgi Gerganov
8d10406d6e
readme : change logo + add bindings + add uis + add wiki 2023-04-05 18:56:20 +03:00
iacore
ed1c214e66
zig : add build.zig (#773)
Co-authored-by: Locria Cyber <74560659+locriacyber@users.noreply.github.com>
2023-04-05 18:06:02 +03:00
Ivan Stepanov
0c44427df1
make : missing host optimizations in CXXFLAGS (#763) 2023-04-05 17:38:37 +03:00
Adithya Balaji
594cc95fab
readme : update with CMake and windows example (#748)
* README: Update with CMake and windows example

* README: update with code-review for cmake build
2023-04-05 17:36:12 +03:00
at8u
88ed5761b8
examples : add Miku.sh (#724)
* Add Miku.sh to examples

* Add missing line to prompt in Miku.sh

* Add --keep param to Miku.sh

* Remove '[end_of_conversation]' line from Miku.sh

No longer is necessary.
2023-04-05 17:32:42 +03:00
Andrew Duffy
58c438cf7d
Add Accelerate/BLAS when using Swift (#765) 2023-04-05 06:44:24 -04:00
Concedo
5c1920df43 why nobody ever told me the makefile doesnt work outside x86 xD 2023-04-05 17:15:42 +08:00
Concedo
1490cdd71d change GPT-J and GPT2 KVs to use fp16 instead 2023-04-05 15:53:07 +08:00
Concedo
57e9f929ee renamed misnamed ACCELERATE define, and removed all -march=native and -mtune=native flags 2023-04-05 15:22:13 +08:00
Concedo
14273fea7a integrated gpt2 support 2023-04-04 23:15:47 +08:00
Concedo
52de932842 removed main.exe to reduce clutter, added support for rep pen in gptj 2023-04-04 20:43:13 +08:00
Concedo
9c0dbbb08b Merge branch 'master' into concedo 2023-04-04 00:51:05 +08:00
Concedo
dd2abd8bc7 lower default thread threshold 2023-04-04 00:42:49 +08:00
mgroeber9110
53dbba7695
Windows: reactive sigint handler after each Ctrl-C (#736) 2023-04-03 18:00:55 +02:00
SebastianApel
437e77855a
10+% performance improvement of ggml_vec_dot_q4_0 on AVX2 (#654)
* Performance improvement of AVX2 code
* Fixed problem with MSVC compiler
* Reviewer comments: removed double semicolon, deleted empty line 1962
2023-04-03 09:52:28 +02:00
Concedo
06c711d770 Merge branch 'master' into concedo
# Conflicts:
#	.devops/full.Dockerfile
#	README.md
2023-04-03 15:10:08 +08:00
Concedo
eb5b22dda2 rebrand to koboldcpp 2023-04-03 10:35:18 +08:00
Ivan Stepanov
cd7fa95690
Define non-positive temperature behavior (#720) 2023-04-03 02:19:04 +02:00
bsilvereagle
a0c0516416
Remove torch GPU dependencies from the Docker.full image (#665)
By using `pip install torch --index-url https://download.pytorch.org/whl/cpu`
instead of `pip install torch` we can specify we want to install a CPU-only version
of PyTorch without any GPU dependencies. This reduces the size of the Docker image
from 7.32 GB to 1.62 GB
2023-04-03 00:13:03 +02:00
Concedo
8dd8ab1659 Various enhancement and integration pygmalion.cpp 2023-04-03 00:04:43 +08:00
Thatcher Chamberlin
d8d4e865cd
Add a missing step to the gpt4all instructions (#690)
`migrate-ggml-2023-03-30-pr613.py` is needed to get gpt4all running.
2023-04-02 12:48:57 +02:00
Christian Falch
e986f94829
Added api for getting/setting the kv_cache (#685)
The api provides access methods for retrieving the current memory buffer for the kv_cache and its token number.
It also contains a method for setting the kv_cache from a memory buffer.

This makes it possible to load/save history - maybe support --cache-prompt paramater as well?

Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
2023-04-02 12:23:04 +02:00
Marian Cepok
c0bb1d3ce2
ggml : change ne to int64_t (#626) 2023-04-02 13:21:31 +03:00
Concedo
3f4967b827 added new binaries 2023-04-02 17:14:38 +08:00
Concedo
bb965cc120 Merge branch 'master' into concedo
# Conflicts:
#	README.md
2023-04-02 17:13:28 +08:00
Concedo
9aabb0d9db massive refactor completed, GPT-J integrated 2023-04-02 17:03:30 +08:00
Leonardo Neumann
6e7801d08d
examples : add gpt4all script (#658) 2023-04-02 10:56:20 +03:00
Stephan Walter
81040f10aa
llama : do not allocate KV cache for "vocab_only == true" (#682)
Fixes sanitizer CI
2023-04-02 10:18:53 +03:00
Fabian
c4f89d8d73
make : use -march=native -mtune=native on x86 (#609) 2023-04-02 10:17:05 +03:00
Murilo Santana
5b70e7de4c
fix default params for examples/main (#697) 2023-04-02 04:41:12 +02:00