Concedo
1369b46bb7
notice about false positives
2023-04-08 12:20:48 +08:00
Concedo
d1c957ee64
strip symbols
2023-04-08 00:59:34 +08:00
Concedo
289c40df94
updated embedded kobold
2023-04-07 22:39:20 +08:00
Concedo
1abcdb2394
should not be static
2023-04-07 20:35:19 +08:00
Concedo
43949f7c7c
Merge branch 'master' into concedo
2023-04-07 20:34:06 +08:00
Concedo
f322a5820e
fixed positional port arg
2023-04-07 17:46:33 +08:00
Concedo
1d48db4f63
dont build quantize
2023-04-07 17:11:26 +08:00
Sergey Alirzaev
cc9cee8e9e
Do not crash when it has nothing to say. ( #796 )
...
Otherwise observing this in the interactive mode:
/usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/stl_vector.h:1230: reference std::vector<int>::back() [_Tp = int, _Alloc = std::allocator<int>]: Assertion '!this->empty()' failed.
2023-04-06 17:59:11 +02:00
Concedo
4f5faf9612
some users report that this repo is now being flagged as malicious?
...
no idea why, but I am removing all prebuilt binaries except libopenblas. windows users can still obtain it from /releases and osx and linux users can rebuild from source code.
2023-04-06 21:49:43 +08:00
Concedo
b56f872b61
update embedded kobold lite
2023-04-06 16:34:51 +08:00
Pavol Rusnak
d2beca95dc
Make docker instructions more explicit ( #785 )
2023-04-06 08:56:58 +02:00
Concedo
0e889ed6db
Merge branch 'master' into concedo
...
# Conflicts:
# .gitignore
# Makefile
# README.md
2023-04-06 11:14:44 +08:00
Concedo
3d650d0e25
remove dependency of psutil, fixed compile error on WSL, handle exceptions when sending http response, added multiline for embedded kobold
2023-04-06 11:08:19 +08:00
Georgi Gerganov
eeaa7b0492
ggml : multi-thread ggml_rope() (~3-4 times faster on M1) ( #781 )
2023-04-05 22:11:03 +03:00
Georgi Gerganov
986b6ce9f9
ggml, llama : avoid heavy V transpose + improvements ( #775 )
...
ggml :
- added ggml_view_3d()
- ggml_view_tensor() now inherits the stride too
- reimplement ggml_cpy() to account for dst stride
- no longer require tensor->data to be memory aligned
llama :
- compute RoPE on 32-bit tensors (should be more accurate)
- store RoPE-ed K in the KV cache
- store transposed V in the KV cache (significant speed-up)
- avoid unnecessary Q copy
2023-04-05 22:07:33 +03:00
Georgi Gerganov
3416298929
Update README.md
2023-04-05 19:54:30 +03:00
Ivan Stepanov
5a8c4f6240
llama : define non-positive top_k; top_k range check ( #779 )
...
* Define non-positive top_k; top_k range check
* minor : brackets
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-04-05 19:20:05 +03:00
at8u
ff05d05c96
miku.sh : add executable bit ( #780 )
2023-04-05 18:59:13 +03:00
Georgi Gerganov
62b3e81aae
media : add logos and banners
2023-04-05 18:58:31 +03:00
Georgi Gerganov
8d10406d6e
readme : change logo + add bindings + add uis + add wiki
2023-04-05 18:56:20 +03:00
iacore
ed1c214e66
zig : add build.zig ( #773 )
...
Co-authored-by: Locria Cyber <74560659+locriacyber@users.noreply.github.com>
2023-04-05 18:06:02 +03:00
Ivan Stepanov
0c44427df1
make : missing host optimizations in CXXFLAGS ( #763 )
2023-04-05 17:38:37 +03:00
Adithya Balaji
594cc95fab
readme : update with CMake and windows example ( #748 )
...
* README: Update with CMake and windows example
* README: update with code-review for cmake build
2023-04-05 17:36:12 +03:00
at8u
88ed5761b8
examples : add Miku.sh ( #724 )
...
* Add Miku.sh to examples
* Add missing line to prompt in Miku.sh
* Add --keep param to Miku.sh
* Remove '[end_of_conversation]' line from Miku.sh
No longer is necessary.
2023-04-05 17:32:42 +03:00
Andrew Duffy
58c438cf7d
Add Accelerate/BLAS when using Swift ( #765 )
2023-04-05 06:44:24 -04:00
Concedo
5c1920df43
why nobody ever told me the makefile doesnt work outside x86 xD
2023-04-05 17:15:42 +08:00
Concedo
1490cdd71d
change GPT-J and GPT2 KVs to use fp16 instead
2023-04-05 15:53:07 +08:00
Concedo
57e9f929ee
renamed misnamed ACCELERATE define, and removed all -march=native and -mtune=native flags
2023-04-05 15:22:13 +08:00
Concedo
14273fea7a
integrated gpt2 support
2023-04-04 23:15:47 +08:00
Concedo
52de932842
removed main.exe to reduce clutter, added support for rep pen in gptj
2023-04-04 20:43:13 +08:00
Concedo
9c0dbbb08b
Merge branch 'master' into concedo
2023-04-04 00:51:05 +08:00
Concedo
dd2abd8bc7
lower default thread threshold
2023-04-04 00:42:49 +08:00
mgroeber9110
53dbba7695
Windows: reactive sigint handler after each Ctrl-C ( #736 )
2023-04-03 18:00:55 +02:00
SebastianApel
437e77855a
10+% performance improvement of ggml_vec_dot_q4_0 on AVX2 ( #654 )
...
* Performance improvement of AVX2 code
* Fixed problem with MSVC compiler
* Reviewer comments: removed double semicolon, deleted empty line 1962
2023-04-03 09:52:28 +02:00
Concedo
06c711d770
Merge branch 'master' into concedo
...
# Conflicts:
# .devops/full.Dockerfile
# README.md
2023-04-03 15:10:08 +08:00
Concedo
eb5b22dda2
rebrand to koboldcpp
2023-04-03 10:35:18 +08:00
Ivan Stepanov
cd7fa95690
Define non-positive temperature behavior ( #720 )
2023-04-03 02:19:04 +02:00
bsilvereagle
a0c0516416
Remove torch GPU dependencies from the Docker.full image ( #665 )
...
By using `pip install torch --index-url https://download.pytorch.org/whl/cpu `
instead of `pip install torch` we can specify we want to install a CPU-only version
of PyTorch without any GPU dependencies. This reduces the size of the Docker image
from 7.32 GB to 1.62 GB
2023-04-03 00:13:03 +02:00
Concedo
8dd8ab1659
Various enhancement and integration pygmalion.cpp
2023-04-03 00:04:43 +08:00
Thatcher Chamberlin
d8d4e865cd
Add a missing step to the gpt4all instructions ( #690 )
...
`migrate-ggml-2023-03-30-pr613.py` is needed to get gpt4all running.
2023-04-02 12:48:57 +02:00
Christian Falch
e986f94829
Added api for getting/setting the kv_cache ( #685 )
...
The api provides access methods for retrieving the current memory buffer for the kv_cache and its token number.
It also contains a method for setting the kv_cache from a memory buffer.
This makes it possible to load/save history - maybe support --cache-prompt paramater as well?
Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
2023-04-02 12:23:04 +02:00
Marian Cepok
c0bb1d3ce2
ggml : change ne to int64_t ( #626 )
2023-04-02 13:21:31 +03:00
Concedo
3f4967b827
added new binaries
2023-04-02 17:14:38 +08:00
Concedo
bb965cc120
Merge branch 'master' into concedo
...
# Conflicts:
# README.md
2023-04-02 17:13:28 +08:00
Concedo
9aabb0d9db
massive refactor completed, GPT-J integrated
2023-04-02 17:03:30 +08:00
Leonardo Neumann
6e7801d08d
examples : add gpt4all script ( #658 )
2023-04-02 10:56:20 +03:00
Stephan Walter
81040f10aa
llama : do not allocate KV cache for "vocab_only == true" ( #682 )
...
Fixes sanitizer CI
2023-04-02 10:18:53 +03:00
Fabian
c4f89d8d73
make : use -march=native -mtune=native on x86 ( #609 )
2023-04-02 10:17:05 +03:00
Murilo Santana
5b70e7de4c
fix default params for examples/main ( #697 )
2023-04-02 04:41:12 +02:00
Concedo
b1f08813e3
added support for gpt4all original format
2023-04-02 00:53:46 +08:00