Concedo
1543c700d8
added a missing endpoint for tavern
2023-04-09 17:41:33 +08:00
Concedo
b91abc3316
increase default blas batch size
2023-04-09 15:27:43 +08:00
Concedo
4d1825263b
Merge branch 'master' into concedo
...
# Conflicts:
# CMakeLists.txt
# flake.nix
2023-04-09 13:22:40 +08:00
Concedo
26a7933084
hide the tiny tkinter window
2023-04-09 01:01:34 +08:00
Tomáš Pazdiora
aaf3b23deb
fix for windows utf-8 input ( #840 )
...
Use UTF-16 as input on Windows, since UTF-8 does not work and reads multibyte characters as zeros
2023-04-08 17:49:39 +02:00
eiery
f2d1c47294
cmake should link openblas properly with -lopenblas like how it's done in the makefile ( #839 )
2023-04-08 11:15:17 +00:00
lon
317fb12fbd
Add new binaries to flake.nix ( #847 )
2023-04-08 12:04:23 +02:00
Concedo
d335fae7c4
missed a print statement
2023-04-08 17:59:53 +08:00
Concedo
0b904e12db
Merge branch 'master' into concedo
...
# Conflicts:
# Makefile
2023-04-08 17:42:09 +08:00
LostRuins
5dd610032e
Merge pull request #27 from ariez-xyz/patch-1
...
add more precise instructions for arch
2023-04-08 17:37:39 +08:00
Concedo
d8e37bfe75
new gpt2 format supported
2023-04-08 17:35:36 +08:00
ariez-xyz
b48255db19
add more precise instructions for arch
2023-04-08 10:41:57 +02:00
Concedo
1369b46bb7
notice about false positives
2023-04-08 12:20:48 +08:00
unbounded
62cfc54f77
Add quantize-stats command for testing quantization ( #728 )
...
Command that calculates some statistics over the errors introduced by
quantization, like mean square error, max error and some percentile errors for layer
weights. Should be useful for testing quantization improvements.
Exposes some internal state from ggml and llama for testing
2023-04-08 00:09:18 +02:00
Concedo
d1c957ee64
strip symbols
2023-04-08 00:59:34 +08:00
bhubbb
698f7b5d63
make : add libllama.so target for llama-cpp-python ( #797 )
...
I was able to get llama-cpp-python working but only when I build libllama.so with make.
2023-04-07 19:11:58 +03:00
iacore
c1950c3431
zig : don't link examples/common.cpp for non-example ( #814 )
2023-04-07 19:05:29 +03:00
Ivan Stepanov
4953e9007f
llama : always sort logits before nucleus sampling ( #812 )
...
* Always sort logits before nucleus sampling
* remove second normalization
- fix windows build
- remove normalization since std::discrete_distribution does not require it
2023-04-07 19:02:12 +03:00
Concedo
289c40df94
updated embedded kobold
2023-04-07 22:39:20 +08:00
Concedo
1abcdb2394
should not be static
2023-04-07 20:35:19 +08:00
Concedo
43949f7c7c
Merge branch 'master' into concedo
2023-04-07 20:34:06 +08:00
Concedo
f322a5820e
fixed positional port arg
2023-04-07 17:46:33 +08:00
Concedo
1d48db4f63
dont build quantize
2023-04-07 17:11:26 +08:00
Sergey Alirzaev
cc9cee8e9e
Do not crash when it has nothing to say. ( #796 )
...
Otherwise observing this in the interactive mode:
/usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/stl_vector.h:1230: reference std::vector<int>::back() [_Tp = int, _Alloc = std::allocator<int>]: Assertion '!this->empty()' failed.
2023-04-06 17:59:11 +02:00
Concedo
4f5faf9612
some users report that this repo is now being flagged as malicious?
...
no idea why, but I am removing all prebuilt binaries except libopenblas. windows users can still obtain it from /releases and osx and linux users can rebuild from source code.
2023-04-06 21:49:43 +08:00
Concedo
b56f872b61
update embedded kobold lite
2023-04-06 16:34:51 +08:00
Pavol Rusnak
d2beca95dc
Make docker instructions more explicit ( #785 )
2023-04-06 08:56:58 +02:00
Concedo
0e889ed6db
Merge branch 'master' into concedo
...
# Conflicts:
# .gitignore
# Makefile
# README.md
2023-04-06 11:14:44 +08:00
Concedo
3d650d0e25
remove dependency of psutil, fixed compile error on WSL, handle exceptions when sending http response, added multiline for embedded kobold
2023-04-06 11:08:19 +08:00
Georgi Gerganov
eeaa7b0492
ggml : multi-thread ggml_rope() (~3-4 times faster on M1) ( #781 )
2023-04-05 22:11:03 +03:00
Georgi Gerganov
986b6ce9f9
ggml, llama : avoid heavy V transpose + improvements ( #775 )
...
ggml :
- added ggml_view_3d()
- ggml_view_tensor() now inherits the stride too
- reimplement ggml_cpy() to account for dst stride
- no longer require tensor->data to be memory aligned
llama :
- compute RoPE on 32-bit tensors (should be more accurate)
- store RoPE-ed K in the KV cache
- store transposed V in the KV cache (significant speed-up)
- avoid unnecessary Q copy
2023-04-05 22:07:33 +03:00
Georgi Gerganov
3416298929
Update README.md
2023-04-05 19:54:30 +03:00
Ivan Stepanov
5a8c4f6240
llama : define non-positive top_k; top_k range check ( #779 )
...
* Define non-positive top_k; top_k range check
* minor : brackets
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-04-05 19:20:05 +03:00
at8u
ff05d05c96
miku.sh : add executable bit ( #780 )
2023-04-05 18:59:13 +03:00
Georgi Gerganov
62b3e81aae
media : add logos and banners
2023-04-05 18:58:31 +03:00
Georgi Gerganov
8d10406d6e
readme : change logo + add bindings + add uis + add wiki
2023-04-05 18:56:20 +03:00
iacore
ed1c214e66
zig : add build.zig ( #773 )
...
Co-authored-by: Locria Cyber <74560659+locriacyber@users.noreply.github.com>
2023-04-05 18:06:02 +03:00
Ivan Stepanov
0c44427df1
make : missing host optimizations in CXXFLAGS ( #763 )
2023-04-05 17:38:37 +03:00
Adithya Balaji
594cc95fab
readme : update with CMake and windows example ( #748 )
...
* README: Update with CMake and windows example
* README: update with code-review for cmake build
2023-04-05 17:36:12 +03:00
at8u
88ed5761b8
examples : add Miku.sh ( #724 )
...
* Add Miku.sh to examples
* Add missing line to prompt in Miku.sh
* Add --keep param to Miku.sh
* Remove '[end_of_conversation]' line from Miku.sh
No longer is necessary.
2023-04-05 17:32:42 +03:00
Andrew Duffy
58c438cf7d
Add Accelerate/BLAS when using Swift ( #765 )
2023-04-05 06:44:24 -04:00
Concedo
5c1920df43
why nobody ever told me the makefile doesnt work outside x86 xD
2023-04-05 17:15:42 +08:00
Concedo
1490cdd71d
change GPT-J and GPT2 KVs to use fp16 instead
2023-04-05 15:53:07 +08:00
Concedo
57e9f929ee
renamed misnamed ACCELERATE define, and removed all -march=native and -mtune=native flags
2023-04-05 15:22:13 +08:00
Concedo
14273fea7a
integrated gpt2 support
2023-04-04 23:15:47 +08:00
Concedo
52de932842
removed main.exe to reduce clutter, added support for rep pen in gptj
2023-04-04 20:43:13 +08:00
Concedo
9c0dbbb08b
Merge branch 'master' into concedo
2023-04-04 00:51:05 +08:00
Concedo
dd2abd8bc7
lower default thread threshold
2023-04-04 00:42:49 +08:00
mgroeber9110
53dbba7695
Windows: reactive sigint handler after each Ctrl-C ( #736 )
2023-04-03 18:00:55 +02:00
SebastianApel
437e77855a
10+% performance improvement of ggml_vec_dot_q4_0 on AVX2 ( #654 )
...
* Performance improvement of AVX2 code
* Fixed problem with MSVC compiler
* Reviewer comments: removed double semicolon, deleted empty line 1962
2023-04-03 09:52:28 +02:00