Commit graph

2399 commits

Author SHA1 Message Date
teddybear082
7d120f2794
Add context size parameter to google colab notebook (#489)
-add configurable context size to parameters along with models and layers for ease of use

-this can already be done with a simple edit by experienced llm users but new users may not know this is a parameter they should set.

Co-authored-by: LostRuins <39025047+LostRuins@users.noreply.github.com>
2023-10-24 17:13:01 +08:00
Concedo
7744aa6a9c updated colab 2023-10-24 15:37:47 +08:00
Concedo
5f1f8a5a89 adjust 2023-10-22 21:53:54 +08:00
Concedo
ccf8334651 remove script (+8 squashed commit)
Squashed commit:

[bde2e3da] should be working

[1cde82c0] update

[bb6c8676] wip

[66b698d1] wip colab

[9953466a] wip colab

[ae0bedea] json fix

[0aac144f] wip on optimized colab

[ec9f8e96] prepare colab binaries notebook
2023-10-22 21:38:38 +08:00
Concedo
fafe999ff9 update lite and colab (+1 squashed commits)
Squashed commits:

[06b6ca6d] updated lite and colab
2023-10-22 14:03:18 +08:00
Concedo
cff75061fe fixed some old models failing due to tokenizer changes, update lite (+1 squashed commits)
Squashed commits:

[9dee81ec] fixed some old models failing due to tokenizer changes, update lite tooltip (+3 squashed commit)

Squashed commit:

[5ab95a79] fixes

[a561d5e2] fixed some old models failing due to tokenizer changes

[95e65daf] lite updates
2023-10-22 11:04:59 +08:00
Concedo
dd1d61ea6b colab is fixed (+1 squashed commits)
Squashed commits:

[0b2a51f3] fix colab (+1 squashed commits)

Squashed commits:

[a6b832d0] fix colab (+1 squashed commits)

Squashed commits:

[8f88f210] updated colab (+1 squashed commits)

Squashed commits:

[75552e0d] try new colab
2023-10-21 10:08:32 +08:00
Concedo
6119a2b5b2 revert lite change 2023-10-20 22:13:56 +08:00
Concedo
6fa681b692 fixed a race condition with SSE streaming 2023-10-20 22:01:09 +08:00
Concedo
5f5d5f1d86 quick fix 2023-10-20 19:43:56 +08:00
Concedo
012c53367d minor lite fixes 2023-10-20 18:41:17 +08:00
Concedo
d3c7b7cc71 colab fix 2023-10-20 16:34:45 +08:00
Concedo
d5016fdc8f updated lite bug 2023-10-20 16:03:06 +08:00
Concedo
ee93213218 updated lite 2023-10-20 15:44:52 +08:00
Concedo
cd3bb3ede2 update colab link 2023-10-20 13:49:34 +08:00
Concedo
8947142c46 updated lite and colab 2023-10-20 11:35:44 +08:00
Concedo
8d31550d48 fix groupchat 2023-10-19 23:40:15 +08:00
Concedo
957e245285 Merge branch 'master' into concedo_experimental
# Conflicts:
#	Makefile
#	README.md
2023-10-19 23:32:52 +08:00
kalomaze
ddce116ec9
Fix for Top K disabling (#480)
* Update gpttype_adapter.cpp

* use n_vocab instead of 32000 for when top k is off
2023-10-19 23:20:44 +08:00
Concedo
8c6001de2a updated lite 2023-10-19 23:18:14 +08:00
Concedo
fd770bb105 patch 2023-10-19 23:04:26 +08:00
Concedo
4382e51719 updated lite and default horde ctx amount 2023-10-19 22:49:59 +08:00
M. Yusuf Sarıgöz
60abea9798
llava : avoid segfault in case of non-existent mmproj file (#3674) 2023-10-19 16:59:11 +03:00
Georgi Gerganov
004797f6ac
readme : update hot topics 2023-10-18 21:44:43 +03:00
Georgi Gerganov
4e82b2ea3f
speculative : bug fixes 2023-10-18 18:49:40 +03:00
Georgi Gerganov
0e89203b51
speculative : add tree-based sampling example (#3624)
* sampling : one sequence per sampling context

ggml-ci

* speculative : add tree-based sampling support

ggml-ci

* speculative : reuse the n_parallel CLI param

* speculative : refactor sampling

* examples : fix build after sampling refactoring

ggml-ci

* batched : fix n_seq_id

* sampling : fix malloc

ggml-ci

* swift : fix build

ggml-ci

* swift : try to fix build

ggml-ci

* prompts : add assistant.txt

* common : add llama_batch_add() and llama_batch_clear() helpers

* speculative : minor refactor

ggml-ci

* minor : comments + rename

ggml-ci

* speculative : fix off-by-one for n_drafted

* speculative : fix the n_drafted fix + p constants
2023-10-18 16:21:57 +03:00
Jhen-Jie Hong
c67fe68e41
metal : implement q5_0 and q5_1 kernels (#3648)
* metal : implement dequantize_q5_0

* metal : block_q_n_dot_y for block_q5_0 (broken)

* metal : revert unnecessary change

* metal : implement dequantize_q5_1

* metal : block_q_n_dot_y for q5_1 (broken)

* metal : fix block_q_n_dot_y

* minor : spaces / formatting

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-18 15:21:48 +03:00
shibe2
1117d06607
opencl : fix element-wise multiplication (#3656) 2023-10-18 15:09:22 +03:00
Concedo
c1ca1de2ac fixed support for old falcon models 2023-10-18 17:20:44 +08:00
Concedo
700951dbd4 Merge branch 'master' into concedo_experimental
# Conflicts:
#	README.md
2023-10-18 16:33:09 +08:00
Concedo
53b7cdf8a3 Merge branch 'concedo' into concedo_experimental 2023-10-18 13:51:13 +08:00
slaren
cb33f43a2a
fix embeddings when using CUDA (#3657) 2023-10-17 22:24:50 +02:00
Georgi Gerganov
e1675d133c
llama : avoid fprintf in favor of LLAMA_LOG (#3538) 2023-10-17 22:34:26 +03:00
BarfingLemurs
8402566a7c
readme : update hot-topics & models, detail windows release in usage (#3615)
* Update README.md

* Update README.md

* Update README.md

* move "Running on Windows" section below "Prepare data and run"

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-17 21:13:21 +03:00
LostRuins
6e34d31c44
Update README.md (#479) 2023-10-18 01:24:14 +08:00
shibe2
40e5ce054f CLBlast: Fix temporary buffer size for f16 conversion (wsize)
Fix buffer overflow.
Reduce the size to fit just one 2D slice.
Assert sufficient size.
2023-10-17 21:02:30 +04:00
slaren
a5e8c1d8c7
train-text-from-scratch : fix assert failure in ggml-alloc (#3618) 2023-10-17 20:00:58 +03:00
Georgi Gerganov
e74c705e15
editorconfig : remove trailing spaces 2023-10-17 19:52:53 +03:00
coezbek
3ad1e3f1a1
server : documentation of JSON return value of /completion endpoint (#3632)
* Added documentation of JSON return value of /completion endpoint

* Update examples/server/README.md

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-17 19:51:02 +03:00
Georgi Gerganov
1142013da4
save-load-state : fix example + add ci test (#3655)
* save-load-state : fix example (close #3606)

* ci : add test for save-load-state example

ggml-ci
2023-10-17 19:12:46 +03:00
ldwang
5fe268a4d9
readme : add Aquila2 links (#3610)
Signed-off-by: ldwang <ftgreat@gmail.com>
Co-authored-by: ldwang <ftgreat@gmail.com>
2023-10-17 18:52:33 +03:00
staviq
1a159553f9
tokenizer : special token handling (#3538)
* Rewrite special token handling from #1931

* shorten param name, add st verification by type

* use offsets instead of copy by substr

* formatting, remove copying iterator on delete

* llama : normalize code-style

* swift fix

* print pfx/sfx if verb, main: split pfx input sfx

* dont add space when using special tokens

* minor : comment + spacing

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-17 18:11:01 +03:00
Concedo
6f8fe88f10 fix for lite (+5 squashed commit)
Squashed commit:

[f9ce9855] catch more exceptions

[8cdaf149] tweaked horde worker timeouts, updated lite

[619ebef4] fixed abort no response if failed

[a54a66a2] fixed time overflow

[9affdc3e] updated lite
2023-10-17 23:04:32 +08:00
Georgi Gerganov
281ef73c25
k-quants : fix quantization ranges (#3646) 2023-10-17 09:19:28 +03:00
Georgi Gerganov
940efa95fe
llava : fix tokenization to not add bos between image embeddings and user prompt (#3645)
* llava : fix tokenization to not add bos after system prompt

* set seed

---------

Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>
2023-10-16 23:58:00 +03:00
Concedo
ee0681f0d9 convert some asserts into non-terminating since they are ovezealous 2023-10-15 16:12:20 +08:00
Concedo
5cfabaee25 Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	Makefile
#	README.md
#	docs/BLIS.md
2023-10-15 15:50:20 +08:00
cebtenzzre
11bff29045
MPT : support GQA for replit-code-v1.5 (#3627) 2023-10-15 09:32:06 +03:00
M. Yusuf Sarıgöz
11dc1091f6
Honor -ngl option for Cuda offloading in llava (#3621) 2023-10-14 04:52:44 -06:00
Daniel Bevenius
2a4bcbacea
llama : remove n_threads from llama_decode_internal (#3614)
This commit removes `n_threads` from the `llama_decode_internal`
functions doc comment as it does not exist anymore.

It looks like this parameter was removed in
Commit 16bc66d947 ("llama.cpp : split
llama_context_params into model and context params").

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2023-10-13 13:33:16 +03:00