Commit graph

1360 commits

Author SHA1 Message Date
Georgi Gerganov
c90d135eb4
examples : fix underscore in beam-search + .gitignore (close #2900) 2023-08-30 12:53:24 +03:00
M. Yusuf Sarıgöz
0d1c706181
gguf : add workflow for Pypi publishing (#2896)
* gguf : add workflow for Pypi publishing

* gguf : add workflow for Pypi publishing

* fix trailing whitespace
2023-08-30 12:47:40 +03:00
alonfaraj
9509294420
make : add test and update CI (#2897)
* build ci: run make test

* makefile:
- add all
- add test

* enable tests/test-tokenizer-0-llama

* fix path to model

* remove gcc-8 from macos build test

* Update Makefile

* Update Makefile
2023-08-30 12:42:51 +03:00
Gilad S
35092fb547
docs : add node-llama-cpp to README.md (#2885) 2023-08-30 11:40:12 +03:00
Kerfuffle
dc07dc492e
convert : various script cleanups/fixes + merges and special token handling (#2842)
* convert: Fix permute calls and method/func definitions

* Cleanups for gguf-py

* Minor types cleanups.

* Initial implementation of handling merges and special tokens

* convert: Handle special tokens and merges in vocab only mode

convert: Vocab only mode no longer requires loading model tensors

* gguf: Refactor tensor name mapping

* convert: Fix type hint for special_token_types in SpecialVocab

* Use common special vocab handling in various conversion scripts

* First pass at implementing suggested changes

* Second pass

* gguf: SpecialVocab: Fix issue with special token content not in a dict

gguf: SpecialVocab: Allow skipping handling of merges

* convert-falcon-hf-to-gguf: Support --vocab-only option, bail out if no tokenizer.json

* convert-gptneox-hf-to-gguf and convert: Only handle merges for BPE tokenizer

* gguf: SpecialVocab: Actually set load_merges in object

* Uniform args parsing and vocab only mode for convert examples

* convert.py: Set gpt2 as tokenizer model when using BPE

* Squish last type warning in gguf.py - yay!
2023-08-30 11:25:50 +03:00
chaihahaha
ad9ddcff6e
llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879) 2023-08-30 09:50:55 +03:00
staviq
8341a25957
main : log file (#2748)
* initial, base LOG macro

* add *.log to .gitignore

* added basic log file handler

* reverted log auto endline to better mimic printf

* remove atomics and add dynamic log target

* log_enable/disable, LOG_TEE, basic usage doc

* update .gitignore

* mv include to common, params, help msg

* log tostring helpers, token vectors pretty prints

* main: replaced fprintf/LOG_TEE, some trace logging

* LOG_DISABLE_LOGS compile flag, wrapped f in macros

* fix LOG_TEELN and configchecker

* stub LOG_DUMP_CMDLINE for WIN32 for now

* fix msvc

* cleanup main.cpp:273

* fix stray whitespace after master sync

* log : fix compile warnings

- do not use C++20 stuff
- use PRIu64 to print uint64_t
- avoid string copies by using const ref
- fix ", ##__VA_ARGS__" warnings
- compare strings with == and !=

* log : do not append to existing log + disable file line func by default

* log : try to fix Windows build

* main : wip logs

* main : add trace log

* review: macro f lowercase, str append to sstream

* review: simplify ifs and str comparisons

* fix MSVC, formatting, FMT/VAL placeholders

* review: if/else cleanup

* review: if/else cleanup (2)

* replace _ prefix with _impl suffix

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-08-30 09:29:32 +03:00
Cebtenzzre
849408957c
tests : add a C compliance test (#2848)
* tests : add a C compliance test

* make : build C compliance test by default

* make : fix clean and make sure C test fails on clang

* make : move -Werror=implicit-int to CFLAGS
2023-08-30 09:20:26 +03:00
slaren
06abf8eeba
ggml : add view_src and view_offs to ggml_tensor for views (#2874)
* ggml : add view_src and view_offs

* update ggml-alloc to use view_src

* update ggml_diag_mask to work correctly with automatic inplace

* exclude other ops that set an inplace flag from automatic inplace
2023-08-29 23:24:42 +02:00
slaren
c03a243abf
remove outdated references to -eps and -gqa from README (#2881) 2023-08-29 23:17:34 +02:00
xaedes
bf70e27cd6
fix check_gradient
ggml_build_backward_expand was previously replaced by ggml_build_backward, but the assignment of forward graph to backward graph missing
2023-08-29 23:08:30 +02:00
Kawrakow
fa3582f509
Tell users attmepting to run perplexity with too few tokens to use more (#2882)
Closes #2858

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2023-08-29 23:55:45 +03:00
Kawrakow
e37e69dcc3
10X faster BPE tokenizer (#2876)
* 10X faster BPE tokenizer

* Remove comment that no longer applies

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2023-08-29 23:55:03 +03:00
xaedes
5854f51188
fix error message in ggml_allocr_alloc to display actual max_avail 2023-08-29 22:49:01 +02:00
xaedes
281245a48f
Merge branch 'master' into finetune-lora 2023-08-29 21:47:28 +02:00
xaedes
8a96d4c2aa
add missing argument 'int i0' to ggml_get_i32_nd & ggml_set_i32_nd header declarations 2023-08-29 21:24:37 +02:00
xaedes
dd4e4bca09
remove unused 'inplace' argument from ggml_compute_backward function
inplace operations to add gradients are no longer created by ggml_compute_backward
use allocator to automatically make inplace operations
2023-08-29 21:21:10 +02:00
xaedes
a76e66ac8d
fix ggml_acc_or_set to return tensor of correct shape 2023-08-29 21:02:10 +02:00
xaedes
b1aa26f718
add sanity check to ggml_compute_backward, asserting the correct shape of gradients 2023-08-29 21:01:17 +02:00
xaedes
5fcfa7e49e
increase test-grad0 context mem size to accommodate for bigger cgraph 2023-08-29 21:00:19 +02:00
xaedes
82c5247a20
add ggml API functions ggml_unravel_index, ggml_get_i32_nd and its analogs for set and for f32
ggml_get_i32_1d, ggml_set_i32_1d, ggml_get_f32_1d, ggml_set_f32_1d now support non-contiguous tensors.
in case of non-contiguous tensor, the 1d index is unraveled into a multi index using ggml_unravel_index to be passed to '_nd' function equivalent.

this fixes a bug in test-grad0 which happens due to ggml_build_backward not building purely contiguous tensors anymore
2023-08-29 20:59:31 +02:00
xaedes
5f0a4e971f
avoid stack overflow of large cgraphs in test-grad0 2023-08-29 19:59:41 +02:00
xaedes
794bb7ea42
implement ggml_compute_forward_repeat_f16 2023-08-29 19:59:14 +02:00
xaedes
e28cf7e9ce
update README.md 2023-08-29 19:38:23 +02:00
xaedes
a6165dafcd
remove trailing whitespace 2023-08-29 19:30:42 +02:00
xaedes
5813ac832f
omit tokenization when training is disabled, only save llama lora adapter
training can be disabled by passing '-n 0' to finetune
2023-08-29 19:21:45 +02:00
xaedes
ebff3a14c3
remove code to print data checksums which was used to verify correctness of new gguf code 2023-08-29 18:31:20 +02:00
xaedes
1425968ead
remove old checkpoint save & load code 2023-08-29 18:30:16 +02:00
xaedes
6134ad4de7
add python script to convert old finetune checkpoint files to gguf 2023-08-29 18:24:06 +02:00
xaedes
0564f4ed1f
add load & save lora finetune checkpoints via gguf 2023-08-29 18:20:39 +02:00
maddes8cht
53885d7256
py : fix "usage" messages (#2873)
convert-to-gguf python scripts
2023-08-29 16:51:02 +03:00
jameswu2014
bcce96ba4d
convert.py : fix baichuan7B support (#2870)
* [Fix]: convert.py support baichuan7B

* convert.py : fix trailing whitespaces

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-08-29 12:48:41 +03:00
Jhen-Jie Hong
74e0caeb82
readme : add react-native binding (#2869) 2023-08-29 12:30:10 +03:00
Cebtenzzre
d4b5e16c32
make : fix clang tests build, add missing examples (#2859)
* make : do not pass headers to the compiler

This fixes building tests with clang.

* make : add missing examples

* make : fix build-info.h dependencies
2023-08-29 11:42:41 +03:00
Georgi Gerganov
3a007648f2
metal : add option to disable debug logs (close #2764) 2023-08-29 11:33:46 +03:00
Georgi Gerganov
611363ac79 scripts : add pipefail 2023-08-29 10:50:30 +03:00
Marcus Dunn
95b6e5212f
added struct to llama_dump_timing_info_yaml's llama_context (#2857)
fixes C compat.
2023-08-29 09:33:27 +03:00
xaedes
ecb1b20c85
add gguf constants and load/save functions from train-text-from-scratch 2023-08-29 01:40:02 +02:00
xaedes
e030f7b2c5
add LLM_KV_TRAINING_TYPE to train-text-from-scratch checkpoints
so that they can be differentiated from lora finetune checkpoints
2023-08-29 01:27:28 +02:00
xaedes
ca97583f0b
remove vocab related code as it is unnecessary 2023-08-29 01:19:45 +02:00
xaedes
a3b45298f1
remove unused code 2023-08-29 01:12:51 +02:00
xaedes
1faee64db9
handle rms_norm and rope parameters the same as in train-text-from-scratch 2023-08-29 01:09:35 +02:00
xaedes
007280c82f
make default value of float member a float literal 2023-08-29 01:04:57 +02:00
xaedes
49af7fbe12
add comment explaining why finetune checkpoints are allocated in one block 2023-08-29 00:57:39 +02:00
xaedes
9a28bce29a
reduce large memory overhead in train-text-from-scratch
all gradients had to be pinned so that graph_reset works correctly.
this is no longer necessary with the changes to ggml_compute_backward introduced in this PR.
2023-08-29 00:56:44 +02:00
xaedes
271c0300de
remove prediction related code to reduce duplicated code with main
use main instead
2023-08-29 00:50:59 +02:00
xaedes
5ce92aed37
finetune bug fixes to compile with merged in code from master 2023-08-29 00:41:19 +02:00
xaedes
daedc6f419
replace llama_n_mult by llama_n_ff 2023-08-29 00:40:53 +02:00
xaedes
aa8016e95d
bug fix: replace GGML_TYPE_SIZE[t] by ggml_type_size(t) 2023-08-29 00:40:30 +02:00
xaedes
aecc3b3890
fix dump_non_result_info_yaml to output multiple lora adapters 2023-08-29 00:39:59 +02:00