Georgi Gerganov
b532a69b2f
convert.py : use dir name to name the llama
2023-08-30 13:29:40 +03:00
Georgi Gerganov
c90d135eb4
examples : fix underscore in beam-search + .gitignore ( close #2900 )
2023-08-30 12:53:24 +03:00
M. Yusuf Sarıgöz
0d1c706181
gguf : add workflow for Pypi publishing ( #2896 )
...
* gguf : add workflow for Pypi publishing
* gguf : add workflow for Pypi publishing
* fix trailing whitespace
2023-08-30 12:47:40 +03:00
alonfaraj
9509294420
make : add test and update CI ( #2897 )
...
* build ci: run make test
* makefile:
- add all
- add test
* enable tests/test-tokenizer-0-llama
* fix path to model
* remove gcc-8 from macos build test
* Update Makefile
* Update Makefile
2023-08-30 12:42:51 +03:00
Gilad S
35092fb547
docs : add node-llama-cpp
to README.md
( #2885 )
2023-08-30 11:40:12 +03:00
Kerfuffle
dc07dc492e
convert : various script cleanups/fixes + merges and special token handling ( #2842 )
...
* convert: Fix permute calls and method/func definitions
* Cleanups for gguf-py
* Minor types cleanups.
* Initial implementation of handling merges and special tokens
* convert: Handle special tokens and merges in vocab only mode
convert: Vocab only mode no longer requires loading model tensors
* gguf: Refactor tensor name mapping
* convert: Fix type hint for special_token_types in SpecialVocab
* Use common special vocab handling in various conversion scripts
* First pass at implementing suggested changes
* Second pass
* gguf: SpecialVocab: Fix issue with special token content not in a dict
gguf: SpecialVocab: Allow skipping handling of merges
* convert-falcon-hf-to-gguf: Support --vocab-only option, bail out if no tokenizer.json
* convert-gptneox-hf-to-gguf and convert: Only handle merges for BPE tokenizer
* gguf: SpecialVocab: Actually set load_merges in object
* Uniform args parsing and vocab only mode for convert examples
* convert.py: Set gpt2 as tokenizer model when using BPE
* Squish last type warning in gguf.py - yay!
2023-08-30 11:25:50 +03:00
chaihahaha
ad9ddcff6e
llm.vim : stop generation at multiple linebreaks, bind to <F2> ( #2879 )
2023-08-30 09:50:55 +03:00
staviq
8341a25957
main : log file ( #2748 )
...
* initial, base LOG macro
* add *.log to .gitignore
* added basic log file handler
* reverted log auto endline to better mimic printf
* remove atomics and add dynamic log target
* log_enable/disable, LOG_TEE, basic usage doc
* update .gitignore
* mv include to common, params, help msg
* log tostring helpers, token vectors pretty prints
* main: replaced fprintf/LOG_TEE, some trace logging
* LOG_DISABLE_LOGS compile flag, wrapped f in macros
* fix LOG_TEELN and configchecker
* stub LOG_DUMP_CMDLINE for WIN32 for now
* fix msvc
* cleanup main.cpp:273
* fix stray whitespace after master sync
* log : fix compile warnings
- do not use C++20 stuff
- use PRIu64 to print uint64_t
- avoid string copies by using const ref
- fix ", ##__VA_ARGS__" warnings
- compare strings with == and !=
* log : do not append to existing log + disable file line func by default
* log : try to fix Windows build
* main : wip logs
* main : add trace log
* review: macro f lowercase, str append to sstream
* review: simplify ifs and str comparisons
* fix MSVC, formatting, FMT/VAL placeholders
* review: if/else cleanup
* review: if/else cleanup (2)
* replace _ prefix with _impl suffix
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-08-30 09:29:32 +03:00
Cebtenzzre
849408957c
tests : add a C compliance test ( #2848 )
...
* tests : add a C compliance test
* make : build C compliance test by default
* make : fix clean and make sure C test fails on clang
* make : move -Werror=implicit-int to CFLAGS
2023-08-30 09:20:26 +03:00
slaren
06abf8eeba
ggml : add view_src and view_offs to ggml_tensor for views ( #2874 )
...
* ggml : add view_src and view_offs
* update ggml-alloc to use view_src
* update ggml_diag_mask to work correctly with automatic inplace
* exclude other ops that set an inplace flag from automatic inplace
2023-08-29 23:24:42 +02:00
slaren
c03a243abf
remove outdated references to -eps and -gqa from README ( #2881 )
2023-08-29 23:17:34 +02:00
xaedes
bf70e27cd6
fix check_gradient
...
ggml_build_backward_expand was previously replaced by ggml_build_backward, but the assignment of forward graph to backward graph missing
2023-08-29 23:08:30 +02:00
Kawrakow
fa3582f509
Tell users attmepting to run perplexity with too few tokens to use more ( #2882 )
...
Closes #2858
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2023-08-29 23:55:45 +03:00
Kawrakow
e37e69dcc3
10X faster BPE tokenizer ( #2876 )
...
* 10X faster BPE tokenizer
* Remove comment that no longer applies
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2023-08-29 23:55:03 +03:00
xaedes
5854f51188
fix error message in ggml_allocr_alloc to display actual max_avail
2023-08-29 22:49:01 +02:00
xaedes
281245a48f
Merge branch 'master' into finetune-lora
2023-08-29 21:47:28 +02:00
xaedes
8a96d4c2aa
add missing argument 'int i0' to ggml_get_i32_nd & ggml_set_i32_nd header declarations
2023-08-29 21:24:37 +02:00
xaedes
dd4e4bca09
remove unused 'inplace' argument from ggml_compute_backward function
...
inplace operations to add gradients are no longer created by ggml_compute_backward
use allocator to automatically make inplace operations
2023-08-29 21:21:10 +02:00
xaedes
a76e66ac8d
fix ggml_acc_or_set to return tensor of correct shape
2023-08-29 21:02:10 +02:00
xaedes
b1aa26f718
add sanity check to ggml_compute_backward, asserting the correct shape of gradients
2023-08-29 21:01:17 +02:00
xaedes
5fcfa7e49e
increase test-grad0 context mem size to accommodate for bigger cgraph
2023-08-29 21:00:19 +02:00
xaedes
82c5247a20
add ggml API functions ggml_unravel_index, ggml_get_i32_nd and its analogs for set and for f32
...
ggml_get_i32_1d, ggml_set_i32_1d, ggml_get_f32_1d, ggml_set_f32_1d now support non-contiguous tensors.
in case of non-contiguous tensor, the 1d index is unraveled into a multi index using ggml_unravel_index to be passed to '_nd' function equivalent.
this fixes a bug in test-grad0 which happens due to ggml_build_backward not building purely contiguous tensors anymore
2023-08-29 20:59:31 +02:00
xaedes
5f0a4e971f
avoid stack overflow of large cgraphs in test-grad0
2023-08-29 19:59:41 +02:00
xaedes
794bb7ea42
implement ggml_compute_forward_repeat_f16
2023-08-29 19:59:14 +02:00
xaedes
e28cf7e9ce
update README.md
2023-08-29 19:38:23 +02:00
xaedes
a6165dafcd
remove trailing whitespace
2023-08-29 19:30:42 +02:00
xaedes
5813ac832f
omit tokenization when training is disabled, only save llama lora adapter
...
training can be disabled by passing '-n 0' to finetune
2023-08-29 19:21:45 +02:00
xaedes
ebff3a14c3
remove code to print data checksums which was used to verify correctness of new gguf code
2023-08-29 18:31:20 +02:00
xaedes
1425968ead
remove old checkpoint save & load code
2023-08-29 18:30:16 +02:00
xaedes
6134ad4de7
add python script to convert old finetune checkpoint files to gguf
2023-08-29 18:24:06 +02:00
xaedes
0564f4ed1f
add load & save lora finetune checkpoints via gguf
2023-08-29 18:20:39 +02:00
maddes8cht
53885d7256
py : fix "usage" messages ( #2873 )
...
convert-to-gguf python scripts
2023-08-29 16:51:02 +03:00
jameswu2014
bcce96ba4d
convert.py : fix baichuan7B support ( #2870 )
...
* [Fix]: convert.py support baichuan7B
* convert.py : fix trailing whitespaces
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-08-29 12:48:41 +03:00
Jhen-Jie Hong
74e0caeb82
readme : add react-native binding ( #2869 )
2023-08-29 12:30:10 +03:00
Cebtenzzre
d4b5e16c32
make : fix clang tests build, add missing examples ( #2859 )
...
* make : do not pass headers to the compiler
This fixes building tests with clang.
* make : add missing examples
* make : fix build-info.h dependencies
2023-08-29 11:42:41 +03:00
Georgi Gerganov
3a007648f2
metal : add option to disable debug logs ( close #2764 )
2023-08-29 11:33:46 +03:00
Georgi Gerganov
611363ac79
scripts : add pipefail
2023-08-29 10:50:30 +03:00
Marcus Dunn
95b6e5212f
added struct
to llama_dump_timing_info_yaml's llama_context
( #2857 )
...
fixes C compat.
2023-08-29 09:33:27 +03:00
xaedes
ecb1b20c85
add gguf constants and load/save functions from train-text-from-scratch
2023-08-29 01:40:02 +02:00
xaedes
e030f7b2c5
add LLM_KV_TRAINING_TYPE to train-text-from-scratch checkpoints
...
so that they can be differentiated from lora finetune checkpoints
2023-08-29 01:27:28 +02:00
xaedes
ca97583f0b
remove vocab related code as it is unnecessary
2023-08-29 01:19:45 +02:00
xaedes
a3b45298f1
remove unused code
2023-08-29 01:12:51 +02:00
xaedes
1faee64db9
handle rms_norm and rope parameters the same as in train-text-from-scratch
2023-08-29 01:09:35 +02:00
xaedes
007280c82f
make default value of float member a float literal
2023-08-29 01:04:57 +02:00
xaedes
49af7fbe12
add comment explaining why finetune checkpoints are allocated in one block
2023-08-29 00:57:39 +02:00
xaedes
9a28bce29a
reduce large memory overhead in train-text-from-scratch
...
all gradients had to be pinned so that graph_reset works correctly.
this is no longer necessary with the changes to ggml_compute_backward introduced in this PR.
2023-08-29 00:56:44 +02:00
xaedes
271c0300de
remove prediction related code to reduce duplicated code with main
...
use main instead
2023-08-29 00:50:59 +02:00
xaedes
5ce92aed37
finetune bug fixes to compile with merged in code from master
2023-08-29 00:41:19 +02:00
xaedes
daedc6f419
replace llama_n_mult by llama_n_ff
2023-08-29 00:40:53 +02:00
xaedes
aa8016e95d
bug fix: replace GGML_TYPE_SIZE[t] by ggml_type_size(t)
2023-08-29 00:40:30 +02:00