Commit graph

1479 commits

Author SHA1 Message Date
xaedes
934ad8d35d
move some params from lora hparams into model hparams and load model params from gguf
this equalizes the model definition in finetune and text-from-scratch and removes the need for additional llama api functions to get model parameters
2023-09-17 16:51:15 +02:00
xaedes
b0ee563748
assert correct base model tensor shapes 2023-09-17 16:43:12 +02:00
xaedes
5ed309810e
align code 2023-09-17 16:41:25 +02:00
xaedes
1dbd6bc3d5
remove n_rot hparam, as it must always be hparam.n_embd_head() 2023-09-17 16:40:40 +02:00
xaedes
56a03faf5f
deduplicate code into function 2023-09-17 16:37:21 +02:00
xaedes
d1bb6fb349
add train option "--sample-random-offsets"
Use samples beginning at random offsets.
The offset is only applied to the first sample in each batch context window.
Together with "--fill-with-next-samples" this may help for training endless text generation.

For example given a dataset containing samples "abcd", "ABCD", "0123".
With context size of 8 and options "--fill-with-next-samples", "--no-separate-with-eos", "--no-separate-with-bos",
the context windows of batches could only be filled with "abcdABCD", "ABCDabcd", "0123abcd", etc.

With "--sample-random-offsets" it can also be filled with "23abcdAB", "bcd0123A", etc.
2023-09-17 14:37:41 +02:00
xaedes
bf2ad65836
fix frand to return value in interval [0,1) 2023-09-17 14:28:58 +02:00
xaedes
151bfe9ee1
assert that sample_count > 0, avoiding division by zero 2023-09-17 13:07:17 +02:00
xaedes
ddf5ac257a
use new/delete for train_state instead of malloc/free
using malloc may result in seg faults when trying to assign string fields
2023-09-17 12:48:17 +02:00
xaedes
8721785c52
fix compile warnings 2023-09-16 22:28:23 +02:00
xaedes
83061fbdbe
fix compile warnings 2023-09-16 22:19:46 +02:00
xaedes
dd3e7634f0
remove terminating '\0' from tokenization
(llama_tokenize is now passed the string length instead of relying on terminating '\0')
2023-09-16 21:31:50 +02:00
xaedes
9db2664dd1
fix saving and loading of training type 2023-09-16 21:21:04 +02:00
xaedes
1d09965179
use die("msg") instead of replace GGML_ASSERT(!"msg") or throw std::runtime_error("msg") 2023-09-16 21:13:03 +02:00
xaedes
1d33ec5b1c
fix condition in load_train_state_gguf 2023-09-16 21:13:02 +02:00
xaedes
9139fec7ff
fix code formating of long function declarations 2023-09-16 20:38:23 +02:00
xaedes
8d82d4c8e6
remove static from process_escape since we need it exposed in header 2023-09-16 20:37:56 +02:00
xaedes
7930caf24c
fix usage of llama_tokenize 2023-09-16 20:36:43 +02:00
xaedes
d3e06d3e73
Merge branch 'master' into finetune-lora
# Conflicts:
#	Makefile
#	examples/baby-llama/baby-llama.cpp
#	examples/train-text-from-scratch/train-text-from-scratch.cpp
#	llama.cpp
2023-09-16 20:31:58 +02:00
xaedes
571dc94da9
increase train_samples by used_samples instead of number of batches
on batch can contain more than one sample when option "fill_with_next_samples" is used
2023-09-16 20:23:05 +02:00
xaedes
48d3509190
save and load head_count_kv in lora checkpoints 2023-09-16 20:20:23 +02:00
IsaacDynamo
b541b4f0b1
Enable BUILD_SHARED_LIBS=ON on all Windows builds (#3215) 2023-09-16 19:35:25 +02:00
xaedes
7aa9ea7f20
fix consume_common_train_arg 2023-09-16 19:08:51 +02:00
xaedes
bef1e97875
move common opt_callback into common/train 2023-09-16 18:54:57 +02:00
xaedes
e9758ae1d2
move common train params into common/train 2023-09-16 18:45:59 +02:00
xaedes
ee27333b16
move train data saving code into callback to unify code of opt_callback
train_params are still different in finetune and train-text-from-scratch, so it can't yet be moved to train.h|cpp
2023-09-16 17:50:16 +02:00
xaedes
a8c8907c62
move train state into struct train_state 2023-09-16 17:30:38 +02:00
Vlad
5dbc2b3213
Enable build with CUDA 11.0 (make) (#3132)
* CUDA 11.0 fixes

* Cleaner CUDA/host flags separation

Also renamed GGML_ASSUME into GGML_CUDA_ASSUME
2023-09-16 16:55:43 +02:00
xaedes
9f4b1bf88d
move common train functions into common/train.[h|cpp] 2023-09-16 16:17:13 +02:00
xaedes
00b656f6db
remove lbfgs related train parameters 2023-09-16 15:59:46 +02:00
goerch
b08e75baea
Fixing the last deviations from sentencepiece indicated by test-tokenizer-1 (#3170)
* Fix für #2721

* Reenable tokenizer test for LLaMa

* Add `console.cpp` dependency

* Fix dependency to `common`

* Fixing wrong fix.

* Make console usage platform specific

Work on compiler warnings.

* Adapting makefile

* Remove trailing whitespace

* Adapting the other parts of the makefile

* Fix typo.

* Fixing the last deviations from sentencepiece indicated by test-tokenizer-1

* Simplify logic

* Add missing change...

* Fix ugly compiler warning

* llama_tokenize should accept strings containing NUL now

* Adding huichen's test case
2023-09-16 13:41:33 +02:00
xaedes
ab56b63b27
update train-text-from-scratch with tokenization, sample selection and shuffling from finetune 2023-09-15 23:45:54 +02:00
xaedes
cc60b3f639
remove outcommented old code 2023-09-15 23:45:05 +02:00
xaedes
4f2ce91b9e
add static keywords 2023-09-15 23:44:53 +02:00
Cebtenzzre
e6616cf0db
examples : add compiler version and target to build info (#2998) 2023-09-15 16:59:49 -04:00
Cebtenzzre
3aefaab9e5
check C++ code with -Wmissing-declarations (#3184) 2023-09-15 15:38:27 -04:00
Cebtenzzre
69eb67e282
fix build numbers by setting fetch-depth=0 (#3197) 2023-09-15 15:18:15 -04:00
Meng Zhang
4fe09dfe66
llama : add support for StarCoder model architectures (#3187)
* add placeholder of starcoder in gguf / llama.cpp

* support convert starcoder weights to gguf

* convert MQA to MHA

* fix ffn_down name

* add LLM_ARCH_STARCODER to llama.cpp

* set head_count_kv = 1

* load starcoder weight

* add max_position_embeddings

* set n_positions to max_positioin_embeddings

* properly load all starcoder params

* fix head count kv

* fix comments

* fix vram calculation for starcoder

* store mqa directly

* add input embeddings handling

* add TBD

* working in cpu, metal buggy

* cleanup useless code

* metal : fix out-of-bounds access in soft_max kernels

* llama : make starcoder graph build more consistent with others

* refactor: cleanup comments a bit

* add other starcoder models: 3B, 7B, 15B

* support-mqa-directly

* fix: remove max_position_embeddings, use n_train_ctx

* Update llama.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Update llama.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* fix: switch to space from tab

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-15 22:02:13 +03:00
Cebtenzzre
80291a1d02
common : do not use GNU zero-length __VA_ARGS__ extension (#3195) 2023-09-15 21:02:01 +03:00
Georgi Gerganov
c6f1491da0
metal : fix bug in soft_max kernels (out-of-bounds access) (#3194) 2023-09-15 20:17:24 +03:00
Cebtenzzre
e3d87a6c36
convert : make ftype optional in simple scripts (#3185) 2023-09-15 12:29:02 -04:00
Georgi Gerganov
8c00b7a6ff
sync : ggml (Metal F32 support + reduce ggml-alloc size) (#3192)
* sync : ggml (Metal F32 support + reduce ggml-alloc size)

ggml-ci

* llama-bench : fix ggml_cpu_has_metal() duplicate function

ggml-ci
2023-09-15 19:06:03 +03:00
Engininja2
7e50d34be6
cmake : fix building shared libs for clang (rocm) on windows (#3176) 2023-09-15 15:24:30 +03:00
Evgeny Kurnevsky
235f7c193b
flake : use pkg-config instead of pkgconfig (#3188)
pkgconfig is an alias, it got removed from nixpkgs:
295a5e1e2b/pkgs/top-level/aliases.nix (L1408)
2023-09-15 11:10:22 +03:00
Georgi Gerganov
a51b687657
metal : relax conditions on fast matrix multiplication kernel (#3168)
* metal : relax conditions on fast matrix multiplication kernel

* metal : revert the concurrnecy change because it was wrong

* llama : remove experimental stuff
2023-09-15 11:09:24 +03:00
Andrei
76164fe2e6
cmake : fix llama.h location when built outside of root directory (#3179) 2023-09-15 11:07:40 +03:00
Ali Tariq
c2ab6fe661
ci : Cloud-V for RISC-V builds (#3160)
* Added Cloud-V File

* Replaced Makefile with original one

---------

Co-authored-by: moiz.hussain <moiz.hussain@10xengineers.ai>
2023-09-15 11:06:56 +03:00
Roland
2d770505a8
llama : remove mtest (#3177)
* Remove mtest

* remove from common/common.h and examples/main/main.cpp
2023-09-15 10:28:45 +03:00
Cebtenzzre
98311c4277
llama : make quantize example up to 2.7x faster (#3115) 2023-09-14 21:09:53 -04:00
xaedes
76804fab1d
exclude some more known zero values from computations in flash_attn_f32 & flash_attn_back_f32 2023-09-14 22:19:39 +02:00