Commit graph

603 commits

Author SHA1 Message Date
xaedes
5d9fed7e7f
remove shape annotations in llama_eval_internal 2023-05-07 21:45:29 +02:00
xaedes
d20ba6f6e6
update static assert of GGML_OP_COUNT 2023-05-07 21:42:42 +02:00
xaedes
e643fa1619
smaller default values for baby llama model parameters 2023-05-07 21:38:00 +02:00
xaedes
ee565f34e3
Merge branch 'master' into train-example
# Conflicts:
#	ggml.c
#	llama.cpp
2023-05-07 21:24:12 +02:00
xaedes
4764842120
change name of GGML_OP_ADD_AT to GGML_OP_ACC 2023-05-07 21:14:57 +02:00
xaedes
e0de09d77e
shorten code using a variable 2023-05-07 19:48:38 +02:00
xaedes
49d6daa11e
vastly improve training results
instead of logit targets 0 and 1 use -1 and +1.
2023-05-07 19:46:05 +02:00
xaedes
93201abdb7
add trainable lora-only model with all big matrices C split into A,B with A*B=C
this is not a lora-finetune, but the whole model changed to have only low-rank "lora" matrices.

training this instead of the normal model resulted in much worse results though...
2023-05-07 19:44:51 +02:00
Henri Vasserman
e1295513a4
CI: add Windows CLBlast and OpenBLAS builds (#1277)
* Add OpenCL and CLBlast support

* Add OpenBLAS support

* Remove testing from matrix

* change build name to 'clblast'
2023-05-07 13:20:09 +02:00
swittk
1b0fd45465
ggml : Allow usage of CLBlast alongside Accelerate.framework (#1336)
Minor edit in ggml.c which originally would prevent OpenCL from loading completely if GGML_USE_ACCELERATE was defined.
Minor speedup in prompt eval time.
2023-05-06 23:03:23 -04:00
xaedes
e91b83b899
add GGML_ASSERT to catch ggml_rope and back value errors 2023-05-07 01:47:14 +02:00
xaedes
561fbe0d1b
replace inplace operations for training with copying operations to allow gradient propagation 2023-05-07 01:33:42 +02:00
xaedes
956511b248
fix kv_self gradients for training
use ggml_set instead of ggml_cpy to set kv_self cache with properly propagating gradients
2023-05-07 01:32:46 +02:00
xaedes
47561de7d8
add ggml_set(ctx, a, b) to set b in view of a and return modified a
necessary to set values into kv_self cache and properly propagate the gradients
2023-05-07 01:30:34 +02:00
xaedes
48bcc4dcf9
fix backward pass for add_at and change arguments to have same order as in view 2023-05-07 01:27:11 +02:00
xaedes
226521a4f1
optimize loss over multiple samples
this increases computation graph, need parallel batched forward for more efficiency.
2023-05-07 01:23:51 +02:00
xaedes
7a5dec24f8
add square_error_loss and cross_entropy_loss functions 2023-05-07 01:21:26 +02:00
xaedes
73fd66e9e5
fix training get_example_targets
predict the next token, not the current token!
2023-05-07 01:18:17 +02:00
Jed Fox
3924088512
Remove default arguments from sampling functions (#1343) 2023-05-06 17:01:47 -04:00
xaedes
80223d98fd
add test for ggml_sum_rows gradients 2023-05-06 18:01:32 +02:00
xaedes
e6186d98a5
implement ggml_repeat support for rank > 2 tensors 2023-05-06 18:01:17 +02:00
xaedes
7a15a8370c
implement backward pass for ggml_sum_rows, necessary for cross entropy loss 2023-05-06 17:37:51 +02:00
xaedes
5724628d31
add test for ggml_log gradients 2023-05-06 17:36:21 +02:00
xaedes
65d9f7349d
add ggml_log operation necessary for cross entropy loss 2023-05-06 17:35:40 +02:00
xaedes
8cf04fec9d
fix soft_max backward pass for input->ne[1] != 1 2023-05-06 17:32:13 +02:00
xaedes
b4c273f7a3
add ggml_reshape_1d, ggml_reshape_4d and ggml_view_4d 2023-05-06 17:29:41 +02:00
xaedes
f1d51d144b
train on multiple examples, generate & print tokens with trained model afterwards
ctx0 for evaluation and optimization is renewed for each sample
2023-05-06 14:16:40 +02:00
xaedes
83ee1cd741
fix bug when using ggml_opt to optimize params in one context and use a renewable context for eval and opt
when not keeping gradients of model parameters they are overwritten by tensors created by opt, which may be invalid after opt context is renewed.
so we need to keep the original gradients and make dups for opt
2023-05-06 13:05:29 +02:00
DaniAndTheWeb
173d0e6419
makefile: automatic Arch Linux detection (#1332)
This commit is a port of a detection method used in koboldcpp's Makefile in order to automatically set the -lcblas option on Arch Linux
2023-05-05 23:57:14 +02:00
Erik Scholz
a3b85b28da
ci : add cublas to windows release (#1271) 2023-05-05 22:56:09 +02:00
Pavol Rusnak
921dcee00a
readme: add missing info (#1324) 2023-05-05 16:43:36 +02:00
Ionoclast Laboratories
2d13786e91
Fix for OpenCL / clbast builds on macOS. (#1329) 2023-05-05 14:18:21 +02:00
Benjamin Lecaillon
a90e96b266
Convert.py @staticmethod (#1327)
* Line 698 has one #staticmethod and should not

otherwise throw error at unpickle.load() as not callable

* Update convert.py

---------

Co-authored-by: Ivan Stepanov <ivanstepanovftw@gmail.com>
2023-05-05 03:17:07 +03:00
slaren
94c5652fc0
quantize: make output filename optional, default to ggml-model-<ftype>.bin (#1301) 2023-05-05 00:58:56 +02:00
Ivan Stepanov
34d9f22f44
Wrap exceptions in std::exception to verbose output on exception. (#1316) 2023-05-04 18:56:27 +02:00
Ivan Stepanov
d3e8093e9b
convert: support DT_BF16 tensors (#1309)
Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
2023-05-04 18:54:37 +02:00
44670
360cfe5bec
readme : add OpenBuddy link (#1321) 2023-05-04 19:33:31 +03:00
44670
2edbdb0f99
main : add --in-suffix option (#1318)
* adding --in-suffix option

* print input suffix before generation
2023-05-04 18:41:12 +03:00
Ron Jailall
20fbf2a2a0
ggml : change immintrin.h to intrin.h for compatibility (#1307)
* change immintrin.h to intrin.h for compatibility

Building on windows11 arm throws an error on this line. Seems like using intrin.h covers x86 and and arm

* conditional def of intrin.h

* fix typo in ggml.c
2023-05-04 18:05:59 +03:00
DannyDaemonic
db1080876a
Only escape prompts when used with -e (#1311) 2023-05-04 05:08:25 -07:00
DannyDaemonic
c65a7fbfa9
Update main's README.md with new features (#1296) 2023-05-04 03:02:59 -07:00
Tomas
f647ce040f
fix #1224 reverse prompt and multi line (#1297)
* fix reverse prompt and multi line

* Code Formatting

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-05-04 03:02:30 -07:00
Georgi Gerganov
799fdc1b5d
ggml : vectorize Q8_0 quantization
https://github.com/ggerganov/ggml/pull/127#issuecomment-1533648531
2023-05-03 23:24:20 +03:00
khimaros
6daa09d879
examples : read chat prompts from a template file (#1196) 2023-05-03 20:58:11 +03:00
Georgi Gerganov
bca9ad938a
minor : fix whitespaces (#1302) 2023-05-03 20:09:42 +03:00
Georgi Gerganov
e2a937ca6a
minor : fix trailing whitespaces 2023-05-03 18:43:23 +03:00
KASR
b0c71c7b6d
scripts : platform independent script to verify sha256 checksums (#1203)
* python script to verify the checksum of the llama models

Added Python script for verifying SHA256 checksums of files in a directory, which can run on multiple platforms. Improved the formatting of the output results for better readability.

* Update README.md

update to the readme for improved readability and to explain the usage of the python checksum verification script

* update the verification script

I've extended the script based on suggestions by @prusnak

The script now checks the available RAM, is there is enough to check the file at once it will do so. If not the file is read in chunks.

* minor improvment

small change so that the available ram is checked and not the total ram

* remove the part of the code that reads the file at once if enough ram is available

based on suggestions from @prusnak i removed the part of the code that checks whether the user had enough ram to read the entire model at once. the file is now always read in chunks.

* Update verify-checksum-models.py

quick fix to pass the git check
2023-05-03 18:31:28 +03:00
CRD716
a8a2efdc81
examples : various prompt and example fixes (#1298)
* fix dan.txt

* miku prompt improvements

* use common characters
2023-05-03 18:26:47 +03:00
Evan Jones
e216aa0463
llama : only copy used KV cache in get / set state (#1272)
* llama : only copy used KV cache in get / set state

* switch to ggml for copying k, v

* avoid designated initializers
2023-05-02 22:26:13 -04:00
DannyDaemonic
2485d7a4d3
Process escape sequences given in prompts (#1173) 2023-05-02 18:46:20 -07:00