Commit graph

1209 commits

Author SHA1 Message Date
xaedes
f6828cba9e
remove GGML_ALIGNED_REALLOC and use normal malloc/realloc/free for gguf ctx->kv & ctx->infos 2023-08-28 20:21:03 +02:00
xaedes
440d221c62
add missing blank line at end of file 2023-08-28 19:17:47 +02:00
xaedes
a925e9304a
fix non-windows GGML_ALIGNED_REALLOC 2023-08-28 19:16:27 +02:00
xaedes
12c4e5b50f
Merge branch 'master' into pr-train-mem-usage-improvements 2023-08-28 19:14:18 +02:00
xaedes
17ab46dffc
update train-text-from-scratch README.md 2023-08-28 19:13:20 +02:00
xaedes
3e7dfd08c4
remove prediction related code
use main for prediction, it is better optimized
2023-08-28 19:11:27 +02:00
xaedes
3155019b53
remove trailing whitespace 2023-08-28 18:39:50 +02:00
xaedes
63bf200b87
remove code used to verify correctness of checkpoint file conversion 2023-08-28 18:38:52 +02:00
xaedes
31c093c2cc
bug fixes for convert-train-checkpoint-to-gguf.py loading checkpoints with opt_version=0 2023-08-28 18:33:00 +02:00
xaedes
e8df9e6815
temporarily add code to write old checkpoint files
used to verify that old checkpoint files are correctly converted to gguf
2023-08-28 18:17:51 +02:00
Johannes Gäßler
6b73ef1201
YAML result logging + preset script (#2657) 2023-08-28 17:59:39 +02:00
xaedes
5f27ade48e
bug fixes for convert-train-checkpoint-to-gguf 2023-08-28 17:57:10 +02:00
alonfaraj
75fafcbccc
make : fix tests build (#2855)
* makefile:
- fix test name
- add missing tests build

* editorconfig : fixes

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-08-28 18:38:35 +03:00
grahameth
be475f60af
llama.cpp : fix wrong vsnprintf call in MS compiler (#2856)
Co-authored-by: grahameth <->
2023-08-28 18:38:12 +03:00
xaedes
c690c20362
print data checksums before saving and after loading to verify correctness 2023-08-28 16:09:53 +02:00
xaedes
f97f92bce5
remove trailing whitespace 2023-08-28 15:28:19 +02:00
xaedes
daa0b6c6a4
set name of tensors with empty name from what was read from gguf 2023-08-28 15:27:26 +02:00
xaedes
e86b3e3257
avoid printing lots of spaced on the unusual case that loss gets nan 2023-08-28 15:26:44 +02:00
xaedes
3d8d884049
bug fix in load_opt_context_gguf 2023-08-28 15:07:00 +02:00
Ronny Brendel
3af6b86301
ggml : tiny ggml_vec_dot_q4_K_q8_K AVX2 improvement (#2819) 2023-08-28 15:51:08 +03:00
Georgi Gerganov
35feac6560
ggml : sync (mem align to header + conv_transpose_2d fixes + ggml_alloc) (#2852)
* ggml : sync (mem align to header + conv_transpose_2d fixes)

ggml-ci

* ggml-alloc : minor fix

* ggml-alloc : sync more fixes
2023-08-28 14:24:53 +03:00
Johannes Gäßler
92b1bbd2ec
CUDA: fix RoPE asserts, block sizes (#2833) 2023-08-28 14:23:55 +03:00
igarnier
dd0dc366da
llama.h : add missing struct keyword for C compat in callback type (#2847) 2023-08-28 11:19:59 +03:00
Georgi Gerganov
f55538c3cc
metal : fix memory leak (#2762)
* metal : fix memory leak

* metal : fix encoders memory leak

* metal : clean up more memory resources

* metal : fix more leaks

* metal : reuse dispatch queue + autoreleasepool

* metal : reuse array for command buffers and encoders

* ggml : assert for odd number of blocks on ARM

15M tinyllama is an example
2023-08-28 10:59:08 +03:00
Cebtenzzre
ebcee207b6
quantize : make output filename optional again (#2823)
* quantize : make output filename optional again

* quantize : fix path parsing on Windows

suggested by @slaren
2023-08-28 09:32:25 +03:00
JohnnyB
3e8ff47af6
devops : added systemd units and set versioning to use date. (#2835)
* Corrections and systemd units

* Missing dependency clblast
2023-08-28 09:31:24 +03:00
xaedes
1f83343498
bug fix in read_tensor_by_name 2023-08-28 02:02:05 +02:00
xaedes
152cfaac36
bug fix: init model when no checkpoint was loaded 2023-08-28 01:49:18 +02:00
xaedes
4882ff0c59
bug fixes in load_llama_model_gguf 2023-08-28 01:49:17 +02:00
xaedes
76d2794e11
bug fixes in tokenize_file 2023-08-28 01:49:17 +02:00
xaedes
5d94997a09
add gguf example cmake file 2023-08-28 01:49:17 +02:00
xaedes
ca5b344fb1
fix memory corruption bug in gguf
ctx->kv and ctx->infos was reallocated using not-aligned realloc, but freed with aligned free.
to fix this a GGML_ALIGNED_REALLOC was added, but there is no posix_memalign_realloc function.
so on non-windows and non-mingw32 platforms we fall back to aligned malloc, followed by copying
and freeing the old data.
2023-08-28 01:49:17 +02:00
xaedes
0b2c85b025
use norm_rms_eps, and rope parameters and command line options to set them 2023-08-27 23:39:21 +02:00
xaedes
91a4ccaf96
use same GGUF_GET_KEY macro as in llama.cpp 2023-08-27 23:32:49 +02:00
xaedes
d71069c4fb
add layer_norm_rms_eps to checkpoint convert script 2023-08-27 23:25:41 +02:00
xaedes
ef899fbe89
add gguf key and tensor names for optimizer and training 2023-08-27 23:21:59 +02:00
xaedes
495a62a142
save opt parameter counter as uint64 2023-08-27 23:21:08 +02:00
xaedes
cb42324d6a
add gguf arch and ftype 2023-08-27 23:20:18 +02:00
xaedes
a6f3a47c39
Merge branch 'master' into pr-train-mem-usage-improvements 2023-08-27 23:11:47 +02:00
xaedes
3a91c975a6
add first draft for checkpoint conversion script 2023-08-27 22:05:36 +02:00
xaedes
0c494cc60e
save & load opt->just_initialized value 2023-08-27 22:05:24 +02:00
Georgi Gerganov
103cfafc77
gguf : fix strings to not be null-terminated (#2839)
* gguf : fix strings to not be null-terminated

ggml-ci

* gguf : fix gguf_add_tensor name
2023-08-27 21:50:22 +03:00
Georgi Gerganov
c10704d01e
llama : fix MPI threads (close #2827) 2023-08-27 18:55:41 +03:00
Olivier Chafik
230d46c723
examples : update llama2.c converter to read vocab and write models in GGUF format (#2751)
* llama2.c: direct gguf output (WIP)

* Simplify vector building logic

* llama2.c gguf conversion: fix token types in converter

* llama2.c: support copying vocab from a llama gguf model file

* llama2.c: update default path for vocab model + readme

* llama2.c: use defines for gguf keys

* llama2.c: escape whitespaces w/ U+2581 in vocab converter the llama.cpp way

* llama2.c converter: cleanups + take n_ff from config
2023-08-27 17:13:31 +03:00
Kawrakow
463173a6c0
llama : speedup tokenization (#2831)
* Speedup tokenization

On current master it takes ~3.2 seconds to tokenize
Wikitext. With this change it becomes ~525 ms.

* Fixit: it was missing the piece after the last found occurence

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2023-08-27 16:50:33 +03:00
Georgi Gerganov
eaa13a48ff
falcon : fix CUDA inference by making K and Q contiguous (#2830)
* falcon : fix CUDA inference by making K and Q contiguous

ggml-ci

* cuda : add assert to guard from non-cont ropes
2023-08-27 16:40:48 +03:00
Georgi Gerganov
da7455d046
readme : fix headings 2023-08-27 15:52:34 +03:00
Georgi Gerganov
25423e9185
scripts : helper convert script 2023-08-27 15:24:58 +03:00
Kawrakow
a6d1189fdd
k_quants tuning for Falcon-7b (#2816)
* Make ggml-cuda.cu build with QK_K = 64

Using LLAMA_CUDA_FORCE_DMMV = ON and -nommq it runs and produces
a meaningful result.

* k_quants tuning for Falcon-7b

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2023-08-27 15:19:59 +03:00
Georgi Gerganov
c48c5bb0b0
readme : update hot topics 2023-08-27 14:44:35 +03:00