xaedes
f6828cba9e
remove GGML_ALIGNED_REALLOC and use normal malloc/realloc/free for gguf ctx->kv & ctx->infos
2023-08-28 20:21:03 +02:00
xaedes
440d221c62
add missing blank line at end of file
2023-08-28 19:17:47 +02:00
xaedes
a925e9304a
fix non-windows GGML_ALIGNED_REALLOC
2023-08-28 19:16:27 +02:00
xaedes
12c4e5b50f
Merge branch 'master' into pr-train-mem-usage-improvements
2023-08-28 19:14:18 +02:00
xaedes
17ab46dffc
update train-text-from-scratch README.md
2023-08-28 19:13:20 +02:00
xaedes
3e7dfd08c4
remove prediction related code
...
use main for prediction, it is better optimized
2023-08-28 19:11:27 +02:00
xaedes
3155019b53
remove trailing whitespace
2023-08-28 18:39:50 +02:00
xaedes
63bf200b87
remove code used to verify correctness of checkpoint file conversion
2023-08-28 18:38:52 +02:00
xaedes
31c093c2cc
bug fixes for convert-train-checkpoint-to-gguf.py loading checkpoints with opt_version=0
2023-08-28 18:33:00 +02:00
xaedes
e8df9e6815
temporarily add code to write old checkpoint files
...
used to verify that old checkpoint files are correctly converted to gguf
2023-08-28 18:17:51 +02:00
Johannes Gäßler
6b73ef1201
YAML result logging + preset script ( #2657 )
2023-08-28 17:59:39 +02:00
xaedes
5f27ade48e
bug fixes for convert-train-checkpoint-to-gguf
2023-08-28 17:57:10 +02:00
alonfaraj
75fafcbccc
make : fix tests build ( #2855 )
...
* makefile:
- fix test name
- add missing tests build
* editorconfig : fixes
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-08-28 18:38:35 +03:00
grahameth
be475f60af
llama.cpp : fix wrong vsnprintf call in MS compiler ( #2856 )
...
Co-authored-by: grahameth <->
2023-08-28 18:38:12 +03:00
xaedes
c690c20362
print data checksums before saving and after loading to verify correctness
2023-08-28 16:09:53 +02:00
xaedes
f97f92bce5
remove trailing whitespace
2023-08-28 15:28:19 +02:00
xaedes
daa0b6c6a4
set name of tensors with empty name from what was read from gguf
2023-08-28 15:27:26 +02:00
xaedes
e86b3e3257
avoid printing lots of spaced on the unusual case that loss gets nan
2023-08-28 15:26:44 +02:00
xaedes
3d8d884049
bug fix in load_opt_context_gguf
2023-08-28 15:07:00 +02:00
Ronny Brendel
3af6b86301
ggml : tiny ggml_vec_dot_q4_K_q8_K AVX2 improvement ( #2819 )
2023-08-28 15:51:08 +03:00
Georgi Gerganov
35feac6560
ggml : sync (mem align to header + conv_transpose_2d fixes + ggml_alloc) ( #2852 )
...
* ggml : sync (mem align to header + conv_transpose_2d fixes)
ggml-ci
* ggml-alloc : minor fix
* ggml-alloc : sync more fixes
2023-08-28 14:24:53 +03:00
Johannes Gäßler
92b1bbd2ec
CUDA: fix RoPE asserts, block sizes ( #2833 )
2023-08-28 14:23:55 +03:00
igarnier
dd0dc366da
llama.h : add missing struct keyword for C compat in callback type ( #2847 )
2023-08-28 11:19:59 +03:00
Georgi Gerganov
f55538c3cc
metal : fix memory leak ( #2762 )
...
* metal : fix memory leak
* metal : fix encoders memory leak
* metal : clean up more memory resources
* metal : fix more leaks
* metal : reuse dispatch queue + autoreleasepool
* metal : reuse array for command buffers and encoders
* ggml : assert for odd number of blocks on ARM
15M tinyllama is an example
2023-08-28 10:59:08 +03:00
Cebtenzzre
ebcee207b6
quantize : make output filename optional again ( #2823 )
...
* quantize : make output filename optional again
* quantize : fix path parsing on Windows
suggested by @slaren
2023-08-28 09:32:25 +03:00
JohnnyB
3e8ff47af6
devops : added systemd units and set versioning to use date. ( #2835 )
...
* Corrections and systemd units
* Missing dependency clblast
2023-08-28 09:31:24 +03:00
xaedes
1f83343498
bug fix in read_tensor_by_name
2023-08-28 02:02:05 +02:00
xaedes
152cfaac36
bug fix: init model when no checkpoint was loaded
2023-08-28 01:49:18 +02:00
xaedes
4882ff0c59
bug fixes in load_llama_model_gguf
2023-08-28 01:49:17 +02:00
xaedes
76d2794e11
bug fixes in tokenize_file
2023-08-28 01:49:17 +02:00
xaedes
5d94997a09
add gguf example cmake file
2023-08-28 01:49:17 +02:00
xaedes
ca5b344fb1
fix memory corruption bug in gguf
...
ctx->kv and ctx->infos was reallocated using not-aligned realloc, but freed with aligned free.
to fix this a GGML_ALIGNED_REALLOC was added, but there is no posix_memalign_realloc function.
so on non-windows and non-mingw32 platforms we fall back to aligned malloc, followed by copying
and freeing the old data.
2023-08-28 01:49:17 +02:00
xaedes
0b2c85b025
use norm_rms_eps, and rope parameters and command line options to set them
2023-08-27 23:39:21 +02:00
xaedes
91a4ccaf96
use same GGUF_GET_KEY macro as in llama.cpp
2023-08-27 23:32:49 +02:00
xaedes
d71069c4fb
add layer_norm_rms_eps to checkpoint convert script
2023-08-27 23:25:41 +02:00
xaedes
ef899fbe89
add gguf key and tensor names for optimizer and training
2023-08-27 23:21:59 +02:00
xaedes
495a62a142
save opt parameter counter as uint64
2023-08-27 23:21:08 +02:00
xaedes
cb42324d6a
add gguf arch and ftype
2023-08-27 23:20:18 +02:00
xaedes
a6f3a47c39
Merge branch 'master' into pr-train-mem-usage-improvements
2023-08-27 23:11:47 +02:00
xaedes
3a91c975a6
add first draft for checkpoint conversion script
2023-08-27 22:05:36 +02:00
xaedes
0c494cc60e
save & load opt->just_initialized value
2023-08-27 22:05:24 +02:00
Georgi Gerganov
103cfafc77
gguf : fix strings to not be null-terminated ( #2839 )
...
* gguf : fix strings to not be null-terminated
ggml-ci
* gguf : fix gguf_add_tensor name
2023-08-27 21:50:22 +03:00
Georgi Gerganov
c10704d01e
llama : fix MPI threads ( close #2827 )
2023-08-27 18:55:41 +03:00
Olivier Chafik
230d46c723
examples : update llama2.c converter to read vocab and write models in GGUF format ( #2751 )
...
* llama2.c: direct gguf output (WIP)
* Simplify vector building logic
* llama2.c gguf conversion: fix token types in converter
* llama2.c: support copying vocab from a llama gguf model file
* llama2.c: update default path for vocab model + readme
* llama2.c: use defines for gguf keys
* llama2.c: escape whitespaces w/ U+2581 in vocab converter the llama.cpp way
* llama2.c converter: cleanups + take n_ff from config
2023-08-27 17:13:31 +03:00
Kawrakow
463173a6c0
llama : speedup tokenization ( #2831 )
...
* Speedup tokenization
On current master it takes ~3.2 seconds to tokenize
Wikitext. With this change it becomes ~525 ms.
* Fixit: it was missing the piece after the last found occurence
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2023-08-27 16:50:33 +03:00
Georgi Gerganov
eaa13a48ff
falcon : fix CUDA inference by making K and Q contiguous ( #2830 )
...
* falcon : fix CUDA inference by making K and Q contiguous
ggml-ci
* cuda : add assert to guard from non-cont ropes
2023-08-27 16:40:48 +03:00
Georgi Gerganov
da7455d046
readme : fix headings
2023-08-27 15:52:34 +03:00
Georgi Gerganov
25423e9185
scripts : helper convert script
2023-08-27 15:24:58 +03:00
Kawrakow
a6d1189fdd
k_quants tuning for Falcon-7b ( #2816 )
...
* Make ggml-cuda.cu build with QK_K = 64
Using LLAMA_CUDA_FORCE_DMMV = ON and -nommq it runs and produces
a meaningful result.
* k_quants tuning for Falcon-7b
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2023-08-27 15:19:59 +03:00
Georgi Gerganov
c48c5bb0b0
readme : update hot topics
2023-08-27 14:44:35 +03:00