Commit graph

  • 2c6985f79e
    bug fixes for cross entropy loss xaedes 2023-07-02 20:55:54 +02:00
  • 97964a4cc9
    change default AdamW weight decay parameter defined in ggml to 0.0, making Adam default instead of AdamW xaedes 2023-06-29 21:36:28 +02:00
  • f175ead6ef
    change default AdamW weight decay parameter used in training to 0.1 as used in nanoGPT xaedes 2023-06-29 21:33:39 +02:00
  • a80f184e6d
    change AdamW decay parameter to work like the torch AdamW decay parameter xaedes 2023-06-29 21:31:25 +02:00
  • ed4319e1a7
    add and use function ggml_build_backward_expand to avoid stack overflows with large maximum number of nodes xaedes 2023-07-28 23:08:11 +02:00
  • e05e4414ac
    remove unused compute buffer 3 xaedes 2023-06-27 17:43:00 +02:00
  • 6e3f95bf06
    implement gradient checkpointing for training xaedes 2023-07-28 23:06:05 +02:00
  • 3492f848d7
    gguf : add gguf_find_key (#2438) klosax 2023-07-28 22:45:24 +02:00
  • 765fab636a
    ggml.c : add gguf_find_key klosax 2023-07-28 22:31:14 +02:00
  • 78f57ffd90
    ggml.h : add gguf_find_key klosax 2023-07-28 22:28:56 +02:00
  • e9d2990c3d
    gguf.cpp : find key example klosax 2023-07-28 22:26:49 +02:00
  • 5a87675db4 output vector is not part of llama.c model file Aniket 2023-07-28 16:17:44 -04:00
  • 817cc20f4c updating gitignore to ignore additional binaries Aniket 2023-07-28 16:09:33 -04:00
  • af9caca434 updating makefile to compile finalized version Aniket 2023-07-28 16:08:51 -04:00
  • b3aa1073ab saving the file with all the variables found in llama.c model Aniket 2023-07-28 16:08:09 -04:00
  • d7003a98cc
    Fix reset of unused g->nodes and g->grads to NULL xaedes 2023-06-17 18:56:27 +02:00
  • d395b19c8c
    add gradient clipping to AdamW xaedes 2023-06-15 23:48:46 +02:00
  • d39c8e6863
    remove unnecessary Adam(W) optimizer tensors. xaedes 2023-06-15 21:07:56 +02:00
  • 5d124d0cb4
    fix track_max_mem in forward_batch_wo_cache_flash_attn_train xaedes 2023-06-15 20:34:56 +02:00
  • 8a88e5855c
    perplexity : add Hellaswag calculation (#2389) master-8a88e58 klosax 2023-07-28 20:25:36 +02:00
  • a9559bf77b
    ggml : workaround for missing _mm256_setr_m128i in GCC < 8 in k_quants.c (#2405) master-a9559bf Lee 2023-07-29 02:17:45 +08:00
  • ee1b497c98
    llama : support more diverse tokenizers? (#2420) master-ee1b497 eric8607242 2023-07-29 02:10:05 +08:00
  • a8ee520eb8
    Update llama.cpp Georgi Gerganov 2023-07-28 21:09:19 +03:00
  • d73b8d48b4
    examples : fix whitespace Georgi Gerganov 2023-07-28 21:05:08 +03:00
  • 34ae1caf7f
    examples : server chat mode with llama2 (#2400) nhamanasu 2023-07-29 03:02:10 +09:00
  • abed446346 q2_K sc_high JohannesGaessler 2023-07-28 19:27:44 +02:00
  • cc5c67be9b adding the rough attempt to convert the model Aniket 2023-07-28 12:26:44 -04:00
  • 485e62b1e9 Adding a doc that shows mappings that are coded in between llama.c <-> gg Aniket 2023-07-28 12:26:11 -04:00
  • 912fc590c4 Updated makefile to compile rough tests Aniket 2023-07-28 12:25:21 -04:00
  • 58daf95aa3 add __restrict__ JohannesGaessler 2023-07-28 18:01:34 +02:00
  • 6808800c17 loop unrolling JohannesGaessler 2023-07-28 17:10:42 +02:00
  • a3505fac64 faster q8_1 loading JohannesGaessler 2023-07-28 14:21:25 +02:00
  • dead8f4b5b Fix misaligned memory access in Q4_1 kernel Iwan Kawrakow 2023-07-28 17:27:01 +03:00
  • 72af25998c Fix misaligned memory access in Q4_1 kernel Iwan Kawrakow 2023-07-28 17:12:27 +03:00
  • e5d23f2e7e
    ggml : fix ARM build + speed-up ggml_mul Georgi Gerganov 2023-07-28 16:31:59 +03:00
  • 2231618450 Split off matrix vector multiplication for separate optimization 0cc4m 2023-07-28 14:58:07 +02:00
  • a4d1eb72c6 ggml : add q4_1 normalized quants Georgi Gerganov 2023-07-28 14:37:52 +03:00
  • bf60b6a149
    common.cpp : alter wording klosax 2023-07-28 11:57:31 +02:00
  • 630fa8d86f
    common.h : alter wording klosax 2023-07-28 11:56:38 +02:00
  • d100e9afe2
    perplexity.cpp : alter wording klosax 2023-07-28 11:55:22 +02:00
  • b53e713883 q5_K JohannesGaessler 2023-07-28 11:26:14 +02:00
  • 5d8b3de4e5 vdr JohannesGaessler 2023-07-28 10:24:54 +02:00
  • b59cd1dc1c q4_K JohannesGaessler 2023-07-28 09:24:14 +02:00
  • a62bcc891c q3_k JohannesGaessler 2023-07-27 21:39:04 +02:00
  • 5bff3df032 q2_K JohannesGaessler 2023-07-27 20:44:16 +02:00
  • 4b3af63ee8 q6_K JohannesGaessler 2023-07-26 21:46:03 +02:00
  • ddb37bf8a0 mmq implementation for non k-quants JohannesGaessler 2023-07-08 19:12:39 +02:00
  • d91f3f0c55
    readme : fix the description of the Tail free sampling (TFS) method (#2431) Weird Constructor 2023-07-28 10:44:43 +02:00
  • 65cdf34bdc
    llama : use n_embd_gqa instead of n_embd to handle llama-2 70B (#2433) Rand Xie 2023-07-28 01:42:53 -07:00
  • 11ef380c2a
    GGUF : write tensor (#2426) M. Yusuf Sarıgöz 2023-07-28 11:34:16 +03:00
  • 675425563c ggml : poc for normalizing weights for better quantization Georgi Gerganov 2023-07-27 21:16:10 +03:00
  • 0a8ad8f1d8 use n_embd_gqa instead of n_embd to handle llama-2 70B randxie 2023-07-28 00:21:58 -07:00
  • 511055722e undo formatting gguf-write-tensor M. Yusuf Sarıgöz 2023-07-28 09:09:14 +03:00
  • 6bd9bd9026 Add .spv to gitignore 0cc4m 2023-07-28 07:08:28 +02:00
  • f6b241e803 Batch submissions 0cc4m 2023-07-28 07:07:58 +02:00
  • b40550cf1a change wiki link Concedo 2023-07-28 13:01:12 +08:00
  • 44065df367 Add F32 dmmv shaders 0cc4m 2023-07-28 06:38:23 +02:00
  • d0bd120814 Use F16 kernel for most things, replace q_f32 with mul_mat_q_f16 function 0cc4m 2023-07-28 05:41:47 +02:00
  • 31486ebc8d updated readme Concedo 2023-07-28 11:32:55 +08:00
  • 6e8c89df94 Fix the description of the Tail free sampling (TFS) method. Weird Constructor 2023-07-28 05:13:11 +02:00
  • 766ec56642 ensure primitive types can be used as root of schema Evan Jones 2023-07-27 21:24:05 -04:00
  • edcc7ae7d2
    Obtaining LLaMA 2 instructions (#2308) niansa/tuxifan 2023-07-28 03:14:11 +02:00
  • 029d911cd1
    Added links to LLaMA 2 70B models niansa/tuxifan 2023-07-28 02:01:29 +02:00
  • 0c74b82f2c
    Added LLaMA 2 usage instructions niansa/tuxifan 2023-07-28 01:59:55 +02:00
  • 56b121e6d6
    Add LLaMA 2 to list of supported models niansa/tuxifan 2023-07-28 01:56:40 +02:00
  • cd4a8cd28c llama.cpp : better memory usage prints with allocator slaren 2023-07-28 00:36:48 +02:00
  • 0c43a3b7d8 gitignore *.gguf M. Yusuf Sarıgöz 2023-07-28 00:07:28 +03:00
  • 8e62d2b214 rm example.gguf M. Yusuf Sarıgöz 2023-07-28 00:06:47 +03:00
  • 62f4926bde fix : fix errors upd writing example M. Yusuf Sarıgöz 2023-07-28 00:04:19 +03:00
  • 7c529cede6
    convert.py : Update to support 70B HF format model files (#2427) mj-shifu 2023-07-27 22:39:17 +02:00
  • 9411250564 refactor : rm unused import and upd todos M. Yusuf Sarıgöz 2023-07-27 23:25:47 +03:00
  • bb54d1700e GGUF : Support writing tensors in Python M. Yusuf Sarıgöz 2023-07-27 23:09:53 +03:00
  • 464192b9be WIP: Write tensor M. Yusuf Sarıgöz 2023-07-27 22:25:04 +03:00
  • 9442c34f49 convert.py : shorten and simplify permute Maximilian Markewitz 2023-07-27 20:59:43 +02:00
  • 01d16e1a1e convert.py : fix of type and shorter code Maximilian Markewitz 2023-07-27 20:03:43 +02:00
  • e15a67d6b2 convert.py : fix llama 2 70b conversion from Huggingface Maximilian Markewitz 2023-07-27 19:16:58 +02:00
  • 966c069b3f llama.cpp : fix embeddings input slaren 2023-07-27 19:03:31 +02:00
  • ba0ab56b63 llama.cpp : fix embeddings output slaren 2023-07-27 18:54:06 +02:00
  • e592a17a75 ggml : refactor ggml_view_Nd into ggml_view_tensor_offset slaren 2023-07-27 18:40:52 +02:00
  • e39e62ba4a replace n_views and n_children in ggml_tensor with a hash table in the allocator slaren 2023-07-27 18:34:21 +02:00
  • af7bd42b2a llama.cpp : free allocator when deleting context, cleanup slaren 2023-07-27 18:02:53 +02:00
  • 64584d56a7 ggml : don't calculate data pointer of unallocated tensors when creating a view with an offset slaren 2023-07-27 17:46:05 +02:00
  • f67179aaf2 add list of ops that support in-place slaren 2023-07-27 16:11:32 +02:00
  • 8fa548377a allow using the allocator with opencl slaren 2023-07-27 12:18:03 +02:00
  • 8afe392398 fix mpi build slaren 2023-07-27 12:15:49 +02:00
  • 598a9ada8f adjust buffer size to account for alignment slaren 2023-07-27 12:14:51 +02:00
  • 768ecfcc28 ggml : add graph tensor allocator slaren 2023-07-26 17:13:58 +02:00
  • ca4650afdb
    common.cpp : Change default param klosax 2023-07-27 16:48:53 +02:00
  • 90b2ce3549
    common.h : change default param value klosax 2023-07-27 16:46:18 +02:00
  • 01bdda2574
    Update index.html JackJollimore 2023-07-27 11:35:17 -03:00
  • d2bb3ac10b
    convert.py : remove GGML vocab + other obsolete stuff Georgi Gerganov 2023-07-27 16:36:35 +03:00
  • 68f53485e4
    convert.py : start a new simplified implementation by removing old stuff Georgi Gerganov 2023-07-27 15:56:53 +03:00
  • 158be8f7f4
    gguf.py : some code style changes Georgi Gerganov 2023-07-27 15:37:06 +03:00
  • d2b6ca13ad
    gguf : add array support Georgi Gerganov 2023-07-27 14:53:07 +03:00
  • a55b102e21 Update index.html.hpp by running ./deps.sh Ebrahim Byagowi 2023-07-27 13:36:37 +03:30
  • 6500d95af4
    Merge 5f04a5d877 into 1a941869cb Howard Su 2023-07-27 09:11:44 +00:00
  • efb5dac337 supporting more diverse tokenizers eric8607242 2023-07-27 16:59:42 +08:00
  • d89533dff6
    gguf : expose the gguf_type enum through the API for now Georgi Gerganov 2023-07-27 11:10:34 +03:00
  • 1a941869cb
    metal : disable graph concurrency optimization due to bug (#2413) master-1a94186 Georgi Gerganov 2023-07-27 11:00:54 +03:00
  • a6c25ebf3e supporting more diverse tokenizers eric.huang 2023-07-27 15:37:14 +08:00