Commit graph

  • 42d9b4cfc2
    store optimizer state in training checkpoint and add learning schedule xaedes 2023-05-21 21:36:04 +02:00
  • 37c69435f0
    print suppressed newline tokens as string "\n" xaedes 2023-05-21 21:17:46 +02:00
  • 93eb8f7752
    add forward function without using cache, for more performant training xaedes 2023-05-21 21:14:49 +02:00
  • 2afd218479
    fix bug in llama_sample_token_mirostat_v2 xaedes 2023-05-21 21:12:10 +02:00
  • ec1783c3e0
    add ggml_opt_context, so that we can properly resume training xaedes 2023-05-21 15:16:07 +02:00
  • 6fc5f17e21 detect NUMA systems and pin work threads to nodes (linux) zrm 2023-05-21 14:09:52 -04:00
  • 7e4ea5beff
    examples : add server example with REST API (#1443) master-7e4ea5b Steward Garcia 2023-05-21 11:51:18 -06:00
  • 2257f9f691 Remove trailing space Howard Su 2023-05-21 23:03:36 +08:00
  • fea84c3cf5 fix for stupid msvc compiler Concedo 2023-05-21 22:41:33 +08:00
  • 80f1faac87 format fix Howard Su 2023-05-21 22:31:19 +08:00
  • 006d5707e8 Support V3 format upgrade Howard Su 2023-05-21 22:14:27 +08:00
  • 7780e4f479
    make : .PHONY clean (#1553) master-7780e4f Stefan Sydow 2023-05-21 16:03:44 +02:00
  • b16c085c49
    examples : fix benchmark-matmult Georgi Gerganov 2023-05-21 16:56:33 +03:00
  • f0f6824994
    fix make clean Stefan Sydow 2023-05-21 15:50:54 +02:00
  • 60e0c67874 fix compile errors on cuda Concedo 2023-05-21 21:13:17 +08:00
  • 1eee9255e7
    add missing default parameters for adam optimizer xaedes 2023-05-21 15:03:51 +02:00
  • 33528f5b1d fix for cublas Concedo 2023-05-21 21:03:36 +08:00
  • 994be9a4db fix for cublas Concedo 2023-05-21 21:02:21 +08:00
  • 57c2f4f909
    fix random weight initialization scale xaedes 2023-05-21 12:18:47 +02:00
  • 96514971dd
    use inplace operations in cross_entropy_loss xaedes 2023-05-21 12:17:57 +02:00
  • 24127ebf98 updated lite, fixed some encoding issues Concedo 2023-05-21 17:29:00 +08:00
  • 265db9834e
    ggml : output 3d sizes in ggml_graph_dump_dot() master-265db98 Georgi Gerganov 2023-05-21 11:56:23 +03:00
  • 10cbc311e3 Support more data types Howard Su 2023-05-18 09:49:25 +08:00
  • d521d09380 Support Q4_1 Howard Su 2023-05-17 23:42:17 +08:00
  • b8d69650dc Upgrade v1 format to v2 by leveraging quantize Howard Su 2023-05-17 23:39:39 +08:00
  • 18e9dd87da Explicitely set GEMM type 0cc4m 2023-05-21 08:34:17 +02:00
  • b6b39960c0 Use compile args for preprocessing constants 0cc4m 2023-05-21 08:17:17 +02:00
  • a1657d0233 Add OpenCL compile options 0cc4m 2023-05-19 21:18:57 +02:00
  • e41a7ae40c Fix convert_row_f16 kernel issue 0cc4m 2023-05-18 08:05:19 +02:00
  • 457eff920e Deduplicate dequant kernels 0cc4m 2023-05-18 07:35:40 +02:00
  • 42e1a2ba3d Fix tensor load to device 0cc4m 2023-05-16 18:49:49 +02:00
  • cda2d488f9 Fix error in convert f16 to f32 kernel call 0cc4m 2023-05-16 13:05:33 +02:00
  • 915d0d1168 Generate dequant_mul_mat kernels from simple templates 0cc4m 2023-05-16 07:42:01 +02:00
  • 1968380373 Fix CMakeLists.txt 0cc4m 2023-05-15 19:51:23 +02:00
  • cb588e2aa4 Add remaining dequant_mul_mat functions 0cc4m 2023-05-14 22:19:54 +02:00
  • 8c7a7cea2e Fix dequant_mul_mat kernel 0cc4m 2023-05-14 21:26:07 +02:00
  • 5f610c90bf Fix bugs in dequant_mul_mat code 0cc4m 2023-05-14 21:14:05 +02:00
  • 17e53dbb7e Refactor OpenCL code to work more like the CUDA code, add missing functions 0cc4m 2023-05-14 17:01:46 +02:00
  • a7e3bee4cc Move back to C++ for OpenCL 0cc4m 2023-05-14 17:00:37 +02:00
  • 651f50f6ca
    merge-hf-and-lora-to-hf.py FNsi 2023-05-21 11:28:00 +08:00
  • 5dbdc65700
    merge-hf-and-lora-to-hf.py FNsi 2023-05-21 11:19:18 +08:00
  • 28bec1eb25
    merge-hf-and-lora-to-hf.py FNsi 2023-05-21 11:17:39 +08:00
  • d892edcf7d
    Update merge-hf-and-lora-to-hf.py FNsi 2023-05-21 11:14:52 +08:00
  • 84d3432a98
    Rename merge-HF-and-lora-to-HF.py to merge-hf-and-lora-to-hf.py FNsi 2023-05-21 11:09:29 +08:00
  • e970d41095
    Update and rename merge.py to merge-HF-and-lora-to-HF.py FNsi 2023-05-21 11:09:08 +08:00
  • 600ace39c8
    update warp size Henri Vasserman 2023-05-20 23:42:20 +03:00
  • b19fefef94
    Forwardcompat Henri Vasserman 2023-05-20 23:28:08 +03:00
  • 75e4548821 missed out gpt2 Concedo 2023-05-21 01:44:47 +08:00
  • 2ead735f08 initial integration completed Concedo 2023-05-21 01:29:20 +08:00
  • d6123f738a Merge commit 'ea600071cb' into concedo_experimental Concedo 2023-05-21 01:27:27 +08:00
  • fab49c685e
    ggml : update WASM SIMD master-fab49c6 Georgi Gerganov 2023-05-20 20:00:41 +03:00
  • d418146535 fixed a token decoding bug Concedo 2023-05-21 00:53:20 +08:00
  • d1824f1e88 Merge branch 'master' into concedo_experimental Concedo 2023-05-21 00:30:06 +08:00
  • 5032e0fd64 trying to fix ggjt v3 Concedo 2023-05-21 00:29:50 +08:00
  • c048bcfec4 remove old filever checks (+7 squashed commit) Concedo 2023-05-20 16:47:44 +08:00
  • c66115b833
    Merge 'origin/master' into hipblas Henri Vasserman 2023-05-20 18:29:31 +03:00
  • b8ee340abe
    feature : support blis and other blas implementation (#1536) master-b8ee340 Zenix 2023-05-20 23:58:31 +09:00
  • 9ecb30f959
    OpenCL: Fixes for older devices. (#1435) master-9ecb30f Henri Vasserman 2023-05-20 17:57:39 +03:00
  • 6b5a4ab957 Fix: blas changes on ci zenix 2023-05-20 22:03:33 +09:00
  • c29378e5a8
    clang-tidi Henri Vasserman 2023-05-20 16:03:25 +03:00
  • 29cf5596fe
    llama : define magic numbers as integer constants (#1518) (#1520) master-29cf559 Juuso Alasuutari 2023-05-20 15:58:15 +03:00
  • ef17d99f65
    implement AdamW in ggml_opt_adam by adding weight decay parameter (default 0.001f) xaedes 2023-05-20 14:54:40 +02:00
  • c69f0cd6e4 Define magic numbers as integer constants (#1518) Juuso Alasuutari 2023-05-20 15:54:03 +03:00
  • f4e9ce7998
    enable gradient propagation for inplace add1 and scale operations xaedes 2023-05-20 14:49:19 +02:00
  • a6aafdd719
    add ggml_add1_inplace to header xaedes 2023-05-20 14:47:56 +02:00
  • 3de84b2606
    ggml : add ggml_clamp() (#1539) master-3de84b2 Georgi Gerganov 2023-05-20 15:34:45 +03:00
  • 71ac58ae53
    make clang-tidy happy Henri Vasserman 2023-05-20 15:29:26 +03:00
  • ad9ab0e3fe
    editorconfig fixes Henri Vasserman 2023-05-20 15:27:24 +03:00
  • 4f97f73db2
    fix indexing issue Henri Vasserman 2023-05-20 15:21:38 +03:00
  • affc76edfd
    cuda : loading models directly into VRAM, norm calculation on GPU, broadcasting for ggml_mul (#1483) master-affc76e Johannes Gäßler 2023-05-20 14:19:28 +02:00
  • 37f2c6c251 Add forgotten fclose() JohannesGaessler 2023-05-20 14:16:40 +02:00
  • a4da072d39 llama : fix vram size computation Georgi Gerganov 2023-05-20 15:14:53 +03:00
  • 1de31d55cb
    Fixed llama_set_state_data declaration not matching definition niansa/tuxifan 2023-05-20 14:12:41 +02:00
  • fadcd583fc Attempt clang-tidy fix JohannesGaessler 2023-05-20 14:07:46 +02:00
  • c9cef3917d
    ggml : indentation Georgi Gerganov 2023-05-20 15:02:22 +03:00
  • e4640eec70
    Merge 'origin/master' into clfixes Henri Vasserman 2023-05-20 15:01:41 +03:00
  • 23467680f6
    ggml : add ggml_clamp() Georgi Gerganov 2023-05-20 14:59:34 +03:00
  • e71bba90b8
    rewrite platform selection code. Henri Vasserman 2023-05-20 14:58:33 +03:00
  • b81f662e9d Loop in llama.cpp, fixed progress callback JohannesGaessler 2023-05-20 13:42:19 +02:00
  • fee87f6558 gg rebase fixup JohannesGaessler 2023-05-20 13:27:21 +02:00
  • 909acb3e3f
    Merge branch 'master' into gpu-norms Georgi Gerganov 2023-05-20 13:26:16 +03:00
  • a3586c526f cmake : workarounds for cufile when CMake version < 3.25 Georgi Gerganov 2023-05-20 13:22:47 +03:00
  • 3ec7941bad ggml : ggml_mul better broadcast support Georgi Gerganov 2023-05-20 13:09:01 +03:00
  • f67bc3c363 llama : code style fixes + progress print fix Georgi Gerganov 2023-05-20 12:29:08 +03:00
  • ffe9652bc1 GPU weights not in RAM, direct loading with cuFile JohannesGaessler 2023-05-17 16:35:50 +02:00
  • 977e74d70e Revert "feature : add blis and other BLAS implementation support (#1502)" Georgi Gerganov 2023-05-20 12:03:48 +03:00
  • 667c57f11a feature : add blis and other BLAS implementation support (#1502) Zenix 2023-05-20 18:02:48 +09:00
  • 54ec8a963b llama : add llama_init_backend() API (close #1527) Georgi Gerganov 2023-05-20 11:06:11 +03:00
  • f401d5ffa2 Fix for mingw (#1462) DannyDaemonic 2023-05-20 00:40:02 -07:00
  • df512bbb49 llama : fix name shadowing and C4146 (#1526) Maxime 2023-05-20 09:22:37 +02:00
  • f14673ad56 llama : fix compile warnings in llama_set_state_data() Georgi Gerganov 2023-05-20 10:14:31 +03:00
  • 9a7af6c2a5 ggml : fix scalar implementation of Q4_1 dot Georgi Gerganov 2023-05-20 10:13:19 +03:00
  • 211aa6aff0 ggml : use F16 instead of F32 in Q4_0, Q4_1, Q8_0 (#1508) Georgi Gerganov 2023-05-19 22:17:18 +03:00
  • 9fd8187215 tests : add missing header Georgi Gerganov 2023-05-19 21:17:28 +03:00
  • 0226d491af examples : add persistent chat (#1495) Evan Jones 2023-05-19 13:39:51 -04:00
  • c51c64a8fe main : make reverse prompt option act as a stop token in non-interactive mode (#1032) Jason McCartney 2023-05-19 10:24:59 -07:00
  • 75c017fc5a readme : adds WizardLM to the list of supported models (#1485) David Kennedy 2023-05-19 13:16:30 -04:00
  • 6b5776b0a7 minor : fix compile warnings Georgi Gerganov 2023-05-19 20:14:51 +03:00
  • e22541a49e make kv_f16 the default for api users (#1517) Erik Scholz 2023-05-18 19:31:01 +02:00
  • a94b334591 Fixes #1511 lambda issue for w64devkit (mingw) (#1513) DannyDaemonic 2023-05-18 10:30:40 -07:00