Commit graph

  • 6a498f0d79 Remove torch GPU dependencies bsilvereagle 2023-03-31 15:51:41 -07:00
  • b09da81d52
    Use getopts for example scripts Ben Siraphob 2023-03-31 17:37:33 -05:00
  • 3525899277
    Enable -std= for cmake builds, fix warnings (#598) master-3525899 Stephan Walter 2023-03-31 19:19:16 +00:00
  • e80b06305d Enable -std= for cmake builds, fix warnings Stephan Walter 2023-03-29 17:44:04 +02:00
  • 7e30f52600
    examples: add gpt4all script Leonardo Neumann 2023-03-31 15:51:20 -03:00
  • 2d2d61568c Show error message when -f fails Slaren 2023-03-31 20:03:48 +02:00
  • 41e8d2b434 Move constant out of loop Howard Su 2023-04-01 01:51:44 +08:00
  • 8febfc73af Fix inplace version of operators Howard Su 2023-04-01 01:26:48 +08:00
  • 6b86f5ea22 halfway refactoring, wip adding other model types Concedo 2023-04-01 01:13:05 +08:00
  • 61c6b1a4e0 Add comparison against reference implementation script, implement state & logits saving saharNooby 2023-03-31 20:23:42 +04:00
  • d00f28581a Add reference implementation of RWKV RNN saharNooby 2023-03-31 19:57:16 +04:00
  • 1d08882afa
    Optimize AVX2 ggml_vec_dot_q4_0 (#642) master-1d08882 slaren 2023-03-31 17:55:52 +02:00
  • fd2f59a03d Reviewer requests: added parameter for threads, switched to ggml_time_us() Sebastian Apel 2023-03-31 17:38:19 +02:00
  • 02c9946b57 Update README.md saharNooby 2023-03-31 19:06:31 +04:00
  • 01d667f066 Implement exp, max, 1_minus_x, sigmoid operators in ggml saharNooby 2023-03-31 19:04:35 +04:00
  • 3b7dcc0fc8 Bugfix: Added dependency to ggml.o to benchmark Sebastian Apel 2023-03-31 16:58:54 +02:00
  • 2877517fe8
    Merge branch 'ggerganov:master' into master SebastianApel 2023-03-31 16:44:48 +02:00
  • bcf363cb53 Optimize model to leverage inplace to avoid create new tensor Howard Su 2023-03-31 22:42:09 +08:00
  • ed5f4fe00e Initial version of q4_0 matrix multiplication benchmark Sebastian Apel 2023-03-31 16:39:39 +02:00
  • fed6b5da76
    Fix memory bugs in loading code Justine Tunney 2023-03-30 19:43:41 -07:00
  • 02c5b27e91
    Add AVX acceleration (#617) master-02c5b27 perserk 2023-03-31 16:55:44 +05:00
  • 56949197fe added HF converter base Concedo 2023-03-31 19:10:21 +08:00
  • 17044257a0 Merge branch 'master' into concedo Concedo 2023-03-31 19:04:47 +08:00
  • 559a1967f7 Backwards compatibility formats all done Concedo 2023-03-31 19:01:33 +08:00
  • 9eab39fe6d prepare legacy functions (+1 squashed commits) Concedo 2023-03-31 16:37:39 +08:00
  • 40c7dd19e3 Use -march=native -mtune=native on x86. Also enables AVX512 on macOS. Fabian 2023-03-30 00:08:23 +02:00
  • cbef542879 py : cleanup the code Pavol Rusnak 2023-03-29 21:31:24 +02:00
  • 79f9743347 improved console info, fixed utf encoding bugs Concedo 2023-03-31 15:38:38 +08:00
  • fe272dc3d3 Minor changes saharNooby 2023-03-31 10:24:12 +04:00
  • 3e90f37626 Optimize AVX2 ggml_vec_dot_q4_0 Slaren 2023-03-31 02:51:49 +02:00
  • 1604abdad2
    py : cleanup the code Pavol Rusnak 2023-03-29 21:31:24 +02:00
  • 9733104be5 drop quantize.py (now that models are using a single file) Pavol Rusnak 2023-03-31 00:52:06 +02:00
  • f4c4d29d72
    drop quantize.py (now that models are using a single file) Pavol Rusnak 2023-03-31 00:52:06 +02:00
  • e968c80f5d Link with cblas when LLAMA_OPENBLAS is enabled. KerfuffleV2 2023-03-30 13:42:15 -06:00
  • 3df890aef4
    readme : update supported models Georgi Gerganov 2023-03-30 22:31:54 +03:00
  • ee0c40dd6d Introduce GGML migration tool for new file format master-ee0c40d Justine Tunney 2023-03-30 05:42:56 -07:00
  • 6f23ba5ee2 Ensure --mlock works properly with mmap() support Justine Tunney 2023-03-30 01:53:36 -07:00
  • 78ca9838ee Make loading weights 10-100x faster Justine Tunney 2023-03-29 13:51:37 -07:00
  • a017390358 Initial windows support (untested) Slaren 2023-03-29 22:22:36 +02:00
  • ac184d5147 Always initialize mm_addr and mm_length in llama_model Slaren 2023-03-29 08:53:14 +02:00
  • 276e5b7811 Unmap the file in llama_free Slaren 2023-03-29 08:31:26 +02:00
  • d68c5dc435 Make mmap_file static Slaren 2023-03-29 06:18:18 +02:00
  • 64bde3ffd4 Fix ggml_init_params in quantize Slaren 2023-03-29 05:38:57 +02:00
  • c03ae8dca1 Add mmap support for model files Slaren 2023-03-29 02:03:43 +02:00
  • 516474b465
    Introduce GGML migration tool for new file format Justine Tunney 2023-03-30 05:42:56 -07:00
  • 85e8395944
    SWAP info added Jaime R 2023-03-30 21:25:21 +03:00
  • 3bcc129ba8
    cmake : properly invoke CTest (#629) master-3bcc129 Stephan Walter 2023-03-30 17:56:59 +00:00
  • a4755cf288
    Remove unused variable (#607) master-a4755cf Casey Primozic 2023-03-30 10:53:35 -07:00
  • 1f0414feec
    make : fix darwin f16c flags check (#615) master-1f0414f david raistrick 2023-03-30 13:34:45 -04:00
  • 77efdf5a50
    ggml : fix NEON signs (close #620, #622) master-77efdf5 Georgi Gerganov 2023-03-30 20:27:32 +03:00
  • 44aea7752b Properly invoke CTest Stephan Walter 2023-03-30 19:06:54 +02:00
  • 80dad7923e ggml : refactor AVX part of ggml_vec_dot_q4_0() Sergey Pershukov 2023-03-30 21:44:34 +05:00
  • 93c8dcae75 Update README.md saharNooby 2023-03-30 20:37:09 +04:00
  • 56bf4fc856 Implement time mixing, fix matrix shape mismatch saharNooby 2023-03-30 20:29:41 +04:00
  • 873cb954d0 Make ln0 work correctly saharNooby 2023-03-30 20:01:26 +04:00
  • 9bbf2180a8
    Merge branch 'ggerganov:master' into black Siddhesh Thakur 2023-03-30 10:15:42 -04:00
  • 2f51451561 Initial commit saharNooby 2023-03-30 17:55:30 +04:00
  • ed3c680bcd
    Fix GGML_F32Cx8_STORE in AVX without F16C path (#619) master-ed3c680 slaren 2023-03-30 11:16:30 +02:00
  • b223ef18b8 Fix GGML_F32Cx8_STORE in AVX without F16C path Slaren 2023-03-30 10:57:53 +02:00
  • a45e843efa
    Ensure --mlock works properly with mmap() support Justine Tunney 2023-03-30 01:53:36 -07:00
  • 75d1e55134
    Make loading weights 10-100x faster Justine Tunney 2023-03-29 13:51:37 -07:00
  • 93a3169284 ggml : add AVX ggml_vec_dot_q4_0() Sergey Pershukov 2023-03-30 09:50:40 +05:00
  • 79e14129e1 ggml : add AVX quantize_row_q4_0() Sergey Pershukov 2023-03-30 09:43:27 +05:00
  • 354d4f232f fixed linux openblas build errors Concedo 2023-03-30 11:55:35 +08:00
  • 977a9a246f Merge remote-tracking branch 'origin/master' into concedo Concedo 2023-03-30 09:42:51 +08:00
  • 0f5b470c04 more library checks Concedo 2023-03-30 09:28:04 +08:00
  • 4bab8cb243
    fix darwin f16c flags check david raistrick 2023-03-29 20:31:18 -04:00
  • 80c2178d04
    Initial windows support (untested) Slaren 2023-03-29 22:22:36 +02:00
  • 812cfa1995
    Always initialize mm_addr and mm_length in llama_model Slaren 2023-03-29 08:53:14 +02:00
  • 4daaa5e792
    Unmap the file in llama_free Slaren 2023-03-29 08:31:26 +02:00
  • 4ae12d0824
    Make mmap_file static Slaren 2023-03-29 06:18:18 +02:00
  • a1e0f17a05
    Fix ggml_init_params in quantize Slaren 2023-03-29 05:38:57 +02:00
  • 2a6cef62b3
    Add mmap support for model files Slaren 2023-03-29 02:03:43 +02:00
  • dc5adf173a Windows: convert prompt in system locale to UTF-8. Allows to use others languages without tambourine dancing... Dmitriy Prikhodko 2023-03-30 04:22:45 +05:00
  • 81b8748c98 blacked flake.lock Geeks-sid 2023-03-29 19:17:45 -04:00
  • 172febff3f blacked quantize Geeks-sid 2023-03-29 19:16:48 -04:00
  • 382c0c6100 blacked unversioned-ggml-to-ggml Geeks-sid 2023-03-29 19:16:32 -04:00
  • efab7f8bad blacked gptq-to-ggml Geeks-sid 2023-03-29 19:16:04 -04:00
  • eec105a86a blacked gpt4all-to-ggml Geeks-sid 2023-03-29 19:15:37 -04:00
  • dfa2d707e9 apply black to ggml-to-pth Geeks-sid 2023-03-29 19:15:06 -04:00
  • 842abc7da9
    Remove unused variable Casey Primozic 2023-03-29 14:11:29 -07:00
  • 9cbc404ba6
    ci : re-enable AVX512 testing (Windows-MSVC) (#584) master-9cbc404 anzz1 2023-03-29 23:44:39 +03:00
  • f789d8d50a Initial windows support (untested) Slaren 2023-03-29 22:22:36 +02:00
  • b51c717d5c
    ggml : init time on first ggml_init() call master-b51c717 Georgi Gerganov 2023-03-29 22:15:34 +03:00
  • 0ba76c1e73
    llama : fix compile warnings when reading the vocab master-0ba76c1 Georgi Gerganov 2023-03-29 22:13:12 +03:00
  • cea1c85948
    ggml : add ARM_NEON dequantize_row_q4_1() master-cea1c85 Georgi Gerganov 2023-03-29 22:10:01 +03:00
  • f202ada131
    ggml : add ARM_NEON quantize_row_q4_1() master-f202ada Georgi Gerganov 2023-03-29 22:03:02 +03:00
  • 3b44d30d9b
    ggml : add ARM_NEON ggml_vec_dot_q4_1() Georgi Gerganov 2023-03-29 21:47:33 +03:00
  • 61cbfff5c9
    rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600) Pavol Rusnak 2023-03-29 20:09:25 +02:00
  • d9ad104440
    Create chat-13B.bat (#592) Thérence 2023-03-29 19:21:09 +02:00
  • 536971bade
    Apply suggestions from code review anzz1 2023-03-29 20:20:02 +03:00
  • c447de6937
    rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py Pavol Rusnak 2023-03-29 19:06:19 +02:00
  • d8febc8653 renamed main python script Concedo 2023-03-30 00:48:44 +08:00
  • 664b277c27 integrated libopenblas for greatly accelerated prompt processing. Windows binaries are included - feel free to build your own or to build for other platforms, but that is beyond the scope of this repo. Will fall back to non-blas if libopenblas is removed. Concedo 2023-03-30 00:43:52 +08:00
  • b467702b87
    readme : fix typos Georgi Gerganov 2023-03-29 19:38:31 +03:00
  • 516d88e75c
    readme : add GPT4All instructions (close #588) Georgi Gerganov 2023-03-29 19:37:20 +03:00
  • 53635c081c
    py : add GPT4All conversion script Georgi Gerganov 2023-03-29 19:29:26 +03:00
  • 41318d708e
    llama : use the same threshold for OpenBLAS and ggml thread limiting (#577) Maël Kerbiriou 2023-03-29 18:10:07 +02:00
  • a6956b25a1
    add example of re-act pattern (#583) Tobias Lütke 2023-03-29 17:10:24 +02:00
  • 83df5639eb
    Fix GCC warning about binary literal (#595) master-83df563 anzz1 2023-03-29 16:20:07 +03:00