Commit graph

  • 7a87d31f4f
    [main] fix infinite generation (-n == -1) (#523) master-7a87d31 anzz1 2023-03-26 16:06:10 +03:00
  • e2152356dc Update README and comments for standalone perplexity tool Stephan Walter 2023-03-26 14:59:13 +02:00
  • 7312191d1f Update README and comments for standalone perplexity tool Stephan Walter 2023-03-26 14:59:13 +02:00
  • d4fd8ccbf8
    [main] fix infinite generation (-n == -1) anzz1 2023-03-26 15:35:01 +03:00
  • 8e1fb49abf
    Merge branch 'ggerganov:master' into master R.Kaufmann 2023-03-26 13:00:52 +02:00
  • 348d6926ee
    Add logo to README.md Georgi Gerganov 2023-03-26 10:20:49 +03:00
  • 053b20c8ca merged complete Concedo 2023-03-26 14:55:43 +08:00
  • 33b5d2c376 Merge branch 'master' into concedo Concedo 2023-03-26 14:52:14 +08:00
  • 57474944d6 Merge branch 'master' into concedo Concedo 2023-03-26 14:52:08 +08:00
  • 33e35b8fe8
    Exit from interactive mode if input stream is bad (#491) master-33e35b8 Harald Fernengel 2023-03-26 07:25:46 +02:00
  • 7dca16bcfd Add AVX2 implementation of quantize_row_q4_1 Slaren 2023-03-26 01:01:16 +01:00
  • 1582a04085 Exit from interactive mode if input stream is bad Harald Fernengel 2023-03-25 12:48:39 +01:00
  • 7a7e0acf1a
    Merge branch 'ggerganov:master' into master R.Kaufmann 2023-03-26 00:23:18 +01:00
  • 633dec89af
    Add platform to versioned builds Juan Calderon-Perez 2023-03-25 19:18:10 -04:00
  • 840645dea7 trace logits to a file Maël Kerbiriou 2023-03-24 21:59:07 +01:00
  • bbf5b04a95
    Add support for linux/arm64 platform Juan Calderon-Perez 2023-03-25 19:15:41 -04:00
  • 19726169b3
    CI: Run other sanitizer builds even if one fails (#511) master-1972616 anzz1 2023-03-26 00:13:28 +02:00
  • f732695cd5
    Clarify console output in convert-pth-to-ggml.py (#512) jp-x-g 2023-03-25 14:53:55 -07:00
  • a157fe69ca
    Clarify console output in convert-pth-to-ggml.py jp-x-g 2023-03-25 14:50:25 -07:00
  • aca5a9e74d
    CI: Run other sanitizer builds even if one fails anzz1 2023-03-25 23:44:47 +02:00
  • 2f7bf7dd7c
    CMake / CI additions (#497) master-2f7bf7d anzz1 2023-03-25 23:38:11 +02:00
  • 43523220a4 Remove perplexity from main Gary Linscott 2023-03-25 13:33:42 -07:00
  • 7392ad629d update from merge Gary Linscott 2023-03-25 13:30:40 -07:00
  • 34ab526843
    (Windows) Set console to UTF-8 on init (#420) master-34ab526 anzz1 2023-03-25 22:29:22 +02:00
  • c3d3cd2d45 Merge branch 'master' into batch_perplexity Gary Linscott 2023-03-25 13:24:22 -07:00
  • 098eb922b8
    Merge branch 'ggerganov:master' into master R.Kaufmann 2023-03-25 21:23:56 +01:00
  • c3dc4dbb1e merge master anzz1 2023-03-25 22:20:45 +02:00
  • c2b25b6912
    Fix colors enabling on WIN32 master-c2b25b6 Georgi Gerganov 2023-03-25 21:53:39 +02:00
  • 79b2b266db
    If n_predict == -1, generate forever Georgi Gerganov 2023-03-25 21:51:41 +02:00
  • 004fddfed7
    CI: Add sanitizer build (Ubuntu) anzz1 2023-03-25 21:44:27 +02:00
  • 779c37f916
    cmake: make sanitizers linking #468 anzz1 2023-03-25 21:39:05 +02:00
  • e2d490dafd
    Inifinite generation via context swapping (#71) Georgi Gerganov 2023-03-25 21:36:22 +02:00
  • 2d27013343
    test avx-512f only when possible anzz1 2023-03-25 21:28:31 +02:00
  • 03f7e33560
    Cleanup STL headers + fix embedding examples + minor stuff master-03f7e33 Georgi Gerganov 2023-03-25 20:51:14 +02:00
  • 55ad42af84
    Move chat scripts into "./examples" Georgi Gerganov 2023-03-25 20:36:52 +02:00
  • 459e93cce0
    Add AVX2 implementation of dequantize_row_q4_1 (#505) master-459e93c slaren 2023-03-25 19:31:48 +01:00
  • a316a425d0
    Overhaul the examples structure master-a316a42 Georgi Gerganov 2023-03-25 20:26:40 +02:00
  • 70ff2062df Add AVX2 implementation of dequantize_row_q4_1 Slaren 2023-03-25 19:02:07 +01:00
  • ecbe466a36
    Retire the ggml_mul_mat() branch for transposed src0 (#500) master-ecbe466 Georgi Gerganov 2023-03-25 19:47:21 +02:00
  • face8082ea
    SIMD-ify dequantize_row_q4_0() for ARM_NEON (#502) Georgi Gerganov 2023-03-25 19:31:53 +02:00
  • b83ddbd768
    Fix dequantization - forgot to interleave the quants Georgi Gerganov 2023-03-25 19:31:23 +02:00
  • c2916bb4a0
    disable avx512 test on runner for now anzz1 2023-03-25 19:09:34 +02:00
  • a52f6b43b7
    CI: option 2 anzz1 2023-03-25 18:54:28 +02:00
  • 04be5b0ba4
    Attempt to SIMD-ify dequantize_row_q4_0() for ARM_NEON Georgi Gerganov 2023-03-25 18:40:13 +02:00
  • d234a8643f
    Merge branch 'ggerganov:master' into patch-1 RSereno 2023-03-25 16:01:53 +00:00
  • 8daa71d958
    Update tools.sh RSereno 2023-03-25 16:01:11 +00:00
  • 1e39d2bf77
    Retire the ggml_mul_mat() for transposed src0 Georgi Gerganov 2023-03-25 17:55:31 +02:00
  • 2279cd25f7
    Enable avx for Linux only if also fp16c available. R.Kaufmann 2023-03-25 16:54:42 +01:00
  • ea546b5f8d with logits_all == true, seek to the last logits vector Maël Kerbiriou 2023-03-25 14:58:57 +01:00
  • 502a400192
    Disable prompt verbosity by default and add option to enable (#480) master-502a400 Georgi Gerganov 2023-03-25 17:16:50 +02:00
  • 09aecbf628
    Add AVX2 implementation of dequantize_row_q4_0 (#467) master-09aecbf slaren 2023-03-25 16:06:49 +01:00
  • 4640eff23d
    Don't interefe with BLAS for large prompts by running only 1 thread master-4640eff Georgi Gerganov 2023-03-25 17:03:10 +02:00
  • ab77d76312
    Add longer DAN prompt for testing big batch numbers Georgi Gerganov 2023-03-25 16:47:59 +02:00
  • 29b7baab67
    Add timings for the prompt evaluation (#478) master-29b7baa slaren 2023-03-25 15:34:23 +01:00
  • 4a7129acd2
    Remove obsolete information from README Georgi Gerganov 2023-03-25 16:30:32 +02:00
  • 43e1cf8693
    CI: (Windows) Add AVX / AVX512 builds anzz1 2023-03-25 16:30:25 +02:00
  • d213f9d52b
    CMake: add AVX512 option anzz1 2023-03-25 16:28:09 +02:00
  • 6b6dbc8910
    Remove obsolete assert and fix compiler warning master-6b6dbc8 Georgi Gerganov 2023-03-25 16:22:05 +02:00
  • 2a2e63ce05
    Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS master-2a2e63c Georgi Gerganov 2023-03-25 16:09:54 +02:00
  • 4eae17153c
    Merge branch 'ggerganov:master' into patch-1 RSereno 2023-03-25 13:59:12 +00:00
  • e899bf54b2
    bounds checking for input prefix (#492) master-e899bf5 anzz1 2023-03-25 14:42:09 +02:00
  • d37af8d7c0
    bounds checking for input prefix anzz1 2023-03-25 14:07:23 +02:00
  • fbd4d38c64
    feat: '--in-prefix STRING' option (#426) master-fbd4d38 anzz1 2023-03-25 14:03:19 +02:00
  • 58e6c9f36f
    Add support for file load progress reporting callbacks (#434) master-58e6c9f Jed Fox 2023-03-25 01:26:28 -04:00
  • 36d07532ef
    Add missing struct annotation (#483) master-36d0753 Doomsdayrs 2023-03-25 01:21:24 -04:00
  • 6f1ee4b640
    Fix crash for 65B model with pre-allocated memory (#485) master-6f1ee4b Chris Kuehl 2023-03-24 23:38:14 -05:00
  • 8a339bd75c update gitignore Concedo 2023-03-25 11:23:40 +08:00
  • 3c78124aac Merge branch 'master' into concedo Concedo 2023-03-25 11:20:04 +08:00
  • 119392f6f2 defaulting to f32 kv, and 4 threads seem to produce better results Concedo 2023-03-25 11:11:40 +08:00
  • 506cd62638 changed some defaults to hopefully increase compatibility Concedo 2023-03-25 10:40:11 +08:00
  • b13a768813 added softprompt endpoint Concedo 2023-03-25 10:12:47 +08:00
  • 8347bede58
    Add missing struct annotation Doomsdayrs 2023-03-24 20:25:50 -04:00
  • 5d909a377c
    enable sanitizers in linux ci Green Sky 2023-03-25 01:16:14 +01:00
  • fe5af95ef8
    cmake: make sanitizers link Green Sky 2023-03-23 21:46:04 +01:00
  • 743ec9b221 Fix crash for 65B model with pre-allocated memory Chris Kuehl 2023-03-24 19:17:34 -05:00
  • 186ecfd8a4 Remove printing of prompt and prompt tokenization at startup Slaren 2023-03-24 23:46:02 +01:00
  • 8666c5aa43 Add timings for the prompt evaluation Slaren 2023-03-24 23:12:15 +01:00
  • 8520fc310e
    Disable BLAS altogether - the bug is not just for qunatized mat mul master-8520fc3 Georgi Gerganov 2023-03-24 23:47:06 +02:00
  • b3f460e941
    Disable BLAS branch in mul_mat - seems there is a bug master-b3f460e Georgi Gerganov 2023-03-24 23:39:17 +02:00
  • 3a8e8b7a0f
    Fix typo Jed Fox 2023-03-24 17:28:34 -04:00
  • a8096f3d81
    Merge branch 'master' into jed/spm Jed Fox 2023-03-24 17:27:46 -04:00
  • ae3d0ff68f
    Call progress callback more frequently Jed Fox 2023-03-24 17:26:19 -04:00
  • 1e3fd898a3
    Merge branch 'master' into jed/load-progress Jed Fox 2023-03-24 17:25:38 -04:00
  • 04c6f5ed6f
    Immediately start processing the prompt before user input has been provided (#476) master-7a9b6c3 master-04c6f5e Georgi Gerganov 2023-03-24 23:17:58 +02:00
  • 7a9b6c3a8b
    Reduce memory usage and allocate enough memory for largest context (#473) Georgi Gerganov 2023-03-24 23:17:37 +02:00
  • 6feb572b36
    Merge branch 'master' into mem-fix Georgi Gerganov 2023-03-24 23:17:19 +02:00
  • d0f7519338
    Fix KV cache size for F32 Georgi Gerganov 2023-03-24 22:58:00 +02:00
  • d26a3994f4
    Immediately start processing the prompt before user input has been provided Georgi Gerganov 2023-03-24 22:44:02 +02:00
  • 4aeee216fd Regroup q4_1 dot addition for better numerics. q4_1_more_accel Matvey Soloviev 2023-03-23 04:56:21 +01:00
  • 580991bbed Squeeze out about 5% more performance in Q4_1 inference Matvey Soloviev 2023-03-21 22:55:35 +01:00
  • 0b4e849a24
    Fix number of layers in 30B and 65B Georgi Gerganov 2023-03-24 22:15:06 +02:00
  • 3634c312bc
    Reenable BLAS for quantized mul_mat Georgi Gerganov 2023-03-24 22:03:56 +02:00
  • ea60d2193a
    Simpler scratch buffer usage Georgi Gerganov 2023-03-24 21:41:47 +02:00
  • 9330ff0f35
    Reduce memory usage and allocate enough memory for large contexts Georgi Gerganov 2023-03-24 18:22:48 +02:00
  • 8f2b6d222d Add AVX2 implementation of dequantize_row_q4_0 Slaren 2023-03-24 17:13:50 +01:00
  • 31572d9665
    Temporary bump the memory buffer size - hopefully fix issues from 483bab2e master-31572d9 Georgi Gerganov 2023-03-24 18:23:56 +02:00
  • f4f5362edb
    Update README.md (#444) master-863f65e Gary Mulder 2023-03-24 15:23:09 +00:00
  • 863f65e2e3
    fix instruct mode (#445) rabidcopy 2023-03-24 10:22:39 -05:00
  • afd220d9c6
    Properly free llama_context on failure master-afd220d master-563cdc3 master-481044d Georgi Gerganov 2023-03-24 17:21:01 +02:00
  • 481044d50c
    additional optimizations for POWER9 (#454) Cameron Kaiser 2023-03-24 08:19:26 -07:00