Commit graph

  • 3d650d0e25 remove dependency of psutil, fixed compile error on WSL, handle exceptions when sending http response, added multiline for embedded kobold Concedo 2023-04-06 11:08:19 +08:00
  • 79ed023891
    Do not crash when it has nothing to say. Sergey Alirzaev 2023-04-06 00:57:31 +02:00
  • e65e832082 ADD libllama.so target for llama-cpp-python Brendan Hubble 2023-04-06 08:56:53 +10:00
  • 41d4a863c9 Remove "internal" header files Håkon H. Hitland 2023-04-05 22:18:58 +02:00
  • 4778f93611 Merge branch 'master' into eval-thread-count ml6 2023-04-05 12:44:50 -07:00
  • 36ddd12924
    llama : add flash attention (demo) flash-attn Georgi Gerganov 2023-04-05 18:28:01 +03:00
  • eeaa7b0492
    ggml : multi-thread ggml_rope() (~3-4 times faster on M1) (#781) master-eeaa7b0 Georgi Gerganov 2023-04-05 22:11:03 +03:00
  • 372b70c39d
    ggml : multi-thread ggml_rope() (~3-4 times faster on M1) Georgi Gerganov 2023-04-05 19:15:21 +03:00
  • 986b6ce9f9
    ggml, llama : avoid heavy V transpose + improvements (#775) master-986b6ce Georgi Gerganov 2023-04-05 22:07:33 +03:00
  • 754b5f30e5
    fixed the git repo name on the usage instructions Emile 2023-04-05 19:50:07 +02:00
  • 1c4c634db4
    Make docker instructions more explicit Pavol Rusnak 2023-04-05 19:20:04 +02:00
  • 3416298929
    Update README.md Georgi Gerganov 2023-04-05 19:54:30 +03:00
  • 27d6178698
    Add examples/common.cpp in CMake library 'llama' Locria Cyber 2023-04-05 16:37:40 +00:00
  • 5a8c4f6240
    llama : define non-positive top_k; top_k range check (#779) master-5a8c4f6 Ivan Stepanov 2023-04-05 19:20:05 +03:00
  • 724e5ec124
    minor : brackets Georgi Gerganov 2023-04-05 19:19:04 +03:00
  • ff05d05c96
    miku.sh : add executable bit (#780) at8u 2023-04-05 15:59:13 +00:00
  • 62b3e81aae
    media : add logos and banners Georgi Gerganov 2023-04-05 18:58:06 +03:00
  • 8d10406d6e
    readme : change logo + add bindings + add uis + add wiki Georgi Gerganov 2023-04-05 18:56:20 +03:00
  • a52f51ada2 Add executable bit to Miku.sh at8u 2023-04-05 16:42:45 +01:00
  • c6479a3cda Define non-positive top_k; top_k range check Ivan Stepanov 2023-04-05 18:40:17 +03:00
  • ed1c214e66
    zig : add build.zig (#773) iacore 2023-04-05 15:06:02 +00:00
  • 0c44427df1
    make : missing host optimizations in CXXFLAGS (#763) master-0c44427 Ivan Stepanov 2023-04-05 17:38:37 +03:00
  • 594cc95fab
    readme : update with CMake and windows example (#748) Adithya Balaji 2023-04-05 16:36:12 +02:00
  • 88ed5761b8
    examples : add Miku.sh (#724) at8u 2023-04-05 14:32:42 +00:00
  • 1868f6c84f
    ggml, llama : avoid heavy V transpose + improvements Georgi Gerganov 2023-04-05 17:04:16 +03:00
  • 6f171ec28d
    Add build.zig Locria Cyber 2023-04-05 11:39:10 +00:00
  • d12088e164 Minor formatting changes saharNooby 2023-04-05 15:31:23 +04:00
  • 65c0af359f
    Build static lib Locria Cyber 2023-04-05 11:18:55 +00:00
  • 58c438cf7d
    Add Accelerate/BLAS when using Swift (#765) Andrew Duffy 2023-04-05 11:44:24 +01:00
  • 5c1920df43 why nobody ever told me the makefile doesnt work outside x86 xD Concedo 2023-04-05 17:15:42 +08:00
  • 3415e292b1
    Update Package.swift Andrew Duffy 2023-04-05 09:36:28 +01:00
  • 1490cdd71d change GPT-J and GPT2 KVs to use fp16 instead Concedo 2023-04-05 15:53:07 +08:00
  • 86286cb318
    Fix magic in convert-gptq-to-ggml.py Pavol Rusnak 2023-04-05 09:23:01 +02:00
  • 57e9f929ee renamed misnamed ACCELERATE define, and removed all -march=native and -mtune=native flags Concedo 2023-04-05 15:22:13 +08:00
  • 63cfa43200 quantize-stats: add option to test against reference quantization Håkon H. Hitland 2023-04-05 03:30:23 +02:00
  • b1fa386c11
    Update ggml.c Sylvie 2023-04-04 21:50:15 +02:00
  • dc679bf971
    Merge pull request #14 from hypnopump/update_macos Alexander 2023-04-04 21:42:45 +05:00
  • d3801340f3
    streaming output hypnopump 2023-04-04 18:27:14 +02:00
  • a9cb9adfd6
    streaming output hypnopump 2023-04-04 18:27:04 +02:00
  • c320573b5e
    verify instructions can be followed hypnopump 2023-04-04 17:45:55 +02:00
  • f5feb7470b
    verify instructions can be followed hypnopump 2023-04-04 17:45:06 +02:00
  • b75a805563
    working on macos. no point in fp32 if all weights distributed in fp16 hypnopump 2023-04-04 17:39:21 +02:00
  • 14273fea7a integrated gpt2 support Concedo 2023-04-04 23:15:47 +08:00
  • 0349d03134 Add Accelerate framework dependency from Swift in this test. Andrew Duffy 2023-04-04 15:52:22 +01:00
  • 42ad59fe41 Bugfix: We can handle the situation where matrix rows / thread count is not a multiple of TILESIZE_X Sebastian Apel 2023-04-04 16:23:51 +02:00
  • 52de932842 removed main.exe to reduce clutter, added support for rep pen in gptj Concedo 2023-04-04 20:43:13 +08:00
  • ce58bfc3ac Missing host optimizations in CXXFLAGS Ivan Stepanov 2023-04-04 15:33:01 +03:00
  • 77e19980e9
    Merge pull request #13 from pixelkaiser/rwkv-macos Alexander 2023-04-04 14:24:21 +05:00
  • 888db62c80
    Advise the kernel to preload the mapped memory Pavol Rusnak 2023-04-03 12:28:49 +02:00
  • 5f1e91677c
    README: update with code-review for cmake build Adithya Balaji 2023-04-04 10:32:22 +02:00
  • 977efba905 we actually build a dylib on macos PXLKSR 2023-04-04 10:19:06 +02:00
  • a33cbbe03b Makefile: Added defaults for TILESIZE_X and _Y Sebastian Apel 2023-04-04 09:26:29 +02:00
  • 9e4e917a96
    Merge branch 'ggerganov:master' into main barton ⊛ 2023-04-04 03:36:14 +00:00
  • 890af8cacf
    Remove '[end_of_conversation]' line from Miku.sh at8u 2023-04-04 03:12:01 +00:00
  • 6a4f137805 Fix wrongly copy-pasted mmap flags trollkotze 2023-04-04 03:19:09 +02:00
  • 2c9910cd99 Change mmap parameters to avoid much swap thrashing trollkotze 2023-04-04 01:52:15 +02:00
  • d4915074c4 quantize-stats: misc improvements Håkon H. Hitland 2023-04-04 00:33:09 +02:00
  • 6d479decd7 Added support to compile MPI on Darwin Chad Brewbaker 2023-04-03 17:21:51 -05:00
  • a7d3c3f304 quantize-stats: use less scratch memory Håkon H. Hitland 2023-04-04 00:21:01 +02:00
  • 32d0fe7e92
    Trying again to fix error on windows compilation C2589: '(': illegal token CoderRC 2023-04-03 17:48:52 -04:00
  • 634b09c9a4 add mpi Chad Brewbaker 2023-04-03 16:31:29 -05:00
  • b90a3bf15e
    Trying to fix error on windows compilation C2589: '(': illegal token on right side of CoderRC 2023-04-03 17:13:25 -04:00
  • 75eea96d01 Add benchmark script Sebastian Apel 2023-04-03 22:50:03 +02:00
  • 8889c3be01
    Remove redundant duplicate #include <windows.h> CoderRC 2023-04-03 16:26:32 -04:00
  • 10d758b917
    Change static pthread_create and pthread_join to non static pthread_create and pthread_join CoderRC 2023-04-03 16:08:42 -04:00
  • 8c2ffe8559
    README: Update with CMake and windows example Adithya Balaji 2023-04-03 22:06:55 +02:00
  • 0de310a159
    Remove deletions of Patch 2: Added threading for non posix systems CoderRC 2023-04-03 15:45:49 -04:00
  • 361632264c Working version of tiled implementation Sebastian Apel 2023-04-03 21:20:55 +02:00
  • 9881cb3301
    Improve ifdef logic Marco Matthies 2023-04-03 21:13:36 +02:00
  • f43aca0f63 Simplify to include lower-case windows.h always, fix compilation on mingw32 Marco Matthies 2023-04-03 21:00:19 +02:00
  • 8a7dd2c682
    Patch 2: Added threading for non posix systems CoderRC 2023-04-03 14:21:46 -04:00
  • 68623ee175
    Merge branch 'ggerganov:master' into master CoderRC 2023-04-03 14:20:07 -04:00
  • 37264707c2 Add "-e"/"--eval-threads" command-line parameter to set a different number of threads for single-token eval than for prompt eval. ml6 2023-04-03 11:17:07 -07:00
  • 9c0dbbb08b Merge branch 'master' into concedo Concedo 2023-04-04 00:51:05 +08:00
  • dd2abd8bc7 lower default thread threshold Concedo 2023-04-04 00:42:49 +08:00
  • 53dbba7695
    Windows: reactive sigint handler after each Ctrl-C (#736) master-53dbba7 mgroeber9110 2023-04-03 18:00:55 +02:00
  • 5b1143ed93 quantize-stats: show percentiles Håkon H. Hitland 2023-04-03 17:32:03 +02:00
  • dc1c5ae7ec Experimental code that achives 30k FLOPS Sebastian Apel 2023-04-03 13:49:15 +02:00
  • 437e77855a
    10+% performance improvement of ggml_vec_dot_q4_0 on AVX2 (#654) master-437e778 SebastianApel 2023-04-03 09:52:28 +02:00
  • 1ed8878a4c Reviewer comments: removed double semicolon, deleted empty line 1962 Sebastian Apel 2023-04-03 09:31:15 +02:00
  • 06c711d770 Merge branch 'master' into concedo Concedo 2023-04-03 15:10:08 +08:00
  • aacc8b6872 Minor formatting changes saharNooby 2023-04-03 10:39:28 +04:00
  • 4f1df7c89e
    Merge pull request #9 from hypnopump/more_instructions_works_linux Alexander 2023-04-03 11:35:38 +05:00
  • fa74b016c6
    more details for macos/linux hypnopump 2023-04-03 08:33:57 +02:00
  • b589e34f92 Fixed problem with MSVC compiler Sebastian Apel 2023-04-03 08:33:03 +02:00
  • bea02c4b4c
    Merge branch 'master' into more_instructions_works_linux Eric Alcaide 2023-04-03 08:29:55 +02:00
  • 0a0cabc4c7
    for consistency hypnopump 2023-04-03 08:27:00 +02:00
  • 6f3fb01913
    suggestions hypnopump 2023-04-03 08:25:54 +02:00
  • 3535476987 Update README.md: include info about pre-compiled library saharNooby 2023-04-03 09:48:53 +04:00
  • 5b2830ed30 Increase memory for overhead from 32 MB to 256 MB saharNooby 2023-04-03 09:32:58 +04:00
  • 61cd520cd6
    Patch 1: Added threading for non posix systems CoderRC 2023-04-03 00:53:39 -04:00
  • ec59387899
    Added threading for non posix systems CoderRC 2023-04-03 00:41:39 -04:00
  • 0a1c308d04
    Sync CoderRC 2023-04-02 23:52:27 -04:00
  • 578c327dd4
    Fixed loading time by reading the file while letting the code execute CoderRC 2023-04-02 23:48:54 -04:00
  • 864dcb26fb updates Gary Linscott 2023-04-02 20:16:15 -07:00
  • eb5b22dda2 rebrand to koboldcpp Concedo 2023-04-03 10:35:18 +08:00
  • c23078f57a
    Add --keep param to Miku.sh at8u 2023-04-03 02:15:33 +00:00
  • bf18c6f4f9
    Update README.md Pi 2023-04-02 19:01:56 -07:00
  • 0a5354fb1c
    Added cat translator Pi 2023-04-02 18:59:07 -07:00
  • ed667e9581 quantize-stats command Håkon H. Hitland 2023-04-02 15:59:14 +02:00