Commit graph

  • f7882e2d69 Fixed a crash caused by erasing from empty last_n_tokens digiwombat 2023-05-31 20:35:28 -04:00
  • 5f6e16da36
    Merge pull request #9 from anon998/stopping-strings Randall Fitzgerald 2023-05-31 20:05:18 -04:00
  • e9b1f0bf5c fix stopping strings anon 2023-05-31 20:31:58 -03:00
  • 342604bb81 Added a super simple CORS header as default for all endpoints. digiwombat 2023-05-31 19:54:05 -04:00
  • e5dad2afa0 Look for libllama in parent directory Don Mahurin 2023-05-23 06:21:31 -07:00
  • 4ad62c489d fix "missing 1 required positional argument: 'min_keep'" Don Mahurin 2023-05-22 23:54:57 -07:00
  • 60a7c76339 Update llama.cpp Andrei Betlen 2023-05-21 17:47:21 -04:00
  • fda33ddbd5 Fix llama_cpp and Llama type signatures. Closes #221 Andrei Betlen 2023-05-19 11:59:33 -04:00
  • 601b19203f Check for CUDA_PATH before adding Andrei Betlen 2023-05-17 15:26:38 -04:00
  • 66c27f3120 Fixd CUBLAS dll load issue in Windows Aneesh Joy 2023-05-17 18:04:58 +01:00
  • aae6c03e94 Update llama.cpp Andrei Betlen 2023-05-14 00:04:22 -04:00
  • a83d117507 Add winmode arg only on windows if python version supports it Andrei Betlen 2023-05-15 09:15:01 -04:00
  • 7609c73ee6 Update llama.cpp (remove min_keep default value) Andrei Betlen 2023-05-07 00:12:47 -04:00
  • 59f80d2a0d Fix mlock_supported and mmap_supported return type Andrei Betlen 2023-05-07 03:04:22 -04:00
  • 3808a73751 Fix obscure Wndows DLL issue. Closes #208 Andrei Betlen 2023-05-14 22:08:11 -04:00
  • 690588410e Fix return type Andrei Betlen 2023-05-07 19:30:14 -04:00
  • 4885e55ccd Fix: runtime type errors Andrei Betlen 2023-05-05 14:12:26 -04:00
  • 0c2fb05361 Fix: types Andrei Betlen 2023-05-05 14:04:12 -04:00
  • ff31330d7f Fix candidates type Andrei Betlen 2023-05-05 14:00:30 -04:00
  • 7862b520ec Fix llama_cpp types Andrei Betlen 2023-05-05 13:54:22 -04:00
  • f20b34a3be Add return type annotations for embeddings and logits Andrei Betlen 2023-05-05 14:22:55 -04:00
  • 731c71255b Add types for all low-level api functions Andrei Betlen 2023-05-05 12:22:27 -04:00
  • a439fe1529 Allow model to tokenize strings longer than context length and set add_bos. Closes #92 Andrei Betlen 2023-05-12 14:28:22 -04:00
  • b5531e1435 low_level_api_chat_cpp.py: Fix missing antiprompt output in chat. Don Mahurin 2023-05-26 06:35:15 -07:00
  • fb79c567d2 Fix session loading and saving in low level example chat Mug 2023-05-08 15:27:03 +02:00
  • 0bf36a77ae Fix mirastat requiring c_float Mug 2023-05-06 13:35:50 +02:00
  • f8ba031576 Fix lora Mug 2023-05-08 15:27:42 +02:00
  • bbf6848cb0 Wrong logit_bias parsed type Mug 2023-05-06 13:27:52 +02:00
  • 335cd8d947 Rename postfix to suffix to match upstream Mug 2023-05-06 13:18:25 +02:00
  • 32cf0133c9 Update low level examples Mug 2023-05-04 18:33:08 +02:00
  • 9e79465b21 Prefer explicit imports Andrei Betlen 2023-05-05 14:05:31 -04:00
  • d15578e63e Update llama.cpp (session version) Andrei Betlen 2023-05-03 09:33:30 -04:00
  • c26e9bf1c1 Update sampling api Andrei Betlen 2023-05-01 14:47:55 -04:00
  • 78531e5d05 Fix return types and import comments Andrei Betlen 2023-05-01 14:02:06 -04:00
  • d0031edbd2 Update llama.cpp Andrei Betlen 2023-05-01 10:44:28 -04:00
  • 441d30811a Detect multi-byte responses and wait Mug 2023-04-28 12:50:30 +02:00
  • 36b3494332 Also ignore errors on input prompts Mug 2023-04-26 14:45:51 +02:00
  • c8e6ac366a Update llama.cpp (llama_load_session_file) Andrei Betlen 2023-04-28 15:32:43 -04:00
  • 66ad132575 Update llama.cpp Andrei Betlen 2023-04-26 20:00:54 -04:00
  • 656190750d Update llama.cpp Andrei Betlen 2023-04-25 19:03:41 -04:00
  • 80c18cb665 Update llama.cpp (remove llama_get_kv_cache) Andrei Betlen 2023-04-24 09:30:10 -04:00
  • bf9f02d8ee Update llama.cpp Andrei Betlen 2023-04-22 19:50:28 -04:00
  • 5bbf40aa47 Update llama.cpp Andrei Betlen 2023-04-21 17:40:27 -04:00
  • fd64310276 Fix decode errors permanently Mug 2023-04-26 14:37:06 +02:00
  • bdbaf5dc76 Fixed end of text wrong type, and fix n_predict behaviour Mug 2023-04-17 14:45:28 +02:00
  • 81c4c10389 Update type signature to allow for null pointer to be passed. Andrei Betlen 2023-04-18 23:44:46 -04:00
  • 8229410a4e More reasonable defaults Mug 2023-04-10 16:38:45 +02:00
  • b6ce5133d9 Add bindings for LoRA adapters. Closes #88 Andrei Betlen 2023-04-18 01:30:04 -04:00
  • 3693449c07 Update llama.cpp Andrei Betlen 2023-04-12 14:29:00 -04:00
  • d595f330e2 Update llama.cpp Andrei Betlen 2023-04-11 11:59:03 -04:00
  • ce0ca60b56 Update llama.cpp (llama_mmap_supported) Andrei Betlen 2023-04-09 22:01:33 -04:00
  • d0a7ce9abf Make windows users happy (hopefully) Mug 2023-04-10 17:12:25 +02:00
  • 848b4021a3 Better custom library debugging Mug 2023-04-10 17:06:58 +02:00
  • c8b5d0b963 Use environment variable for library override Mug 2023-04-10 17:00:35 +02:00
  • d1b3517477 Allow local llama library usage Mug 2023-04-05 14:23:01 +02:00
  • b36c04c99e Added iterative search to prevent instructions from being echoed, add ignore eos, add no-mmap, fixed 1 character echo too much bug Mug 2023-04-10 16:35:38 +02:00
  • f25a81309e Update model paths to be more clear they should point to file Andrei Betlen 2023-04-09 22:45:55 -04:00
  • e19909249d More interoperability to the original llama.cpp, and arguments now work Mug 2023-04-07 13:32:19 +02:00
  • d5680144c5 Bugfix: Wrong size of embeddings. Closes #47 Andrei Betlen 2023-04-08 15:05:33 -04:00
  • 29e9fb66a3 Better llama.cpp interoperability Mug 2023-04-06 15:30:57 +02:00
  • ce66405da1 Add quantize example Andrei Betlen 2023-04-05 04:17:26 -04:00
  • 739e8d4c9b Fix bug in init_break not being set when exited via antiprompt and others. Mug 2023-04-05 14:47:24 +02:00
  • ae1f37f505 Fix repeating instructions and an antiprompt bug Mug 2023-04-04 17:54:47 +02:00
  • 3c1020b866 Fix stripping instruction prompt Mug 2023-04-04 16:20:27 +02:00
  • 0bfad75406 Added instruction mode, fixed infinite generation, and various other fixes Mug 2023-04-04 16:18:26 +02:00
  • 9e872410da Add instruction mode Mug 2023-04-04 11:48:48 +02:00
  • 15bea0946b Chat llama.cpp example implementation Mug 2023-04-03 22:54:46 +02:00
  • 2b8147e7a8 Update llama_cpp.py MillionthOdin16 2023-04-02 21:50:13 -04:00
  • 62ce167b22 Update low level api example Andrei Betlen 2023-04-01 13:02:10 -04:00
  • a71cda6546 Update llama.cpp Andrei Betlen 2023-03-28 21:10:23 -04:00
  • a279acd680 Update llama.cpp (llama_n_embd) Andrei Betlen 2023-03-25 16:26:03 -04:00
  • ef3c152257 Update llama.cpp (llama_progress_callback) Andrei Betlen 2023-03-25 12:12:09 -04:00
  • def46dd9a6 Add example based on stripped down version of main.cpp from llama.cpp Andrei Betlen 2023-03-24 18:57:25 -04:00
  • 5bb1bc74d1 Fix type signature of token_to_str Andrei Betlen 2023-03-31 03:25:12 -04:00
  • a7a6d88793 Fix ctypes typing issue for Arrays Andrei Betlen 2023-03-31 03:20:15 -04:00
  • 019650f416 Fix array type signatures Andrei Betlen 2023-03-31 02:08:20 -04:00
  • a3da39af79 Bugfix: cross-platform method to find shared lib Andrei Betlen 2023-03-24 18:43:29 -04:00
  • bd1c657f80 Bugfix: wrong signature for quantize function Andrei Betlen 2023-04-04 22:36:59 -04:00
  • ef5a9a6160 Update llama.cpp and re-organize low-level api Andrei Betlen 2023-03-24 14:58:42 -04:00
  • d9dfdec2bd Initial commit (llama_cpp.py, llama-cpp-python) Andrei Betlen 2023-03-23 05:33:06 -04:00
  • bed308c69c
    Apply suggestions from code review Henri Vasserman 2023-06-01 01:15:48 +03:00
  • 8478e59b08
    Merge pull request #8 from SlyEcho/server_refactor Randall Fitzgerald 2023-05-31 18:03:40 -04:00
  • 9104fe5a7c
    Change how the token buffers work. Henri Vasserman 2023-05-31 11:47:55 +03:00
  • f2e1130901
    Merge pull request #7 from anon998/logging-reuse Randall Fitzgerald 2023-05-31 17:08:12 -04:00
  • 497160a60d remove old log function anon 2023-05-31 18:01:07 -03:00
  • 1bd7cc60a8 reuse format_generation_settings for logging anon 2023-05-31 17:58:43 -03:00
  • 43d295fddc filter empty stopping strings anon 2023-05-31 16:54:12 -03:00
  • 276fa99873 Misunderstood the instructions, I think. Back to the raw JSON output only. digiwombat 2023-05-31 16:45:57 -04:00
  • 1b96df2b5f Spacing fix. Nothing to see here. digiwombat 2023-05-31 16:42:43 -04:00
  • 86337e3a9b Server console logs now come in one flavor: Verbose. digiwombat 2023-05-31 16:41:34 -04:00
  • dda4c10d64 Switch to the CPPHTTPLIB logger. Verbose adds body dump as well as request info. digiwombat 2023-05-31 16:23:39 -04:00
  • 7ca81e9e65
    mtl : add reshape and transpose handling Georgi Gerganov 2023-05-31 22:38:40 +03:00
  • 7332b41f9f Simple single-line server log for requests digiwombat 2023-05-31 15:56:27 -04:00
  • 1213af76ce
    mtl : add rope kernel Georgi Gerganov 2023-05-31 22:28:59 +03:00
  • 6af6a05663
    ggml : fix handling of "view" ops in ggml_graph_import() Georgi Gerganov 2023-05-31 22:28:15 +03:00
  • 96fa480147
    Merge pull request #6 from anon998/fix-multibyte Randall Fitzgerald 2023-05-31 12:14:43 -04:00
  • 234270bd83 back to 32 block size, not better Concedo 2023-06-01 00:14:22 +08:00
  • 3edaf6bd8b print timings by default anon 2023-05-31 12:55:19 -03:00
  • d58e48663d default penalize_nl to false + format anon 2023-05-31 11:56:12 -03:00
  • 40e13805d9 print timings + build info anon 2023-05-31 10:41:47 -03:00