Commit graph

  • a5b5d9a101
    llama.android : fix build (#9350) b3687 Georgi Gerganov 2024-09-08 00:33:50 +03:00
  • f12295b8a9
    llama : fix empty ring buffer push (#9358) b3686 Georgi Gerganov 2024-09-08 00:33:33 +03:00
  • faf69d4237
    llama : sanitize invalid tokens (#9357) b3685 Georgi Gerganov 2024-09-08 00:33:13 +03:00
  • df4a8695cc
    llama : fix empty ring buffer push Georgi Gerganov 2024-09-07 23:30:26 +03:00
  • 748d516e34
    tests : fix batch size of bert model Georgi Gerganov 2024-09-07 23:19:07 +03:00
  • e536426ded
    llamafile : disable sgemm for batch-size 1 (#9330) b3684 Eve 2024-09-07 19:02:26 +00:00
  • 6726e3f29a
    llama : check that the input tokens are valid Georgi Gerganov 2024-09-07 21:52:25 +03:00
  • 1b9ae5189c
    common : refactor arg parser (#9308) b3683 Xuan Son Nguyen 2024-09-07 20:43:51 +02:00
  • ba6a97c390
    common : do not add null tokens during warmup Georgi Gerganov 2024-09-07 21:32:43 +03:00
  • e32d0816ed
    ggml : always check bounds on get_rows operations (#9354) b3682 slaren 2024-09-07 20:23:07 +02:00
  • 4bc7dbe738 ggml : always check bounds on get_rows operations slaren 2024-09-07 19:23:31 +02:00
  • 4b96c69a08 export-docs --> gen-docs Xuan Son Nguyen 2024-09-07 19:20:56 +02:00
  • 65b736f9fd Merge branch 'master' into xsn/argparser_v3 Xuan Son Nguyen 2024-09-07 19:07:34 +02:00
  • e625f5fd1e optimize more Xuan Son Nguyen 2024-09-07 18:41:42 +02:00
  • eb7d8f85a2 params.sparams Xuan Son Nguyen 2024-09-07 18:24:44 +02:00
  • ceddafa0e1 no more lamba capture Xuan Son Nguyen 2024-09-07 18:19:41 +02:00
  • 9438cf69b1
    llama.android : fix build Georgi Gerganov 2024-09-07 18:54:55 +03:00
  • 04c9c9d811 add phi2 tokenizer hoangdz 2024-09-07 23:50:02 +09:00
  • 48de4da428 fix editorconfig slaren 2024-09-07 14:52:59 +02:00
  • 91695ad41a llama : set attrs of mislabelled EOT/EOM tokens Kevin Gibbons 2024-09-07 05:30:30 -07:00
  • df270ef745
    llama : refactor sampling v2 (#9294) b3681 Georgi Gerganov 2024-09-07 15:16:19 +03:00
  • 4ac186aece
    llama : update doc [no ci] Georgi Gerganov 2024-09-07 15:14:37 +03:00
  • 2387dbea7d
    sampling : fix repeat penalty out-of-bounds access Georgi Gerganov 2024-09-07 14:50:43 +03:00
  • 8a82f388cd
    sampling : fix state cloning Georgi Gerganov 2024-09-07 14:38:00 +03:00
  • 0e6d170a50
    sampling : avoid llama_model in few samplers Georgi Gerganov 2024-09-07 14:16:21 +03:00
  • 947538acb8
    ggml : fix missing cpu_set_t on emscripten (#9336) b3680 Xuan Son Nguyen 2024-09-07 12:01:34 +02:00
  • 19c36962f7
    batched.swift : fix build Georgi Gerganov 2024-09-07 12:49:56 +03:00
  • 4b27235624
    style : rearrange code + add comments and TODOs Georgi Gerganov 2024-09-07 12:22:27 +03:00
  • 4a4530b7ff
    examples : add missing samplers Georgi Gerganov 2024-09-07 12:21:45 +03:00
  • 9ce9210ef1
    batched.swift : fix build [no ci] Georgi Gerganov 2024-09-06 14:57:44 +03:00
  • befcfe7a31
    common : simplify gpt_sampler Georgi Gerganov 2024-09-06 14:02:17 +03:00
  • 757a9bf868
    llama : add new llama_perf API Georgi Gerganov 2024-09-06 13:47:27 +03:00
  • 5ab52c1f64
    sampling : remove _context suffix [no ci] Georgi Gerganov 2024-09-06 12:52:19 +03:00
  • b448c753b9
    sampling : remove redundant indirection calls Georgi Gerganov 2024-09-06 12:39:43 +03:00
  • 809bdcf767
    sampling : allow passing m to mirostat sampler Georgi Gerganov 2024-09-06 12:06:00 +03:00
  • 8c972b69c1
    grammar : restore llama_grammar_accept signature Georgi Gerganov 2024-09-06 11:58:11 +03:00
  • 5b01cc8c8e
    swift : fix example Georgi Gerganov 2024-09-05 18:29:53 +03:00
  • 82a89df960
    sampling : improve mirostat implementation Georgi Gerganov 2024-09-05 18:07:47 +03:00
  • bd88352834
    ios : try to fix build Georgi Gerganov 2024-09-05 18:10:09 +03:00
  • 34f4bd02da
    sampling : fix cloning of samplers with null ctx Georgi Gerganov 2024-09-05 17:08:46 +03:00
  • 0b6dfcebb2
    llama : remove llama_constraint Georgi Gerganov 2024-09-05 16:49:14 +03:00
  • a2d8b27a4b
    llama : restore comments in llama.h Georgi Gerganov 2024-09-05 10:38:31 +03:00
  • 595711417a
    sampling : add name API + option to disable timings Georgi Gerganov 2024-09-05 10:33:04 +03:00
  • ebeb65194b
    sampling : change _cp/copy to clone Georgi Gerganov 2024-09-05 10:25:33 +03:00
  • 69551ffd60
    sampling : remove top-k min_keep, fix mirostat init and state Georgi Gerganov 2024-09-05 10:18:04 +03:00
  • b2b36e9e95
    example : fix build + fix speculative Georgi Gerganov 2024-09-04 22:16:30 +03:00
  • 9b950671f4
    sampling : fix grammar apply Georgi Gerganov 2024-09-04 21:48:57 +03:00
  • 8e80a1cf6b
    sampling : simplify sample API Georgi Gerganov 2024-09-04 21:23:35 +03:00
  • e7a11cac0e
    sampling : simplify new llama_sampler calls Georgi Gerganov 2024-09-04 20:21:02 +03:00
  • 784a644040
    sampler : API to iterate constraints Georgi Gerganov 2024-09-04 17:13:15 +03:00
  • 0e1378c844
    sampling : convert mirostat samplers to constraints Georgi Gerganov 2024-09-04 16:57:43 +03:00
  • 1a0de0b781
    constraint : add name API Georgi Gerganov 2024-09-04 15:37:51 +03:00
  • c024fe45b0
    constraint : clean-up and simplify Georgi Gerganov 2024-09-04 15:01:31 +03:00
  • ca5d21c17a
    grammar : fix reset call Georgi Gerganov 2024-09-04 14:26:23 +03:00
  • fdb52aa657
    common : fix gpt_sampler_cp Georgi Gerganov 2024-09-04 14:17:19 +03:00
  • ad436e9284
    examples : fix build Georgi Gerganov 2024-09-04 14:07:26 +03:00
  • a0b91214b4
    cont : use new API in examples Georgi Gerganov 2024-09-04 13:54:32 +03:00
  • 437376e708
    cont : add n_prev to llama_sampler_params Georgi Gerganov 2024-09-04 11:54:49 +03:00
  • 91cbb40b29
    cont : common/sampling use the new API [no ci] Georgi Gerganov 2024-09-04 11:21:37 +03:00
  • 1e8e26c155
    cont : leaner constraint initialization [no ci] Georgi Gerganov 2024-09-04 10:03:14 +03:00
  • 09ceb68caa
    cont : add comments [no ci] Georgi Gerganov 2024-09-03 17:27:40 +03:00
  • a2ce91cbef
    cont : add penalties and logit-bias constraints [no ci] Georgi Gerganov 2024-09-03 16:04:22 +03:00
  • 0daebc6b8d
    cont : fix [no ci] Georgi Gerganov 2024-09-03 15:19:32 +03:00
  • 71293a6456
    cont : add rest of the existing samplers [no ci] Georgi Gerganov 2024-09-03 15:17:02 +03:00
  • 1b07dc51c6
    cont : fixes, naming [no ci] Georgi Gerganov 2024-09-03 14:45:07 +03:00
  • cf4dd10ea5
    cont : initial implementation sketch [no ci] Georgi Gerganov 2024-09-03 14:33:10 +03:00
  • 5116b3681c
    cont : add llama_constraint_i [no ci] Georgi Gerganov 2024-09-03 13:12:50 +03:00
  • 86b07ccbb3
    llama : sketching new sampling API Georgi Gerganov 2024-09-03 12:09:08 +03:00
  • ab545c8380
    llama : add llama_sampling API + move grammar in libllama Georgi Gerganov 2024-08-05 10:08:25 +03:00
  • f39b66deb5 add check malloc result on device arthw 2024-09-07 16:42:52 +08:00
  • 6c89eb0b47
    ci : disable rocm image creation (#9340) slaren 2024-09-07 09:48:54 +02:00
  • 481cb3a0c5 fix compiling error hongruichen 2024-09-07 12:22:53 +08:00
  • 67e8af7d87 Merge branch 'master' into dev-refactoring hongruichen 2024-09-07 11:15:11 +08:00
  • 52e47aadca
    Merge 5ee96aa7e0 into 9b2c24c099 Rune Berg 2024-09-06 19:52:27 -07:00
  • c3e2bb6dcf rpc : fix nkvo sl/fix-rpc-nkvo slaren 2024-09-07 03:24:47 +02:00
  • 9b2c24c099
    server : simplify state machine for slot (#9283) b3678 Xuan Son Nguyen 2024-09-06 23:21:29 +02:00
  • 3de9300c37 imatrix : use GGUF to store imatrix data Francis Couture-Harpin 2024-09-06 17:17:25 -04:00
  • 134bc38ecf
    llama-bench : log benchmark progress (#9287) b3677 Aarni Koskela 2024-09-07 00:03:01 +03:00
  • 2dd838af7e ci : disable rocm image creation slaren 2024-09-06 22:35:36 +02:00
  • 5f3047fc00 fixes slaren 2024-09-06 22:19:48 +02:00
  • afb016c138 Update CMakeLists.txt, spaces fix Michael Podvitskiy 2024-09-06 22:17:34 +02:00
  • 6b4923bde0 GGML_TARGET_DEFINES-NOTFOUND fix for builds without GGML_CDEF_PUBLIC Michael Podvitskiy 2024-09-06 20:50:50 +02:00
  • 5ae09fd9f6 Merge branch 'master' into xsn/argparser_v3 Xuan Son Nguyen 2024-09-06 18:11:19 +02:00
  • 815b1fb20a
    batched-bench : add --output-format jsonl option (#9293) b3676 Aarni Koskela 2024-09-06 18:59:58 +03:00
  • 89e70fe3ae bring back android part Xuan Son Nguyen 2024-09-06 17:58:21 +02:00
  • c4a1276ba7 better version Xuan Son Nguyen 2024-09-06 16:23:29 +02:00
  • 87d31e9662 ggml : fix missing cpu_set_t on emscripten Xuan Son Nguyen 2024-09-06 16:15:27 +02:00
  • 409dc4f8bb
    ggml : fix build break for the vulkan-debug (#9265) b3675 Changyeon Kim 2024-09-06 21:54:50 +09:00
  • c8f28909dc implements a retry function to avoid duplication farbod 2024-09-06 15:58:12 +03:30
  • e0fdcda961 Merge branch 'master' into xsn/slot_state_machine Xuan Son Nguyen 2024-09-06 14:07:34 +02:00
  • 38b14cd3ad
    Update examples/server/server.cpp Xuan Son Nguyen 2024-09-06 14:06:38 +02:00
  • 4a1411b4f1
    server : fix missing lock (#9334) b3674 Xuan Son Nguyen 2024-09-06 14:06:04 +02:00
  • e1281d0d7a refine example-specific args Xuan Son Nguyen 2024-09-06 14:05:51 +02:00
  • 53244f9c58 fix args with 2 values Xuan Son Nguyen 2024-09-06 13:47:10 +02:00
  • 961bd19da1 add comments Xuan Son Nguyen 2024-09-06 13:42:20 +02:00
  • dd7e853b41 pop_deferred_task: also notify Xuan Son Nguyen 2024-09-06 13:18:45 +02:00
  • 85ae0920cd Add newline to the end of docs/build.md Dan Johansson 2024-09-06 09:57:01 +02:00
  • eb9f945d30 llama-bench : add optional progress messages Aarni Koskela 2024-09-06 12:42:15 +03:00
  • e93125e7d8 server : fix missing lock Xuan Son Nguyen 2024-09-06 11:22:05 +02:00
  • 295e9945f9 common: warmup: Handle situation when eos=bos=-1 Molly Sophia 2024-09-06 16:34:47 +08:00