Commit graph

  • 09186fabbe
    llama : remove check flash_attn with lora (#11104) Xuan Son Nguyen 2025-01-06 13:41:12 +01:00
  • f75349c27d
    tts : minor cleanup Georgi Gerganov 2025-01-06 13:35:39 +02:00
  • 90cb25c9a2 no need backticks [no ci] Xuan Son Nguyen 2025-01-06 12:34:25 +01:00
  • a160819b83 rm 2 [no ci] Xuan Son Nguyen 2025-01-06 12:33:37 +01:00
  • 9fe79c06d0 rm cmd in log output [no ci] Xuan Son Nguyen 2025-01-06 12:33:10 +01:00
  • 6bd90cf862
    Apply suggestions from code review [no ci] Xuan Son Nguyen 2025-01-06 12:30:00 +01:00
  • 8dbd0880c4 llama : remove check flash_attn with lora Xuan Son Nguyen 2025-01-06 12:28:25 +01:00
  • 96a1dc27c3
    llama : prevent system info string accumulation across calls (#11101) b4426 Asghar Ghorbani 2025-01-06 12:21:46 +01:00
  • 9605c5fb28
    cmake : remove explicit _XOPEN_SOURCE shards-lang/gio/visionos-ci Georgi Gerganov 2025-01-06 13:02:48 +02:00
  • 1487d32b46
    Add streaming for omnivlm (#39) T 2025-01-06 18:10:50 +08:00
  • f77b79d1c5 omni vlm add streaming Te993 2025-01-06 18:06:11 +08:00
  • 7aa97b97ce omni vlm add streaming Te993 2025-01-06 17:59:17 +08:00
  • 6fb7f6a5cd fix: prevent system info string accumulation across calls a-ghorbani 2025-01-06 10:58:13 +01:00
  • 6369f867a4
    llama : rename missed batch params/vars to ubatch (#10059) b4425 Daniel Bevenius 2025-01-06 10:28:17 +01:00
  • 47182dd03f
    llama : update llama_model API names (#11063) b4424 Georgi Gerganov 2025-01-06 10:55:18 +02:00
  • 3e6e7a6bc2
    tokenize : escape the prompt (#11058) b4423 Georgi Gerganov 2025-01-06 10:54:25 +02:00
  • c3a473d421
    tokenize : update help Georgi Gerganov 2025-01-06 10:54:11 +02:00
  • ae2f606bb5
    mmap : fix fileno macro clash (#11076) b4422 Georgi Gerganov 2025-01-06 10:52:38 +02:00
  • 727368c60f
    llama : use LLAMA_TOKEN_NULL (#11062) b4421 Georgi Gerganov 2025-01-06 10:52:15 +02:00
  • 5047dd3546
    llama : use _impl suffix instead of _internal (#11060) b4420 Georgi Gerganov 2025-01-06 10:52:01 +02:00
  • 8d01c89362 Merge remote-tracking branch 'upstream/master' into Remove_obsolete_HIP_workaround Nikita Sarychev 2025-01-05 20:33:10 -08:00
  • 7aba1f9cf3 Remove more references to rocBLAS Nikita Sarychev 2025-01-05 19:51:00 -08:00
  • bb936ea4b8
    Update CMakeLists.txt ag2s20150909 2025-01-06 09:56:52 +08:00
  • 19e9ca1e9e
    Update CMakeLists.txt ag2s20150909 2025-01-06 09:49:22 +08:00
  • b9224cb32e
    fix: Vulkan shader gen binary path when cross compiling ag2s20150909 2025-01-06 09:34:54 +08:00
  • 46e3556e01
    CUDA: add BF16 support (#11093) b4419 Johannes Gäßler 2025-01-06 02:33:52 +01:00
  • 62fe73d4b5 Context size by default should be 2048, not 512 Eric Curtin 2025-01-05 23:03:06 +00:00
  • db8d6b71ec try MUSA fix Johannes Gäßler 2025-01-05 21:41:53 +01:00
  • fa77f7a41e try MUSA fix Johannes Gäßler 2025-01-05 21:38:24 +01:00
  • 6a5cdad219 try MUSA fix Johannes Gäßler 2025-01-05 21:35:12 +01:00
  • ab20aa99c2 CUDA: add BF16 support Johannes Gäßler 2025-01-03 16:36:27 +01:00
  • 1714f5ed4e codeowners : (@ngxson) only watch dockerfile Xuan Son Nguyen 2025-01-05 16:07:20 +01:00
  • 29cbff7360 github : cmd line to bug report Xuan Son Nguyen 2025-01-05 16:05:46 +01:00
  • 5d7bb10ee5
    take effect only on windows and force it to icl 蕭澧邦 2025-01-05 23:04:02 +08:00
  • f62dc45f31
    SYCL: Use get_multi_ptr instead of deprecated get_pointer in wkv6 Akarshan Biswas 2025-01-05 18:11:38 +05:30
  • 6e9fd00f14 llama : rename missed batch params/vars to ubatch Daniel Bevenius 2025-01-05 07:04:34 +01:00
  • c01ccf8288 little stuff Eve 2025-01-05 02:31:28 +00:00
  • d70a731639 q2_k Eve 2025-01-04 20:48:27 -05:00
  • 07d0d58bef q3_k Eve 2025-01-04 16:24:29 -05:00
  • b0e4ccbeb9 revert it Eve 2025-01-04 14:57:40 -05:00
  • 21c6b805c9 q4_k test (slow) Eve 2025-01-04 14:57:28 -05:00
  • 9ad2e7dfd2 Remove obsolete HIP workaround Nikita Sarychev 2025-01-04 16:38:35 -08:00
  • b56f079e28
    Vulkan: Add device-specific blacklist for coopmat for the AMD proprietary driver (#11074) b4418 0cc4m 2025-01-04 21:09:59 +01:00
  • 9394bbd484
    llama : Add support for DeepSeek V3 (#11049) b4417 fairydreaming 2025-01-04 21:06:11 +01:00
  • 6b06d16890 16 bit unpack Eve 2025-01-04 13:32:44 -05:00
  • d122d5c987 q6_k scale caching Eve 2025-01-03 21:57:55 -05:00
  • 40e3363bd4 Add (TM) to AMD name check 0cc4m 2025-01-04 19:55:02 +01:00
  • 4a58b99777 llama : move llama_expert_gating_func_type to llama-hparams.h Stanisław Szymczyk 2025-01-04 17:28:17 +01:00
  • f922a9c542
    [GGML][RPC] Support for models with non-512-aligned tensors over RPC. (#11047) b4416 matt23654 2025-01-04 16:10:30 +00:00
  • 964a345e61
    cont Georgi Gerganov 2025-01-04 16:46:47 +02:00
  • 46be942214
    llama : add support for the cohere2 model architecture (#10900) b4415 DAN™ 2025-01-04 09:33:31 -05:00
  • c98eb635d6
    mmap : fix fileno macro clash Georgi Gerganov 2025-01-04 16:21:52 +02:00
  • 78c6785175 sync : ggml b4414 Georgi Gerganov 2025-01-04 10:54:01 +02:00
  • 5e3b08d606 ggml : do not install metal source when embed library (ggml/1054) Georgi Gerganov 2025-01-04 10:53:54 +02:00
  • db68c93b57 ggml : improve inputs log sched_print_assignments (ggml/1053) Daniel Bevenius 2024-12-19 03:50:12 +01:00
  • a48c3df3df llama : add DeepSeek-V3 chat template Stanisław Szymczyk 2025-01-04 14:36:03 +01:00
  • 3cecef7def webui: fix printing timings [no ci] Xuan Son Nguyen 2025-01-04 12:54:46 +01:00
  • e4d3364e93 add download btn Xuan Son Nguyen 2025-01-04 12:35:45 +01:00
  • a8153cc681 ok Xuan Son Nguyen 2025-01-04 12:11:30 +01:00
  • b24d93460b
    sync : ggml Georgi Gerganov 2025-01-04 10:54:01 +02:00
  • 89f9f2c05a
    ggml : do not install metal source when embed library (ggml/1054) Georgi Gerganov 2025-01-04 10:53:54 +02:00
  • c993e5965c
    ggml : improve inputs log sched_print_assignments (ggml/1053) Daniel Bevenius 2024-12-19 03:50:12 +01:00
  • 54b11a3948 Vulkan: Add device-specific blacklist for coopmat for the AMD proprietary driver 0cc4m 2024-12-29 20:21:18 +00:00
  • c31fc8b966
    fix: Vulkan shader gen binary path (#11037) b4411 Gilad S. 2025-01-04 10:17:31 +02:00
  • 4973a298b6
    Apply suggestions from code review matt23654 2025-01-03 22:46:35 +00:00
  • 3744bc4bc5 server : POC OAI-compat TTS using OuteTTS Xuan Son Nguyen 2025-01-03 23:39:09 +01:00
  • c111e8a5b2 Handle potentially dangerous edge cases. matt23654 2025-01-03 22:14:21 +00:00
  • ddb2dddac1 Merge remote-tracking branch 'origin/master' into deepseek-v3 Stanisław Szymczyk 2025-01-03 20:14:15 +01:00
  • dfffe67611 llama : add support for ACCENT_MARK (\\p{M}) and SYMBOL (\\p{S}) unicode categories in pre-tokenization regex Stanisław Szymczyk 2025-01-03 18:39:44 +01:00
  • 0857839001 fix Windows return statement Johannes Gäßler 2025-01-03 15:46:41 +01:00
  • e8ac0945ca fix print datatypes Johannes Gäßler 2025-01-03 15:40:08 +01:00
  • b66e91b1b2 Cleanup and use GGML error logging functions. matt23654 2025-01-03 14:39:32 +00:00
  • b456e10966 fix gguf_set_kv Johannes Gäßler 2025-01-03 15:34:48 +01:00
  • b87784d63a GGUF: C++ refactor, backend support, misc fixes Johannes Gäßler 2024-12-03 21:43:57 +01:00
  • eb76b84252 feat(ci): add visionOS build workflow Giovanni Petrantoni 2025-01-03 23:02:59 +09:00
  • 5b4673b3dd llama : rename expert_weights_b to exp_probs_b Stanisław Szymczyk 2025-01-03 14:57:56 +01:00
  • 4e37cf1d9a Add support for the cohere2 model architecture. DAN™ 2025-01-03 08:35:06 -05:00
  • 140eb29264 gguf-py, llama : rename expert_weights to exp_probs in tensor and variable names Stanisław Szymczyk 2025-01-03 13:51:14 +01:00
  • 138255e761
    llama : change llama_load_model_from_file -> llama_model_load_from_file Georgi Gerganov 2025-01-03 14:42:28 +02:00
  • 0261d4f02f
    llama : deprecate llama_free_model, add llama_model_free Georgi Gerganov 2025-01-03 14:37:28 +02:00
  • c7b006fc1a
    llama : use LLAMA_TOKEN_NULL Georgi Gerganov 2025-01-03 14:26:46 +02:00
  • 4b0c638b9a
    common : disable KV cache shifting automatically for unsupported models (#11053) Molly Sophia 2025-01-03 20:13:18 +08:00
  • f03c717a46
    llama : avoid hardcoded QK_K Georgi Gerganov 2025-01-03 14:02:24 +02:00
  • 9c529a7939
    Update common/common.cpp Molly Sophia 2025-01-03 20:06:48 +08:00
  • dc32e8f03e
    llama : use _impl suffix instead of _internal Georgi Gerganov 2025-01-03 13:54:38 +02:00
  • 996dc4cdd2 Added Relevant comments dhruvanand24 2025-01-03 16:15:47 +05:30
  • 967987f287 Merge remote-tracking branch 'origin/master' dhruvanand24 2025-01-03 16:13:09 +05:30
  • 5f546b860e Created JNI Binding for applying chat template. Created 2 more utility function- mapListToJSONString and format_chat. Added Function to apply chat template in LLamaAndroid.kt dhruvanand24 2025-01-03 16:12:55 +05:30
  • fc49c3230a
    tokenize : escape the prompt Georgi Gerganov 2025-01-03 12:05:16 +02:00
  • ca7ebcd575
    Merge d7de64bc2b into e7da954ecc Judd 2025-01-03 17:33:24 +08:00
  • e7da954ecc
    metal : avoid uint (#11019) b4409 Georgi Gerganov 2025-01-03 11:26:14 +02:00
  • 331581b2e3 Update README.md Molly Sophia 2024-12-28 19:26:56 +08:00
  • 08cf56060b Fix cuda warning Molly Sophia 2024-12-28 19:24:17 +08:00
  • 00930e6fe5 Fix wkv test & add gla test Molly Sophia 2024-12-28 19:16:29 +08:00
  • aaa870e80e code format changes Molly Sophia 2024-12-28 18:25:13 +08:00
  • f2c1a5c918 Fix some typos Molly Sophia 2024-12-28 18:09:38 +08:00
  • bc930cd59a RWKV6[QWEN2]: Concat lerp weights together to reduce cpu overhead Molly Sophia 2024-12-28 17:30:53 +08:00
  • fab0aa7b1a Add support for RWKV6Qwen2 with cpu and cuda GLA Molly Sophia 2024-12-28 00:14:05 +08:00
  • 385b611d45 RWKV: Some graph simplification Molly Sophia 2024-12-25 15:29:19 +08:00
  • f298f03970 WIP: Add support for RWKV6Qwen2 Molly Sophia 2024-12-24 09:36:30 +08:00