Commit graph

  • 2926089c5d fix lints ochafik 2024-09-26 19:06:29 +01:00
  • 5840e10069 tool-call: merge & fix jinja template tests into test-chat-template ochafik 2024-09-26 19:05:00 +01:00
  • 50685f837f minja: add str.title() ochafik 2024-09-26 19:03:59 +01:00
  • 9958b6b291
    cmake : add option for common library Borislav Stanimirov 2024-09-26 20:47:15 +03:00
  • 296331bba3 minja: update chat template goldens w/ llama.3.1 arguments workaround ochafik 2024-09-26 18:10:27 +01:00
  • 9cfe4d7202 tool-call: refactor llama_chat_template class + use in validate_model_chat_template ochafik 2024-09-26 18:06:03 +01:00
  • d4c57cd641 test-backend-ops : use flops for some performance tests slaren 2024-09-26 18:24:42 +02:00
  • 4eba23299e
    Merge faaac59d16 into 95bc82fbc0 compilade 2024-09-26 18:19:57 +02:00
  • cf7bece6a7 tool-call: factor chat template away from legacy API ochafik 2024-09-26 17:19:29 +01:00
  • 4d9ebd1b7f sycl: initial cmake support of SYCL for AMD GPUs Alberto Cabrera 2024-09-26 16:37:55 +01:00
  • 1ae8376d40 change test data Xuan Son Nguyen 2024-09-26 17:52:57 +02:00
  • f27dd6990d add reranking test Xuan Son Nguyen 2024-09-26 17:43:02 +02:00
  • 6f0ab1783b Set ROCM_DOCKER_ARCH as string due it incorrectly build and cause OOM exit code serhii.n 2024-09-26 18:01:09 +03:00
  • 351a4d85c1 Docs: Add akx/ollama-dl Aarni Koskela 2024-09-26 15:55:48 +03:00
  • f19554f453
    ci : add rerank tests Georgi Gerganov 2024-09-26 15:20:32 +03:00
  • ca99a6ce70
    llama : fix uninitialized tensors Georgi Gerganov 2024-09-26 15:20:11 +03:00
  • 4d457755c0
    llama : add comment [no ci] Georgi Gerganov 2024-09-26 14:36:14 +03:00
  • 877a04ccff
    server : add docs Georgi Gerganov 2024-09-26 14:31:03 +03:00
  • 00b33760aa
    server : initiate tests for later Georgi Gerganov 2024-09-26 13:17:22 +03:00
  • 0a6a9e6742 Merge branch 'master' into chameleon nopperl 2024-09-26 12:10:07 +02:00
  • 95bc82fbc0
    [SYCL] add missed dll file in package (#9577) b3828 Neo Zhang Jianyu 2024-09-26 17:38:31 +08:00
  • a48284c5f3 ggml: Move definition of ggml_arm_arch_features to the global data section Dan Johansson 2024-09-25 12:42:57 +02:00
  • 8fd848dd30 ggml: Extend feature detection to include non aarch64 Arm arch Dan Johansson 2024-09-19 12:45:11 +02:00
  • ce926fe879 ggml: Added run-time detection of neon, i8mm and sve Dan Johansson 2024-08-08 13:52:59 +02:00
  • d7ec84f78c tool-call: allow <|python_tag|> in functionary-medium-3.1 ochafik 2024-09-26 06:51:46 +01:00
  • 3d2650ce65 fix gcc build ochafik 2024-09-26 06:50:51 +01:00
  • 749a21c67a gcc appeasement ochafik 2024-09-26 06:08:18 +01:00
  • 0c870133d8 tool-call: test/fix functionary-medium-v3.1's template (can "look" like llama3.1 template) ochafik 2024-09-26 05:56:15 +01:00
  • 8e4a9bad8a minja: allow none input to selectattr, and add safe passthrough filter ochafik 2024-09-26 05:53:12 +01:00
  • 5f5be9cde7 minja: gcc tweaks ochafik 2024-09-26 05:06:11 +01:00
  • 2eb29bf8b8 tool-call: update chat templates/goldens ochafik 2024-09-26 04:00:10 +01:00
  • 4cd82d61dd tool-call: fix pyright type errors ochafik 2024-09-26 03:59:38 +01:00
  • 059babdd9b minja: try to please gcc ochafik 2024-09-26 03:58:18 +01:00
  • 94377d743c server: catch errors in format_final_response_oaicompat instead of taking server down ochafik 2024-09-26 03:42:36 +01:00
  • 595e11cb11 tool-call: fix/test functionary v3 ochafik 2024-09-26 03:42:05 +01:00
  • c9ae1916ec add tensor parallel support Chen Xi 2024-09-26 02:14:04 +00:00
  • cb8507b3b4 add tensor parallelism support to SYCL Chen Xi 2024-08-22 06:31:34 +00:00
  • c124ab48ea minja: add str.endswith ochafik 2024-09-26 03:21:23 +01:00
  • 76d2938ef8 fix flake8 lints ochafik 2024-09-26 02:30:17 +01:00
  • 1b6280102b fix editorconfig lints ochafik 2024-09-26 02:27:46 +01:00
  • 7691654c68
    mtgpu: enable VMM (#9597) b3827 R0CKSTAR 2024-09-26 09:27:40 +08:00
  • ab25e3fbf9 tool-call: allow empty message content when there's tool_calls in format_chat ochafik 2024-09-26 02:19:04 +01:00
  • d928ff4dfd server: catch errors in oaicompat_completion_params_parse instead of taking server down ochafik 2024-09-26 02:18:01 +01:00
  • a774093a99 tool-call: add server tests for llama 3.1 ochafik 2024-09-26 02:17:30 +01:00
  • 9e366b3d03 server: fix tailing comma in completions_seed ochafik 2024-09-26 02:15:48 +01:00
  • 45b243b4a5 minja: fix llama_chat_apply_template + adde use_jinja param to validate_model_chat_template ochafik 2024-09-26 02:14:42 +01:00
  • 8d76b0c72c profiler: add support for different outputs Max Krasnyansky 2024-09-25 17:48:48 -07:00
  • 4578c37a92 profiler: make profiler optional with GGML_GRAPH_PROFILER Max Krasnyansky 2024-09-25 15:42:46 -07:00
  • b7ae2d176e profiler: initial support for profiling graph ops Max Krasnyansky 2024-09-25 14:25:13 -07:00
  • e983c9d0de tool-call: fix llama_chat_apply_template signature / test-chat-template ochafik 2024-09-25 22:02:58 +01:00
  • 6ada7e48b5 Fix Docker ROCM builds, use AMDGPU_TARGETS instead of GPU_TARGETS serhii.n 2024-09-25 22:31:21 +03:00
  • 97d0620968 minja: fetch more templates (add models from test-chat-template) ochafik 2024-09-25 19:22:43 +01:00
  • d15dcfb09d tool-call: add output example to readme ochafik 2024-09-25 19:22:16 +01:00
  • 8b0d3ab5ab
    fix formatting ZXED 2024-09-25 21:00:46 +03:00
  • 33ea20edd1 Merge remote-tracking branch 'origin/master' into tool-call ochafik 2024-09-25 18:58:54 +01:00
  • 84f56f3c45
    vocab : minor style Georgi Gerganov 2024-09-25 20:39:37 +03:00
  • 866c0113fb
    jina : support v1 reranker Georgi Gerganov 2024-09-25 20:39:25 +03:00
  • c62a39d91e
    embedding : parse special tokens Georgi Gerganov 2024-09-25 20:36:38 +03:00
  • 8f25531c44 tool-call: add basic usage example to server readme ochafik 2024-09-25 18:00:31 +01:00
  • c795d8b82b
    add tests ZXED 2024-09-25 19:57:52 +03:00
  • 3722c729b8
    server: add repeat penalty sigmoid ZXED 2024-09-15 15:00:52 +03:00
  • 4706bdbae1 tool-call: support Functionary v3 vs. v3-llama3.1 variants ochafik 2024-09-25 17:33:00 +01:00
  • ea9c32be71
    ci : fix docker build number and tag name (#9638) Xuan Son Nguyen 2024-09-25 17:26:01 +02:00
  • 41103c0ed6 server: add --chat-template-file ochafik 2024-09-25 16:12:21 +01:00
  • e309c6a47f tool-call: integrate minja & tool-call to server when --jinja is set ochafik 2024-09-25 16:11:58 +01:00
  • 3cfc21ea71 tool-call: basic Functionary 3.2, Llama 3.1, Hermes 2 Pro grammar generators + parsers ochafik 2024-09-25 16:08:29 +01:00
  • 26c175b416 json: build_grammar helper ochafik 2024-09-25 16:06:28 +01:00
  • eaca756ecc minja: minimalist Jinja templating engine for LLM chat templates ochafik 2024-09-25 16:01:18 +01:00
  • 5b6d5040d5 grammar: trigger words + refactor of antiprompts ochafik 2024-09-25 15:51:37 +01:00
  • 41efa86198
    singular Sigbjørn Skjæret 2024-09-25 16:17:22 +02:00
  • 7bde9a0452
    server : accept /rerank endpoint in addition to /v1/rerank [no ci] Georgi Gerganov 2024-09-25 17:12:32 +03:00
  • 670935f166 fine-grant permissions Xuan Son Nguyen 2024-09-25 16:07:44 +02:00
  • 87793d4666
    bump version Sigbjørn Skjæret 2024-09-25 16:06:07 +02:00
  • 28393079fa
    support inverse chat template Sigbjørn Skjæret 2024-09-25 16:04:49 +02:00
  • ebcbc45711
    Add inverse chat template metadata Sigbjørn Skjæret 2024-09-25 16:03:50 +02:00
  • 62a45d12ef
    rerank : cleanup + comments Georgi Gerganov 2024-09-25 16:58:54 +03:00
  • 6916ed1606
    llama : aboud ggml_repeat during classification Georgi Gerganov 2024-09-23 20:20:38 +03:00
  • 6235c62ac9
    server : add rerank endpoint Georgi Gerganov 2024-09-19 16:18:30 +03:00
  • 125a0671ab
    llama : add "rank" pooling type Georgi Gerganov 2024-09-19 13:21:15 +03:00
  • d0a7bf9382
    llama : add classigication head (wip) [no ci] Georgi Gerganov 2024-09-18 21:20:21 +03:00
  • dc0cdd8760
    llama : read new cls tensors [no ci] Georgi Gerganov 2024-09-17 16:38:38 +03:00
  • 49f90de363
    py : fix position embeddings chop [no ci] Georgi Gerganov 2024-09-17 13:53:19 +03:00
  • 77723ed69e
    py : fix scalar-tensor conversion [no ci] Georgi Gerganov 2024-09-17 13:40:52 +03:00
  • 3453e62bb9
    py : add XLMRobertaForSequenceClassification [no ci] Georgi Gerganov 2024-09-16 16:59:17 +03:00
  • a00f6544b8 ci : fix docker build number and tag name Xuan Son Nguyen 2024-09-25 15:37:03 +02:00
  • cd145b11c0
    Update src/llama.cpp Georgi Gerganov 2024-09-25 16:19:55 +03:00
  • 95e433ceb5
    Update src/llama.cpp Georgi Gerganov 2024-09-25 16:19:49 +03:00
  • 1e43630218
    ggml : remove assert for AArch64 GEMV and GEMM Q4 kernels (#9217) b3825 Charles Xu 2024-09-25 15:12:20 +02:00
  • afbbfaa537
    server : add more env vars, improve gen-docs (#9635) b3824 Xuan Son Nguyen 2024-09-25 14:05:13 +02:00
  • 900e27f026 Rebase to the latest upstream Charles Xu 2024-09-25 13:58:08 +02:00
  • 7276e2b31c remove prints from the low-level code Charles Xu 2024-09-11 19:10:30 +02:00
  • 8e3ca11184 fix for build errors Charles Xu 2024-09-06 14:01:08 +02:00
  • ef6d397d99 LLAMA_ARG_NO_CONTEXT_SHIFT Xuan Son Nguyen 2024-09-25 12:33:17 +02:00
  • 5d4ca61c58 update server docs Xuan Son Nguyen 2024-09-25 12:27:20 +02:00
  • 3d4c45064d server : add more env vars, improve gen-docs Xuan Son Nguyen 2024-09-25 12:26:12 +02:00
  • baad96a363 ggml : remove assert for AArch64 GEMV and GEMM Q4 kernels Charles Xu 2024-08-28 10:18:31 +02:00
  • d3df98d6ea compress: add cmath Stéphane du Hamel 2024-09-25 12:07:41 +02:00
  • da444fafd7 compress: remove sampling.cpp dependency Stéphane du Hamel 2024-09-25 11:56:47 +02:00
  • 3d6bf6919f
    llama : add IBM Granite MoE architecture (#9438) b3823 Gabe Goodhart 2024-09-25 01:06:52 -06:00
  • bd95f2c3c2 mtgpu: enable VMM Xiaodong Ye 2024-09-23 08:33:06 +08:00