ochafik
d274ffcc95
build: Add missing optional include for gcc
2025-01-28 09:29:31 +00:00
ochafik
0a51e514f6
Update test-chat-handler.cpp
2025-01-28 09:24:35 +00:00
Olivier Chafik
2f99236f77
Tool-call: do last partial parse upon limit stop
2025-01-28 09:23:19 +00:00
Olivier Chafik
6d5682909f
Cleanup dead code in llama_3_1 tool call code
2025-01-28 09:22:26 +00:00
Olivier Chafik
62717145f7
Allow tool use + streaming
2025-01-28 09:22:03 +00:00
Michael Engel
2b8525d5c8
Handle missing model in CLI parameters for llama-run ( #11399 )
...
The HTTP client in llama-run only prints an error in case the download of
a resource failed. If the model name in the CLI parameter list is missing,
this causes the application to crash.
In order to prevent this, a check for the required model parameter has been
added and errors for resource downloads get propagated to the caller.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-01-28 08:32:40 +00:00
ochafik
ef9efc9ed3
Fix Llama 3.1 (incl. constrained builtin tools e.g. <|python_tag|>foo.call(arg=vallue)
)
2025-01-28 01:04:06 +00:00
ochafik
2d607f1a68
Update test-chat-handler.cpp
2025-01-27 23:29:28 +00:00
ochafik
b565ab2ab1
comment out broken tests in test_tool_call.py
2025-01-27 23:02:15 +00:00
ochafik
cafea60922
Split e2e test_tool_call from test_chat_completion
2025-01-27 22:46:33 +00:00
ochafik
90effb845f
Pass grammar laziness all the way down to sampler (need to print special trigger tokens e.g. for Nemo even w/ tool_choice=required)
2025-01-27 22:46:17 +00:00
ochafik
ad229783c5
updated tool call example to be less ambiguous (deepseek likes to rant about hello world)
2025-01-27 22:44:44 +00:00
ochafik
fa065eb095
Rehabilitate test_format_detection
2025-01-27 20:46:03 +00:00
ochafik
add9124115
fix test-chat-handler grammar tests
2025-01-27 20:13:09 +00:00
Eric Curtin
a4417ddda9
Add new hf protocol for ollama ( #11449 )
...
https://huggingface.co/docs/hub/en/ollama
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-27 19:36:10 +01:00
ochafik
118f799ae4
DeepSeek-R1: implement grammar constraints
2025-01-27 17:52:46 +00:00
ochafik
92ac336dfa
Prepare DeepSeek-R1-Distill-Llama-8B support
2025-01-27 17:26:43 +00:00
ochafik
09971e626c
Update test_chat_completion.py
2025-01-27 15:43:03 +00:00
ochafik
67709552ad
tool-call: compact json output to cap # tokens generated
2025-01-27 15:42:27 +00:00
ochafik
57f40e366b
tool-call: fix lazy grammar & mixed content + tool calls parsing
2025-01-27 15:41:54 +00:00
ochafik
2efa0c27bf
tool-call: add weather tool e2e tests
2025-01-27 15:02:09 +00:00
ochafik
15ec01e896
jinja: only add special tokens if template doesn't seem to handle them
2025-01-27 14:28:11 +00:00
ochafik
da606d8d41
tool-call: remove nonsensical code_interpreter code
2025-01-27 14:19:20 +00:00
Haus1
d6d24cd9ed
AMD: parse the architecture as supplied by gcnArchName ( #11244 )
...
The value provided by minor doesn't include stepping for AMD, parse the value returned by gcnArchName instead to retrieve an accurate ID.
2025-01-27 14:58:17 +01:00
lexasub
a5203b4465
llama : minor fixes for up llama load model speed ( #11448 )
...
* impl::load change map bpe_ranks to onordered map for reduce time of impl::load on 30%
* llama_model_loader::init_mapping - replace new llama_mmap to std::make_unique<llama_mmap> for clean code & reduce (/2) time of running init_mappings
* Update src/llama-vocab.cpp
---------
Co-authored-by: lexasub <empty@empty.ru>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
2025-01-27 14:42:09 +01:00
ochafik
bddc1bebcc
tool-call: fix special handling of special trigger tokens (Nemo)
2025-01-27 11:37:41 +00:00
Johannes Gäßler
df984e0147
llama: refactor llama_decode_impl ( #11381 )
2025-01-27 12:07:12 +01:00
Ihar Hrachyshka
acd38efee3
metal: Handle null returned from MTLCreateSystemDefaultDevice() ( #11441 )
...
This fixes segmentation fault error when running tests when no metal
devices are available (for example, when not linked with Core Graphics
framework or otherwise).
2025-01-27 09:41:59 +02:00
ochafik
ca0c837b6a
nits
2025-01-27 01:08:29 +00:00
ochafik
f7078cab36
tool-call: fix functionary v3.1 required test
2025-01-26 23:23:09 +00:00
Xuan Son Nguyen
caf773f249
docker : fix ARM build and Vulkan build ( #11434 )
...
* ci : do not fail-fast for docker
* build arm64/amd64 separatedly
* fix pip
* no fast fail
* vulkan: try jammy
2025-01-26 22:45:32 +01:00
ochafik
5ec4c5e4d3
reshuffle chat handlers
2025-01-26 21:38:07 +00:00
ochafik
43385b2ff2
sync: minja
2025-01-26 21:36:25 +00:00
Georgi Gerganov
178a7eb952
metal : use residency sets ( #11427 )
...
* metal : use residency sets
ggml-ci
* metal : restore commandBufferWithUnretainedReferences calls [no ci]
* metal : release descriptors
ggml-ci
* metal : check env GGML_METAL_NO_RESIDENCY
ggml-ci
* metal : fix build + clean-up
ggml-ci
2025-01-26 20:06:16 +02:00
Nuno
6f53d8a6b4
docker: add missing vulkan library to base layer and update to 24.04 ( #11422 )
...
Signed-off-by: rare-magma <rare-magma@posteo.eu>
2025-01-26 18:22:43 +01:00
bandoti
19f65187cb
cmake: add ggml find package ( #11369 )
...
* Add initial ggml cmake package
* Add build numbers to ggml find-package
* Expand variables with GGML_ prefix
* Guard against adding to cache variable twice
* Add git to msys2 workflow
* Handle ggml-cpu-* variants
* Link ggml/ggml-base libraries to their targets
* Replace main-cmake-pkg with simple-cmake-pkg
* Interface features require c_std_90
* Fix typo
* Removed unnecessary bracket from status message
* Update examples/simple-cmake-pkg/README.md
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Update examples/simple-cmake-pkg/README.md
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-01-26 12:07:48 -04:00
ochafik
11594557e3
Merge branch 'tool-call' into tool-call-handler
2025-01-26 15:32:53 +00:00
ochafik
3f3fc03983
nit: trailing spaces
2025-01-26 15:32:13 +00:00
Frank Mai
1d8ee06000
rpc: fix register position ( #11424 )
...
Signed-off-by: thxCode <thxcode0824@gmail.com>
2025-01-26 16:20:34 +01:00
Georgi Gerganov
2cc9b8c32c
readme : update hot topics
2025-01-26 14:30:15 +02:00
Jeff Bolz
f35726c2fb
build: apply MSVC /bigobj option to c/cpp files only ( #11423 )
2025-01-26 03:10:03 +01:00
Jeff Bolz
4a75d19376
vulkan: compile shaders on-demand ( #11406 )
...
Reduce first-run startup time and memory consumption.
Should fix #11339 .
2025-01-25 22:29:57 +01:00
uvos
26771a1491
Hip: disable VMM on hip as it seams that it dosent work in some configurations ( #11420 )
2025-01-25 21:01:12 +01:00
Jeff Bolz
ca6baf76c1
build: add /bigobj to MSVC build ( #11407 )
2025-01-25 11:26:37 -06:00
Diego Devesa
6e264a905b
docker : add GGML_CPU_ARM_ARCH arg to select ARM architecture to build for ( #11419 )
2025-01-25 17:22:41 +01:00
Xuan Son Nguyen
49b0e3cec4
server : fix cleaning up stream task ( #11418 )
...
* server : fix cleaning up stream task
* one more spot
2025-01-25 16:36:44 +01:00
Diego Devesa
20a758155b
docker : fix CPU ARM build ( #11403 )
...
* docker : fix CPU ARM build
* add CURL to other builds
2025-01-25 15:22:29 +01:00
Georgi Gerganov
00c24acb2a
ci : fix line breaks on windows builds ( #11409 )
...
* ci : fix line breaks on windows builds
* cont : another try
* ci : fix powershell line breaks
2025-01-25 13:36:48 +02:00
Olivier Chafik
51b7aab841
Update test_chat_completion.py
2025-01-25 04:57:40 +00:00
Olivier Chafik
a6463c1e35
jinja: don't add bos when jinja enabled
2025-01-25 04:52:42 +00:00