Michal Moskal
437ff3178c
add final newline
2025-01-26 13:04:36 -08:00
Michal Moskal
00fcd984d5
include <cmath> for INFINITY
2025-01-26 12:36:06 -08:00
Michal Moskal
1afc53a338
fix warning
2025-01-26 12:33:11 -08:00
Michal Moskal
08fefd1d7c
fix whitespace
2025-01-26 12:30:02 -08:00
Michal Moskal
efc36c9acf
add $LLGUIDANCE_LOG_LEVEL support
2025-01-26 10:15:22 -08:00
Michal Moskal
c9e9853e6c
format file
2025-01-26 10:11:39 -08:00
Michal Moskal
44e1973af0
update llg
2025-01-26 10:09:57 -08:00
Michal Moskal
ca88ce7b77
llama_tokenizer() in fact requires valid utf8
2025-01-26 10:09:51 -08:00
Georgi Gerganov
178a7eb952
metal : use residency sets ( #11427 )
...
* metal : use residency sets
ggml-ci
* metal : restore commandBufferWithUnretainedReferences calls [no ci]
* metal : release descriptors
ggml-ci
* metal : check env GGML_METAL_NO_RESIDENCY
ggml-ci
* metal : fix build + clean-up
ggml-ci
2025-01-26 20:06:16 +02:00
Michal Moskal
8e027f8dcd
align tests with LLG grammar syntax and JSON Schema spec
2025-01-26 09:59:31 -08:00
Nuno
6f53d8a6b4
docker: add missing vulkan library to base layer and update to 24.04 ( #11422 )
...
Signed-off-by: rare-magma <rare-magma@posteo.eu>
2025-01-26 18:22:43 +01:00
Michal Moskal
0a211fcb9d
add gh action for llg test
2025-01-26 09:06:38 -08:00
Michal Moskal
c7ebf57822
rename llguidance test file to test-grammar-llguidance.cpp
2025-01-26 08:54:56 -08:00
Michal Moskal
29375376fe
conditionally include llguidance test based on LLAMA_LLGUIDANCE flag
2025-01-26 08:53:49 -08:00
Michal Moskal
16a5484048
gbnf -> lark syntax
2025-01-26 08:50:59 -08:00
Michal Moskal
f245ca26f5
build and run test
2025-01-26 08:49:05 -08:00
Michal Moskal
036b91fbc3
fix ref-count bug
2025-01-26 08:48:53 -08:00
Michal Moskal
58006ddb13
clang fmt
2025-01-26 08:20:26 -08:00
Michal Moskal
3675050804
copy test-grammar-integration.cpp to test-llguidance.cpp
2025-01-26 08:18:10 -08:00
Michal Moskal
a7be6669b1
pass vocab not model to llama_sampler_init_llg()
2025-01-26 08:16:56 -08:00
bandoti
19f65187cb
cmake: add ggml find package ( #11369 )
...
* Add initial ggml cmake package
* Add build numbers to ggml find-package
* Expand variables with GGML_ prefix
* Guard against adding to cache variable twice
* Add git to msys2 workflow
* Handle ggml-cpu-* variants
* Link ggml/ggml-base libraries to their targets
* Replace main-cmake-pkg with simple-cmake-pkg
* Interface features require c_std_90
* Fix typo
* Removed unnecessary bracket from status message
* Update examples/simple-cmake-pkg/README.md
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Update examples/simple-cmake-pkg/README.md
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-01-26 12:07:48 -04:00
Michal Moskal
de269a1833
fix tests when llg is enabled
2025-01-26 08:02:37 -08:00
Frank Mai
1d8ee06000
rpc: fix register position ( #11424 )
...
Signed-off-by: thxCode <thxcode0824@gmail.com>
2025-01-26 16:20:34 +01:00
Georgi Gerganov
2cc9b8c32c
readme : update hot topics
2025-01-26 14:30:15 +02:00
Michal Moskal
8cb12d43d6
remove llguidance.h from .gitignore
2025-01-25 20:45:59 -08:00
Michal Moskal
2a92bfbe06
code style fixes
2025-01-25 20:43:33 -08:00
Michal Moskal
adc4aed0af
clarify docs
2025-01-25 20:35:41 -08:00
Michal Moskal
b5399d44c2
add some docs
2025-01-25 20:27:07 -08:00
Jeff Bolz
f35726c2fb
build: apply MSVC /bigobj option to c/cpp files only ( #11423 )
2025-01-26 03:10:03 +01:00
Michal Moskal
afb6cac5ab
use '%llguidance' as marker to enable llg lark syntax
2025-01-25 16:57:28 -08:00
Michal Moskal
f4dc4b89fa
build: integrate llguidance as an external project
2025-01-25 15:49:23 -08:00
Michal Moskal
f19655c4c0
update for new APIs
2025-01-25 15:49:07 -08:00
Michal Moskal
76290d9ea0
initial porting of previous LLG patch
2025-01-25 14:43:57 -08:00
Jeff Bolz
4a75d19376
vulkan: compile shaders on-demand ( #11406 )
...
Reduce first-run startup time and memory consumption.
Should fix #11339 .
2025-01-25 22:29:57 +01:00
uvos
26771a1491
Hip: disable VMM on hip as it seams that it dosent work in some configurations ( #11420 )
2025-01-25 21:01:12 +01:00
Jeff Bolz
ca6baf76c1
build: add /bigobj to MSVC build ( #11407 )
2025-01-25 11:26:37 -06:00
Diego Devesa
6e264a905b
docker : add GGML_CPU_ARM_ARCH arg to select ARM architecture to build for ( #11419 )
2025-01-25 17:22:41 +01:00
Xuan Son Nguyen
49b0e3cec4
server : fix cleaning up stream task ( #11418 )
...
* server : fix cleaning up stream task
* one more spot
2025-01-25 16:36:44 +01:00
Diego Devesa
20a758155b
docker : fix CPU ARM build ( #11403 )
...
* docker : fix CPU ARM build
* add CURL to other builds
2025-01-25 15:22:29 +01:00
Georgi Gerganov
00c24acb2a
ci : fix line breaks on windows builds ( #11409 )
...
* ci : fix line breaks on windows builds
* cont : another try
* ci : fix powershell line breaks
2025-01-25 13:36:48 +02:00
jiahao su
466ea66f33
CANN: Add Ascend CANN build ci ( #10217 )
...
* CANN: Add Ascend CANN build ci
* Update build.yml
* Modify cann image version
* Update build.yml
* Change to run on x86 system
* Update build.yml
* Update build.yml
* Modify format error
* Update build.yml
* Add 'Ascend NPU' label restrictions
* Exclude non PR event
Co-authored-by: Yuanhao Ji <jiyuanhao@apache.org>
* Update build.yml
---------
Co-authored-by: Yuanhao Ji <jiyuanhao@apache.org>
2025-01-25 00:26:01 +01:00
uvos
5f0db9522f
hip : Add hipGraph and VMM support to ROCM ( #11362 )
...
* Add hipGraph support
* Enable VMM on rocm
2025-01-25 00:02:23 +01:00
Johannes Gäßler
c5d9effb49
CUDA: fix FP16 cuBLAS GEMM ( #11396 )
2025-01-24 21:02:43 +01:00
uvos
9fbadaef4f
rocBLAS: Avoid fp32->fp16->fp32 conversion on cdna ( #11356 )
2025-01-24 17:50:49 +01:00
Georgi Gerganov
9755129c27
release : pack /lib in the packages ( #11392 )
...
* release : pack /lib and /include in the packages
* cmake : put libs in /bin
* TMP : push artifacts
* Revert "TMP : push artifacts"
This reverts commit 4decf2c4df
.
* ci : fix HIP cmake compiler options to be on first line
* ci : restore the original HIP commands
* ci : change ubuntu build from latest to 20.04
* ci : try to fix macos build rpaths
* ci : remove obsolete MacOS build
* TMP : push artifacts
* ci : change back to ubuntu latest
* ci : macos set build rpath to "@loader_path"
* ci : fix typo
* ci : change ubuntu package to 22.04
* Revert "TMP : push artifacts"
This reverts commit 537b09e70f
.
2025-01-24 18:41:30 +02:00
Jafar Uruç
a07c2c8a52
docs : Update readme to build targets for local docker build ( #11368 )
2025-01-24 14:30:13 +01:00
Johannes Gäßler
8137b4bb2b
CPU/CUDA: fix (GQA) mul mat back, add CUDA support ( #11380 )
2025-01-24 12:38:31 +01:00
Bernhard M. Wiedemann
1af6945eb0
cmake : avoid -march=native when reproducible build is wanted ( #11366 )
...
See https://reproducible-builds.org/ for why this is good
and https://reproducible-builds.org/specs/source-date-epoch/
for the definition of this variable.
Without this patch, compiling on different machines produced different binaries, which made verification of results difficult.
Fixes : #11317
This patch was done while working on reproducible builds for openSUSE.
2025-01-24 13:21:35 +02:00
Eric Curtin
01f37edf1a
Update llama-run README.md ( #11386 )
...
For consistency
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-24 09:39:24 +00:00
stduhpf
c07e87f38b
server : (webui) put DeepSeek R1 CoT in a collapsible <details> element ( #11364 )
...
* webui : put DeepSeek R1 CoT in a collapsible <details> element
* webui: refactor split
* webui: don't use regex to split cot and response
* webui: format+qol
* webui: no loading icon if the model isn't generating
* ui fix, add configs
* add jsdoc types
* only filter </think> for assistant msg
* build
* update build
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-01-24 09:02:38 +01:00