- Package.swift now supports conditional compilation based on OS
- Allows for package to be used by SPM on Non-Apple platforms
Co-authored-by: Steven Prichard <steven.prichard@justeattakeaway.com>
* Add chat template for command-r model series
* Fix indentation
* Add chat template test for command-r models and update the implementation to trim whitespaces
* Remove debug print
* Fix --split-max-size
Byte size calculation was done on int and overflowed.
* add tests.sh
* add examples test scripts to ci run
Will autodiscover examples/*/tests.sh scripts and run them.
* move WORK_PATH to a subdirectory
* clean up before and after test
* explicitly define which scripts to run
* add --split-max-size to readme
* disable mmap to fix memcpy crash, add missed cmd in guide, fix softmax
* refactor to disable mmap for SYCL backend
* fix compile error in other os
* refactor the solution, use host buf to fix it, instead of disable mmap
* keep to support mmap()
* use host buff to reduce malloc times
* revert to malloc/free solution, for threaad safe
* infill : add download instructions for model
This commit adds instructions on how to download a CodeLlama model
using the `hf.sh` script. This will download the model and place it
in the `models` directory which is the same model use later by the
infill example.
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
* squash! infill : add download instructions for model
Clarify the reason for using CodeLlama.
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
---------
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
* Remove split metadata when quantize model shards
* Find metadata key by enum
* Correct loop range for gguf_remove_key and code format
* Free kv memory
---------
Co-authored-by: z5269887 <z5269887@unsw.edu.au>
* Refactor Error Handling for CUDA
Add guidance for setting CUDA_DOCKER_ARCH to match GPU compute capability for CUDA versions < 11.7. Include link to NVIDIA's CUDA GPUs documentation for compute capability reference.
* Update Makefile
Improved wording
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
---------
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>