Commit graph

2484 commits

Author SHA1 Message Date
Pierrick HYMBERT
dbd969142e build: move the make build with env LLAMA_CURL to a dedicated place 2024-03-16 22:01:19 +01:00
Pierrick HYMBERT
d81acb6847 build: introduce cmake option LLAMA_CURL to trigger libcurl linking to be coherent with the make toolchain 2024-03-16 21:59:53 +01:00
Pierrick HYMBERT
e6848ab0e6 build: move the make build with env LLAMA_CURL to a dedicated place 2024-03-16 21:53:07 +01:00
Pierrick HYMBERT
22b3bb3ceb common: fix windows build caused by double windows.h import 2024-03-16 21:50:37 +01:00
Pierrick HYMBERT
1ad5a45210 ci: build: add libcurl in default make toolchain step for tests 2024-03-16 20:06:18 +01:00
Pierrick HYMBERT
78812c6d63 llama_load_model_from_url: PR feedback, use snprintf instead of strncp and strncat 2024-03-16 20:02:34 +01:00
Pierrick HYMBERT
5df5605b02 ci: build: add libcurl in default make toolchain step 2024-03-16 19:52:11 +01:00
Pierrick HYMBERT
176f039a91 ci: tests: windows tests add libcurl 2024-03-16 19:51:44 +01:00
Pierrick HYMBERT
838178a196 ci: tests: windows tests add libcurl 2024-03-16 18:34:53 +01:00
Pierrick HYMBERT
064dc076bb common: CMakeLists.txt fix typo in logging when lib curl is not found 2024-03-16 18:34:36 +01:00
Pierrick HYMBERT
124c474bba llama_load_model_from_url: coherent clearer logging 2024-03-16 18:24:21 +01:00
Pierrick HYMBERT
4fadb072e9 server: tests: add --model-url tests 2024-03-16 18:15:41 +01:00
Pierrick HYMBERT
545fef6e0e llama_load_model_from_url: fix compilation warning, clearer logging 2024-03-16 18:01:55 +01:00
Pierrick Hymbert
b0b49e0bb8
Update examples/main/README.md
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-16 17:48:48 +01:00
Pierrick Hymbert
eb9e52a218
Update common/common.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-16 17:48:38 +01:00
Pierrick Hymbert
be561a7ffd
Update common/common.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-16 17:48:32 +01:00
Pierrick Hymbert
89ab37a261
Update common/common.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-16 17:48:27 +01:00
Pierrick Hymbert
330e28df08
Update common/common.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-16 17:48:20 +01:00
Pierrick Hymbert
9565ae3187
Update common/common.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-16 17:48:10 +01:00
Pierrick Hymbert
f22456d8c3
Update common/common.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-16 17:48:02 +01:00
Pierrick Hymbert
b088122719
Update common/common.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-16 17:47:04 +01:00
Pierrick Hymbert
f53bfd56af
Update common/common.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-16 17:46:53 +01:00
Pierrick Hymbert
8751bd0c82
Update common/common.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-16 17:46:46 +01:00
Pierrick Hymbert
4bc47b75ca
Update common/common.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-16 17:46:34 +01:00
Pierrick Hymbert
e84206d132
Update examples/server/README.md
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-16 17:46:18 +01:00
Pierrick HYMBERT
1430e895fc Merge branch 'master' into hp/download-model-from-hf
# Conflicts:
#	common/common.cpp
2024-03-16 16:57:24 +01:00
AmirAli Mirian
c47cf414ef
ggml : add AVX512F SIMD (#6088) 2024-03-16 17:52:02 +02:00
Pierrick HYMBERT
6633689fa5 llama_load_model_from_url: cleanup code 2024-03-16 16:49:44 +01:00
Daniel Bevenius
b5f4ae09c3
gritlm : add initial README.md (#6086)
* gritlm: add initial README.md to examples/gritlm

This commit adds a suggestion for an initial README.md for the gritlm
example.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* squash! gritlm: add initial README.md to examples/gritlm

Use the `scripts/hf.sh` script to download the model file.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* squash! gritlm: add initial README.md to examples/gritlm

Fix editorconfig-checker error in examples/gritlm/README.md.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

---------

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2024-03-16 17:46:29 +02:00
Xuan Son Nguyen
dfbfdd60f9
readme : add wllama as a wasm binding (#6100) 2024-03-16 17:42:08 +02:00
DAN™
15961ec04d
common : refactor nested if causing error C1061 on MSVC (#6101)
* Refactor nested if causing error C1061 on MSVC.

* Revert back and remove else's.

* Add flag to track found arguments.
2024-03-16 17:39:15 +02:00
Pierrick HYMBERT
921e4af930 ci: build, fix the default build to use LLAMA_CURL 2024-03-16 16:29:08 +01:00
Pierrick HYMBERT
5d99f3224f llama_load_model_from_url: download the file only if modified based on etag and last-modified http headers 2024-03-16 16:27:48 +01:00
Pierrick HYMBERT
4135d4a505 llama_load_model_from_url: typo 2024-03-16 16:27:48 +01:00
Pierrick Hymbert
2c3a00e270
Update Makefile
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-16 15:40:29 +01:00
Pierrick HYMBERT
80bec9890a llama_load_model_from_url: try to make the windows build passing 2024-03-16 14:08:21 +01:00
Pierrick HYMBERT
df0d82289c ci: compile the server with curl, add make option curl example in default cmake 2024-03-16 13:52:45 +01:00
Pierrick HYMBERT
7e782856bd common: LLAMA_USE_CURL in make toolchain 2024-03-16 13:45:09 +01:00
Pierrick HYMBERT
42b25dacab common: PR feedback, rename the definition to LLAMA_USE_CURL 2024-03-16 13:27:05 +01:00
Pierrick Hymbert
a56d09a440
ci : close inactive issue with workflow (#6053)
* issues: ci - close inactive issue with workflow

* ci: close issue, change workflow schedule time
2024-03-16 14:20:53 +02:00
Pierrick HYMBERT
a0ebdfcc5d common: llama_load_model_from_url witch to libcurl dependency 2024-03-16 12:27:08 +01:00
Pierrick HYMBERT
3221ab01ad common: introduce llama_load_model_from_url to download model from hf url using libopenssl only 2024-03-16 09:59:14 +01:00
slaren
d84c48505f
llama : fix Baichuan2 13B (#6092) 2024-03-15 23:14:16 +02:00
Theia Vogel
877b4d0c62
llama : add support for control vectors (#5970)
* control vector api and implementation

* control-vectors : minor code style updates

* disable control vector when data == nullptr

use -1 for disabled range (also on init) in case we ever support controlling layer 0 (embeddings)

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-15 22:43:02 +02:00
Andrew Canis
12247f4c69
llama : add Command-R support (#6033)
Information about the Command-R 35B model (128k context) can be found at:
	https://huggingface.co/CohereForAI/c4ai-command-r-v01

Based on the llama2 model with a few changes:

1) New hyper parameter to scale output logits (logit_scale)
2) Uses LayerNorm instead of RMSNorm
3) Transfomer layers have a single shared LayerNorm that feeds into both the
   self-attention and FFN layers in parallel. There is no post-attention LayerNorm.
4) No support for Rotary Position Embeddings (RoPE) scaling
5) No biases used

Find GGUF files here:
	https://huggingface.co/andrewcanis/c4ai-command-r-v01-GGUF

To convert model to GGUF format yourself:

1) Download Command-R Hugging Face safetensors:
	git lfs install
	git clone https://huggingface.co/CohereForAI/c4ai-command-r-v01

2) Run:
	python3 convert-hf-to-gguf.py --outtype f16 ./c4ai-command-r-v01
2024-03-15 22:41:22 +02:00
Ting Lou
4e9a7f7f7f
llava : change API to pure C style for Rust FFI bindgen (#6079)
Co-authored-by: Lou Ting <louting.t@alibaba-inc.com>
2024-03-15 16:31:05 +02:00
slaren
3020327f6c
cuda : disable unused cudaLaunchHostFunc code (#6078) 2024-03-15 14:24:03 +02:00
Neo Zhang Jianyu
46acb36767
fix set main gpu error (#6073) 2024-03-15 18:53:53 +08:00
Georgi Gerganov
131b058409
make : ggml-metal.o depends on ggml.h 2024-03-15 11:38:40 +02:00
AidanBeltonS
753e36f650
[SYCL] Fix non-intel device selection (#6042)
* Fix non-intel device selection

* Update ggml-sycl.cpp

Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>

* Update ggml-sycl.cpp

Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>

---------

Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>
2024-03-15 14:56:20 +05:30