Commit graph

2715 commits

Author SHA1 Message Date
Julia Longtin
7a00422fa3 try to use vectorized zeroing function. 2024-06-09 18:01:49 +00:00
Julia Longtin
2870bfc6dd add missing variable. 2024-06-09 18:01:48 +00:00
Julia Longtin
656bf28c91 copy right block. 2024-06-09 18:01:48 +00:00
Julia Longtin
e99f3a9bf4 fix typo. 2024-06-09 18:01:48 +00:00
Julia Longtin
84093a6be6 promote aux16 into a vector. (part three) 2024-06-09 18:01:48 +00:00
Julia Longtin
66d26d4914 promote aux16 into a vector. 2024-06-09 18:01:48 +00:00
Julia Longtin
2f0a949ae0 promote aux16 into a vector. 2024-06-09 18:01:48 +00:00
Julia Longtin
ff29b659c8 formatting improvement. 2024-06-09 18:01:48 +00:00
Julia Longtin
b3ec86e59c first fixes. 2024-06-09 18:01:48 +00:00
Julia Longtin
7f5adf3b5c attempt to speed up float clearing. 2024-06-09 18:01:48 +00:00
Julia Longtin
a015d8485e allow using code from ggml-phi-knc-dot_q5_K_q8_K.c 2024-06-09 18:01:48 +00:00
Julia Longtin
aee550af6c force to compile. 2024-06-09 18:01:48 +00:00
Julia Longtin
a7f8abeb9b tell ggml-common.h to export what we want. 2024-06-09 18:01:48 +00:00
Julia Longtin
8703abe225 pull in ggml specific types. 2024-06-09 18:01:48 +00:00
Julia Longtin
62e354354c import stdio.h for size_t. 2024-06-09 18:01:48 +00:00
Julia Longtin
3edaaca993 import stdint.h for sizeSt. 2024-06-09 18:01:48 +00:00
Julia Longtin
669ce9b720 begin work on targeting dot_q5_K_q8_K. 2024-06-09 18:01:48 +00:00
Julia Longtin
c9730c0e04 be more specific about the length of our list of run amounts. 2024-06-09 18:01:48 +00:00
Julia Longtin
a48d3b96d7 spacing changes. 2024-06-09 18:01:48 +00:00
Julia Longtin
bb73cb319c formatting changes. 2024-06-09 18:01:48 +00:00
Julia Longtin
a06fa4b1b5 use the same header as ggml.c, and remove some warnings. 2024-06-09 18:01:48 +00:00
Julia Longtin
5a9d2f5f71 remove intrinsics import, and use upConv to save 12 bytes of memory transit. 2024-06-09 18:01:48 +00:00
Julia Longtin
d095d8e9c7 Update ggml-phi-knc.c 2024-06-09 18:01:48 +00:00
Julia Longtin
a56a6f31fa add a benchmark / test binary. 2024-06-09 18:01:48 +00:00
Julia Longtin
d7d679e41a merge from upstream 2024-06-09 18:01:48 +00:00
Julia Longtin
c70b5f211b Update ggml.c 2024-06-09 18:01:48 +00:00
Julia Longtin
114e7dd762 Update ggml.c 2024-06-09 18:01:48 +00:00
Julia Longtin
83be3dbab7 Update ggml.c 2024-06-09 18:01:48 +00:00
Julia Longtin
192e4ad857 implement F32 dot products. 2024-06-09 18:01:48 +00:00
Julia Longtin
7fce3f6b67 import intrinsics. 2024-06-09 18:01:48 +00:00
Julia Longtin
b5ea05f003 use right type, and define GGML_F32_VEC_ZERO. 2024-06-09 18:01:48 +00:00
Julia Longtin
429d69fd22 try to implement one intrinsic 2024-06-09 18:01:48 +00:00
Julia Longtin
7fb8d477ca try to detect the PHI cross compiler in make. 2024-06-09 18:01:48 +00:00
Julia Longtin
366279e09e try to detect the PHI cross compiler in make. 2024-06-09 18:01:48 +00:00
Julia Longtin
5c0d49cde4 instead of checking on glibc, check on SYS_getcpu 2024-06-09 18:01:48 +00:00
Julia Longtin
a83e2cadc0 handle the case that we have no glibc on the PHI. 2024-06-09 18:01:48 +00:00
Julia Longtin
9ec8635a06 add detection of Xeon PHI: Knights Corner. 2024-06-09 18:01:47 +00:00
compilade
132f55795e
llama : fix restoring the number of outputs from state files (#6687) 2024-04-15 15:56:55 +03:00
Pierrick Hymbert
3272896d79
server : revert "minor layout improvements" (#6684)
This reverts commit b3a96f27f0.
2024-04-15 15:18:47 +03:00
Steven Prichard
7fc16a2c32
swift : linux support (#6590)
- Package.swift now supports conditional compilation based on OS
- Allows for package to be used by SPM on Non-Apple platforms

Co-authored-by: Steven Prichard <steven.prichard@justeattakeaway.com>
2024-04-15 13:14:46 +03:00
Neo Zhang Jianyu
17e98d4c96
fix mul_mat_id() for new input, make the ut pass (#6682) 2024-04-15 17:12:26 +08:00
David Renshaw
1958f7e06c
llama : add missing kv clear in llama_beam_search (#6664) 2024-04-14 15:24:15 -04:00
Chao Jiang
04fbc5f23e
Add Command R chat template (#6650)
* Add chat template for command-r model series

* Fix indentation

* Add chat template test for command-r models and update the implementation to trim whitespaces

* Remove debug print
2024-04-14 18:16:34 +02:00
Georgi Gerganov
f184dd9208
flake.lock: Update (#6669) 2024-04-14 06:55:30 -07:00
Dave
422c2aff1c
Added support for GGML_OP_CLAMP in Metal (#6662)
* Added support for GGML_OP_CLAMP in Metal

* Corrected size

---------

Co-authored-by: dave-fl <dave@Davids-MacBook-Pro.local>
2024-04-14 13:14:19 +02:00
Sigbjørn Skjæret
8800226d65
Fix --split-max-size (#6655)
* Fix --split-max-size

Byte size calculation was done on int and overflowed.

* add tests.sh

* add examples test scripts to ci run

Will autodiscover examples/*/tests.sh scripts and run them.

* move WORK_PATH to a subdirectory

* clean up before and after test

* explicitly define which scripts to run

* add --split-max-size to readme
2024-04-14 13:12:59 +02:00
Jaemin Son
e689fc4e91
[bug fix] convert github repository_owner to lowercase (#6673) 2024-04-14 13:12:36 +02:00
James A Capozzoli
a4ec34e1cd
convert : enable the --use-temp-file cli flag (#6645) 2024-04-14 11:40:18 +03:00
Neo Zhang Jianyu
de17e3f745
fix memcpy() crash, add missed cmd in guide, fix softmax (#6622)
* disable mmap to fix memcpy crash, add missed cmd in guide, fix softmax

* refactor to disable mmap for SYCL backend

* fix compile error in other os

* refactor the solution, use host buf to fix it, instead of disable mmap

* keep to support mmap()

* use host buff to reduce malloc times

* revert to malloc/free solution, for threaad safe
2024-04-14 10:42:29 +08:00
Johannes Gäßler
b5e7285baf
CUDA: fix matrix multiplication logic for tests (#6667) 2024-04-14 00:21:55 +02:00