Commit graph

2919 commits

Author SHA1 Message Date
Julia Longtin
7e3eb5c01d perform 16 operations at a time. 2024-05-13 22:12:55 +00:00
Julia Longtin
6d4535e829 use proper mov operator, and pass addresses. 2024-05-13 22:12:54 +00:00
Julia Longtin
e72539bcc5 attempt our first FMA. 2024-05-13 22:12:54 +00:00
Julia Longtin
b22e3e021e add I32 vector memory clearing. 2024-05-13 22:12:54 +00:00
Julia Longtin
1446a724df promote aux32 to a vector. 2024-05-13 22:12:54 +00:00
Julia Longtin
a9cc0e74d3 add missing address of operators. 2024-05-13 22:12:54 +00:00
Julia Longtin
bff7b695b3 promote aux16 to a vector. 2024-05-13 22:12:54 +00:00
Julia Longtin
df33835700 use quotes properly. 2024-05-13 22:12:54 +00:00
Julia Longtin
2dc7991809 use better memory save operator. 2024-05-13 22:12:54 +00:00
Julia Longtin
588a0b19cc expand mask, and align memory. 2024-05-13 22:12:54 +00:00
Julia Longtin
3994d81bf0 try to use vectorized zeroing function. 2024-05-13 22:12:54 +00:00
Julia Longtin
e227717136 add missing variable. 2024-05-13 22:12:54 +00:00
Julia Longtin
d5a27eb507 copy right block. 2024-05-13 22:12:54 +00:00
Julia Longtin
9f92f9730e fix typo. 2024-05-13 22:12:54 +00:00
Julia Longtin
484c4abf8d promote aux16 into a vector. (part three) 2024-05-13 22:12:54 +00:00
Julia Longtin
fb0fb9ff1b promote aux16 into a vector. 2024-05-13 22:12:54 +00:00
Julia Longtin
405b5fa731 promote aux16 into a vector. 2024-05-13 22:12:54 +00:00
Julia Longtin
b92e06456c formatting improvement. 2024-05-13 22:12:54 +00:00
Julia Longtin
ea858eee03 first fixes. 2024-05-13 22:12:54 +00:00
Julia Longtin
feed51c3f4 attempt to speed up float clearing. 2024-05-13 22:12:54 +00:00
Julia Longtin
2ed306623c allow using code from ggml-phi-knc-dot_q5_K_q8_K.c 2024-05-13 22:12:50 +00:00
Julia Longtin
d5f39c3caa force to compile. 2024-05-13 22:11:16 +00:00
Julia Longtin
b794e48ff8 tell ggml-common.h to export what we want. 2024-05-13 22:11:16 +00:00
Julia Longtin
2c5daab90f pull in ggml specific types. 2024-05-13 22:11:16 +00:00
Julia Longtin
7080280c5b import stdio.h for size_t. 2024-05-13 22:11:16 +00:00
Julia Longtin
96dce97091 import stdint.h for sizeSt. 2024-05-13 22:11:16 +00:00
Julia Longtin
0e6c910db9 begin work on targeting dot_q5_K_q8_K. 2024-05-13 22:11:16 +00:00
Julia Longtin
16cbe5dd81 be more specific about the length of our list of run amounts. 2024-05-13 22:11:16 +00:00
Julia Longtin
c605e951dc spacing changes. 2024-05-13 22:11:16 +00:00
Julia Longtin
56be29fc58 formatting changes. 2024-05-13 22:11:16 +00:00
Julia Longtin
97c69835dc use the same header as ggml.c, and remove some warnings. 2024-05-13 22:11:16 +00:00
Julia Longtin
580a347e59 remove intrinsics import, and use upConv to save 12 bytes of memory transit. 2024-05-13 22:11:15 +00:00
Julia Longtin
9ba28eaed3 Update ggml-phi-knc.c 2024-05-13 22:11:15 +00:00
Julia Longtin
72e2b13185 add a benchmark / test binary. 2024-05-13 22:11:15 +00:00
Julia Longtin
6f699fc98d merge from upstream 2024-05-13 22:11:15 +00:00
Julia Longtin
926b0e8076 Update ggml.c 2024-05-13 22:11:15 +00:00
Julia Longtin
6e1b77ad58 Update ggml.c 2024-05-13 22:11:15 +00:00
Julia Longtin
f940c96aac Update ggml.c 2024-05-13 22:11:15 +00:00
Julia Longtin
2458643dac implement F32 dot products. 2024-05-13 22:11:15 +00:00
Julia Longtin
59ce785f61 import intrinsics. 2024-05-13 22:11:15 +00:00
Julia Longtin
c08ddb831f use right type, and define GGML_F32_VEC_ZERO. 2024-05-13 22:11:15 +00:00
Julia Longtin
25095cac23 try to implement one intrinsic 2024-05-13 22:11:15 +00:00
Julia Longtin
8f6e535edc try to detect the PHI cross compiler in make. 2024-05-13 22:11:15 +00:00
Julia Longtin
f7f174ecc9 try to detect the PHI cross compiler in make. 2024-05-13 22:11:15 +00:00
Julia Longtin
b9e2f2a332 instead of checking on glibc, check on SYS_getcpu 2024-05-13 22:11:10 +00:00
Julia Longtin
78291d93b9 handle the case that we have no glibc on the PHI. 2024-05-13 22:05:33 +00:00
Julia Longtin
757f952046 add detection of Xeon PHI: Knights Corner. 2024-05-13 22:03:26 +00:00
compilade
ee52225067
convert-hf : support direct Q8_0 conversion (#7234)
* convert-hf : support q8_0 conversion

* convert-hf : add missing ftype

This was messing with the checksums otherwise.

* convert-hf : add missing ftype to Baichuan and Xverse

I didn't notice these on my first pass.
2024-05-13 14:10:51 -04:00
Georgi Gerganov
614d3b914e
llama : less KV padding when FA is off (#7257)
ggml-ci
2024-05-13 17:15:15 +03:00
k.h.lai
30e70334f7
llava-cli: fix base64 prompt (#7248) 2024-05-14 00:02:36 +10:00