Commit graph

2515 commits

Author SHA1 Message Date
Julia Longtin
8c17353717 minor changes. 2024-04-02 16:55:40 +00:00
Julia Longtin
9f569ca50b massively rewrite assembly routines. 2024-04-02 15:41:56 +00:00
Julia Longtin
12c9576aec fix vector sizes. 2024-03-25 19:43:37 +00:00
Julia Longtin
bc3d6db862 separate filling aux16 from consuming aux16 by making it an array of vectors. 2024-03-24 14:18:08 +00:00
Julia Longtin
ca0dc26704 loosen alignment requirements for zeros, add missing function, and promote aux8 to an array of vectors. 2024-03-24 13:35:05 +00:00
Julia Longtin
cf481cf901 promote aux8 into a vector. 2024-03-24 12:50:01 +00:00
Julia Longtin
169a145409 fix our reference to src in the second place, and use a more accurate comment. 2024-03-24 12:41:21 +00:00
Julia Longtin
c28bfe4552 spacing changes, eliminate dead references to k1 or zero, and use the right type when referring to src. 2024-03-24 12:37:47 +00:00
Julia Longtin
ba4f4129b3 better comments, and fix some small errors. 2024-03-24 12:17:06 +00:00
Julia Longtin
03a3e0eb7a perform 16 operations at a time. 2024-03-24 12:04:44 +00:00
Julia Longtin
5935bb34f4 use proper mov operator, and pass addresses. 2024-03-23 23:46:36 +00:00
Julia Longtin
a5132a1507 attempt our first FMA. 2024-03-23 22:16:57 +00:00
Julia Longtin
4477b8e123 add I32 vector memory clearing. 2024-03-23 21:16:23 +00:00
Julia Longtin
ea1edb0600 promote aux32 to a vector. 2024-03-23 21:12:35 +00:00
Julia Longtin
f967690a41 add missing address of operators. 2024-03-23 21:05:50 +00:00
Julia Longtin
2fdd11fe3a promote aux16 to a vector. 2024-03-23 21:00:51 +00:00
Julia Longtin
f09b3ed79e use quotes properly. 2024-03-23 20:53:16 +00:00
Julia Longtin
bb5eb95816 use better memory save operator. 2024-03-23 20:49:11 +00:00
Julia Longtin
9d7ca41703 expand mask, and align memory. 2024-03-23 20:48:43 +00:00
Julia Longtin
bd6d7e6238 try to use vectorized zeroing function. 2024-03-23 19:55:12 +00:00
Julia Longtin
f985372e3a add missing variable. 2024-03-23 19:49:16 +00:00
Julia Longtin
31d4f9312b copy right block. 2024-03-23 19:47:21 +00:00
Julia Longtin
e43a63e7c6 fix typo. 2024-03-23 16:29:30 +00:00
Julia Longtin
f092a10dc9 promote aux16 into a vector. (part three) 2024-03-23 16:27:11 +00:00
Julia Longtin
c72157a5a6 promote aux16 into a vector. 2024-03-23 16:24:11 +00:00
Julia Longtin
e3503c924a promote aux16 into a vector. 2024-03-23 16:21:20 +00:00
Julia Longtin
edb76ffddb formatting improvement. 2024-03-23 16:19:17 +00:00
Julia Longtin
6face8a0be first fixes. 2024-03-23 15:56:47 +00:00
Julia Longtin
0a2051aa88 attempt to speed up float clearing. 2024-03-23 15:55:00 +00:00
Julia Longtin
0b012c03ef allow using code from ggml-phi-knc-dot_q5_K_q8_K.c 2024-03-23 15:02:56 +00:00
Julia Longtin
0b3f17127f force to compile. 2024-03-23 14:58:33 +00:00
Julia Longtin
18f353987c tell ggml-common.h to export what we want. 2024-03-23 14:49:35 +00:00
Julia Longtin
cd20404250 pull in ggml specific types. 2024-03-23 14:38:15 +00:00
Julia Longtin
8f57803f58 import stdio.h for size_t. 2024-03-23 14:29:59 +00:00
Julia Longtin
9bcb8350d5 import stdint.h for sizeSt. 2024-03-23 14:28:29 +00:00
Julia Longtin
a7bd64c130 begin work on targeting dot_q5_K_q8_K. 2024-03-23 14:19:47 +00:00
Julia Longtin
9185e14922 be more specific about the length of our list of run amounts. 2024-03-21 20:38:49 +00:00
Julia Longtin
0979522fbe spacing changes. 2024-03-21 18:36:25 +00:00
Julia Longtin
ac3637142d formatting changes. 2024-03-20 21:34:12 +00:00
Julia Longtin
76e66e77c2 use the same header as ggml.c, and remove some warnings. 2024-03-20 21:12:22 +00:00
Julia Longtin
ee27148629 remove intrinsics import, and use upConv to save 12 bytes of memory transit. 2024-03-20 20:15:30 +00:00
Julia Longtin
ab6f3a8a8d
Update ggml-phi-knc.c 2024-03-17 21:36:14 +00:00
Julia Longtin
f882673ba6 add a benchmark / test binary. 2024-03-17 21:20:14 +00:00
Julia Longtin
fe663c1b63 merge from upstream 2024-03-17 21:15:32 +00:00
Julia Longtin
eac00a72d5
Update ggml.c 2024-03-16 14:17:21 +00:00
Julia Longtin
e216a2f133
Update ggml.c 2024-03-16 14:15:51 +00:00
Julia Longtin
257ffd9955
Update ggml.c 2024-03-16 14:13:22 +00:00
Julia Longtin
717e164dd7 implement F32 dot products. 2024-03-16 14:05:03 +00:00
Julia Longtin
7a57feba0c import intrinsics. 2024-03-13 19:26:54 +00:00
Julia Longtin
a1ae649662 use right type, and define GGML_F32_VEC_ZERO. 2024-03-13 19:23:53 +00:00