Commit graph

2948 commits

Author SHA1 Message Date
Julia Longtin
201566c965 use different restrict syntax, to make g++ happy. 2024-05-13 22:17:31 +00:00
Julia Longtin
bf674be34f fix typo 2024-05-13 22:17:31 +00:00
Julia Longtin
30e8b37f33 remove a warning. 2024-05-13 22:17:31 +00:00
Julia Longtin
de44c6633e add batch fp16<->fp32 conversion functions. 2024-05-13 22:17:27 +00:00
Julia Longtin
b33cd8d614 minor spacing and comment changes. 2024-05-13 22:12:55 +00:00
Julia Longtin
e108564e2d spacing and capitalization changes. Fix the register list of GGML_5bit_Unpacked_Unaligned. 2024-05-13 22:12:55 +00:00
Julia Longtin
1ba6534846 spacing and capitalization changes. 2024-05-13 22:12:55 +00:00
Julia Longtin
93d0a0ae7a use or, instead of and. bug fix? 2024-05-13 22:12:55 +00:00
Julia Longtin
2cfc15b0a9 comment and spacing fixes. 2024-05-13 22:12:55 +00:00
Julia Longtin
d27cd93d11 fix an offset error, and get rid of tabs. 2024-05-13 22:12:55 +00:00
Julia Longtin
5b2023bb12 fix some small errors. 2024-05-13 22:12:55 +00:00
Julia Longtin
934f869a51 further optimizations. 0.99 tokens per second. 2024-05-13 22:12:55 +00:00
Julia Longtin
a33c82b6bb replace tabs with spaces. 2024-05-13 22:12:55 +00:00
Julia Longtin
039685d78c reformat, and label what these files are. 2024-05-13 22:12:55 +00:00
Julia Longtin
feb8bccfab use GGML_F32_EPR, and remove some dead code. 2024-05-13 22:12:55 +00:00
Julia Longtin
7214391ff7 whoops. missing tab. 2024-05-13 22:12:55 +00:00
Julia Longtin
10f06379d7 add Makefile rule for generation .s file, for manual inspection. 2024-05-13 22:12:55 +00:00
Julia Longtin
e544a3faa2 formatting changes. 2024-05-13 22:12:55 +00:00
Julia Longtin
481f1746c0 indent headers consistently. 2024-05-13 22:12:55 +00:00
Julia Longtin
aa33f281e3 formatting. 2024-05-13 22:12:55 +00:00
Julia Longtin
021ae03bd6 minor changes. 2024-05-13 22:12:55 +00:00
Julia Longtin
efcd202f0f massively rewrite assembly routines. 2024-05-13 22:12:55 +00:00
Julia Longtin
e66a97f765 fix vector sizes. 2024-05-13 22:12:55 +00:00
Julia Longtin
5a6024279f separate filling aux16 from consuming aux16 by making it an array of vectors. 2024-05-13 22:12:55 +00:00
Julia Longtin
d351d995b0 loosen alignment requirements for zeros, add missing function, and promote aux8 to an array of vectors. 2024-05-13 22:12:55 +00:00
Julia Longtin
185d4b8bf7 promote aux8 into a vector. 2024-05-13 22:12:55 +00:00
Julia Longtin
a95c7b0138 fix our reference to src in the second place, and use a more accurate comment. 2024-05-13 22:12:55 +00:00
Julia Longtin
babe051eaa spacing changes, eliminate dead references to k1 or zero, and use the right type when referring to src. 2024-05-13 22:12:55 +00:00
Julia Longtin
b5c1135f4d better comments, and fix some small errors. 2024-05-13 22:12:55 +00:00
Julia Longtin
7e3eb5c01d perform 16 operations at a time. 2024-05-13 22:12:55 +00:00
Julia Longtin
6d4535e829 use proper mov operator, and pass addresses. 2024-05-13 22:12:54 +00:00
Julia Longtin
e72539bcc5 attempt our first FMA. 2024-05-13 22:12:54 +00:00
Julia Longtin
b22e3e021e add I32 vector memory clearing. 2024-05-13 22:12:54 +00:00
Julia Longtin
1446a724df promote aux32 to a vector. 2024-05-13 22:12:54 +00:00
Julia Longtin
a9cc0e74d3 add missing address of operators. 2024-05-13 22:12:54 +00:00
Julia Longtin
bff7b695b3 promote aux16 to a vector. 2024-05-13 22:12:54 +00:00
Julia Longtin
df33835700 use quotes properly. 2024-05-13 22:12:54 +00:00
Julia Longtin
2dc7991809 use better memory save operator. 2024-05-13 22:12:54 +00:00
Julia Longtin
588a0b19cc expand mask, and align memory. 2024-05-13 22:12:54 +00:00
Julia Longtin
3994d81bf0 try to use vectorized zeroing function. 2024-05-13 22:12:54 +00:00
Julia Longtin
e227717136 add missing variable. 2024-05-13 22:12:54 +00:00
Julia Longtin
d5a27eb507 copy right block. 2024-05-13 22:12:54 +00:00
Julia Longtin
9f92f9730e fix typo. 2024-05-13 22:12:54 +00:00
Julia Longtin
484c4abf8d promote aux16 into a vector. (part three) 2024-05-13 22:12:54 +00:00
Julia Longtin
fb0fb9ff1b promote aux16 into a vector. 2024-05-13 22:12:54 +00:00
Julia Longtin
405b5fa731 promote aux16 into a vector. 2024-05-13 22:12:54 +00:00
Julia Longtin
b92e06456c formatting improvement. 2024-05-13 22:12:54 +00:00
Julia Longtin
ea858eee03 first fixes. 2024-05-13 22:12:54 +00:00
Julia Longtin
feed51c3f4 attempt to speed up float clearing. 2024-05-13 22:12:54 +00:00
Julia Longtin
2ed306623c allow using code from ggml-phi-knc-dot_q5_K_q8_K.c 2024-05-13 22:12:50 +00:00