Julia Longtin
429d69fd22
try to implement one intrinsic
2024-06-09 18:01:48 +00:00
Julia Longtin
7fb8d477ca
try to detect the PHI cross compiler in make.
2024-06-09 18:01:48 +00:00
Julia Longtin
366279e09e
try to detect the PHI cross compiler in make.
2024-06-09 18:01:48 +00:00
Julia Longtin
5c0d49cde4
instead of checking on glibc, check on SYS_getcpu
2024-06-09 18:01:48 +00:00
Julia Longtin
a83e2cadc0
handle the case that we have no glibc on the PHI.
2024-06-09 18:01:48 +00:00
Julia Longtin
9ec8635a06
add detection of Xeon PHI: Knights Corner.
2024-06-09 18:01:47 +00:00
Julia Longtin
0add3107f7
spacing changes.
2024-05-12 09:36:08 +00:00
Julia Longtin
a20edbf300
do 2 rounds of 4, instead of 4 rounds of 2. and properly offset unalligned reads across a 64 byte boundary.
2024-05-11 20:28:47 +00:00
Julia Longtin
b23ab86eda
make offset available in a register.
2024-05-11 19:57:45 +00:00
Julia Longtin
1072686dcf
load from identical addresses for low and high side.
2024-05-11 19:48:53 +00:00
Julia Longtin
3449b0f359
minor comment fixes.
2024-05-11 19:47:20 +00:00
Julia Longtin
efdb4116d1
make the offset of q4 available.
2024-05-11 19:39:53 +00:00
Julia Longtin
9550ca516f
add missing vector.
2024-05-11 19:29:09 +00:00
Julia Longtin
653a565a02
fill and increment r12 and r13.
2024-05-11 19:24:11 +00:00
Julia Longtin
7fa2d73b0a
relabel some other labels.
2024-05-11 19:02:48 +00:00
Julia Longtin
047defea41
rename some labels.
2024-05-11 17:56:10 +00:00
Julia Longtin
a1d0da669d
rename label 1 to 3.
2024-05-11 14:24:30 +00:00
Julia Longtin
0a0bb9b7db
introduce r10 and r11, for vloadunpackhd.
2024-05-11 14:02:36 +00:00
Julia Longtin
9d7f967e88
spacing changes.
2024-05-11 13:35:50 +00:00
Julia Longtin
6c4e687b85
spacing changes.
2024-05-11 13:26:00 +00:00
Julia Longtin
b34575b1f3
add missing jump.
2024-05-11 12:53:23 +00:00
Julia Longtin
fa0226c8df
look at the right final memory location.
2024-05-11 11:27:52 +00:00
Julia Longtin
fba57c125c
subtract the correct amount.
2024-05-11 11:11:15 +00:00
Julia Longtin
3156e639bf
change from handling three iterations per loop to four.
2024-05-11 11:07:16 +00:00
Julia Longtin
a82ada7dcd
comment clarification.
2024-05-10 21:57:16 +00:00
Julia Longtin
4a3c42c82c
correct a comment, and use jz when comparing to zero.
2024-05-10 20:30:56 +00:00
Julia Longtin
806472787d
use values inside of the loop as soon as we have them.
2024-05-10 19:33:58 +00:00
Julia Longtin
21a1e740c2
fix loop.
2024-05-10 17:07:27 +00:00
Julia Longtin
7e44eabe0f
move sub earlier, and move the compare of iterations to outside, and at the end of the loop.
2024-05-10 17:03:41 +00:00
Julia Longtin
7966c8e443
spacing and comment changes.
2024-05-10 16:50:39 +00:00
Julia Longtin
650094e17b
remove useless prefetches.
2024-05-10 16:28:53 +00:00
Julia Longtin
0ff7d5dd1a
perform better prefetches, and invert the test of our clear flag for clarity.
2024-05-10 16:14:28 +00:00
Julia Longtin
b00607d1ab
use vbroadcastss in place of vbroadcast32x4.
2024-05-10 15:52:35 +00:00
Julia Longtin
f6edcc4061
Use a vectorized assembly function to handle remaining chunks less than vector wide.
2024-05-10 14:52:46 +00:00
Julia Longtin
2282ac4d9f
broadcast a single int8, instead of 4 of them.
2024-05-10 14:19:27 +00:00
Julia Longtin
867de5edce
use different restrict syntax, to make g++ happy.
2024-05-09 23:08:43 +00:00
Julia Longtin
e1fdfaae45
fix typo
2024-05-09 20:41:50 +00:00
Julia Longtin
a283551db0
remove a warning.
2024-05-09 20:40:50 +00:00
Julia Longtin
af4ee51fa7
add batch fp16<->fp32 conversion functions.
2024-05-09 19:31:28 +00:00
Julia Longtin
81ca166ecd
minor spacing and comment changes.
2024-05-09 16:57:59 +00:00
Julia Longtin
047291fb42
spacing and capitalization changes. Fix the register list of GGML_5bit_Unpacked_Unaligned.
2024-04-26 14:44:08 +00:00
Julia Longtin
77d4ca906b
spacing and capitalization changes.
2024-04-25 21:23:22 +00:00
Julia Longtin
d69cf87fce
use or, instead of and. bug fix?
2024-04-24 17:50:12 +00:00
Julia Longtin
8cae9a9ef6
comment and spacing fixes.
2024-04-24 17:38:42 +00:00
Julia Longtin
90e99eaf1c
fix an offset error, and get rid of tabs.
2024-04-22 18:29:31 +00:00
Julia Longtin
6d16090246
fix some small errors.
2024-04-22 18:22:22 +00:00
Julia Longtin
e298d9e65e
further optimizations. 0.99 tokens per second.
2024-04-22 18:16:28 +00:00
compilade
132f55795e
llama : fix restoring the number of outputs from state files ( #6687 )
2024-04-15 15:56:55 +03:00
Pierrick Hymbert
3272896d79
server : revert "minor layout improvements" ( #6684 )
...
This reverts commit b3a96f27f0
.
2024-04-15 15:18:47 +03:00
Steven Prichard
7fc16a2c32
swift : linux support ( #6590 )
...
- Package.swift now supports conditional compilation based on OS
- Allows for package to be used by SPM on Non-Apple platforms
Co-authored-by: Steven Prichard <steven.prichard@justeattakeaway.com>
2024-04-15 13:14:46 +03:00