Make more ML improvements

- Fix UX issues with llama.com - Do housekeeping on libm code - Add more vectorization to GGML - Get GGJT quantizer programs working well - Have the quantizer keep the output layer as f16c - Prefetching improves performance 15% if you use fewer threads
2025-10-04 13:41:02 +00:00 · 2023-05-16 08:07:23 -07:00 · 2023-05-16 08:07:23 -07:00 · e7eb0b3070
commit e7eb0b3070
parent 80db9de173
46 changed files with 340 additions and 289 deletions
--- a/libc/tinymath/fsumf.c
+++ b/libc/tinymath/fsumf.c
@ -22,8 +22,8 @@
 /**
 * Adds floats in array.
 */
-float fsumf(const float *p, size_t n) {
-  float s;
+double fsumf(const float *p, size_t n) {
+  double s;
  size_t i;
  if (n > 8) return fsumf(p, n / 2) + fsumf(p + n / 2, n - n / 2);
  for (s = i = 0; i < n; ++i) s += p[i];