Make more ML improvements

- Fix UX issues with llama.com
- Do housekeeping on libm code
- Add more vectorization to GGML
- Get GGJT quantizer programs working well
- Have the quantizer keep the output layer as f16c
- Prefetching improves performance 15% if you use fewer threads
This commit is contained in:
Justine Tunney 2023-05-16 08:07:23 -07:00
parent 80db9de173
commit e7eb0b3070
No known key found for this signature in database
GPG key ID: BE714B4575D6E328
46 changed files with 340 additions and 289 deletions

View file

@ -22,8 +22,8 @@
/**
* Adds floats in array.
*/
float fsumf(const float *p, size_t n) {
float s;
double fsumf(const float *p, size_t n) {
double s;
size_t i;
if (n > 8) return fsumf(p, n / 2) + fsumf(p + n / 2, n - n / 2);
for (s = i = 0; i < n; ++i) s += p[i];