llama.cpp

Author	SHA1	Message	Date
Julia Longtin	7a00422fa3	try to use vectorized zeroing function.	2024-06-09 18:01:49 +00:00
Julia Longtin	2870bfc6dd	add missing variable.	2024-06-09 18:01:48 +00:00
Julia Longtin	656bf28c91	copy right block.	2024-06-09 18:01:48 +00:00
Julia Longtin	e99f3a9bf4	fix typo.	2024-06-09 18:01:48 +00:00
Julia Longtin	84093a6be6	promote aux16 into a vector. (part three)	2024-06-09 18:01:48 +00:00
Julia Longtin	66d26d4914	promote aux16 into a vector.	2024-06-09 18:01:48 +00:00
Julia Longtin	2f0a949ae0	promote aux16 into a vector.	2024-06-09 18:01:48 +00:00
Julia Longtin	ff29b659c8	formatting improvement.	2024-06-09 18:01:48 +00:00
Julia Longtin	b3ec86e59c	first fixes.	2024-06-09 18:01:48 +00:00
Julia Longtin	7f5adf3b5c	attempt to speed up float clearing.	2024-06-09 18:01:48 +00:00
Julia Longtin	a015d8485e	allow using code from ggml-phi-knc-dot_q5_K_q8_K.c	2024-06-09 18:01:48 +00:00
Julia Longtin	aee550af6c	force to compile.	2024-06-09 18:01:48 +00:00
Julia Longtin	a7f8abeb9b	tell ggml-common.h to export what we want.	2024-06-09 18:01:48 +00:00
Julia Longtin	8703abe225	pull in ggml specific types.	2024-06-09 18:01:48 +00:00
Julia Longtin	62e354354c	import stdio.h for size_t.	2024-06-09 18:01:48 +00:00
Julia Longtin	3edaaca993	import stdint.h for sizeSt.	2024-06-09 18:01:48 +00:00
Julia Longtin	669ce9b720	begin work on targeting dot_q5_K_q8_K.	2024-06-09 18:01:48 +00:00
Julia Longtin	c9730c0e04	be more specific about the length of our list of run amounts.	2024-06-09 18:01:48 +00:00
Julia Longtin	a48d3b96d7	spacing changes.	2024-06-09 18:01:48 +00:00
Julia Longtin	bb73cb319c	formatting changes.	2024-06-09 18:01:48 +00:00
Julia Longtin	a06fa4b1b5	use the same header as ggml.c, and remove some warnings.	2024-06-09 18:01:48 +00:00
Julia Longtin	5a9d2f5f71	remove intrinsics import, and use upConv to save 12 bytes of memory transit.	2024-06-09 18:01:48 +00:00
Julia Longtin	d095d8e9c7	Update ggml-phi-knc.c	2024-06-09 18:01:48 +00:00
Julia Longtin	a56a6f31fa	add a benchmark / test binary.	2024-06-09 18:01:48 +00:00
Julia Longtin	d7d679e41a	merge from upstream	2024-06-09 18:01:48 +00:00
Julia Longtin	c70b5f211b	Update ggml.c	2024-06-09 18:01:48 +00:00
Julia Longtin	114e7dd762	Update ggml.c	2024-06-09 18:01:48 +00:00
Julia Longtin	83be3dbab7	Update ggml.c	2024-06-09 18:01:48 +00:00
Julia Longtin	192e4ad857	implement F32 dot products.	2024-06-09 18:01:48 +00:00
Julia Longtin	7fce3f6b67	import intrinsics.	2024-06-09 18:01:48 +00:00
Julia Longtin	b5ea05f003	use right type, and define GGML_F32_VEC_ZERO.	2024-06-09 18:01:48 +00:00
Julia Longtin	429d69fd22	try to implement one intrinsic	2024-06-09 18:01:48 +00:00
Julia Longtin	7fb8d477ca	try to detect the PHI cross compiler in make.	2024-06-09 18:01:48 +00:00
Julia Longtin	366279e09e	try to detect the PHI cross compiler in make.	2024-06-09 18:01:48 +00:00
Julia Longtin	5c0d49cde4	instead of checking on glibc, check on SYS_getcpu	2024-06-09 18:01:48 +00:00
Julia Longtin	a83e2cadc0	handle the case that we have no glibc on the PHI.	2024-06-09 18:01:48 +00:00
Julia Longtin	9ec8635a06	add detection of Xeon PHI: Knights Corner.	2024-06-09 18:01:47 +00:00
compilade	132f55795e	llama : fix restoring the number of outputs from state files (#6687 )	2024-04-15 15:56:55 +03:00
Pierrick Hymbert	3272896d79	server : revert "minor layout improvements" (#6684 ) This reverts commit `b3a96f27f0`.	2024-04-15 15:18:47 +03:00
Steven Prichard	7fc16a2c32	swift : linux support (#6590 ) - Package.swift now supports conditional compilation based on OS - Allows for package to be used by SPM on Non-Apple platforms Co-authored-by: Steven Prichard <steven.prichard@justeattakeaway.com>	2024-04-15 13:14:46 +03:00
Neo Zhang Jianyu	17e98d4c96	fix mul_mat_id() for new input, make the ut pass (#6682 )	2024-04-15 17:12:26 +08:00
David Renshaw	1958f7e06c	llama : add missing kv clear in llama_beam_search (#6664 )	2024-04-14 15:24:15 -04:00
Chao Jiang	04fbc5f23e	Add Command R chat template (#6650 ) * Add chat template for command-r model series * Fix indentation * Add chat template test for command-r models and update the implementation to trim whitespaces * Remove debug print	2024-04-14 18:16:34 +02:00
Georgi Gerganov	f184dd9208	flake.lock: Update (#6669 )	2024-04-14 06:55:30 -07:00
Dave	422c2aff1c	Added support for GGML_OP_CLAMP in Metal (#6662 ) * Added support for GGML_OP_CLAMP in Metal * Corrected size --------- Co-authored-by: dave-fl <dave@Davids-MacBook-Pro.local>	2024-04-14 13:14:19 +02:00
Sigbjørn Skjæret	8800226d65	Fix --split-max-size (#6655 ) * Fix --split-max-size Byte size calculation was done on int and overflowed. * add tests.sh * add examples test scripts to ci run Will autodiscover examples//tests.sh scripts and run them. move WORK_PATH to a subdirectory * clean up before and after test * explicitly define which scripts to run * add --split-max-size to readme	2024-04-14 13:12:59 +02:00
Jaemin Son	e689fc4e91	[bug fix] convert github repository_owner to lowercase (#6673 )	2024-04-14 13:12:36 +02:00
James A Capozzoli	a4ec34e1cd	convert : enable the `--use-temp-file` cli flag (#6645 )	2024-04-14 11:40:18 +03:00
Neo Zhang Jianyu	de17e3f745	fix memcpy() crash, add missed cmd in guide, fix softmax (#6622 ) * disable mmap to fix memcpy crash, add missed cmd in guide, fix softmax * refactor to disable mmap for SYCL backend * fix compile error in other os * refactor the solution, use host buf to fix it, instead of disable mmap * keep to support mmap() * use host buff to reduce malloc times * revert to malloc/free solution, for threaad safe	2024-04-14 10:42:29 +08:00
Johannes Gäßler	b5e7285baf	CUDA: fix matrix multiplication logic for tests (#6667 )	2024-04-14 00:21:55 +02:00

1 2 3 4 5 ...

2715 commits