llama.cpp

Author	SHA1	Message	Date
teleprint-me	2fa2c7a86c	chore: Move enums and model map to constants	2024-05-20 14:51:03 -04:00
teleprint-me	d9ba963cd4	refactor: Restructure tokenizer model metadata	2024-05-20 14:42:59 -04:00
teleprint-me	18bb36e496	chore: Allow the user to config the logger	2024-05-20 14:06:21 -04:00
teleprint-me	bdd0286bd0	refactor: Use proper names for referenced member variables	2024-05-20 01:39:09 -04:00
teleprint-me	a1951e27dc	refactor: Add proper names for remote model references	2024-05-20 01:36:44 -04:00
teleprint-me	6fc4492b3f	chore: Add english pangram to vocab tests	2024-05-20 00:51:35 -04:00
teleprint-me	381dad5eb3	fix: Add missing model architectures	2024-05-20 00:50:42 -04:00
teleprint-me	9a2834e24e	fix: Use __name__ as logger name	2024-05-19 22:39:30 -04:00
teleprint-me	a0362ea475	patch: Fix nested quotes for dict refs	2024-05-19 22:39:05 -04:00
teleprint-me	89a46fe818	feat: Attempt to mirror the llama.cpp API for compatibility	2024-05-19 22:31:05 -04:00
teleprint-me	c6f2a48af7	feat: Add prototype for identifying the vocab type	2024-05-19 22:30:37 -04:00
teleprint-me	dcc5d4241d	fix: Remove dangling if statement	2024-05-19 00:06:30 -04:00
teleprint-me	5840b6f0b0	refactor: Simplify the get_vocab_base_pre method	2024-05-18 23:59:52 -04:00
teleprint-me	316b404d94	patch: Fix CLI option for generating vocab tests	2024-05-18 23:59:22 -04:00
teleprint-me	da5deebda1	fix: Apply fix to verbose help description and generating vocab tests option	2024-05-18 23:34:33 -04:00
teleprint-me	ce777c8910	Merge branch 'master' into auto-model-support	2024-05-18 22:46:00 -04:00
teleprint-me	d02a0f42f9	feat: Add vocab generation script	2024-05-18 22:15:12 -04:00
teleprint-me	bd32266c87	feat: Add function for generating vocab script and fix CLI opts	2024-05-18 22:14:58 -04:00
teleprint-me	0479e9695f	patch: Add exception handling for non-existent vocab related files	2024-05-18 22:14:19 -04:00
teleprint-me	4b3735ca50	chore: Remove cluttered vocab files	2024-05-18 22:13:21 -04:00
teleprint-me	1a82573126	feat: Add example script for automating generating tokenizer model checksums and tests	2024-05-18 20:49:22 -04:00
teleprint-me	006bb60d27	chore: Fix model path references	2024-05-18 19:20:19 -04:00
fraxy-v	f5bf761747	Capture CUDA logging output (#7298 ) * logging: output capture in cuda module * fix compile error * fix: vsnprintf terminates with 0, string use not correct * post review * Update llama.cpp Co-authored-by: slaren <slarengh@gmail.com> * Update llama.cpp Co-authored-by: slaren <slarengh@gmail.com> --------- Co-authored-by: slaren <slarengh@gmail.com>	2024-05-19 00:44:42 +02:00
teleprint-me	b6f70b8a0e	chore: Fix line spacing	2024-05-18 16:59:20 -04:00
teleprint-me	832b449cbd	feat: Add pre-tokenizer CLI tooling	2024-05-18 14:33:56 -04:00
teleprint-me	04fb7886c5	chore: Apply isort to package gguf init	2024-05-18 14:33:22 -04:00
teleprint-me	2ef73ee6e4	refactor: Apply SoC for HF requests, vocab, and weights	2024-05-18 13:45:21 -04:00
teleprint-me	5eda2c9485	feat: Add pre-tokenizer logging	2024-05-18 13:21:22 -04:00
Georgi Gerganov	059031b8c4	ci : re-enable sanitizer runs (#7358 ) * Revert "ci : temporary disable sanitizer builds (#6128)" This reverts commit `4f6d1337ca`. * ci : trigger	2024-05-18 18:55:54 +03:00
Georgi Gerganov	511182eabb	android : use "ci-android" branch for CI (#7341 ) * android : use "ci-android" branch for CI * ggml : disable SIMD exp and silu for 32-bit ARM ggml-ci * android : do not fetch, use add_subdirectory instead * cmake : provide binary dir	2024-05-18 20:40:39 +10:00
Johannes Gäßler	133d99c599	CUDA: deduplicate FlashAttention code (#7352 )	2024-05-18 12:36:25 +02:00
Johannes Gäßler	cb42c29427	server: correct --threads documentation [no ci] (#7362 )	2024-05-18 11:10:47 +02:00
Engininja2	d233b507cd	cuda : add half2 __shfl_xor() for ROCm 5.5 (#7263 )	2024-05-18 10:05:17 +02:00
Steffen Röcker	0f98acfac6	llama : add support for larger Granite Code Models (20B, 34B) (#7324 ) Tie the weights for ARCH_STARCODER to support the larger Granite code models. Partially addresses ggerganov/issues/7116 There still remains to be a few things to fix. Currently requires `--override-kv tokenizer.ggml.add_bos_token=bool:false`	2024-05-18 11:04:55 +03:00
strawberrymelonpanda	ca57e0f35e	perplexity : ndot progress and show stats with < 100 tasks (#7348 ) Fix floating point error with ndot printing, allow end stats on lower task numbers if multiple-choice tasks.	2024-05-18 10:57:08 +03:00
0cc4m	c1b295eea5	Update and fix Vulkan soft_max and argsort implementations (#7237 ) * Update and fix Vulkan softmax implementation * Update and fix Vulkan argsort implementation	2024-05-18 08:10:58 +02:00
Brian	de73196344	github-actions-labeler: initial commit (#7330 ) * github-actions-labeler: initial commit [no ci] * github actions: remove priority auto labeling [no ci]	2024-05-18 16:04:23 +10:00
Georgi Gerganov	b49a13dd2f	convert : fix set_vocab_sentencepiece (#6866 ) * convert : fix set_vocab_sentencepiece * Update convert-hf-to-gguf.py	2024-05-18 08:46:20 +03:00
teleprint-me	b2ca23c746	feat: Add method for generating the checksums and writing the results to a json file	2024-05-18 01:46:13 -04:00
teleprint-me	302258721b	refactor: Apply model schema to tokenizer downloads - Add imports for json and hashlib - Add missing models: phi, stablelm, mistral, and mixtral - Fix constructor logic - Fix how models are accessed - Apply model schema to download_model method	2024-05-18 01:26:39 -04:00
teleprint-me	f7515abf49	feat: Add tokenizer types, model types, and model repos	2024-05-18 00:37:19 -04:00
teleprint-me	3ba01c7a0e	chore: Fix spacing	2024-05-18 00:10:42 -04:00
teleprint-me	1a286c8e21	refactor: Clean up variable names and separate concerns when downloading tokenizers	2024-05-17 23:27:30 -04:00
teleprint-me	5c8144e645	feat: Add download_model method and fix references for clarity to mitigate confusion	2024-05-17 23:00:12 -04:00
teleprint-me	4790f76740	feat: Add prototype for requesting vocab related files	2024-05-17 21:08:39 -04:00
teleprint-me	98cf788990	patch: Apply minor fixes for handling headers and writing content	2024-05-17 21:07:51 -04:00
slaren	05834841dc	ggml : fix quants nans when all the group weights are very close to zero (#7313 )	2024-05-18 02:39:54 +02:00
Engininja2	ef277de2ad	cmake : fix typo in AMDGPU_TARGETS (#7356 )	2024-05-18 02:39:25 +02:00
teleprint-me	742abebb39	refactor: Add log for status and fix url path variable name	2024-05-17 20:37:59 -04:00
teleprint-me	ba13d64bb3	feat: Add utils for logging and writing when interacting with HuggingFaceHub	2024-05-17 20:26:21 -04:00

1 2 3 4 5 ...

2967 commits