Commit graph

2967 commits

Author SHA1 Message Date
teleprint-me
2fa2c7a86c
chore: Move enums and model map to constants 2024-05-20 14:51:03 -04:00
teleprint-me
d9ba963cd4
refactor: Restructure tokenizer model metadata 2024-05-20 14:42:59 -04:00
teleprint-me
18bb36e496
chore: Allow the user to config the logger 2024-05-20 14:06:21 -04:00
teleprint-me
bdd0286bd0
refactor: Use proper names for referenced member variables 2024-05-20 01:39:09 -04:00
teleprint-me
a1951e27dc
refactor: Add proper names for remote model references 2024-05-20 01:36:44 -04:00
teleprint-me
6fc4492b3f
chore: Add english pangram to vocab tests 2024-05-20 00:51:35 -04:00
teleprint-me
381dad5eb3
fix: Add missing model architectures 2024-05-20 00:50:42 -04:00
teleprint-me
9a2834e24e
fix: Use __name__ as logger name 2024-05-19 22:39:30 -04:00
teleprint-me
a0362ea475
patch: Fix nested quotes for dict refs 2024-05-19 22:39:05 -04:00
teleprint-me
89a46fe818
feat: Attempt to mirror the llama.cpp API for compatibility 2024-05-19 22:31:05 -04:00
teleprint-me
c6f2a48af7
feat: Add prototype for identifying the vocab type 2024-05-19 22:30:37 -04:00
teleprint-me
dcc5d4241d
fix: Remove dangling if statement 2024-05-19 00:06:30 -04:00
teleprint-me
5840b6f0b0
refactor: Simplify the get_vocab_base_pre method 2024-05-18 23:59:52 -04:00
teleprint-me
316b404d94
patch: Fix CLI option for generating vocab tests 2024-05-18 23:59:22 -04:00
teleprint-me
da5deebda1
fix: Apply fix to verbose help description and generating vocab tests option 2024-05-18 23:34:33 -04:00
teleprint-me
ce777c8910
Merge branch 'master' into auto-model-support 2024-05-18 22:46:00 -04:00
teleprint-me
d02a0f42f9
feat: Add vocab generation script 2024-05-18 22:15:12 -04:00
teleprint-me
bd32266c87
feat: Add function for generating vocab script and fix CLI opts 2024-05-18 22:14:58 -04:00
teleprint-me
0479e9695f
patch: Add exception handling for non-existent vocab related files 2024-05-18 22:14:19 -04:00
teleprint-me
4b3735ca50
chore: Remove cluttered vocab files 2024-05-18 22:13:21 -04:00
teleprint-me
1a82573126
feat: Add example script for automating generating tokenizer model checksums and tests 2024-05-18 20:49:22 -04:00
teleprint-me
006bb60d27
chore: Fix model path references 2024-05-18 19:20:19 -04:00
fraxy-v
f5bf761747
Capture CUDA logging output (#7298)
* logging: output capture in cuda module

* fix compile error

* fix: vsnprintf terminates with 0, string use not correct

* post review

* Update llama.cpp

Co-authored-by: slaren <slarengh@gmail.com>

* Update llama.cpp

Co-authored-by: slaren <slarengh@gmail.com>

---------

Co-authored-by: slaren <slarengh@gmail.com>
2024-05-19 00:44:42 +02:00
teleprint-me
b6f70b8a0e
chore: Fix line spacing 2024-05-18 16:59:20 -04:00
teleprint-me
832b449cbd
feat: Add pre-tokenizer CLI tooling 2024-05-18 14:33:56 -04:00
teleprint-me
04fb7886c5
chore: Apply isort to package gguf init 2024-05-18 14:33:22 -04:00
teleprint-me
2ef73ee6e4
refactor: Apply SoC for HF requests, vocab, and weights 2024-05-18 13:45:21 -04:00
teleprint-me
5eda2c9485
feat: Add pre-tokenizer logging 2024-05-18 13:21:22 -04:00
Georgi Gerganov
059031b8c4
ci : re-enable sanitizer runs (#7358)
* Revert "ci : temporary disable sanitizer builds (#6128)"

This reverts commit 4f6d1337ca.

* ci : trigger
2024-05-18 18:55:54 +03:00
Georgi Gerganov
511182eabb
android : use "ci-android" branch for CI (#7341)
* android : use "ci-android" branch for CI

* ggml : disable SIMD exp and silu for 32-bit ARM

ggml-ci

* android : do not fetch, use add_subdirectory instead

* cmake : provide binary dir
2024-05-18 20:40:39 +10:00
Johannes Gäßler
133d99c599
CUDA: deduplicate FlashAttention code (#7352) 2024-05-18 12:36:25 +02:00
Johannes Gäßler
cb42c29427
server: correct --threads documentation [no ci] (#7362) 2024-05-18 11:10:47 +02:00
Engininja2
d233b507cd
cuda : add half2 __shfl_xor() for ROCm 5.5 (#7263) 2024-05-18 10:05:17 +02:00
Steffen Röcker
0f98acfac6
llama : add support for larger Granite Code Models (20B, 34B) (#7324)
Tie the weights for ARCH_STARCODER to support the larger Granite code models.
Partially addresses ggerganov/issues/7116

There still remains to be a few things to fix.
Currently requires `--override-kv tokenizer.ggml.add_bos_token=bool:false`
2024-05-18 11:04:55 +03:00
strawberrymelonpanda
ca57e0f35e
perplexity : ndot progress and show stats with < 100 tasks (#7348)
Fix floating point error with ndot printing, allow end stats on lower task numbers if multiple-choice tasks.
2024-05-18 10:57:08 +03:00
0cc4m
c1b295eea5
Update and fix Vulkan soft_max and argsort implementations (#7237)
* Update and fix Vulkan softmax implementation

* Update and fix Vulkan argsort implementation
2024-05-18 08:10:58 +02:00
Brian
de73196344
github-actions-labeler: initial commit (#7330)
* github-actions-labeler: initial commit [no ci]

* github actions: remove priority auto labeling [no ci]
2024-05-18 16:04:23 +10:00
Georgi Gerganov
b49a13dd2f
convert : fix set_vocab_sentencepiece (#6866)
* convert : fix set_vocab_sentencepiece

* Update convert-hf-to-gguf.py
2024-05-18 08:46:20 +03:00
teleprint-me
b2ca23c746
feat: Add method for generating the checksums and writing the results to a json file 2024-05-18 01:46:13 -04:00
teleprint-me
302258721b
refactor: Apply model schema to tokenizer downloads
- Add imports for json and hashlib
- Add missing models: phi, stablelm, mistral, and mixtral
- Fix constructor logic
- Fix how models are accessed
- Apply model schema to download_model method
2024-05-18 01:26:39 -04:00
teleprint-me
f7515abf49
feat: Add tokenizer types, model types, and model repos 2024-05-18 00:37:19 -04:00
teleprint-me
3ba01c7a0e
chore: Fix spacing 2024-05-18 00:10:42 -04:00
teleprint-me
1a286c8e21
refactor: Clean up variable names and separate concerns when downloading tokenizers 2024-05-17 23:27:30 -04:00
teleprint-me
5c8144e645
feat: Add download_model method and fix references for clarity to mitigate confusion 2024-05-17 23:00:12 -04:00
teleprint-me
4790f76740
feat: Add prototype for requesting vocab related files 2024-05-17 21:08:39 -04:00
teleprint-me
98cf788990
patch: Apply minor fixes for handling headers and writing content 2024-05-17 21:07:51 -04:00
slaren
05834841dc
ggml : fix quants nans when all the group weights are very close to zero (#7313) 2024-05-18 02:39:54 +02:00
Engininja2
ef277de2ad
cmake : fix typo in AMDGPU_TARGETS (#7356) 2024-05-18 02:39:25 +02:00
teleprint-me
742abebb39
refactor: Add log for status and fix url path variable name 2024-05-17 20:37:59 -04:00
teleprint-me
ba13d64bb3
feat: Add utils for logging and writing when interacting with HuggingFaceHub 2024-05-17 20:26:21 -04:00