teleprint-me
|
18bb36e496
|
chore: Allow the user to config the logger
|
2024-05-20 14:06:21 -04:00 |
|
Georgi Gerganov
|
fabf30b4c4
|
llama : remove Persimmon (#7408)
* llama : remove Persimmon
* requirements : remove
|
2024-05-21 02:35:28 +10:00 |
|
Johannes Gäßler
|
20385cebcc
|
perplexity: update README FP16 results [no ci] (#7413)
|
2024-05-20 18:15:38 +02:00 |
|
Radoslav Gerganov
|
db10f01310
|
rpc : track allocated buffers (#7411)
* rpc : track allocated buffers
ref: #7407
* rpc : pack rpc_tensor tightly
|
2024-05-20 16:36:55 +03:00 |
|
Georgi Gerganov
|
3bc10cb485
|
server : fix temperature + disable some tests (#7409)
* server : fix temperature
* server : disable tests relying on parallel determinism
* ci : change server Debug -> RelWithDebInfo
|
2024-05-20 22:10:03 +10:00 |
|
AidanBeltonS
|
6bf9b66fa3
|
[SYCL] Update SYCL upscale operation (#7321)
* Update SYCL upscale operation
* Formatting
* Remove messages
|
2024-05-20 16:38:23 +05:30 |
|
Bingan
|
26cd4237bc
|
Update README.md (#7410)
|
2024-05-20 11:55:34 +02:00 |
|
Herman Semenov
|
213e90ed73
|
ggml-opencl, llama: using reserve() if count already known (#7272)
|
2024-05-20 10:33:21 +03:00 |
|
junchao-loongson
|
65c58207ec
|
ggml : add loongarch lsx and lasx support (#6454)
* add loongarch lsx and lasx optimize code
* Add loongarch compilation support to makefile
* revert stb_image.h
* opt bytes_from_nibbles_32 and sum_i16_pairs_float
* fix undeclared
* format code
* update
* update 2
---------
Co-authored-by: Jinyang He <hejinyang@loongson.cn>
|
2024-05-20 10:19:21 +03:00 |
|
Georgi Gerganov
|
1cc0155d04
|
server : tuning tests (#7388)
* server : don't pass temperature as string
* server : increase timeout
* tests : fix the fix 0.8f -> 0.8
ggml-ci
* tests : set explicit temperature
|
2024-05-20 10:16:41 +03:00 |
|
Georgi Gerganov
|
e932094d58
|
server : return error on too large embedding input (#7389)
|
2024-05-20 08:56:05 +03:00 |
|
Georgi Gerganov
|
2789baf480
|
tests : fix --keep_split -> --keep-split (#7374)
|
2024-05-20 08:55:09 +03:00 |
|
teleprint-me
|
bdd0286bd0
|
refactor: Use proper names for referenced member variables
|
2024-05-20 01:39:09 -04:00 |
|
teleprint-me
|
a1951e27dc
|
refactor: Add proper names for remote model references
|
2024-05-20 01:36:44 -04:00 |
|
teleprint-me
|
6fc4492b3f
|
chore: Add english pangram to vocab tests
|
2024-05-20 00:51:35 -04:00 |
|
teleprint-me
|
381dad5eb3
|
fix: Add missing model architectures
|
2024-05-20 00:50:42 -04:00 |
|
teleprint-me
|
9a2834e24e
|
fix: Use __name__ as logger name
|
2024-05-19 22:39:30 -04:00 |
|
teleprint-me
|
a0362ea475
|
patch: Fix nested quotes for dict refs
|
2024-05-19 22:39:05 -04:00 |
|
teleprint-me
|
89a46fe818
|
feat: Attempt to mirror the llama.cpp API for compatibility
|
2024-05-19 22:31:05 -04:00 |
|
teleprint-me
|
c6f2a48af7
|
feat: Add prototype for identifying the vocab type
|
2024-05-19 22:30:37 -04:00 |
|
Srihari-mcw
|
33c8d50acc
|
Add provisions for windows support for BF16 code including CMake provision for enabling AVX512_BF16 (#7258)
|
2024-05-20 12:18:39 +10:00 |
|
slaren
|
d359f30921
|
llama : remove MPI backend (#7395)
|
2024-05-20 01:17:03 +02:00 |
|
Fred Douglas
|
1ea2a0036e
|
quantize : fix --keep-split check (#7374)
|
2024-05-19 19:37:04 +03:00 |
|
0cc4m
|
f030ec1f7a
|
Vulkan Embedding Fix (#7360)
* Fix empty Vulkan host buffers
Add fp32 fp16 matmul shader
Fix matmul shader alignment
* Remove deprecated tensor->backend uses
* Fix Vulkan validation errors on embedding models with no offloaded layers
* Fix Vulkan llava segfault when not offloading layers
|
2024-05-19 17:19:53 +02:00 |
|
slaren
|
e4e6f67be6
|
ggml : fix another case of quants nans (#7387)
|
2024-05-19 17:08:46 +02:00 |
|
Johannes Gäßler
|
5ca49cbecd
|
ggml: implement quantized KV cache for FA (#7372)
|
2024-05-19 16:46:13 +02:00 |
|
Johannes Gäßler
|
1b01f06db0
|
server: add test for token probs (#7347)
|
2024-05-19 16:26:02 +02:00 |
|
Johannes Gäßler
|
41858392e1
|
server: fix seed being reported back (#7382)
|
2024-05-19 17:06:33 +03:00 |
|
Anas Ahouzi
|
6aade19ee7
|
Add StableLM2 pre-tokenizer (#7349)
* Add StableLM pre-tokenizer
* Fix space
* Fix trailing whitespace
|
2024-05-19 22:46:46 +10:00 |
|
slaren
|
ab33f7a338
|
cuda : clear error after buffer allocation failure (#7376)
|
2024-05-19 14:19:37 +02:00 |
|
Brian
|
e23b974f4c
|
labeler.yml: Use settings from ggerganov/llama.cpp [no ci] (#7363)
https://github.com/actions/labeler#using-configuration-path-input-together-with-the-actionscheckout-action
Recommends the use of checkout action to use the correct repo context
when applying settings for PR labels
e.g.
steps:
- uses: actions/checkout@v4 # Uploads repository content to the runner
with:
repository: "owner/repositoryName" # The one of the available inputs, visit https://github.com/actions/checkout#readme to find more
- uses: actions/labeler@v5
with:
configuration-path: 'path/to/the/uploaded/configuration/file'
|
2024-05-19 20:51:03 +10:00 |
|
Georgi Gerganov
|
854d365aba
|
cmake : update android comments (#7341)
|
2024-05-19 11:01:01 +03:00 |
|
teleprint-me
|
dcc5d4241d
|
fix: Remove dangling if statement
|
2024-05-19 00:06:30 -04:00 |
|
teleprint-me
|
5840b6f0b0
|
refactor: Simplify the get_vocab_base_pre method
|
2024-05-18 23:59:52 -04:00 |
|
teleprint-me
|
316b404d94
|
patch: Fix CLI option for generating vocab tests
|
2024-05-18 23:59:22 -04:00 |
|
teleprint-me
|
da5deebda1
|
fix: Apply fix to verbose help description and generating vocab tests option
|
2024-05-18 23:34:33 -04:00 |
|
teleprint-me
|
ce777c8910
|
Merge branch 'master' into auto-model-support
|
2024-05-18 22:46:00 -04:00 |
|
teleprint-me
|
d02a0f42f9
|
feat: Add vocab generation script
|
2024-05-18 22:15:12 -04:00 |
|
teleprint-me
|
bd32266c87
|
feat: Add function for generating vocab script and fix CLI opts
|
2024-05-18 22:14:58 -04:00 |
|
teleprint-me
|
0479e9695f
|
patch: Add exception handling for non-existent vocab related files
|
2024-05-18 22:14:19 -04:00 |
|
teleprint-me
|
4b3735ca50
|
chore: Remove cluttered vocab files
|
2024-05-18 22:13:21 -04:00 |
|
teleprint-me
|
1a82573126
|
feat: Add example script for automating generating tokenizer model checksums and tests
|
2024-05-18 20:49:22 -04:00 |
|
teleprint-me
|
006bb60d27
|
chore: Fix model path references
|
2024-05-18 19:20:19 -04:00 |
|
fraxy-v
|
f5bf761747
|
Capture CUDA logging output (#7298)
* logging: output capture in cuda module
* fix compile error
* fix: vsnprintf terminates with 0, string use not correct
* post review
* Update llama.cpp
Co-authored-by: slaren <slarengh@gmail.com>
* Update llama.cpp
Co-authored-by: slaren <slarengh@gmail.com>
---------
Co-authored-by: slaren <slarengh@gmail.com>
|
2024-05-19 00:44:42 +02:00 |
|
teleprint-me
|
b6f70b8a0e
|
chore: Fix line spacing
|
2024-05-18 16:59:20 -04:00 |
|
teleprint-me
|
832b449cbd
|
feat: Add pre-tokenizer CLI tooling
|
2024-05-18 14:33:56 -04:00 |
|
teleprint-me
|
04fb7886c5
|
chore: Apply isort to package gguf init
|
2024-05-18 14:33:22 -04:00 |
|
teleprint-me
|
2ef73ee6e4
|
refactor: Apply SoC for HF requests, vocab, and weights
|
2024-05-18 13:45:21 -04:00 |
|
teleprint-me
|
5eda2c9485
|
feat: Add pre-tokenizer logging
|
2024-05-18 13:21:22 -04:00 |
|
Georgi Gerganov
|
059031b8c4
|
ci : re-enable sanitizer runs (#7358)
* Revert "ci : temporary disable sanitizer builds (#6128)"
This reverts commit 4f6d1337ca .
* ci : trigger
|
2024-05-18 18:55:54 +03:00 |
|