Georgi Gerganov
|
4ac186aece
|
llama : update doc [no ci]
|
2024-09-07 15:14:37 +03:00 |
|
Georgi Gerganov
|
2387dbea7d
|
sampling : fix repeat penalty out-of-bounds access
ggml-ci
|
2024-09-07 14:50:43 +03:00 |
|
Georgi Gerganov
|
8a82f388cd
|
sampling : fix state cloning
ggml-ci
|
2024-09-07 14:38:00 +03:00 |
|
Georgi Gerganov
|
0e6d170a50
|
sampling : avoid llama_model in few samplers
ggml-ci
|
2024-09-07 14:16:21 +03:00 |
|
Georgi Gerganov
|
19c36962f7
|
batched.swift : fix build
|
2024-09-07 12:49:56 +03:00 |
|
Georgi Gerganov
|
4b27235624
|
style : rearrange code + add comments and TODOs
ggml-ci
|
2024-09-07 12:29:57 +03:00 |
|
Georgi Gerganov
|
4a4530b7ff
|
examples : add missing samplers
|
2024-09-07 12:29:57 +03:00 |
|
Georgi Gerganov
|
9ce9210ef1
|
batched.swift : fix build [no ci]
|
2024-09-07 12:29:57 +03:00 |
|
Georgi Gerganov
|
befcfe7a31
|
common : simplify gpt_sampler
ggml-ci
|
2024-09-07 12:29:56 +03:00 |
|
Georgi Gerganov
|
757a9bf868
|
llama : add new llama_perf API
ggml-ci
|
2024-09-07 12:29:56 +03:00 |
|
Georgi Gerganov
|
5ab52c1f64
|
sampling : remove _context suffix [no ci]
|
2024-09-07 12:29:56 +03:00 |
|
Georgi Gerganov
|
b448c753b9
|
sampling : remove redundant indirection calls
ggml-ci
|
2024-09-07 12:29:56 +03:00 |
|
Georgi Gerganov
|
809bdcf767
|
sampling : allow passing m to mirostat sampler
|
2024-09-07 12:29:56 +03:00 |
|
Georgi Gerganov
|
8c972b69c1
|
grammar : restore llama_grammar_accept signature
ggml-ci
|
2024-09-07 12:29:56 +03:00 |
|
Georgi Gerganov
|
5b01cc8c8e
|
swift : fix example
|
2024-09-07 12:29:56 +03:00 |
|
Georgi Gerganov
|
82a89df960
|
sampling : improve mirostat implementation
ggml-ci
|
2024-09-07 12:29:55 +03:00 |
|
Georgi Gerganov
|
bd88352834
|
ios : try to fix build
|
2024-09-07 12:29:55 +03:00 |
|
Georgi Gerganov
|
34f4bd02da
|
sampling : fix cloning of samplers with null ctx
ggml-ci
|
2024-09-07 12:29:55 +03:00 |
|
Georgi Gerganov
|
0b6dfcebb2
|
llama : remove llama_constraint
ggml-ci
|
2024-09-07 12:29:55 +03:00 |
|
Georgi Gerganov
|
a2d8b27a4b
|
llama : restore comments in llama.h
ggml-ci
|
2024-09-07 12:29:55 +03:00 |
|
Georgi Gerganov
|
595711417a
|
sampling : add name API + option to disable timings
|
2024-09-07 12:29:55 +03:00 |
|
Georgi Gerganov
|
ebeb65194b
|
sampling : change _cp/copy to clone
|
2024-09-07 12:29:55 +03:00 |
|
Georgi Gerganov
|
69551ffd60
|
sampling : remove top-k min_keep, fix mirostat init and state
|
2024-09-07 12:29:54 +03:00 |
|
Georgi Gerganov
|
b2b36e9e95
|
example : fix build + fix speculative
ggml-ci
|
2024-09-07 12:29:54 +03:00 |
|
Georgi Gerganov
|
9b950671f4
|
sampling : fix grammar apply
|
2024-09-07 12:29:54 +03:00 |
|
Georgi Gerganov
|
8e80a1cf6b
|
sampling : simplify sample API
ggml-ci
|
2024-09-07 12:29:54 +03:00 |
|
Georgi Gerganov
|
e7a11cac0e
|
sampling : simplify new llama_sampler calls
|
2024-09-07 12:29:54 +03:00 |
|
Georgi Gerganov
|
784a644040
|
sampler : API to iterate constraints
ggml-ci
|
2024-09-07 12:29:54 +03:00 |
|
Georgi Gerganov
|
0e1378c844
|
sampling : convert mirostat samplers to constraints
ggml-ci
|
2024-09-07 12:29:54 +03:00 |
|
Georgi Gerganov
|
1a0de0b781
|
constraint : add name API
ggml-ci
|
2024-09-07 12:29:53 +03:00 |
|
Georgi Gerganov
|
c024fe45b0
|
constraint : clean-up and simplify
|
2024-09-07 12:29:53 +03:00 |
|
Georgi Gerganov
|
ca5d21c17a
|
grammar : fix reset call
ggml-ci
|
2024-09-07 12:29:53 +03:00 |
|
Georgi Gerganov
|
fdb52aa657
|
common : fix gpt_sampler_cp
ggml-ci
|
2024-09-07 12:29:53 +03:00 |
|
Georgi Gerganov
|
ad436e9284
|
examples : fix build
ggml-ci
|
2024-09-07 12:29:53 +03:00 |
|
Georgi Gerganov
|
a0b91214b4
|
cont : use new API in examples
ggml-ci
|
2024-09-07 12:29:53 +03:00 |
|
Georgi Gerganov
|
437376e708
|
cont : add n_prev to llama_sampler_params
|
2024-09-07 12:29:52 +03:00 |
|
Georgi Gerganov
|
91cbb40b29
|
cont : common/sampling use the new API [no ci]
|
2024-09-07 12:29:52 +03:00 |
|
Georgi Gerganov
|
1e8e26c155
|
cont : leaner constraint initialization [no ci]
|
2024-09-07 12:29:52 +03:00 |
|
Georgi Gerganov
|
09ceb68caa
|
cont : add comments [no ci]
|
2024-09-07 12:29:52 +03:00 |
|
Georgi Gerganov
|
a2ce91cbef
|
cont : add penalties and logit-bias constraints [no ci]
|
2024-09-07 12:29:52 +03:00 |
|
Georgi Gerganov
|
0daebc6b8d
|
cont : fix [no ci]
|
2024-09-07 12:29:52 +03:00 |
|
Georgi Gerganov
|
71293a6456
|
cont : add rest of the existing samplers [no ci]
|
2024-09-07 12:29:52 +03:00 |
|
Georgi Gerganov
|
1b07dc51c6
|
cont : fixes, naming [no ci]
|
2024-09-07 12:29:51 +03:00 |
|
Georgi Gerganov
|
cf4dd10ea5
|
cont : initial implementation sketch [no ci]
|
2024-09-07 12:29:51 +03:00 |
|
Georgi Gerganov
|
5116b3681c
|
cont : add llama_constraint_i [no ci]
|
2024-09-07 12:29:51 +03:00 |
|
Georgi Gerganov
|
86b07ccbb3
|
llama : sketching new sampling API
|
2024-09-07 12:29:51 +03:00 |
|
Georgi Gerganov
|
ab545c8380
|
llama : add llama_sampling API + move grammar in libllama
ggml-ci
|
2024-09-07 12:29:51 +03:00 |
|
slaren
|
6c89eb0b47
|
ci : disable rocm image creation (#9340)
|
2024-09-07 10:48:54 +03:00 |
|
Xuan Son Nguyen
|
9b2c24c099
|
server : simplify state machine for slot (#9283)
* server : simplify state machine for slot
* add SLOT_STATE_DONE_PROMPT
* pop_deferred_task
* add missing notify_one
* fix passkey test
* metrics : add n_busy_slots_per_decode
* fix test step
* add test
* maybe fix AddressSanitizer?
* fix deque ?
* missing lock
* pop_deferred_task: also notify
* Update examples/server/server.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
|
2024-09-06 23:21:29 +02:00 |
|
Aarni Koskela
|
134bc38ecf
|
llama-bench : log benchmark progress (#9287)
* llama-bench : add optional progress messages
|
2024-09-06 23:03:01 +02:00 |
|