Commit graph

3724 commits

Author SHA1 Message Date
Georgi Gerganov
8a82f388cd
sampling : fix state cloning
ggml-ci
2024-09-07 14:38:00 +03:00
Georgi Gerganov
0e6d170a50
sampling : avoid llama_model in few samplers
ggml-ci
2024-09-07 14:16:21 +03:00
Georgi Gerganov
19c36962f7
batched.swift : fix build 2024-09-07 12:49:56 +03:00
Georgi Gerganov
4b27235624
style : rearrange code + add comments and TODOs
ggml-ci
2024-09-07 12:29:57 +03:00
Georgi Gerganov
4a4530b7ff
examples : add missing samplers 2024-09-07 12:29:57 +03:00
Georgi Gerganov
9ce9210ef1
batched.swift : fix build [no ci] 2024-09-07 12:29:57 +03:00
Georgi Gerganov
befcfe7a31
common : simplify gpt_sampler
ggml-ci
2024-09-07 12:29:56 +03:00
Georgi Gerganov
757a9bf868
llama : add new llama_perf API
ggml-ci
2024-09-07 12:29:56 +03:00
Georgi Gerganov
5ab52c1f64
sampling : remove _context suffix [no ci] 2024-09-07 12:29:56 +03:00
Georgi Gerganov
b448c753b9
sampling : remove redundant indirection calls
ggml-ci
2024-09-07 12:29:56 +03:00
Georgi Gerganov
809bdcf767
sampling : allow passing m to mirostat sampler 2024-09-07 12:29:56 +03:00
Georgi Gerganov
8c972b69c1
grammar : restore llama_grammar_accept signature
ggml-ci
2024-09-07 12:29:56 +03:00
Georgi Gerganov
5b01cc8c8e
swift : fix example 2024-09-07 12:29:56 +03:00
Georgi Gerganov
82a89df960
sampling : improve mirostat implementation
ggml-ci
2024-09-07 12:29:55 +03:00
Georgi Gerganov
bd88352834
ios : try to fix build 2024-09-07 12:29:55 +03:00
Georgi Gerganov
34f4bd02da
sampling : fix cloning of samplers with null ctx
ggml-ci
2024-09-07 12:29:55 +03:00
Georgi Gerganov
0b6dfcebb2
llama : remove llama_constraint
ggml-ci
2024-09-07 12:29:55 +03:00
Georgi Gerganov
a2d8b27a4b
llama : restore comments in llama.h
ggml-ci
2024-09-07 12:29:55 +03:00
Georgi Gerganov
595711417a
sampling : add name API + option to disable timings 2024-09-07 12:29:55 +03:00
Georgi Gerganov
ebeb65194b
sampling : change _cp/copy to clone 2024-09-07 12:29:55 +03:00
Georgi Gerganov
69551ffd60
sampling : remove top-k min_keep, fix mirostat init and state 2024-09-07 12:29:54 +03:00
Georgi Gerganov
b2b36e9e95
example : fix build + fix speculative
ggml-ci
2024-09-07 12:29:54 +03:00
Georgi Gerganov
9b950671f4
sampling : fix grammar apply 2024-09-07 12:29:54 +03:00
Georgi Gerganov
8e80a1cf6b
sampling : simplify sample API
ggml-ci
2024-09-07 12:29:54 +03:00
Georgi Gerganov
e7a11cac0e
sampling : simplify new llama_sampler calls 2024-09-07 12:29:54 +03:00
Georgi Gerganov
784a644040
sampler : API to iterate constraints
ggml-ci
2024-09-07 12:29:54 +03:00
Georgi Gerganov
0e1378c844
sampling : convert mirostat samplers to constraints
ggml-ci
2024-09-07 12:29:54 +03:00
Georgi Gerganov
1a0de0b781
constraint : add name API
ggml-ci
2024-09-07 12:29:53 +03:00
Georgi Gerganov
c024fe45b0
constraint : clean-up and simplify 2024-09-07 12:29:53 +03:00
Georgi Gerganov
ca5d21c17a
grammar : fix reset call
ggml-ci
2024-09-07 12:29:53 +03:00
Georgi Gerganov
fdb52aa657
common : fix gpt_sampler_cp
ggml-ci
2024-09-07 12:29:53 +03:00
Georgi Gerganov
ad436e9284
examples : fix build
ggml-ci
2024-09-07 12:29:53 +03:00
Georgi Gerganov
a0b91214b4
cont : use new API in examples
ggml-ci
2024-09-07 12:29:53 +03:00
Georgi Gerganov
437376e708
cont : add n_prev to llama_sampler_params 2024-09-07 12:29:52 +03:00
Georgi Gerganov
91cbb40b29
cont : common/sampling use the new API [no ci] 2024-09-07 12:29:52 +03:00
Georgi Gerganov
1e8e26c155
cont : leaner constraint initialization [no ci] 2024-09-07 12:29:52 +03:00
Georgi Gerganov
09ceb68caa
cont : add comments [no ci] 2024-09-07 12:29:52 +03:00
Georgi Gerganov
a2ce91cbef
cont : add penalties and logit-bias constraints [no ci] 2024-09-07 12:29:52 +03:00
Georgi Gerganov
0daebc6b8d
cont : fix [no ci] 2024-09-07 12:29:52 +03:00
Georgi Gerganov
71293a6456
cont : add rest of the existing samplers [no ci] 2024-09-07 12:29:52 +03:00
Georgi Gerganov
1b07dc51c6
cont : fixes, naming [no ci] 2024-09-07 12:29:51 +03:00
Georgi Gerganov
cf4dd10ea5
cont : initial implementation sketch [no ci] 2024-09-07 12:29:51 +03:00
Georgi Gerganov
5116b3681c
cont : add llama_constraint_i [no ci] 2024-09-07 12:29:51 +03:00
Georgi Gerganov
86b07ccbb3
llama : sketching new sampling API 2024-09-07 12:29:51 +03:00
Georgi Gerganov
ab545c8380
llama : add llama_sampling API + move grammar in libllama
ggml-ci
2024-09-07 12:29:51 +03:00
slaren
6c89eb0b47
ci : disable rocm image creation (#9340) 2024-09-07 10:48:54 +03:00
Xuan Son Nguyen
9b2c24c099
server : simplify state machine for slot (#9283)
* server : simplify state machine for slot

* add SLOT_STATE_DONE_PROMPT

* pop_deferred_task

* add missing notify_one

* fix passkey test

* metrics : add n_busy_slots_per_decode

* fix test step

* add test

* maybe fix AddressSanitizer?

* fix deque ?

* missing lock

* pop_deferred_task: also notify

* Update examples/server/server.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-09-06 23:21:29 +02:00
Aarni Koskela
134bc38ecf
llama-bench : log benchmark progress (#9287)
* llama-bench : add optional progress messages
2024-09-06 23:03:01 +02:00
Aarni Koskela
815b1fb20a
batched-bench : add --output-format jsonl option (#9293)
`--output-format` is modeled after `llama-bench`'s options
2024-09-06 17:59:58 +02:00
Changyeon Kim
409dc4f8bb
ggml : fix build break for the vulkan-debug (#9265)
- windows build : Ok.
- linux build : Ok.

Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com>
2024-09-06 15:54:50 +03:00