Georgi Gerganov
19c36962f7
batched.swift : fix build
2024-09-07 12:49:56 +03:00
Georgi Gerganov
4b27235624
style : rearrange code + add comments and TODOs
...
ggml-ci
2024-09-07 12:29:57 +03:00
Georgi Gerganov
4a4530b7ff
examples : add missing samplers
2024-09-07 12:29:57 +03:00
Georgi Gerganov
9ce9210ef1
batched.swift : fix build [no ci]
2024-09-07 12:29:57 +03:00
Georgi Gerganov
befcfe7a31
common : simplify gpt_sampler
...
ggml-ci
2024-09-07 12:29:56 +03:00
Georgi Gerganov
757a9bf868
llama : add new llama_perf API
...
ggml-ci
2024-09-07 12:29:56 +03:00
Georgi Gerganov
5ab52c1f64
sampling : remove _context suffix [no ci]
2024-09-07 12:29:56 +03:00
Georgi Gerganov
b448c753b9
sampling : remove redundant indirection calls
...
ggml-ci
2024-09-07 12:29:56 +03:00
Georgi Gerganov
809bdcf767
sampling : allow passing m to mirostat sampler
2024-09-07 12:29:56 +03:00
Georgi Gerganov
8c972b69c1
grammar : restore llama_grammar_accept signature
...
ggml-ci
2024-09-07 12:29:56 +03:00
Georgi Gerganov
5b01cc8c8e
swift : fix example
2024-09-07 12:29:56 +03:00
Georgi Gerganov
82a89df960
sampling : improve mirostat implementation
...
ggml-ci
2024-09-07 12:29:55 +03:00
Georgi Gerganov
bd88352834
ios : try to fix build
2024-09-07 12:29:55 +03:00
Georgi Gerganov
34f4bd02da
sampling : fix cloning of samplers with null ctx
...
ggml-ci
2024-09-07 12:29:55 +03:00
Georgi Gerganov
0b6dfcebb2
llama : remove llama_constraint
...
ggml-ci
2024-09-07 12:29:55 +03:00
Georgi Gerganov
a2d8b27a4b
llama : restore comments in llama.h
...
ggml-ci
2024-09-07 12:29:55 +03:00
Georgi Gerganov
595711417a
sampling : add name API + option to disable timings
2024-09-07 12:29:55 +03:00
Georgi Gerganov
ebeb65194b
sampling : change _cp/copy to clone
2024-09-07 12:29:55 +03:00
Georgi Gerganov
69551ffd60
sampling : remove top-k min_keep, fix mirostat init and state
2024-09-07 12:29:54 +03:00
Georgi Gerganov
b2b36e9e95
example : fix build + fix speculative
...
ggml-ci
2024-09-07 12:29:54 +03:00
Georgi Gerganov
9b950671f4
sampling : fix grammar apply
2024-09-07 12:29:54 +03:00
Georgi Gerganov
8e80a1cf6b
sampling : simplify sample API
...
ggml-ci
2024-09-07 12:29:54 +03:00
Georgi Gerganov
e7a11cac0e
sampling : simplify new llama_sampler calls
2024-09-07 12:29:54 +03:00
Georgi Gerganov
784a644040
sampler : API to iterate constraints
...
ggml-ci
2024-09-07 12:29:54 +03:00
Georgi Gerganov
0e1378c844
sampling : convert mirostat samplers to constraints
...
ggml-ci
2024-09-07 12:29:54 +03:00
Georgi Gerganov
1a0de0b781
constraint : add name API
...
ggml-ci
2024-09-07 12:29:53 +03:00
Georgi Gerganov
c024fe45b0
constraint : clean-up and simplify
2024-09-07 12:29:53 +03:00
Georgi Gerganov
ca5d21c17a
grammar : fix reset call
...
ggml-ci
2024-09-07 12:29:53 +03:00
Georgi Gerganov
fdb52aa657
common : fix gpt_sampler_cp
...
ggml-ci
2024-09-07 12:29:53 +03:00
Georgi Gerganov
ad436e9284
examples : fix build
...
ggml-ci
2024-09-07 12:29:53 +03:00
Georgi Gerganov
a0b91214b4
cont : use new API in examples
...
ggml-ci
2024-09-07 12:29:53 +03:00
Georgi Gerganov
437376e708
cont : add n_prev to llama_sampler_params
2024-09-07 12:29:52 +03:00
Georgi Gerganov
91cbb40b29
cont : common/sampling use the new API [no ci]
2024-09-07 12:29:52 +03:00
Georgi Gerganov
1e8e26c155
cont : leaner constraint initialization [no ci]
2024-09-07 12:29:52 +03:00
Georgi Gerganov
09ceb68caa
cont : add comments [no ci]
2024-09-07 12:29:52 +03:00
Georgi Gerganov
a2ce91cbef
cont : add penalties and logit-bias constraints [no ci]
2024-09-07 12:29:52 +03:00
Georgi Gerganov
0daebc6b8d
cont : fix [no ci]
2024-09-07 12:29:52 +03:00
Georgi Gerganov
71293a6456
cont : add rest of the existing samplers [no ci]
2024-09-07 12:29:52 +03:00
Georgi Gerganov
1b07dc51c6
cont : fixes, naming [no ci]
2024-09-07 12:29:51 +03:00
Georgi Gerganov
cf4dd10ea5
cont : initial implementation sketch [no ci]
2024-09-07 12:29:51 +03:00
Georgi Gerganov
5116b3681c
cont : add llama_constraint_i [no ci]
2024-09-07 12:29:51 +03:00
Georgi Gerganov
86b07ccbb3
llama : sketching new sampling API
2024-09-07 12:29:51 +03:00
Georgi Gerganov
ab545c8380
llama : add llama_sampling API + move grammar in libllama
...
ggml-ci
2024-09-07 12:29:51 +03:00
slaren
6c89eb0b47
ci : disable rocm image creation ( #9340 )
2024-09-07 10:48:54 +03:00
Xuan Son Nguyen
9b2c24c099
server : simplify state machine for slot ( #9283 )
...
* server : simplify state machine for slot
* add SLOT_STATE_DONE_PROMPT
* pop_deferred_task
* add missing notify_one
* fix passkey test
* metrics : add n_busy_slots_per_decode
* fix test step
* add test
* maybe fix AddressSanitizer?
* fix deque ?
* missing lock
* pop_deferred_task: also notify
* Update examples/server/server.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-09-06 23:21:29 +02:00
Aarni Koskela
134bc38ecf
llama-bench : log benchmark progress ( #9287 )
...
* llama-bench : add optional progress messages
2024-09-06 23:03:01 +02:00
Aarni Koskela
815b1fb20a
batched-bench : add --output-format jsonl
option ( #9293 )
...
`--output-format` is modeled after `llama-bench`'s options
2024-09-06 17:59:58 +02:00
Changyeon Kim
409dc4f8bb
ggml : fix build break for the vulkan-debug ( #9265 )
...
- windows build : Ok.
- linux build : Ok.
Signed-off-by: Changyeon Kim <cyzero.kim@samsung.com>
2024-09-06 15:54:50 +03:00
Xuan Son Nguyen
4a1411b4f1
server : fix missing lock ( #9334 )
2024-09-06 14:06:04 +02:00
Markus Tavenrath
8ebe8ddebd
Improve Vulkan shader build system ( #9239 )
...
* Improve Vulkan shader builds system
- Add dependency to vulkan-shaders-gen to rebuild shaders when changing the shader compilation utility.
- Add option to generate debug info for Vulkan shaders to provide shader source to Vulkan shader profiling tools
* remove not required self dependency
2024-09-06 08:56:17 +02:00