speculative : PoC for speeding-up inference via speculative sampling (#2926)

* speculative : initial example

* speculative : print encoding speed

* speculative : add --draft CLI arg
This commit is contained in:
Georgi Gerganov 2023-09-03 15:12:08 +03:00 committed by GitHub
parent 8f429fa511
commit 47068e5170
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
6 changed files with 440 additions and 115 deletions

View file

@ -23,6 +23,7 @@ else()
add_subdirectory(train-text-from-scratch)
add_subdirectory(convert-llama2c-to-ggml)
add_subdirectory(simple)
add_subdirectory(speculative)
add_subdirectory(embd-input)
add_subdirectory(llama-bench)
add_subdirectory(beam-search)