server : improve infill context reuse (#9894)

ggml-ci
This commit is contained in:
Georgi Gerganov 2024-10-15 16:28:55 +03:00 committed by GitHub
parent fbc98b748e
commit 223c25a72f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 33 additions and 50 deletions

View file

@ -524,10 +524,12 @@ Takes a prefix and a suffix and returns the predicted completion as stream.
- `input_prefix`: Set the prefix of the code to infill.
- `input_suffix`: Set the suffix of the code to infill.
- `prompt`: Added after the `FIM_MID` token
- `extra_context`: Additional context inserted before the FIM prefix. See https://github.com/ggerganov/llama.cpp/pull/9874
- `input_extra`: Additional context inserted before the FIM prefix.
- `prompt`: Added after the `FIM_MID` token
It also accepts all the options of `/completion`.
`input_extra` is array of `{"filename": string, "text": string}` objects.
The endpoint also accepts all the options of `/completion`.
If the model has `FIM_REPO` and `FIM_FILE_SEP` tokens, the [repo-level pattern](https://arxiv.org/pdf/2409.12186) is used:
@ -545,7 +547,7 @@ If the model has `FIM_REPO` and `FIM_FILE_SEP` tokens, the [repo-level pattern](
If the tokens are missing, then the extra context is simply prefixed at the start:
```txt
[extra_context]<FIM_PRE>[input_prefix]<FIM_SUF>[input_suffix]<FIM_MID>[prompt]
[input_extra]<FIM_PRE>[input_prefix]<FIM_SUF>[input_suffix]<FIM_MID>[prompt]
```
### **GET** `/props`: Get server global properties.