JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings, cap number length (#6555)
* json: rename python schema converter to make import easier * server: skip null json_schema / grammar fields * json: deps management for primitive rules (+ allow null values) * json: optimize repetitions for minItems/maxItems and regexps: `a{,3}` goes from `"a"? "a"? "a"?` (explosive combos) to `(a (a (a)?)?)?` * grammars: add troubleshooting section to readme * json: cap length of numbers to 15 digits before/after decimal point (avoids infinite gen, e.g. "one third" -> `0.333333333333...`) * json: unify all repetition code (w/ or w/o sep) * json: support string minLength/maxLength * server+json: update server/README w/ result_format * nits * json: fix type error w/ python 3.8 * json: fix server/README (json_schema in /completion vs. result_format in /v1/chat/completions) * json: simplify DOT `{"type": "string", "pattern": "^.$"}` * json: remove recursion in opt_repetitions (avoids Python stack overflow) * json: rm dead code * json: rm useless assert & ggml.h import
This commit is contained in:
parent
fbbc030ba9
commit
ab9a3240a9
10 changed files with 2348 additions and 1929 deletions
|
@ -89,3 +89,13 @@ This guide provides a brief overview. Check out the GBNF files in this directory
|
|||
```
|
||||
./main -m <model> --grammar-file grammars/some-grammar.gbnf -p 'Some prompt'
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
Grammars currently have performance gotchas (see https://github.com/ggerganov/llama.cpp/issues/4218).
|
||||
|
||||
### Efficient optional repetitions
|
||||
|
||||
A common pattern is to allow repetitions of a pattern `x` up to N times.
|
||||
|
||||
While semantically correct, the syntax `x? x? x?.... x?` (with N repetitions) will result in extremely slow inference. Instead, you can write `(x (x (x ... (x)?...)?)?)?` (w/ N-deep nesting)
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue