main : add --in-prefix-bos to prefix BOS to user inputs; keep EOS (#2304)

* add `--in-prefix-bos` to prefix BOS to user inputs; keep EOS The BOS precedes the string specified by `--in-prefix`. Model generated EOS is now kept in the context. It provides a way to strictly following the prompt format used in Llama-2-chat. The EOS handling also benefits some existing finetunes that uses EOS to mark the end of turn. * examples/common: move input_prefix_bos to other bools
2023-07-25 07:19:11 -05:00 · 2023-07-25 07:19:11 -05:00 · 0c06204fb3
commit 0c06204fb3
parent 1fed755b1f
3 changed files with 34 additions and 17 deletions
--- a/examples/common.h
+++ b/examples/common.h
@ -82,6 +82,7 @@ struct gpt_params {
    bool interactive_first = false; // wait for user input immediately
    bool multiline_input   = false; // reverse the usage of `\`

+    bool input_prefix_bos  = false; // prefix BOS to user inputs, preceding input_prefix
    bool instruct          = false; // instruction mode (used for Alpaca models)
    bool penalize_nl       = true;  // consider newlines as a repeatable token
    bool perplexity        = false; // compute perplexity over the prompt