From 42fb6707e85fe2cafe6f744a9110b7391e5b76a8 Mon Sep 17 00:00:00 2001 From: Mathijs Henquet Date: Thu, 22 Aug 2024 00:41:44 +0200 Subject: [PATCH] Add example of token splitting --- examples/server/README.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/examples/server/README.md b/examples/server/README.md index 82f9a373f..db3c7f6ff 100644 --- a/examples/server/README.md +++ b/examples/server/README.md @@ -529,6 +529,16 @@ If `with_pieces` is `true`: } ``` +With input 'รก' (utf8 hex: C3 A1) on tinyllama/stories260k +```json +{ + "tokens": [ + {"id": 198, "piece": [195]}, // hex C3 + {"id": 164, "piece": [161]} // hex A1 + ] +} +``` + ### POST `/detokenize`: Convert tokens to text *Options:*