command-r : add BPE pre-tokenization (#7063)

* Add BPE pre-tokenization for Command-R/R+.

* Bump transformers convert requirement.

* command-r : add individual digits regex

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
DAN™ 2024-05-05 01:19:30 -04:00 committed by GitHub
parent 6fbd432211
commit 889bdd7686
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
9 changed files with 168 additions and 1 deletions

View file

@ -1,5 +1,5 @@
numpy~=1.24.4
sentencepiece~=0.1.98
transformers>=4.35.2,<5.0.0
transformers>=4.40.1,<5.0.0
gguf>=0.1.0
protobuf>=4.21.0,<5.0.0