Georgi Gerganov
952d03dbea
convert : use utf8 encoding ( #7000 )
...
* convert : use utf8 encoding
* convert : update instructions and warning message
2024-04-30 11:05:25 +03:00
Olivier Chafik
8843a98c2b
Improve usability of --model-url & related flags ( #6930 )
...
* args: default --model to models/ + filename from --model-url or --hf-file (or else legacy models/7B/ggml-model-f16.gguf)
* args: main & server now call gpt_params_handle_model_default
* args: define DEFAULT_MODEL_PATH + update cli docs
* curl: check url of previous download (.json metadata w/ url, etag & lastModified)
* args: fix update to quantize-stats.cpp
* curl: support legacy .etag / .lastModified companion files
* curl: rm legacy .etag file support
* curl: reuse regex across headers callback calls
* curl: unique_ptr to manage lifecycle of curl & outfile
* curl: nit: no need for multiline regex flag
* curl: update failed test (model file collision) + gitignore *.gguf.json
2024-04-30 00:52:50 +01:00
Clint Herron
b8c1476e44
Extending grammar integration tests ( #6644 )
...
* Cleaning up integration tests to share code between tests and make it simpler to add new tests.
* Add tests around quantifiers to ensure both matching and non-matching compliance.
* Add slightly more complex grammar with quantifiers to test references with quantifiers.
* Fixing build when C++17 is not present.
* Separating test calls to give more helpful stack traces on failure. Adding verbose messages to give visibility for what is being tested.
* Adding quotes around strings to explicitly show whitespace
* Removing trailing whitespace.
* Implementing suggestions from @ochafik -- grammars and test strings now print and flush before tests to aid in debugging segfaults and whatnot.
* Cleaning up forgotten symbols. Modifying simple test to use test harness. Added comments for more verbose descriptions of what each test is accomplishing.
* Unicode symbol modifications to hopefully make log easier to parse visually.
2024-04-29 14:40:14 -04:00
Daniel Bevenius
5539e6fdd1
main : fix typo in comment in main.cpp ( #6985 )
...
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2024-04-29 13:56:59 -04:00
Olivier Chafik
b8a7a5a90f
build(cmake): simplify instructions (cmake -B build && cmake --build build ...
) ( #6964 )
...
* readme: cmake . -B build && cmake --build build
* build: fix typo
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
* build: drop implicit . from cmake config command
* build: remove another superfluous .
* build: update MinGW cmake commands
* Update README-sycl.md
Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>
* build: reinstate --config Release as not the default w/ some generators + document how to build Debug
* build: revert more --config Release
* build: nit / remove -H from cmake example
* build: reword debug instructions around single/multi config split
---------
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>
2024-04-29 17:02:45 +01:00
Georgi Gerganov
d2c898f746
ci : tmp disable gguf-split ( #6983 )
...
ggml-ci
2024-04-29 18:36:39 +03:00
Georgi Gerganov
544f1f10ad
ggml : fix __MSC_VER -> _MSC_VER ( #6977 )
...
ggml-ci
2024-04-29 17:55:02 +03:00
cpumaxx
ffe666572f
llava-cli : multiple images ( #6969 )
...
Co-authored-by: root <root@nenya.lothlorien.ca>
2024-04-29 17:34:24 +03:00
Georgi Gerganov
24affa7db3
readme : update hot topics
2024-04-29 17:06:19 +03:00
Georgi Gerganov
f4ab2a4147
llama : fix BPE pre-tokenization ( #6920 )
...
* merged the changes from deepseeker models to main branch
* Moved regex patterns to unicode.cpp and updated unicode.h
* Moved header files
* Resolved issues
* added and refactored unicode_regex_split and related functions
* Updated/merged the deepseek coder pr
* Refactored code
* Adding unicode regex mappings
* Adding unicode regex function
* Added needed functionality, testing remains
* Fixed issues
* Fixed issue with gpt2 regex custom preprocessor
* unicode : fix? unicode_wstring_to_utf8
* lint : fix whitespaces
* tests : add tokenizer tests for numbers
* unicode : remove redundant headers
* tests : remove and rename tokenizer test scripts
* tests : add sample usage
* gguf-py : reader prints warnings on duplicate keys
* llama : towards llama3 tokenization support (wip)
* unicode : shot in the dark to fix tests on Windows
* unicode : first try custom implementations
* convert : add "tokenizer.ggml.pre" GGUF KV (wip)
* llama : use new pre-tokenizer type
* convert : fix pre-tokenizer type writing
* lint : fix
* make : add test-tokenizer-0-llama-v3
* wip
* models : add llama v3 vocab file
* llama : adapt punctuation regex + add llama 3 regex
* minor
* unicode : set bomb
* unicode : set bomb
* unicode : always use std::wregex
* unicode : support \p{N}, \p{L} and \p{P} natively
* unicode : try fix windows
* unicode : category support via std::regex
* unicode : clean-up
* unicode : simplify
* convert : add convert-hf-to-gguf-update.py
ggml-ci
* lint : update
* convert : add falcon
ggml-ci
* unicode : normalize signatures
* lint : fix
* lint : fix
* convert : remove unused functions
* convert : add comments
* convert : exercise contractions
ggml-ci
* lint : fix
* cmake : refactor test targets
* tests : refactor vocab tests
ggml-ci
* tests : add more vocabs and tests
ggml-ci
* unicode : cleanup
* scripts : ignore new update script in check-requirements.sh
* models : add phi-3, mpt, gpt-2, starcoder
* tests : disable obsolete
ggml-ci
* tests : use faster bpe test
ggml-ci
* llama : more prominent warning for old BPE models
* tests : disable test-tokenizer-1-bpe due to slowness
ggml-ci
---------
Co-authored-by: Jaggzh <jaggz.h@gmail.com>
Co-authored-by: Kazim Abrar Mahi <kazimabrarmahi135@gmail.com>
2024-04-29 16:58:41 +03:00
David Renshaw
3f167476b1
sampling : use std::random_device{}() for default random seed ( #6962 )
2024-04-29 16:35:45 +03:00
Christian Zhou-Zheng
3055a41805
convert : fix conversion of some BERT embedding models ( #6937 )
2024-04-29 16:34:41 +03:00
Przemysław Pawełczyk
577277ffd2
make : change GNU make default CXX from g++ to c++ ( #6966 )
2024-04-29 16:08:20 +03:00
Przemysław Pawełczyk
ca7f29f568
ci : add building in MSYS2 environments (Windows) ( #6967 )
2024-04-29 15:59:47 +03:00
Johannes Gäßler
c4f708a93f
llama : fix typo LAMMAFILE -> LLAMAFILE ( #6974 )
2024-04-29 15:36:22 +03:00
DAN™
e00b4a8f81
Fix more int overflow during quant (PPL/CUDA). ( #6563 )
...
* Fix more int overflow during quant.
* Fix some more int overflow in softmax.
* Revert back to int64_t.
2024-04-29 00:38:44 +02:00
Xuan Son Nguyen
7bb36ccf91
gguf : enforce that tensor names are unique ( #6905 )
...
* not allow adding duplicated tensor name
* no duplicated tensor while reading gguf
* typo
* throw exception inside llama_model_loader
Co-authored-by: slaren <slarengh@gmail.com>
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-04-28 17:36:18 +02:00
Neo Zhang
ce023f6f2f
add device version in device list ( #6959 )
...
Co-authored-by: arthw <>
2024-04-28 22:40:31 +08:00
github-actions[bot]
6e472f58e4
flake.lock: Update
...
Flake lock file updates:
• Updated input 'nixpkgs':
'github:NixOS/nixpkgs/5c24cf2f0a12ad855f444c30b2421d044120c66f?narHash=sha256-XtTSSIB2DA6tOv%2Bl0FhvfDMiyCmhoRbNB%2B0SeInZkbk%3D' (2024-04-19)
→ 'github:NixOS/nixpkgs/7bb2ccd8cdc44c91edba16c48d2c8f331fb3d856?narHash=sha256-Drmja/f5MRHZCskS6mvzFqxEaZMeciScCTFxWVLqWEY%3D' (2024-04-25)
2024-04-28 11:12:50 +00:00
ochafik
b4a00cec0f
Merge branch 'gguf-read' into agent-example
2024-04-27 23:17:27 +01:00
ochafik
8d503ef482
grammars: faster llama_grammar_copy
2024-04-27 23:17:00 +01:00
ochafik
00c709eb4a
grammars: cache decoded tokens
2024-04-27 23:17:00 +01:00
ochafik
09c256594d
grammars: early exit when no next_candidates to reject
2024-04-27 23:15:45 +01:00
Olivier Chafik
0120f7cc95
agent: fix wait --std-tools
2024-04-27 23:15:45 +01:00
Olivier Chafik
89dcc062a4
agent: mypy type fixes
...
mypy examples/agent/__main__.py
mypy examples/agent/fastify.py
mypy examples/openai/__main__.py
2024-04-27 23:15:45 +01:00
Olivier Chafik
ea0c31b10b
agent: ensure DATA_DIR exists
...
skip-checks:true
2024-04-27 23:15:45 +01:00
ochafik
a98f48315c
agent: python tool: return errors
2024-04-27 23:15:45 +01:00
ochafik
f9afb041e2
agent: python tool: test serializability of variables
2024-04-27 23:15:45 +01:00
ochafik
082d54db14
agent: rename fake weather tools
2024-04-27 23:15:45 +01:00
ochafik
6c00378630
agent: nits
2024-04-27 23:15:45 +01:00
ochafik
1475b1eefa
agent: fix killing of subprocesses
...
subprocesses again
2024-04-27 23:15:45 +01:00
ochafik
24e34f174b
agent: nit
2024-04-27 23:15:45 +01:00
ochafik
a61ebebaa0
agent: hint at math import in python tool
2024-04-27 23:15:45 +01:00
ochafik
9fe269e24a
openai: nit
2024-04-27 23:15:45 +01:00
ochafik
a634e03aba
agent: cache_prompt=True
2024-04-27 23:15:45 +01:00
Olivier Chafik
0532680f40
agent: nits
2024-04-27 23:15:45 +01:00
Olivier Chafik
6880f1d4c0
agent: support basic openapi tools (incl. from fastify sandbox)
2024-04-27 23:14:11 +01:00
Olivier Chafik
85820f4401
agent: fix sandbox dockerfile
2024-04-27 23:14:11 +01:00
ochafik
b447a743fb
agent: revert to json schemas (ts not ready for refs)
2024-04-27 23:14:11 +01:00
ochafik
701a66d80f
agent: fix response_format
2024-04-27 23:14:11 +01:00
ochafik
6e52a9ce48
Update test_chat_handlers.md
2024-04-27 23:14:11 +01:00
ochafik
22fe86d8b8
openai tools: TS signatures work well too at a fraction of the eval cost
2024-04-27 23:14:11 +01:00
ochafik
19811a4011
openai: tests didn't catch output format
2024-04-27 23:14:11 +01:00
ochafik
09de4eb9ed
openai: actually use thoughtful examples in tests
2024-04-27 23:14:11 +01:00
ochafik
da2067a0d6
openai: only special-format assistant in thoughtful mode
2024-04-27 23:14:11 +01:00
ochafik
d9f30f86c8
Update test_chat_handlers.md
2024-04-27 23:14:11 +01:00
ochafik
6935503b53
openai: refactor chat handler vs. template
2024-04-27 23:14:11 +01:00
ochafik
3c3eff52aa
openai: quiet + update prompt output
2024-04-27 23:14:11 +01:00
ochafik
ad2f4c119a
Update test_chat_handlers.py
2024-04-27 23:14:11 +01:00
ochafik
d8a53eadf2
openai: test features of templates at runtime, to make sure no bits of intel are lost
2024-04-27 23:14:11 +01:00