Commit graph

2804 commits

Author SHA1 Message Date
brian khuu
154ad1236e convert-hf-to-gguf-update.py: use triple quoted f-string instead 2024-05-02 01:47:41 +10:00
brian khuu
6d42f3d773 revert changes to convert-hf-to-gguf.py for get_name() 2024-05-02 01:35:33 +10:00
brian khuu
fcc5a5e0fe *.py: fix flake8 warnings 2024-04-30 19:05:32 +10:00
brian khuu
5e5e74e3b8 convert-hf-to-gguf.py: print() --> logger 2024-04-30 19:00:11 +10:00
brian khuu
2d2bc99385 convert-hf-to-gguf.py: add additional logging 2024-04-30 19:00:11 +10:00
brian khuu
58d5a5d2d5 constants.py: logger no longer required 2024-04-30 19:00:11 +10:00
brian khuu
ad53853a39 python-lint.yml: use .flake8 file instead 2024-04-30 19:00:11 +10:00
brian khuu
fe1d7f605d gguf-py/gguf/*.py: use __name__ as logger name
Since they will be imported and not run directly.
2024-04-30 19:00:11 +10:00
brian khuu
b0b51e7874 *.py: refactor logging.basicConfig() 2024-04-30 19:00:11 +10:00
brian khuu
1b7c80072b verify-checksum-models.py: use print() for printing table 2024-04-30 19:00:11 +10:00
brian khuu
aefd7492a3 convert-hf-to-gguf.py: print --> logger.debug or ValueError() 2024-04-30 19:00:11 +10:00
brian khuu
3a55ae4d72 gguf-dump.py: dump_metadata() should print to stdout 2024-04-30 19:00:11 +10:00
brian khuu
1b1c2ed80b convert.py: warning goes to stderr and won't hurt the dump output 2024-04-30 19:00:11 +10:00
brian khuu
62da83a4b8 reader.py: read_gguf_file() use print() over logging 2024-04-30 19:00:11 +10:00
brian khuu
510dea0d12 compare-llama-bench.py: add blank line for readability during missing repo response 2024-04-30 19:00:11 +10:00
brian khuu
e0372a1b5a verify-checksum-model.py: This is the result of the program, it should be printed to stdout. 2024-04-30 19:00:11 +10:00
brian khuu
ea449058b6 gguf-convert-endian.py: refactor convert_byteorder() to use tqdm progressbar 2024-04-30 19:00:11 +10:00
brian khuu
dc798d23d7 *.py: Convert logger error and sys.exit() into a raise exception (for atypical error) 2024-04-30 19:00:11 +10:00
brian khuu
cf38b4b831 constant.py: logger.error then exit should be a raise exception instead 2024-04-30 19:00:11 +10:00
brian khuu
dc2bff4059 fixup! *.py: logging basiconfig refactor to use conditional expression 2024-04-30 19:00:11 +10:00
brian khuu
c2e5abd33d *.py: removed commented out logging 2024-04-30 19:00:11 +10:00
brian khuu
1cc38d81af *.py: logging basiconfig refactor to use conditional expression 2024-04-30 19:00:11 +10:00
brian khuu
44b058d131 convert-hf-to-gguf.py: print() to logger conversion 2024-04-30 19:00:11 +10:00
brian khuu
dd8b9774eb pre-commit: add flake8-no-print to flake8 and also update pre-commit version 2024-04-30 19:00:11 +10:00
brian khuu
8d855b177c gh-actions: add flake8-no-print to flake8 lint step 2024-04-30 19:00:11 +10:00
brian khuu
c220e353f3 flake8: update flake8 ignore and exclude to match ci settings 2024-04-30 19:00:10 +10:00
brian khuu
9ad587a5ee requirements.txt: remove extra line 2024-04-30 19:00:10 +10:00
brian khuu
f00454fbd4 *.py: Convert all python scripts to use logging module 2024-04-30 19:00:10 +10:00
brian khuu
3670e16e9c convert.py: sys.stderr.write --> logger.error 2024-04-30 19:00:10 +10:00
brian khuu
e6b9d9179b convert.py: convert extra print() to named logger 2024-04-30 19:00:10 +10:00
brian khuu
8008082c2a convert.py: use explicit logger id string 2024-04-30 19:00:10 +10:00
brian khuu
e8be0c8f73 convert.py: named instance logging 2024-04-30 19:00:10 +10:00
brian khuu
88c1e2ff10 convert.py: verbose flag takes priority over dump flag log suppression 2024-04-30 19:00:10 +10:00
brian khuu
573dcecda1 convert.py: add python logging instead of print() 2024-04-30 19:00:10 +10:00
Georgi Gerganov
952d03dbea
convert : use utf8 encoding (#7000)
* convert : use utf8 encoding

* convert : update instructions and warning message
2024-04-30 11:05:25 +03:00
Olivier Chafik
8843a98c2b
Improve usability of --model-url & related flags (#6930)
* args: default --model to models/ + filename from --model-url or --hf-file (or else legacy models/7B/ggml-model-f16.gguf)

* args: main & server now call gpt_params_handle_model_default

* args: define DEFAULT_MODEL_PATH + update cli docs

* curl: check url of previous download (.json metadata w/ url, etag & lastModified)

* args: fix update to quantize-stats.cpp

* curl: support legacy .etag / .lastModified companion files

* curl: rm legacy .etag file support

* curl: reuse regex across headers callback calls

* curl: unique_ptr to manage lifecycle of curl & outfile

* curl: nit: no need for multiline regex flag

* curl: update failed test (model file collision) + gitignore *.gguf.json
2024-04-30 00:52:50 +01:00
Clint Herron
b8c1476e44
Extending grammar integration tests (#6644)
* Cleaning up integration tests to share code between tests and make it simpler to add new tests.

* Add tests around quantifiers to ensure both matching and non-matching compliance.

* Add slightly more complex grammar with quantifiers to test references with quantifiers.

* Fixing build when C++17 is not present.

* Separating test calls to give more helpful stack traces on failure. Adding verbose messages to give visibility for what is being tested.

* Adding quotes around strings to explicitly show whitespace

* Removing trailing whitespace.

* Implementing suggestions from @ochafik -- grammars and test strings now print and flush before tests to aid in debugging segfaults and whatnot.

* Cleaning up forgotten symbols. Modifying simple test to use test harness. Added comments for more verbose descriptions of what each test is accomplishing.

* Unicode symbol modifications to hopefully make log easier to parse visually.
2024-04-29 14:40:14 -04:00
Daniel Bevenius
5539e6fdd1
main : fix typo in comment in main.cpp (#6985)
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2024-04-29 13:56:59 -04:00
Olivier Chafik
b8a7a5a90f
build(cmake): simplify instructions (cmake -B build && cmake --build build ...) (#6964)
* readme: cmake . -B build && cmake --build build

* build: fix typo

Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>

* build: drop implicit . from cmake config command

* build: remove another superfluous .

* build: update MinGW cmake commands

* Update README-sycl.md

Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>

* build: reinstate --config Release as not the default w/ some generators + document how to build Debug

* build: revert more --config Release

* build: nit / remove -H from cmake example

* build: reword debug instructions around single/multi config split

---------

Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>
2024-04-29 17:02:45 +01:00
Georgi Gerganov
d2c898f746
ci : tmp disable gguf-split (#6983)
ggml-ci
2024-04-29 18:36:39 +03:00
Georgi Gerganov
544f1f10ad
ggml : fix __MSC_VER -> _MSC_VER (#6977)
ggml-ci
2024-04-29 17:55:02 +03:00
cpumaxx
ffe666572f
llava-cli : multiple images (#6969)
Co-authored-by: root <root@nenya.lothlorien.ca>
2024-04-29 17:34:24 +03:00
Georgi Gerganov
24affa7db3
readme : update hot topics 2024-04-29 17:06:19 +03:00
Georgi Gerganov
f4ab2a4147
llama : fix BPE pre-tokenization (#6920)
* merged the changes from deepseeker models to main branch

* Moved regex patterns to unicode.cpp and updated unicode.h

* Moved header files

* Resolved issues

* added and refactored unicode_regex_split and related functions

* Updated/merged the deepseek coder pr

* Refactored code

* Adding unicode regex mappings

* Adding unicode regex function

* Added needed functionality, testing remains

* Fixed issues

* Fixed issue with gpt2 regex custom preprocessor

* unicode : fix? unicode_wstring_to_utf8

* lint : fix whitespaces

* tests : add tokenizer tests for numbers

* unicode : remove redundant headers

* tests : remove and rename tokenizer test scripts

* tests : add sample usage

* gguf-py : reader prints warnings on duplicate keys

* llama : towards llama3 tokenization support (wip)

* unicode : shot in the dark to fix tests on Windows

* unicode : first try custom implementations

* convert : add "tokenizer.ggml.pre" GGUF KV (wip)

* llama : use new pre-tokenizer type

* convert : fix pre-tokenizer type writing

* lint : fix

* make : add test-tokenizer-0-llama-v3

* wip

* models : add llama v3 vocab file

* llama : adapt punctuation regex + add llama 3 regex

* minor

* unicode : set bomb

* unicode : set bomb

* unicode : always use std::wregex

* unicode : support \p{N}, \p{L} and \p{P} natively

* unicode : try fix windows

* unicode : category support via std::regex

* unicode : clean-up

* unicode : simplify

* convert : add convert-hf-to-gguf-update.py

ggml-ci

* lint : update

* convert : add falcon

ggml-ci

* unicode : normalize signatures

* lint : fix

* lint : fix

* convert : remove unused functions

* convert : add comments

* convert : exercise contractions

ggml-ci

* lint : fix

* cmake : refactor test targets

* tests : refactor vocab tests

ggml-ci

* tests : add more vocabs and tests

ggml-ci

* unicode : cleanup

* scripts : ignore new update script in check-requirements.sh

* models : add phi-3, mpt, gpt-2, starcoder

* tests : disable obsolete

ggml-ci

* tests : use faster bpe test

ggml-ci

* llama : more prominent warning for old BPE models

* tests : disable test-tokenizer-1-bpe due to slowness

ggml-ci

---------

Co-authored-by: Jaggzh <jaggz.h@gmail.com>
Co-authored-by: Kazim Abrar Mahi <kazimabrarmahi135@gmail.com>
2024-04-29 16:58:41 +03:00
David Renshaw
3f167476b1
sampling : use std::random_device{}() for default random seed (#6962) 2024-04-29 16:35:45 +03:00
Christian Zhou-Zheng
3055a41805
convert : fix conversion of some BERT embedding models (#6937) 2024-04-29 16:34:41 +03:00
Przemysław Pawełczyk
577277ffd2
make : change GNU make default CXX from g++ to c++ (#6966) 2024-04-29 16:08:20 +03:00
Przemysław Pawełczyk
ca7f29f568
ci : add building in MSYS2 environments (Windows) (#6967) 2024-04-29 15:59:47 +03:00
Johannes Gäßler
c4f708a93f
llama : fix typo LAMMAFILE -> LLAMAFILE (#6974) 2024-04-29 15:36:22 +03:00
DAN™
e00b4a8f81
Fix more int overflow during quant (PPL/CUDA). (#6563)
* Fix more int overflow during quant.

* Fix some more int overflow in softmax.

* Revert back to int64_t.
2024-04-29 00:38:44 +02:00