llama.cpp

Author	SHA1	Message	Date
brian khuu	154ad1236e	convert-hf-to-gguf-update.py: use triple quoted f-string instead	2024-05-02 01:47:41 +10:00
brian khuu	6d42f3d773	revert changes to convert-hf-to-gguf.py for get_name()	2024-05-02 01:35:33 +10:00
brian khuu	fcc5a5e0fe	*.py: fix flake8 warnings	2024-04-30 19:05:32 +10:00
brian khuu	5e5e74e3b8	convert-hf-to-gguf.py: print() --> logger	2024-04-30 19:00:11 +10:00
brian khuu	2d2bc99385	convert-hf-to-gguf.py: add additional logging	2024-04-30 19:00:11 +10:00
brian khuu	58d5a5d2d5	constants.py: logger no longer required	2024-04-30 19:00:11 +10:00
brian khuu	ad53853a39	python-lint.yml: use .flake8 file instead	2024-04-30 19:00:11 +10:00
brian khuu	fe1d7f605d	gguf-py/gguf/*.py: use __name__ as logger name Since they will be imported and not run directly.	2024-04-30 19:00:11 +10:00
brian khuu	b0b51e7874	*.py: refactor logging.basicConfig()	2024-04-30 19:00:11 +10:00
brian khuu	1b7c80072b	verify-checksum-models.py: use print() for printing table	2024-04-30 19:00:11 +10:00
brian khuu	aefd7492a3	convert-hf-to-gguf.py: print --> logger.debug or ValueError()	2024-04-30 19:00:11 +10:00
brian khuu	3a55ae4d72	gguf-dump.py: dump_metadata() should print to stdout	2024-04-30 19:00:11 +10:00
brian khuu	1b1c2ed80b	convert.py: warning goes to stderr and won't hurt the dump output	2024-04-30 19:00:11 +10:00
brian khuu	62da83a4b8	reader.py: read_gguf_file() use print() over logging	2024-04-30 19:00:11 +10:00
brian khuu	510dea0d12	compare-llama-bench.py: add blank line for readability during missing repo response	2024-04-30 19:00:11 +10:00
brian khuu	e0372a1b5a	verify-checksum-model.py: This is the result of the program, it should be printed to stdout.	2024-04-30 19:00:11 +10:00
brian khuu	ea449058b6	gguf-convert-endian.py: refactor convert_byteorder() to use tqdm progressbar	2024-04-30 19:00:11 +10:00
brian khuu	dc798d23d7	*.py: Convert logger error and sys.exit() into a raise exception (for atypical error)	2024-04-30 19:00:11 +10:00
brian khuu	cf38b4b831	constant.py: logger.error then exit should be a raise exception instead	2024-04-30 19:00:11 +10:00
brian khuu	dc2bff4059	fixup! *.py: logging basiconfig refactor to use conditional expression	2024-04-30 19:00:11 +10:00
brian khuu	c2e5abd33d	*.py: removed commented out logging	2024-04-30 19:00:11 +10:00
brian khuu	1cc38d81af	*.py: logging basiconfig refactor to use conditional expression	2024-04-30 19:00:11 +10:00
brian khuu	44b058d131	convert-hf-to-gguf.py: print() to logger conversion	2024-04-30 19:00:11 +10:00
brian khuu	dd8b9774eb	pre-commit: add flake8-no-print to flake8 and also update pre-commit version	2024-04-30 19:00:11 +10:00
brian khuu	8d855b177c	gh-actions: add flake8-no-print to flake8 lint step	2024-04-30 19:00:11 +10:00
brian khuu	c220e353f3	flake8: update flake8 ignore and exclude to match ci settings	2024-04-30 19:00:10 +10:00
brian khuu	9ad587a5ee	requirements.txt: remove extra line	2024-04-30 19:00:10 +10:00
brian khuu	f00454fbd4	*.py: Convert all python scripts to use logging module	2024-04-30 19:00:10 +10:00
brian khuu	3670e16e9c	convert.py: sys.stderr.write --> logger.error	2024-04-30 19:00:10 +10:00
brian khuu	e6b9d9179b	convert.py: convert extra print() to named logger	2024-04-30 19:00:10 +10:00
brian khuu	8008082c2a	convert.py: use explicit logger id string	2024-04-30 19:00:10 +10:00
brian khuu	e8be0c8f73	convert.py: named instance logging	2024-04-30 19:00:10 +10:00
brian khuu	88c1e2ff10	convert.py: verbose flag takes priority over dump flag log suppression	2024-04-30 19:00:10 +10:00
brian khuu	573dcecda1	convert.py: add python logging instead of print()	2024-04-30 19:00:10 +10:00
Georgi Gerganov	952d03dbea	convert : use utf8 encoding (#7000 ) * convert : use utf8 encoding * convert : update instructions and warning message	2024-04-30 11:05:25 +03:00
Olivier Chafik	8843a98c2b	Improve usability of --model-url & related flags (#6930 ) * args: default --model to models/ + filename from --model-url or --hf-file (or else legacy models/7B/ggml-model-f16.gguf) * args: main & server now call gpt_params_handle_model_default * args: define DEFAULT_MODEL_PATH + update cli docs * curl: check url of previous download (.json metadata w/ url, etag & lastModified) * args: fix update to quantize-stats.cpp * curl: support legacy .etag / .lastModified companion files * curl: rm legacy .etag file support * curl: reuse regex across headers callback calls * curl: unique_ptr to manage lifecycle of curl & outfile * curl: nit: no need for multiline regex flag * curl: update failed test (model file collision) + gitignore *.gguf.json	2024-04-30 00:52:50 +01:00
Clint Herron	b8c1476e44	Extending grammar integration tests (#6644 ) * Cleaning up integration tests to share code between tests and make it simpler to add new tests. * Add tests around quantifiers to ensure both matching and non-matching compliance. * Add slightly more complex grammar with quantifiers to test references with quantifiers. * Fixing build when C++17 is not present. * Separating test calls to give more helpful stack traces on failure. Adding verbose messages to give visibility for what is being tested. * Adding quotes around strings to explicitly show whitespace * Removing trailing whitespace. * Implementing suggestions from @ochafik -- grammars and test strings now print and flush before tests to aid in debugging segfaults and whatnot. * Cleaning up forgotten symbols. Modifying simple test to use test harness. Added comments for more verbose descriptions of what each test is accomplishing. * Unicode symbol modifications to hopefully make log easier to parse visually.	2024-04-29 14:40:14 -04:00
Daniel Bevenius	5539e6fdd1	main : fix typo in comment in main.cpp (#6985 ) Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>	2024-04-29 13:56:59 -04:00
Olivier Chafik	b8a7a5a90f	build(cmake): simplify instructions (`cmake -B build && cmake --build build ...`) (#6964 ) * readme: cmake . -B build && cmake --build build * build: fix typo Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> * build: drop implicit . from cmake config command * build: remove another superfluous . * build: update MinGW cmake commands * Update README-sycl.md Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com> * build: reinstate --config Release as not the default w/ some generators + document how to build Debug * build: revert more --config Release * build: nit / remove -H from cmake example * build: reword debug instructions around single/multi config split --------- Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>	2024-04-29 17:02:45 +01:00
Georgi Gerganov	d2c898f746	ci : tmp disable gguf-split (#6983 ) ggml-ci	2024-04-29 18:36:39 +03:00
Georgi Gerganov	544f1f10ad	ggml : fix __MSC_VER -> _MSC_VER (#6977 ) ggml-ci	2024-04-29 17:55:02 +03:00
cpumaxx	ffe666572f	llava-cli : multiple images (#6969 ) Co-authored-by: root <root@nenya.lothlorien.ca>	2024-04-29 17:34:24 +03:00
Georgi Gerganov	24affa7db3	readme : update hot topics	2024-04-29 17:06:19 +03:00
Georgi Gerganov	f4ab2a4147	llama : fix BPE pre-tokenization (#6920 ) * merged the changes from deepseeker models to main branch * Moved regex patterns to unicode.cpp and updated unicode.h * Moved header files * Resolved issues * added and refactored unicode_regex_split and related functions * Updated/merged the deepseek coder pr * Refactored code * Adding unicode regex mappings * Adding unicode regex function * Added needed functionality, testing remains * Fixed issues * Fixed issue with gpt2 regex custom preprocessor * unicode : fix? unicode_wstring_to_utf8 * lint : fix whitespaces * tests : add tokenizer tests for numbers * unicode : remove redundant headers * tests : remove and rename tokenizer test scripts * tests : add sample usage * gguf-py : reader prints warnings on duplicate keys * llama : towards llama3 tokenization support (wip) * unicode : shot in the dark to fix tests on Windows * unicode : first try custom implementations * convert : add "tokenizer.ggml.pre" GGUF KV (wip) * llama : use new pre-tokenizer type * convert : fix pre-tokenizer type writing * lint : fix * make : add test-tokenizer-0-llama-v3 * wip * models : add llama v3 vocab file * llama : adapt punctuation regex + add llama 3 regex * minor * unicode : set bomb * unicode : set bomb * unicode : always use std::wregex * unicode : support \p{N}, \p{L} and \p{P} natively * unicode : try fix windows * unicode : category support via std::regex * unicode : clean-up * unicode : simplify * convert : add convert-hf-to-gguf-update.py ggml-ci * lint : update * convert : add falcon ggml-ci * unicode : normalize signatures * lint : fix * lint : fix * convert : remove unused functions * convert : add comments * convert : exercise contractions ggml-ci * lint : fix * cmake : refactor test targets * tests : refactor vocab tests ggml-ci * tests : add more vocabs and tests ggml-ci * unicode : cleanup * scripts : ignore new update script in check-requirements.sh * models : add phi-3, mpt, gpt-2, starcoder * tests : disable obsolete ggml-ci * tests : use faster bpe test ggml-ci * llama : more prominent warning for old BPE models * tests : disable test-tokenizer-1-bpe due to slowness ggml-ci --------- Co-authored-by: Jaggzh <jaggz.h@gmail.com> Co-authored-by: Kazim Abrar Mahi <kazimabrarmahi135@gmail.com>	2024-04-29 16:58:41 +03:00
David Renshaw	3f167476b1	sampling : use std::random_device{}() for default random seed (#6962 )	2024-04-29 16:35:45 +03:00
Christian Zhou-Zheng	3055a41805	convert : fix conversion of some BERT embedding models (#6937 )	2024-04-29 16:34:41 +03:00
Przemysław Pawełczyk	577277ffd2	make : change GNU make default CXX from g++ to c++ (#6966 )	2024-04-29 16:08:20 +03:00
Przemysław Pawełczyk	ca7f29f568	ci : add building in MSYS2 environments (Windows) (#6967 )	2024-04-29 15:59:47 +03:00
Johannes Gäßler	c4f708a93f	llama : fix typo LAMMAFILE -> LLAMAFILE (#6974 )	2024-04-29 15:36:22 +03:00
DAN™	e00b4a8f81	Fix more int overflow during quant (PPL/CUDA). (#6563 ) * Fix more int overflow during quant. * Fix some more int overflow in softmax. * Revert back to int64_t.	2024-04-29 00:38:44 +02:00

1 2 3 4 5 ...

2804 commits