llama.cpp

Author	SHA1	Message	Date
Georgi Gerganov	952d03dbea	convert : use utf8 encoding (#7000 ) * convert : use utf8 encoding * convert : update instructions and warning message	2024-04-30 11:05:25 +03:00
Olivier Chafik	8843a98c2b	Improve usability of --model-url & related flags (#6930 ) * args: default --model to models/ + filename from --model-url or --hf-file (or else legacy models/7B/ggml-model-f16.gguf) * args: main & server now call gpt_params_handle_model_default * args: define DEFAULT_MODEL_PATH + update cli docs * curl: check url of previous download (.json metadata w/ url, etag & lastModified) * args: fix update to quantize-stats.cpp * curl: support legacy .etag / .lastModified companion files * curl: rm legacy .etag file support * curl: reuse regex across headers callback calls * curl: unique_ptr to manage lifecycle of curl & outfile * curl: nit: no need for multiline regex flag * curl: update failed test (model file collision) + gitignore *.gguf.json	2024-04-30 00:52:50 +01:00
Clint Herron	b8c1476e44	Extending grammar integration tests (#6644 ) * Cleaning up integration tests to share code between tests and make it simpler to add new tests. * Add tests around quantifiers to ensure both matching and non-matching compliance. * Add slightly more complex grammar with quantifiers to test references with quantifiers. * Fixing build when C++17 is not present. * Separating test calls to give more helpful stack traces on failure. Adding verbose messages to give visibility for what is being tested. * Adding quotes around strings to explicitly show whitespace * Removing trailing whitespace. * Implementing suggestions from @ochafik -- grammars and test strings now print and flush before tests to aid in debugging segfaults and whatnot. * Cleaning up forgotten symbols. Modifying simple test to use test harness. Added comments for more verbose descriptions of what each test is accomplishing. * Unicode symbol modifications to hopefully make log easier to parse visually.	2024-04-29 14:40:14 -04:00
Daniel Bevenius	5539e6fdd1	main : fix typo in comment in main.cpp (#6985 ) Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>	2024-04-29 13:56:59 -04:00
Olivier Chafik	b8a7a5a90f	build(cmake): simplify instructions (`cmake -B build && cmake --build build ...`) (#6964 ) * readme: cmake . -B build && cmake --build build * build: fix typo Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> * build: drop implicit . from cmake config command * build: remove another superfluous . * build: update MinGW cmake commands * Update README-sycl.md Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com> * build: reinstate --config Release as not the default w/ some generators + document how to build Debug * build: revert more --config Release * build: nit / remove -H from cmake example * build: reword debug instructions around single/multi config split --------- Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>	2024-04-29 17:02:45 +01:00
Georgi Gerganov	d2c898f746	ci : tmp disable gguf-split (#6983 ) ggml-ci	2024-04-29 18:36:39 +03:00
Georgi Gerganov	544f1f10ad	ggml : fix __MSC_VER -> _MSC_VER (#6977 ) ggml-ci	2024-04-29 17:55:02 +03:00
cpumaxx	ffe666572f	llava-cli : multiple images (#6969 ) Co-authored-by: root <root@nenya.lothlorien.ca>	2024-04-29 17:34:24 +03:00
Georgi Gerganov	24affa7db3	readme : update hot topics	2024-04-29 17:06:19 +03:00
Georgi Gerganov	f4ab2a4147	llama : fix BPE pre-tokenization (#6920 ) * merged the changes from deepseeker models to main branch * Moved regex patterns to unicode.cpp and updated unicode.h * Moved header files * Resolved issues * added and refactored unicode_regex_split and related functions * Updated/merged the deepseek coder pr * Refactored code * Adding unicode regex mappings * Adding unicode regex function * Added needed functionality, testing remains * Fixed issues * Fixed issue with gpt2 regex custom preprocessor * unicode : fix? unicode_wstring_to_utf8 * lint : fix whitespaces * tests : add tokenizer tests for numbers * unicode : remove redundant headers * tests : remove and rename tokenizer test scripts * tests : add sample usage * gguf-py : reader prints warnings on duplicate keys * llama : towards llama3 tokenization support (wip) * unicode : shot in the dark to fix tests on Windows * unicode : first try custom implementations * convert : add "tokenizer.ggml.pre" GGUF KV (wip) * llama : use new pre-tokenizer type * convert : fix pre-tokenizer type writing * lint : fix * make : add test-tokenizer-0-llama-v3 * wip * models : add llama v3 vocab file * llama : adapt punctuation regex + add llama 3 regex * minor * unicode : set bomb * unicode : set bomb * unicode : always use std::wregex * unicode : support \p{N}, \p{L} and \p{P} natively * unicode : try fix windows * unicode : category support via std::regex * unicode : clean-up * unicode : simplify * convert : add convert-hf-to-gguf-update.py ggml-ci * lint : update * convert : add falcon ggml-ci * unicode : normalize signatures * lint : fix * lint : fix * convert : remove unused functions * convert : add comments * convert : exercise contractions ggml-ci * lint : fix * cmake : refactor test targets * tests : refactor vocab tests ggml-ci * tests : add more vocabs and tests ggml-ci * unicode : cleanup * scripts : ignore new update script in check-requirements.sh * models : add phi-3, mpt, gpt-2, starcoder * tests : disable obsolete ggml-ci * tests : use faster bpe test ggml-ci * llama : more prominent warning for old BPE models * tests : disable test-tokenizer-1-bpe due to slowness ggml-ci --------- Co-authored-by: Jaggzh <jaggz.h@gmail.com> Co-authored-by: Kazim Abrar Mahi <kazimabrarmahi135@gmail.com>	2024-04-29 16:58:41 +03:00
David Renshaw	3f167476b1	sampling : use std::random_device{}() for default random seed (#6962 )	2024-04-29 16:35:45 +03:00
Christian Zhou-Zheng	3055a41805	convert : fix conversion of some BERT embedding models (#6937 )	2024-04-29 16:34:41 +03:00
Przemysław Pawełczyk	577277ffd2	make : change GNU make default CXX from g++ to c++ (#6966 )	2024-04-29 16:08:20 +03:00
Przemysław Pawełczyk	ca7f29f568	ci : add building in MSYS2 environments (Windows) (#6967 )	2024-04-29 15:59:47 +03:00
Johannes Gäßler	c4f708a93f	llama : fix typo LAMMAFILE -> LLAMAFILE (#6974 )	2024-04-29 15:36:22 +03:00
DAN™	e00b4a8f81	Fix more int overflow during quant (PPL/CUDA). (#6563 ) * Fix more int overflow during quant. * Fix some more int overflow in softmax. * Revert back to int64_t.	2024-04-29 00:38:44 +02:00
Xuan Son Nguyen	7bb36ccf91	gguf : enforce that tensor names are unique (#6905 ) * not allow adding duplicated tensor name * no duplicated tensor while reading gguf * typo * throw exception inside llama_model_loader Co-authored-by: slaren <slarengh@gmail.com> --------- Co-authored-by: slaren <slarengh@gmail.com>	2024-04-28 17:36:18 +02:00
Neo Zhang	ce023f6f2f	add device version in device list (#6959 ) Co-authored-by: arthw <>	2024-04-28 22:40:31 +08:00
github-actions[bot]	6e472f58e4	flake.lock: Update Flake lock file updates: • Updated input 'nixpkgs': 'github:NixOS/nixpkgs/5c24cf2f0a12ad855f444c30b2421d044120c66f?narHash=sha256-XtTSSIB2DA6tOv%2Bl0FhvfDMiyCmhoRbNB%2B0SeInZkbk%3D' (2024-04-19) → 'github:NixOS/nixpkgs/7bb2ccd8cdc44c91edba16c48d2c8f331fb3d856?narHash=sha256-Drmja/f5MRHZCskS6mvzFqxEaZMeciScCTFxWVLqWEY%3D' (2024-04-25)	2024-04-28 11:12:50 +00:00
ochafik	b4a00cec0f	Merge branch 'gguf-read' into agent-example	2024-04-27 23:17:27 +01:00
ochafik	8d503ef482	grammars: faster llama_grammar_copy	2024-04-27 23:17:00 +01:00
ochafik	00c709eb4a	grammars: cache decoded tokens	2024-04-27 23:17:00 +01:00
ochafik	09c256594d	grammars: early exit when no next_candidates to reject	2024-04-27 23:15:45 +01:00
Olivier Chafik	0120f7cc95	agent: fix wait --std-tools	2024-04-27 23:15:45 +01:00
Olivier Chafik	89dcc062a4	agent: mypy type fixes mypy examples/agent/__main__.py mypy examples/agent/fastify.py mypy examples/openai/__main__.py	2024-04-27 23:15:45 +01:00
Olivier Chafik	ea0c31b10b	agent: ensure DATA_DIR exists skip-checks:true	2024-04-27 23:15:45 +01:00
ochafik	a98f48315c	agent: python tool: return errors	2024-04-27 23:15:45 +01:00
ochafik	f9afb041e2	agent: python tool: test serializability of variables	2024-04-27 23:15:45 +01:00
ochafik	082d54db14	agent: rename fake weather tools	2024-04-27 23:15:45 +01:00
ochafik	6c00378630	agent: nits	2024-04-27 23:15:45 +01:00
ochafik	1475b1eefa	agent: fix killing of subprocesses subprocesses again	2024-04-27 23:15:45 +01:00
ochafik	24e34f174b	agent: nit	2024-04-27 23:15:45 +01:00
ochafik	a61ebebaa0	agent: hint at math import in python tool	2024-04-27 23:15:45 +01:00
ochafik	9fe269e24a	openai: nit	2024-04-27 23:15:45 +01:00
ochafik	a634e03aba	agent: cache_prompt=True	2024-04-27 23:15:45 +01:00
Olivier Chafik	0532680f40	agent: nits	2024-04-27 23:15:45 +01:00
Olivier Chafik	6880f1d4c0	agent: support basic openapi tools (incl. from fastify sandbox)	2024-04-27 23:14:11 +01:00
Olivier Chafik	85820f4401	agent: fix sandbox dockerfile	2024-04-27 23:14:11 +01:00
ochafik	b447a743fb	agent: revert to json schemas (ts not ready for refs)	2024-04-27 23:14:11 +01:00
ochafik	701a66d80f	agent: fix response_format	2024-04-27 23:14:11 +01:00
ochafik	6e52a9ce48	Update test_chat_handlers.md	2024-04-27 23:14:11 +01:00
ochafik	22fe86d8b8	openai tools: TS signatures work well too at a fraction of the eval cost	2024-04-27 23:14:11 +01:00
ochafik	19811a4011	openai: tests didn't catch output format	2024-04-27 23:14:11 +01:00
ochafik	09de4eb9ed	openai: actually use thoughtful examples in tests	2024-04-27 23:14:11 +01:00
ochafik	da2067a0d6	openai: only special-format assistant in thoughtful mode	2024-04-27 23:14:11 +01:00
ochafik	d9f30f86c8	Update test_chat_handlers.md	2024-04-27 23:14:11 +01:00
ochafik	6935503b53	openai: refactor chat handler vs. template	2024-04-27 23:14:11 +01:00
ochafik	3c3eff52aa	openai: quiet + update prompt output	2024-04-27 23:14:11 +01:00
ochafik	ad2f4c119a	Update test_chat_handlers.py	2024-04-27 23:14:11 +01:00
ochafik	d8a53eadf2	openai: test features of templates at runtime, to make sure no bits of intel are lost	2024-04-27 23:14:11 +01:00

... 3 4 5 6 7 ...

3029 commits