llama.cpp

Author	SHA1	Message	Date
Mason M	9bca872be0	code review changes	2024-07-01 11:43:58 -03:00
bandoti	4eab311ed0	Merge branch 'ggerganov:master' into vulkan-build-integration	2024-06-30 00:14:14 -03:00
Mason M	ac9a065d31	Remove Python dependency from Vulkan build	2024-06-30 00:02:41 -03:00
Xuan Son Nguyen	72272b83a3	fix code typo in llama-cli (#8198 )	2024-06-29 00:14:20 +02:00
Olivier Chafik	8748d8ac6f	json: attempt to skip slow tests when running under emulator (#8189 )	2024-06-28 18:02:05 +01:00
Xuan Son Nguyen	26a39bbd6b	Add MiniCPM, Deepseek V2 chat template + clean up `llama_chat_apply_template_internal` (#8172 ) * tmp_contains * minicpm chat template * add DeepSeek Lite template * change deepseek-lite to deepseek2 * correct code comment * correct code from master branch	2024-06-28 15:11:44 +02:00
Sigbjørn Skjæret	38373cfbab	Add SPM infill support (#8016 ) * add --spm-infill option * support --spm-infill * support --spm-infill	2024-06-28 12:53:43 +02:00
slaren	b851b3fba0	cmake : allow user to override default options (#8178 )	2024-06-28 12:37:45 +02:00
Olivier Chafik	139cc621e9	`json`: restore default additionalProperties to false, fix some pattern escapes (#8180 ) * json: expand ESCAPED_IN_REGEXPS_BUT_NOT_IN_LITERALS charset * json: revert default of additionalProperties to false * Update README.md	2024-06-28 09:26:45 +01:00
pculliton	e57dc62057	llama: Add support for Gemma2ForCausalLM (#8156 ) * Inference support for Gemma 2 model family * Update convert-hf-to-gguf.py, constants, and tensor mappings * cleanup * format fix * Fix special token vocab bug * Don't add space prefix * fix deleted lines * Update src/llama.cpp Co-authored-by: slaren <slarengh@gmail.com> * Add model type names * Add control vector * Fix model type identification --------- Co-authored-by: Andrei Betlen <abetlen@gmail.com> Co-authored-by: slaren <slarengh@gmail.com>	2024-06-27 21:00:43 -07:00
Xuan Son Nguyen	a27aa50ab7	Add missing items in makefile (#8177 )	2024-06-28 02:19:11 +02:00
Olivier Chafik	cb0b06a8a6	`json`: update grammars/README w/ examples & note about additionalProperties (#8132 ) * json: update grammars/README * mention broken prefixItems * add mention to llama-gbnf-validator * json: explicit type: object for nested items object in cli example	2024-06-27 22:08:42 +01:00
loonerin	558f44bf83	CI: fix release build (Ubuntu+Mac) (#8170 ) * CI: fix release build (Ubuntu) PR #8006 changes defaults to build shared libs. However, CI for releases expects static builds. * CI: fix release build (Mac) --------- Co-authored-by: loonerin <loonerin@users.noreply.github.com>	2024-06-27 21:01:23 +02:00
slaren	8172ee9da9	cmake : fix deprecated option names not working (#8171 ) * cmake : fix deprecated option names not working * remove LlAMA_OPENMP	2024-06-27 20:04:39 +02:00
Mason M	8590508d3d	Link against ggml in cmake pkg	2024-06-27 14:01:10 -03:00
Xuan Son Nguyen	16791b8f0b	Add chatml fallback for cpp `llama_chat_apply_template` (#8160 ) * add chatml fallback for cpp `llama_chat_apply_template` * remove redundant code	2024-06-27 18:14:19 +02:00
Georgi Gerganov	ab3679112d	flake.lock: Update (#8071 ) Flake lock file updates: • Updated input 'nixpkgs': 'github:NixOS/nixpkgs/e9ee548d90ff586a6471b4ae80ae9cfcbceb3420?narHash=sha256-4Zu0RYRcAY/VWuu6awwq4opuiD//ahpc2aFHg2CWqFY%3D' (2024-06-13) → 'github:NixOS/nixpkgs/d603719ec6e294f034936c0d0dc06f689d91b6c3?narHash=sha256-k3JqJrkdoYwE3fHE6xGDY676AYmyh4U2Zw%2B0Bwe5DLU%3D' (2024-06-20) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Philip Taron <philip.taron@gmail.com>	2024-06-27 08:37:29 -07:00
jukofyork	97877eb10b	Control vector loading fixes (#8137 ) * Fixed leak in llama_control_vector_load_one() and allow llama_control_vector_load() to grow * refactored `llama_control_vector_load_one()` * allow multiple directions for same layer in same file * llama_control_vector_load_one() and llama_control_vector_load() now break on error * removed unnecessary ggml_free() call	2024-06-27 16:48:07 +02:00
Raj Hammeer Singh Hada	387952651a	Delete examples/llama.android/llama/CMakeLists.txt (#8165 ) * Delete examples/llama.android/llama/CMakeLists.txt https://github.com/ggerganov/llama.cpp/pull/8145#issuecomment-2194534244 This file is not being used for building on Android. `llama.cpp/examples/llama.android/llama/src/main/cpp/CMakeLists.txt` is being used instead. * Update CMakeLists.txt Pick local llama.cpp files instead of fetching content from git	2024-06-27 16:39:29 +02:00
Sigbjørn Skjæret	6030c61281	Add Qwen2MoE 57B-A14B model identifier (#8158 ) * Add Qwen2MoE 57B-A14B * Add Qwen2MoE 57B-A14B	2024-06-27 16:27:41 +02:00
Johannes Gäßler	85a267daaa	CUDA: fix MMQ stream-k for --split-mode row (#8167 )	2024-06-27 16:26:05 +02:00
Mason M	eec17a6864	Add python3 to Vulkan nix build	2024-06-27 10:26:46 -03:00
Mason M	d0d825f102	Add shaderc to nix pkg	2024-06-27 09:55:48 -03:00
kustaaya	f675b20a3b	Added support for Viking pre-tokenizer (#8135 ) Co-authored-by: kustaaya <kustaaya@protonmail.com>	2024-06-27 10:58:54 +02:00
Sigbjørn Skjæret	911e35bb8b	llama : fix CodeLlama FIM token checks (#8144 ) * account for space prefix character * use find instead	2024-06-27 10:46:41 +03:00
Mason M	d053004046	Update vulkan obj file paths	2024-06-26 23:14:47 -03:00
Raj Hammeer Singh Hada	ac146628e4	Fix llama-android.cpp for error - "common/common.h not found" (#8145 ) - Path seems to be wrong for the common.h header file in llama-android.cpp file. Fixing the path so the Android Build doesn't fail with the error "There is no file common/common.h"	2024-06-27 03:57:57 +02:00
Daniel Bevenius	9b31a40c6d	clip : suppress unused variable warnings (#8105 ) * clip : suppress unused variable warnings This commit suppresses unused variable warnings for the variables e in the catch blocks. The motivation for this change is to suppress the warnings that are generated on Windows when using the MSVC compiler. The warnings are not displayed when using GCC because GCC will mark all catch parameters as used. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com> * squash! clip : suppress unused variable warnings Remove e (/e/) instead instead of using GGML_UNUSED. --------- Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>	2024-06-27 01:50:09 +02:00
Mason M	6571046097	Forward GGML_EXTRA_LIBS to CMake config pkg	2024-06-26 20:18:07 -03:00
Mason M	a1495e709c	Merge branch 'master' into vulkan-build-integration	2024-06-26 19:59:39 -03:00
Georgi Gerganov	c70d117c37	scripts : fix filename sync	2024-06-26 23:25:22 +03:00
slaren	ae5d0f4b89	ci : publish new docker images only when the files change (#8142 )	2024-06-26 21:59:28 +02:00
slaren	31ec3993f6	ggml : add GGML_CUDA_USE_GRAPHS option, restore GGML_CUDA_FORCE_CUBLAS (cmake) (#8140 )	2024-06-26 21:34:14 +02:00
slaren	c7ab7b612c	make : fix missing -O3 (#8143 )	2024-06-26 21:20:22 +03:00
Georgi Gerganov	f2d48fffde	sync : ggml	2024-06-26 19:39:19 +03:00
Georgi Gerganov	4713bf3093	authors : regen	2024-06-26 19:36:44 +03:00
Georgi Gerganov	0e814dfc42	devops : remove clblast + LLAMA_CUDA -> GGML_CUDA (#8139 ) ggml-ci	2024-06-26 19:32:07 +03:00
Georgi Gerganov	a95631ee97	readme : update API notes	2024-06-26 19:26:13 +03:00
Georgi Gerganov	f3f65429c4	llama : reorganize source code + improve CMake (#8006 ) * scripts : update sync [no ci] * files : relocate [no ci] * ci : disable kompute build [no ci] * cmake : fixes [no ci] * server : fix mingw build ggml-ci * cmake : minor [no ci] * cmake : link math library [no ci] * cmake : build normal ggml library (not object library) [no ci] * cmake : fix kompute build ggml-ci * make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE ggml-ci * move public backend headers to the public include directory (#8122) * move public backend headers to the public include directory * nix test * spm : fix metal header --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * scripts : fix sync paths [no ci] * scripts : sync ggml-blas.h [no ci] --------- Co-authored-by: slaren <slarengh@gmail.com>	2024-06-26 18:33:02 +03:00
Mason M	2318cadf0c	Move sudo to apt-key invocation	2024-06-26 08:58:42 -03:00
Mason M	c61cd05611	Clean up tabs	2024-06-26 08:49:16 -03:00
Isaac McFadyen	8854044561	Clarify default MMQ for CUDA and LLAMA_CUDA_FORCE_MMQ flag (#8115 ) * Add message about int8 support * Add suggestions from review Co-authored-by: Johannes Gäßler <johannesg@5d6.de> --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>	2024-06-26 08:29:28 +02:00
Johannes Gäßler	c8771ab5f8	CUDA: fix misaligned shared memory read (#8123 )	2024-06-26 08:28:02 +02:00
Eddie-Wang	494165f3b6	llama : extend llm_build_ffn() to support _scale tensors (#8103 )	2024-06-26 09:27:46 +03:00
Mason M	885954646e	Add vulkan SDK dep to ubuntu-22-cmake-vulkan workflow	2024-06-25 23:01:28 -03:00
Mason M	99c3027298	Use pkg-config to locate vulkan library	2024-06-25 22:18:56 -03:00
Olivier Chafik	9b2f16f805	`json`: better support for "type" unions (e.g. nullable arrays w/ typed items) (#7863 ) * json: better suport for "type" arrays (e.g. `{"type": ["array", "null"], "items": {"type": "string"}}`) * json: add test for type: [array, null] fix * update tests	2024-06-26 01:46:35 +01:00
Olivier Chafik	6777c544bd	`json`: fix additionalProperties, allow space after enum/const (#7840 ) * json: default additionalProperty to true * json: don't force additional props after normal properties! * json: allow space after enum/const * json: update pydantic example to set additionalProperties: false * json: prevent additional props to redefine a typed prop * port not_strings to python, add trailing space * fix not_strings & port to js+py * Update json-schema-to-grammar.cpp * fix _not_strings for substring overlaps * json: fix additionalProperties default, uncomment tests * json: add integ. test case for additionalProperties * json: nit: simplify condition * reformat grammar integ tests w/ R"""()""" strings where there's escapes * update # tokens in server test: consts can now have trailing space	2024-06-26 01:45:58 +01:00
Mason M	491a967455	Add make target for Vulkan shaders	2024-06-25 19:41:32 -03:00
bandoti	dd198ceaaa	Merge branch 'ggerganov:master' into vulkan-build-integration	2024-06-25 19:40:15 -03:00

1 2 3 4 5 ...

3288 commits