llama.cpp

Author	SHA1	Message	Date
KerfuffleV2	4814b4bbcd	Promote add_X_token to GGUF metadata for BOS and EOS	2023-11-10 14:12:55 -07:00
Jared Van Bortel	f22b2f2045	cleanup	2023-11-10 14:46:57 -05:00
KerfuffleV2	9ce51b69b0	gguf-py: SpecialVocab: Always try available sources for special token ids gguf-py: SpecialVocab: Try to load merges from merges.txt if not in tokenizer.json gguf-py: SpecialVocab: Add 'add_bos_token' type bools to GGUF metadata u	2023-11-10 05:50:45 -07:00
KerfuffleV2	960f912a14	convert.py: We can't currently support Q8_0 on big endian.	2023-11-10 05:50:15 -07:00
KerfuffleV2	0b0e726b2d	And include scripts/__init__.py, derp	2023-11-10 00:55:15 -07:00
KerfuffleV2	eff662d66e	Set up gguf- scripts in pyproject.toml	2023-11-10 00:53:23 -07:00
Jared Van Bortel	a21e9e7126	fix python 3.8 compat	2023-11-09 21:23:42 -05:00
Jared Van Bortel	795dc0f048	constants : remove unneeded type annotations	2023-11-09 21:03:05 -05:00
Jared Van Bortel	5608cd8d89	cleanup	2023-11-09 20:59:59 -05:00
Kerfuffle	7d3580d5b1	Murder accidental tuple in gguf-py/scripts/gguf-dump.py Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>	2023-11-09 17:50:11 -07:00
KerfuffleV2	382f9751fd	A few for gguf-dump.py cleanups	2023-11-09 17:08:44 -07:00
KerfuffleV2	bd241db879	Add JSON dumping support to gguf-dump.py Which I kind of regret now	2023-11-09 16:56:27 -07:00
KerfuffleV2	a04f0487b0	Make GGUFReader endian detection less arbitrary	2023-11-09 16:55:58 -07:00
KerfuffleV2	52bdc7e946	Reorganize scripts	2023-11-09 14:52:44 -07:00
Jared Van Bortel	5738b2f3b6	gguf-py : bump minor version	2023-11-09 12:28:28 -05:00
Jared Van Bortel	233cb0741f	cleanup	2023-11-09 12:11:41 -05:00
KerfuffleV2	bca0962575	Add convert-gguf-endian.py script	2023-11-09 08:35:35 -07:00
KerfuffleV2	cc58ad00b0	Merge branch 'master' into feat-gguf-py-read-refactor	2023-11-09 05:25:24 -07:00
Galunid	a75fa576ab	scripts: Generalize convert scripts (#3838 ) * Replace convert-*-hf-to-gguf.py files with convert-hf-to-gguf.py	2023-11-09 11:09:29 +01:00
KerfuffleV2	0d0306e7df	Include a gguf Python package version bump	2023-11-09 02:56:20 -07:00
KerfuffleV2	8e250fe527	Add more information to GGUFReader and examples comments	2023-11-09 02:52:42 -07:00
KerfuffleV2	2360aaadb4	Make examples executable, formatting changes	2023-11-09 00:25:20 -07:00
Kerfuffle	855486c912	Update gguf-py/gguf/gguf_reader.py type hint Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>	2023-11-09 00:22:00 -07:00
Kerfuffle	2af29ffeaa	Update gguf-py/examples/modify_gguf.py formatting Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>	2023-11-09 00:21:36 -07:00
Kerfuffle	4a5cd6924f	Clean up gguf-py/examples/modify_gguf.py whitespace Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>	2023-11-09 00:21:15 -07:00
Mihai	57ad015dc3	server : add min_p param (#3877 ) * Update server.cpp with min_p after it was introduced in https://github.com/ggerganov/llama.cpp/pull/3841 * Use spaces instead of tabs * Update index.html.hpp after running deps.sh * Fix test - fix line ending	2023-11-08 20:00:34 -06:00
KerfuffleV2	b56ed66195	Damagage is not a word.	2023-11-08 09:11:22 -07:00
KerfuffleV2	fffdac32b5	Fix an issue with state init in GGUFReader Move examples to an examples/ directory Clean up examples Add an example of modifying keys in a GGUF file Update documentation with info on examples Try to support people importing gguf/gguf.py directly	2023-11-08 09:03:47 -07:00
slaren	875fb42871	ggml-alloc : fix backend assignments of views (#3982 )	2023-11-08 13:15:14 +01:00
Jared Van Bortel	f2292fcc19	fix NamedTuple and Enum usage	2023-11-07 21:12:26 -05:00
Jared Van Bortel	f364636b2e	style cleanup with flake8	2023-11-07 21:06:41 -05:00
KerfuffleV2	ce865b3ce3	Fix missing return statement in add_tensor	2023-11-07 18:43:23 -07:00
Jared Van Bortel	a6f5742a53	sort imports with isort (again)	2023-11-07 20:28:53 -05:00
KerfuffleV2	d7688dc937	Various type annotation fixes.	2023-11-07 17:30:11 -07:00
KerfuffleV2	8047aa192f	Replay changes from #3871 Credit to @cebtenzzre for that pull	2023-11-07 15:01:36 -07:00
KerfuffleV2	b8c80df741	gguf-py: Refactor and add file reading support	2023-11-07 14:41:58 -07:00
Jared Van Bortel	0a7c980b6f	gguf : track writer state, free unneeded tensors, cleanup (#3871 )	2023-11-07 12:43:04 -05:00
Georgi Gerganov	413503d4b9	make : do not add linker flags when compiling static llava lib (#3977 )	2023-11-07 20:25:32 +03:00
xaedes	e9c1cecb9d	ggml : fix backward rope after YaRN (#3974 ) * fix backward process of rope rope backward process was broken after YaRN RoPE (#2268) implementation, due to missing changes in backward functions. the code for the backward process is nearly identically to the forward process: the only difference is the sign of the sin-values. to avoid future regressions remove the near-duplicate backward functions and reuse the forward code: for this a new function argument `bool forward` was added to `ggml_compute_forward_rope_f32` and `ggml_compute_forward_rope_f16`. the sin-values will be negated when forward is false. * fix finetune rope call to use correct default attn_factor of 1.0f * remove unused `ggml_rope_xpos_back` it is better to have only one `ggml_rope_back` function that accepts all rope parameters, so that `ggml_compute_backward` can propagate all parameters without having to switch between different rope_back variants. * fix comments explaining the sinus sign in ggml_forward_rope * add missing function arguments in declaration * fix function argument type in declaration	2023-11-07 10:04:51 +02:00
Matthew Tejo	54b4df8886	Use params when loading models in llava-cli (#3976 ) llava-cli was loading models with default params and ignoring settings from the cli. This switches to a generic function to load the params from the cli options.	2023-11-07 10:43:59 +03:00
Meng Zhang	46876d2a2c	cuda : supports running on CPU for GGML_USE_CUBLAS=ON build (#3946 ) * protyping the idea that supports running on CPU for a GGML_USE_CUBLAS=on build * doc: add comments to ggml_cublas_loaded() * fix defined(...)	2023-11-07 08:49:08 +02:00
Damian Stewart	381efbf480	llava : expose as a shared library for downstream projects (#3613 ) * wip llava python bindings compatibility * add external llava API * add base64 in-prompt image support * wip refactor image loading * refactor image load out of llava init * cleanup * further cleanup; move llava-cli into its own file and rename * move base64.hpp into common/ * collapse clip and llava libraries * move llava into its own subdir * wip * fix bug where base64 string was not removed from the prompt * get libllava to output in the right place * expose llava methods in libllama.dylib * cleanup memory usage around clip_image_* * cleanup and refactor again * update headerdoc * build with cmake, not tested (WIP) * Editorconfig * Editorconfig * Build with make * Build with make * Fix cyclical depts on Windows * attempt to fix build on Windows * attempt to fix build on Windows * Upd TODOs * attempt to fix build on Windows+CUDA * Revert changes in cmake * Fix according to review comments * Support building as a shared library * address review comments --------- Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com> Co-authored-by: Jared Van Bortel <jared@nomic.ai>	2023-11-07 00:36:23 +03:00
slaren	2833a6f63c	ggml-cuda : fix f16 mul mat (#3961 ) * ggml-cuda : fix f16 mul mat ggml-ci * silence common.cpp warning (bonus)	2023-11-05 18:45:16 +01:00
Kerfuffle	d9ccce2e33	Allow common process_escapes to handle \x sequences (#3928 ) * Allow common process_escapes to handle \x sequences * Fix edge case when second hex digit is NUL	2023-11-05 10:06:06 -07:00
Thái Hoàng Tâm	bb60fd0bf6	server : fix typo for --alias shortcut from -m to -a (#3958 )	2023-11-05 18:15:27 +02:00
Jared Van Bortel	132d25b8a6	cuda : fix disabling device with --tensor-split 1,0 (#3951 ) Co-authored-by: slaren <slarengh@gmail.com>	2023-11-05 10:08:57 -05:00
Meng Zhang	3d48f42efc	llama : mark LLM_ARCH_STARCODER as full offload supported (#3945 ) as done in https://github.com/ggerganov/llama.cpp/pull/3827	2023-11-05 14:40:08 +02:00
Eve	c41ea36eaa	cmake : MSVC instruction detection (fixed up #809 ) (#3923 ) * Add detection code for avx * Only check hardware when option is ON * Modify per code review sugguestions * Build locally will detect CPU * Fixes CMake style to use lowercase like everywhere else * cleanup * fix merge * linux/gcc version for testing * msvc combines avx2 and fma into /arch:AVX2 so check for both * cleanup * msvc only version * style * Update FindSIMD.cmake --------- Co-authored-by: Howard Su <howard0su@gmail.com> Co-authored-by: Jeremy Dunn <jeremydunn123@gmail.com>	2023-11-05 10:03:09 +02:00
Eve	a7fac013cf	ci : use intel sde when ci cpu doesn't support avx512 (#3949 )	2023-11-05 09:46:44 +02:00
slaren	48ade94538	cuda : revert CUDA pool stuff (#3944 ) * Revert "cuda : add ROCM aliases for CUDA pool stuff (#3918)" This reverts commit `629f917cd6`. * Revert "cuda : use CUDA memory pool with async memory allocation/deallocation when available (#3903)" This reverts commit `d6069051de`. ggml-ci	2023-11-05 09:12:13 +02:00

1 2 3 4 5 ...

1534 commits