Commit graph

2689 commits

Author SHA1 Message Date
Concedo
a6e6b8b96b Merge branch 'master' into concedo_experimental 2023-11-10 22:27:11 +08:00
Concedo
36e860e94d updated docs 2023-11-10 22:25:11 +08:00
Galunid
df9d1293de
Unbreak persimmon after #3837 (#4010) 2023-11-10 14:24:54 +01:00
Concedo
4a130ee11c added support for filecomments 2023-11-10 14:12:06 +08:00
Concedo
be92cfa125 added preloadstory 2023-11-10 13:05:22 +08:00
Concedo
6870c31933 updated docs 2023-11-09 21:33:19 +08:00
Galunid
a75fa576ab
scripts: Generalize convert scripts (#3838)
* Replace convert-*-hf-to-gguf.py files with convert-hf-to-gguf.py
2023-11-09 11:09:29 +01:00
Concedo
c938a1011d updated lite 2023-11-09 17:21:27 +08:00
Concedo
7ef4ec3b16 added trim_stop flag 2023-11-09 16:55:44 +08:00
Concedo
afa466807d nooby layer selector considers contextsize 2023-11-09 14:05:35 +08:00
Concedo
93e99179be colab updated 2023-11-09 13:49:06 +08:00
Mihai
57ad015dc3
server : add min_p param (#3877)
* Update server.cpp with min_p after it was introduced in https://github.com/ggerganov/llama.cpp/pull/3841

* Use spaces instead of tabs

* Update index.html.hpp after running deps.sh

* Fix test - fix line ending
2023-11-08 20:00:34 -06:00
Concedo
0b13ebed6a Merge branch 'master' into concedo_experimental
# Conflicts:
#	Makefile
2023-11-08 20:54:09 +08:00
slaren
875fb42871
ggml-alloc : fix backend assignments of views (#3982) 2023-11-08 13:15:14 +01:00
Jared Van Bortel
0a7c980b6f
gguf : track writer state, free unneeded tensors, cleanup (#3871) 2023-11-07 12:43:04 -05:00
Georgi Gerganov
413503d4b9
make : do not add linker flags when compiling static llava lib (#3977) 2023-11-07 20:25:32 +03:00
Concedo
fb3bcac368 handle memory separately for kcpp 2023-11-07 17:15:14 +08:00
xaedes
e9c1cecb9d
ggml : fix backward rope after YaRN (#3974)
* fix backward process of rope

rope backward process was broken after YaRN RoPE (#2268) implementation, due to missing changes in backward functions.

the code for the backward process is nearly identically to the forward process:
the only difference is the sign of the sin-values.

to avoid future regressions remove the near-duplicate backward functions and reuse the forward code:

for this a new function argument `bool forward` was added to `ggml_compute_forward_rope_f32` and `ggml_compute_forward_rope_f16`.
the sin-values will be negated when forward is false.

* fix finetune rope call to use correct default attn_factor of 1.0f

* remove unused `ggml_rope_xpos_back`

it is better to have only one `ggml_rope_back` function that accepts all rope parameters, so that `ggml_compute_backward` can propagate all parameters without having to switch between different rope_back variants.

* fix comments explaining the sinus sign in ggml_forward_rope

* add missing function arguments in declaration

* fix function argument type in declaration
2023-11-07 10:04:51 +02:00
Matthew Tejo
54b4df8886
Use params when loading models in llava-cli (#3976)
llava-cli was loading models with default params and ignoring settings
from the cli. This switches to a generic function to load the params
from the cli options.
2023-11-07 10:43:59 +03:00
Concedo
f277ed0e8c Merge branch 'master' into concedo_experimental
# Conflicts:
#	Makefile
2023-11-07 15:23:08 +08:00
Meng Zhang
46876d2a2c
cuda : supports running on CPU for GGML_USE_CUBLAS=ON build (#3946)
* protyping the idea that supports running on CPU for a GGML_USE_CUBLAS=on build

* doc: add comments to ggml_cublas_loaded()

* fix defined(...)
2023-11-07 08:49:08 +02:00
Damian Stewart
381efbf480
llava : expose as a shared library for downstream projects (#3613)
* wip llava python bindings compatibility

* add external llava API

* add base64 in-prompt image support

* wip refactor image loading

* refactor image load out of llava init

* cleanup

* further cleanup; move llava-cli into its own file and rename

* move base64.hpp into common/

* collapse clip and llava libraries

* move llava into its own subdir

* wip

* fix bug where base64 string was not removed from the prompt

* get libllava to output in the right place

* expose llava methods in libllama.dylib

* cleanup memory usage around clip_image_*

* cleanup and refactor *again*

* update headerdoc

* build with cmake, not tested (WIP)

* Editorconfig

* Editorconfig

* Build with make

* Build with make

* Fix cyclical depts on Windows

* attempt to fix build on Windows

* attempt to fix build on Windows

* Upd TODOs

* attempt to fix build on Windows+CUDA

* Revert changes in cmake

* Fix according to review comments

* Support building as a shared library

* address review comments

---------

Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2023-11-07 00:36:23 +03:00
Concedo
feb60bc447 tokenizer tweaks (+2 squashed commit)
Squashed commit:

[18c70621] tokenizer tweaks

[8002f897] handle if localstorage is inaccessible
2023-11-06 23:51:26 +08:00
Concedo
372cfef2c3 Merge branch 'concedo' into concedo_experimental 2023-11-06 20:16:07 +08:00
Concedo
2102942121 testing LLAMA_PORTABLE flag for building 2023-11-06 20:15:15 +08:00
Concedo
78ca0667a4 Merge branch 'master' into concedo_experimental 2023-11-06 16:58:58 +08:00
Concedo
93c4b2a9c6 add force rebuild 2023-11-06 14:33:42 +08:00
Concedo
2f16eccb89 special colab build 2023-11-06 01:46:58 +08:00
slaren
2833a6f63c
ggml-cuda : fix f16 mul mat (#3961)
* ggml-cuda : fix f16 mul mat

ggml-ci

* silence common.cpp warning (bonus)
2023-11-05 18:45:16 +01:00
Kerfuffle
d9ccce2e33
Allow common process_escapes to handle \x sequences (#3928)
* Allow common process_escapes to handle \x sequences

* Fix edge case when second hex digit is NUL
2023-11-05 10:06:06 -07:00
Thái Hoàng Tâm
bb60fd0bf6
server : fix typo for --alias shortcut from -m to -a (#3958) 2023-11-05 18:15:27 +02:00
Jared Van Bortel
132d25b8a6
cuda : fix disabling device with --tensor-split 1,0 (#3951)
Co-authored-by: slaren <slarengh@gmail.com>
2023-11-05 10:08:57 -05:00
Concedo
2b32b170a1 clang 15 check for macOS 2023-11-05 22:57:05 +08:00
Concedo
ea81eae189 cleanup, up ver (+1 squashed commits)
Squashed commits:

[1ea303d6] cleanup , up ver (+1 squashed commits)

Squashed commits:

[79f09b22] cleanup
2023-11-05 22:49:23 +08:00
YellowRoseCx
e2e5fe56a8
KCPP Fetches AMD ROCm Memory without a stick, CC_TURING Gets the Boot, koboldcpp_hipblas.dll Talks To The Hand, and hipBLAS Compiler Finds Its Independence! (#517)
* AMD ROCm memory fetching and max mem setting

* Update .gitignore with koboldcpp_hipblas.dll

* Update CMakeLists.txt remove CC_TURING for AMD

* separate hipBLAS compiler, update MMV_Y, move CXX/CC print

separate hipBLAS compiler, update MMV_Y value, move the section that prints CXX and CC compiler name
2023-11-05 22:23:18 +08:00
Concedo
a62468ec4c Merge branch 'master' into concedo_experimental
should fix multigpu
2023-11-05 22:14:40 +08:00
Concedo
bdf16d7a3c aria2 needs to show more info 2023-11-05 22:13:22 +08:00
Meng Zhang
3d48f42efc
llama : mark LLM_ARCH_STARCODER as full offload supported (#3945)
as done in https://github.com/ggerganov/llama.cpp/pull/3827
2023-11-05 14:40:08 +02:00
Eve
c41ea36eaa
cmake : MSVC instruction detection (fixed up #809) (#3923)
* Add detection code for avx

* Only check hardware when option is ON

* Modify per code review sugguestions

* Build locally will detect CPU

* Fixes CMake style to use lowercase like everywhere else

* cleanup

* fix merge

* linux/gcc version for testing

* msvc combines avx2 and fma into /arch:AVX2 so check for both

* cleanup

* msvc only version

* style

* Update FindSIMD.cmake

---------

Co-authored-by: Howard Su <howard0su@gmail.com>
Co-authored-by: Jeremy Dunn <jeremydunn123@gmail.com>
2023-11-05 10:03:09 +02:00
Eve
a7fac013cf
ci : use intel sde when ci cpu doesn't support avx512 (#3949) 2023-11-05 09:46:44 +02:00
slaren
48ade94538
cuda : revert CUDA pool stuff (#3944)
* Revert "cuda : add ROCM aliases for CUDA pool stuff (#3918)"

This reverts commit 629f917cd6.

* Revert "cuda : use CUDA memory pool with async memory allocation/deallocation when available (#3903)"

This reverts commit d6069051de.

ggml-ci
2023-11-05 09:12:13 +02:00
Concedo
351dcabd3e lite fix 2023-11-05 14:47:02 +08:00
Concedo
faae84ee1d removed c flag in wget 2023-11-05 10:21:28 +08:00
henk717
02595f9d21
Colabcpp improvements (#512)
* Aria2

* Aria2 Typo fix

* Streamlined Wget

* Streamlining Fix

* Back to .so downloading

* Crash colab if no GPU is present

* Created using Colaboratory

* Restore proper link

Colab overwrite the link, manually changing it back so people don't land on my branch.

* Restore file juggle

* Fixing the colab link... again
2023-11-05 10:19:09 +08:00
Concedo
5e5be717c3 fix for removing inaccessible backends in gui 2023-11-05 10:12:12 +08:00
Kerfuffle
f28af0d81a
gguf-py: Support 01.AI Yi models (#3943) 2023-11-04 16:20:34 -06:00
Concedo
1e7088a80b autopick cublas in gui if possible, better layer picking logic 2023-11-05 01:35:27 +08:00
Concedo
7a8c0df2e5 Merge branch 'master' into concedo_experimental 2023-11-04 09:18:28 +08:00
Concedo
135001abc4 try to make the tunnel more reliable 2023-11-04 09:18:19 +08:00
Concedo
38471fbe06 tensor core info better printout (+1 squashed commits)
Squashed commits:

[be4ef93f] tensor core info better printout
2023-11-04 08:38:25 +08:00