caitianchi
1ec79f04ab
modify convert script and readme
2024-08-12 20:17:56 +08:00
caitianchi
89d378c76b
fix type-check
2024-08-12 16:47:59 +08:00
caitianchi
a945b3ca8b
fix type-check
2024-08-12 15:30:44 +08:00
caitianchi
662d4c1402
fix type-check
2024-08-12 15:06:22 +08:00
caitianchi
32b47f600f
fix type-check
2024-08-10 21:51:04 +08:00
caitianchi
4a87d1d93e
modify readme
2024-08-10 19:21:38 +08:00
caitianchi
28d6a0f43d
modify clip
2024-08-10 19:21:27 +08:00
caitianchi
bffbe1cf44
add resampler of v2.6
2024-08-10 18:19:35 +08:00
caitianchi
fe39ecc1ee
add readme
2024-08-10 18:18:58 +08:00
caitianchi
6cad864cbd
modify convert
2024-08-10 18:18:38 +08:00
tc-mb
ce0d1a6f29
Merge pull request #24 from OpenBMB/master
...
sync master
2024-08-10 16:36:27 +08:00
tc-mb
fc1c860bb8
Merge branch 'prepare-PR-of-minicpm-v2.6' into master
2024-08-10 16:36:04 +08:00
caitianchi
ea0c8283c8
modify convert
2024-08-10 14:20:59 +08:00
Matteo Mortari
911b437f22
gguf-py : fix double call to add_architecture() ( #8952 )
...
Signed-off-by: tarilabs <matteo.mortari@gmail.com>
2024-08-10 08:58:49 +03:00
Georgi Gerganov
b72942fac9
Merge commit from fork
2024-08-09 23:03:21 +03:00
fairydreaming
6afd1a99dc
llama : add support for lora adapters in T5 model ( #8938 )
...
Co-authored-by: Stanisław Szymczyk <sszymczy@gmail.com>
2024-08-09 18:53:09 +02:00
Georgi Gerganov
272e3bd95e
make : fix llava obj file race ( #8946 )
...
ggml-ci
2024-08-09 18:24:30 +03:00
Georgi Gerganov
45a55b91aa
llama : better replace_all (cont) ( #8926 )
...
* llama : better replace_all (cont)
ggml-ci
* code : deduplicate replace_all
ggml-ci
2024-08-09 18:23:52 +03:00
tc-mb
3071c0a5f2
llava : support MiniCPM-V-2.5 ( #7599 )
...
* init
* rename
* add run android for termux in readme
* add android readme
* add instructions in readme
* change name in readme
* Update README.md
* fixed line
* add result in readme
* random pos_embed
* add positions index
* change for ollama
* change for ollama
* better pos_embed in clip
* support ollama
* updata cmakelist
* updata cmakelist
* rename wrapper
* clear code
* replace and organize code
* add link
* sync master
* fix warnings
* fix warnings
* fix bug in bicubic resize when need resize iamge smaller
* receive review comments and modify
* receive review comments and modify
* put all code into llava dir
* fix quality problem in pr code
* change n_layer
* add space in "-1"
* imitate reshape bug of python code
* fix bug in clip
* fix issues for merging
* fix llama-minicpmv-cli in cmake file
* change pr readme
* fix code review
* remove in line 33 directory in the /cmakelists.txt (not in example, in the main dir
* fix cmakefile
* add warn
* fix KEY_HAS_MINICPMV_PROJ
* remove load_image_size into clip_ctx
* remove the extern "C", MINICPMV_API
* fix uhd code for review comment
* delete minicpmv-wrapper in pr
* remove uhd_image_embed
* Modify 2 notes
* clip : style changes
* del common.h in clip
* fix Type-Check error
* fix Type-Check error
* fix Type-Check error
* fix Type-Check error
* fix makefile error
* fix ubuntu-make error
* try fix clip
* try fix 1
---------
Co-authored-by: Hongji Zhu <fireyoucan@gmail.com>
Co-authored-by: harvestingmoon <leewenyeong@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-08-09 13:33:53 +03:00
Georgi Gerganov
4305b57c80
sync : ggml
2024-08-09 10:03:48 +03:00
Matt Stephenson
70c0ea3560
whisper : use vulkan as gpu backend when available (whisper/2302)
...
* ggml: use vulkan as gpu backend when available
Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com>
* whisper: enable using vk as default buffer type
Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com>
---------
Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com>
2024-08-09 10:03:44 +03:00
Daniel Bevenius
5b2c04f492
embedding : add --pooling option to README.md [no ci] ( #8934 )
...
This commit adds the `--pooling` option to the README.md file in the
`examples/embedding` directory.
The motivation for adding this options is that currently if the model
used does not specify a pooling type the embedding example will fail
with the following error message:
```console
main: error: pooling type NONE not supported
```
This commit also updates the name of the executable in the examples
section.
2024-08-09 09:33:30 +03:00
Daniel Bevenius
6f6496bb09
llama : fix typo in llama_tensor_get_type comment [no ci] ( #8937 )
2024-08-09 09:32:23 +03:00
Mathieu Geli
daef3ab233
server : add one level list nesting for embeddings ( #8936 )
2024-08-09 09:32:02 +03:00
compilade
345a686d82
llama : reduce useless copies when saving session ( #8916 )
...
* llama : avoid useless copies in dummy session writer
* llama : avoid double tensor copy when saving session to buffer
2024-08-08 23:54:00 -04:00
compilade
3a14e00366
gguf-py : simplify support for quant types ( #8838 )
...
* gguf-py : use classes for quants
* convert_hf : simplify internal quantization type selection
* gguf-py : fix flake8 lint
* gguf-py : fix BF16 numpy view type
* gguf-py : remove LlamaFileTypeMap
Too specific to 'llama.cpp', and would be a maintenance burden
to keep up to date.
* gguf-py : add generic quantize and dequantize functions
The quant classes no longer need to be known,
only the target or the source type,
for 'quantize' and 'dequantize', respectively.
2024-08-08 13:33:09 -04:00
Georgi Gerganov
afd27f01fe
scripts : sync cann files ( #0 )
2024-08-08 14:56:52 +03:00
Georgi Gerganov
366d486c16
scripts : fix sync filenames ( #0 )
2024-08-08 14:40:12 +03:00
Georgi Gerganov
e44a561ab0
sync : ggml
2024-08-08 13:19:47 +03:00
Borislav Stanimirov
f93d49ab1e
ggml : ignore more msvc warnings (ggml/906)
2024-08-08 13:19:31 +03:00
Georgi Gerganov
5b33ea1ee7
metal : fix struct name (ggml/912)
...
ggml-ci
2024-08-08 13:19:31 +03:00
Conrad Kramer
85fca8deb6
metal : add abort callback (ggml/905)
2024-08-08 13:19:30 +03:00
Pablo Duboue
ebd541a570
make : clean llamafile objects ( #8923 )
...
`ggml/src/llamafile/sgemm.o` was not deleted on `make clean`
2024-08-08 11:44:51 +03:00
slaren
15fa07a5c5
make : use C compiler to build metal embed object ( #8899 )
...
* make : use C compiler to build metal embed object
* use rm + rmdir to avoid -r flag in rm
2024-08-07 18:24:05 +02:00
slaren
be55695eff
ggml-backend : fix async copy from CPU ( #8897 )
...
* ggml-backend : fix async copy from CPU
* cuda : more reliable async copy, fix stream used when the devices are the same
2024-08-07 13:29:02 +02:00
Ouadie EL FAROUKI
0478174d59
[SYCL] Updated SYCL device filtering ( #8901 )
...
* Updated device filter to depend on default_selector (fixes non-intel device issues)
* Small related update to example/sycl Readme
2024-08-07 11:25:36 +01:00
Johannes Gäßler
a8dbc6f753
CUDA/HIP: fix tests/test-backend-ops ( #8896 )
2024-08-07 09:07:52 +02:00
Zhenwei Jin
506122d854
llama-bench : add support for getting cpu info on Windows ( #8824 )
...
* Add support for getting cpu info on Windows for llama_bench
* refactor
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-08-07 03:01:06 +02:00
Daniel Bevenius
725e3d9437
quantize : update usage comment in quantize.cpp ( #8889 )
...
This commit updates the usage comment in quantize.cpp to reflect the
new name of the executable, which is llama-quantize.
2024-08-07 01:43:00 +02:00
Nexes the Old
31958546c3
typo correction ( #8891 )
2024-08-07 01:41:54 +02:00
Xuan Son Nguyen
1e6f6554aa
server : add lora hotswap endpoint (WIP) ( #8857 )
...
* server : add lora hotswap endpoint
* handle lora_no_apply
* fix build
* updae docs
* clean up struct def
* fix build
* add LoRA test
* fix style
2024-08-06 17:33:39 +02:00
Johannes Gäßler
641f5dd2a6
CUDA: fix padding logic for FP16/FP32 ( #8884 )
2024-08-06 17:13:55 +02:00
Daniel Bevenius
5f4dcb1e60
simple : update name of executable to llama-simple ( #8885 )
...
This commit updates the name of the executable in README.md from
`simple` to `llama-simple`.
2024-08-06 16:44:35 +02:00
Jaeden Amero
db20f50cf4
cmake : Link vulkan-shaders-gen with pthreads ( #8835 )
...
When using CMake to build with Vulkan support, compiling
vulkan-shaders-gen fails due to missing a CMakeLists.txt specification
to link vulkan-shaders-gen with the threading library, resulting in the
following error.
[5/172] Linking CXX executable bin/vulkan-shaders-gen
FAILED: bin/vulkan-shaders-gen
: && /usr/bin/c++ ggml/src/vulkan-shaders/CMakeFiles/vulkan-shaders-gen.dir/vulkan-shaders-gen.cpp.o -o bin/vulkan-shaders-gen && :
ld: error: undefined symbol: pthread_create
>>> referenced by vulkan-shaders-gen.cpp
>>> ggml/src/vulkan-shaders/CMakeFiles/vulkan-shaders-gen.dir/vulkan-shaders-gen.cpp.o:(std::__1::__libcpp_thread_create[abi:se180100](pthread**,
>>> void* (*)(void*), void*))
c++: error: linker command failed with exit code 1 (use -v to see invocation)
[6/172] Generating build details from Git
-- Found Git: /usr/local/bin/git (found version "2.45.2")
ninja: build stopped: subcommand failed.
Add the CMakeLists.txt specification to link vulkan-shaders-gen with the
threading library and fix the above error.
Fixes #8834
2024-08-06 15:21:47 +02:00
MaggotHATE
efda90c93a
[Vulkan] Fix compilation of vulkan-shaders-gen
on w64devkit after
e31a4f6
( #8880 )
...
* Fix compilation issue in `vulkan-shaders-gen`
e31a4f6797
broke compilation on w64devkit. Including `algorithm` seems to fix that.
* Guard it under `#ifdef _WIN32`
2024-08-06 13:32:03 +02:00
Georgi Gerganov
0bf16de07b
contributing : add note about write access
2024-08-06 11:48:01 +03:00
Molly Sophia
2d5dd7bb3f
ggml : add epsilon as a parameter for group_norm ( #8818 )
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-06 10:26:46 +03:00
Douglas Hanley
cdd1889de6
convert : add support for XLMRoberta embedding models ( #8658 )
...
* add conversion for bge-m3; small fix in unigram tokenizer
* clean up and simplify XLMRoberta conversion
2024-08-06 10:20:54 +03:00
Mengqing Cao
c21a896405
[CANN]: Fix ggml_backend_cann_buffer_get_tensor ( #8871 )
...
* cann: fix ggml_backend_cann_buffer_get_tensor
1. fix data ptr offset
2. enable the acquisition of incomplete tensors
* fix backend cann set_tensor
2024-08-06 12:42:42 +08:00
Neo Zhang
d4ff847153
[SYCL] correct cmd name ( #8877 )
2024-08-06 09:09:12 +08:00