Commit graph

3674 commits

Author SHA1 Message Date
Zack Li
d5df53658f
Merge pull request #14 from NexaAI/teliu/android/dev
Add submodule llava for android sample
2024-11-08 13:25:00 -08:00
Zack Li
8c417282d5
Merge pull request #15 from NexaAI/weili/master-release
support all omni-vlm models in one omni-vlm/ folder.
2024-11-08 13:23:46 -08:00
李为
eb6d54679e update README.md 2024-11-08 22:05:57 +08:00
李为
3d9c63a3ff remove omni-vlm-v2/ 2024-11-08 21:00:42 +08:00
李为
16c22471e8 remove redundant omni-vlm-v2/ folder, all omni-vlm examples will be added to omni-vlm/ folder. 2024-11-08 20:59:23 +08:00
liute110
b17684efb3 add include llava.h 2024-11-08 16:07:50 +08:00
liute110
400fc2a4b0 add one more model 2024-11-08 16:06:37 +08:00
liute110
86c2233a38 add submodule llava for android 2024-11-08 16:02:45 +08:00
Zack Li
df5841b6b8
Merge pull request #13 from NexaAI/weili/master-release
add omni-vlm-v2 implementations( C++ & python)
2024-11-07 00:48:21 -08:00
李为
3dfac7817f add returned string type (const char*) for nexa-omni-audio 2024-11-07 16:13:53 +08:00
Zack Li
20b9f02cee
Merge pull request #12 from NexaAI/weili/master-release
add returned string type (const char*) for nexa-omni-audio
2024-11-06 19:28:46 -08:00
李为
5edadffd88 add returned string type (const char*) for nexa-omni-audio 2024-11-07 11:19:50 +08:00
Zack Li
6a4cf0b983
Merge pull request #11 from NexaAI/weili/master-release
add returned string (const char*) for qwen2 audio
2024-11-05 23:27:47 -08:00
李为
b24a409e22 add returned string (const char*) for qwen2 audio 2024-11-06 15:24:26 +08:00
Zack Li
5574bda471
Merge pull request #10 from NexaAI/weili/master-release
add returned string (pure c const char* type) for omni-vlm inference api
2024-11-05 19:41:03 -08:00
李为
22da7bc379 add returned string (pure c const char* type) for omni-vlm inference api 2024-11-06 11:20:36 +08:00
Zack Li
983b4625ef
Merge pull request #8 from NexaAI/weili/master-release
add omni-vlm examples (C++ & python)
2024-11-04 22:39:36 -08:00
Zack Li
91b3cafbb5
Merge pull request #6 from NexaAI/master-release-audio-lm
Remove C++20 coding and suport Microsoft Visual Studio Compilation
2024-11-04 21:59:26 -08:00
Zack Zhiyuan Li
05853eb861 remove C++20 syntax 2024-11-04 23:03:49 +00:00
Zack Zhiyuan Li
d42e0371f8 remove C++20 style 2024-11-04 22:50:33 +00:00
Zack Zhiyuan Li
1419681089 disable <cxxabi.h> for MSC_VER 2024-11-04 05:45:52 +00:00
Zack Zhiyuan Li
6f1ed6e5cb Adding #include <io.h> & <fcntl.h> 2024-11-04 04:54:51 +00:00
Zack Zhiyuan Li
a4747b2edb fix error on windows qwen2-audio/whisper.cpp:9935:38: error: '_O_BINARY' was not declared in this scope 2024-11-04 04:40:41 +00:00
Zack Zhiyuan Li
995baefeed Disable cxxabi.h dependency on Windows 2024-11-04 03:48:20 +00:00
李为
d277c674ae add omni-vlm examples (C++ & python) 2024-11-04 09:56:33 +08:00
Zack Zhiyuan Li
4bdc70aaac update to C++17 for compilation 2024-11-03 22:07:07 +00:00
Zack Zhiyuan Li
9e67ef75b4 remove uneccesary build and rename shared lib 2024-11-03 21:29:09 +00:00
Zack Zhiyuan Li
f0d1c4fa1c enable qwen2-audio work E2E 2024-11-03 18:33:32 +00:00
Zack Zhiyuan Li
c7b912bdca support omni-audio 2024-11-03 17:58:08 +00:00
Zack Zhiyuan Li
4a29bca867 update vulkan target name 2024-10-23 20:54:39 +00:00
Zack Li
3a3552632a update README after renaming GGML 2024-09-10 20:53:14 +00:00
Zack Li
5f81588780 support ggml 2024-09-10 20:50:54 +00:00
Georgi Gerganov
1d1ccce676
flake.lock: Update (#9162)
Flake lock file updates:

• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/c3aa7b8938b17aebd2deecf7be0636000d62a2b9?narHash=sha256-med8%2B5DSWa2UnOqtdICndjDAEjxr5D7zaIiK4pn0Q7c%3D' (2024-08-14)
  → 'github:NixOS/nixpkgs/c374d94f1536013ca8e92341b540eba4c22f9c62?narHash=sha256-Z/ELQhrSd7bMzTO8r7NZgi9g5emh%2BaRKoCdaAv5fiO0%3D' (2024-08-21)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2024-08-28 21:28:14 -07:00
slaren
9fe94ccac9
docker : build images only once (#9225) 2024-08-28 17:28:00 +02:00
slaren
66b039a501
docker : update CUDA images (#9213) 2024-08-28 13:20:36 +02:00
Georgi Gerganov
20f1789dfb vulkan : fix build (#0)
ggml-ci
2024-08-27 22:41:27 +03:00
Georgi Gerganov
231cff5f6f sync : ggml 2024-08-27 22:41:27 +03:00
Xie Yanbo
3246fe84d7
Fix minicpm example directory (#9111) 2024-08-27 14:33:08 +02:00
compilade
78eb487bb0
llama : fix qs.n_attention_wv for DeepSeek-V2 (#9156) 2024-08-27 13:09:23 +03:00
Xuan Son Nguyen
a77feb5d71
server : add some missing env variables (#9116)
* server : add some missing env variables

* add LLAMA_ARG_HOST to server dockerfile

* also add LLAMA_ARG_CONT_BATCHING
2024-08-27 11:07:01 +02:00
CausalLM
2e59d61c1b
llama : fix ChatGLM4 wrong shape (#9194)
This should fix THUDM/glm-4-9b-chat-1m and CausalLM/miniG
2024-08-27 09:58:22 +03:00
Carsten Kragelund Jørgensen
75e1dbbaab
llama : fix llama3.1 rope_freqs not respecting custom head_dim (#9141)
* fix: llama3.1 rope_freqs not respecting custom head_dim

* fix: use potential head_dim for Exaone
2024-08-27 09:53:40 +03:00
arch-btw
ad76569f8e
common : Update stb_image.h to latest version (#9161)
* Update stb_image.h to latest version

Fixes https://github.com/ggerganov/llama.cpp/issues/7431

* Update .ecrc
2024-08-27 08:58:50 +03:00
slaren
7d787ed96c
ggml : do not crash when quantizing q4_x_x with an imatrix (#9192) 2024-08-26 19:44:43 +02:00
Georgi Gerganov
06658ad7c3
metal : separate scale and mask from QKT in FA kernel (#9189)
* metal : separate scale and mask from QKT in FA kernel

* metal : ne01 check no longer necessary

* metal : keep data in local memory
2024-08-26 18:31:02 +03:00
Georgi Gerganov
fc18425b6a
ggml : add SSM Metal kernels (#8546)
* ggml : add ggml_ssm_conv metal impl

* ggml : add ssm_scan metal impl

ggml-ci
2024-08-26 17:55:36 +03:00
Georgi Gerganov
879275ac98
tests : fix compile warnings for unreachable code (#9185)
ggml-ci
2024-08-26 16:30:25 +03:00
Georgi Gerganov
7a3df798fc
ci : add VULKAN support to ggml-ci (#9055) 2024-08-26 12:19:39 +03:00
Georgi Gerganov
e5edb210cd
server : update deps (#9183) 2024-08-26 12:16:57 +03:00
slaren
0c41e03ceb
metal : gemma2 flash attention support (#9159) 2024-08-26 11:08:59 +02:00