Commit graph

2556 commits

Author SHA1 Message Date
Concedo
ea81eae189 cleanup, up ver (+1 squashed commits)
Squashed commits:

[1ea303d6] cleanup , up ver (+1 squashed commits)

Squashed commits:

[79f09b22] cleanup
2023-11-05 22:49:23 +08:00
YellowRoseCx
e2e5fe56a8
KCPP Fetches AMD ROCm Memory without a stick, CC_TURING Gets the Boot, koboldcpp_hipblas.dll Talks To The Hand, and hipBLAS Compiler Finds Its Independence! (#517)
* AMD ROCm memory fetching and max mem setting

* Update .gitignore with koboldcpp_hipblas.dll

* Update CMakeLists.txt remove CC_TURING for AMD

* separate hipBLAS compiler, update MMV_Y, move CXX/CC print

separate hipBLAS compiler, update MMV_Y value, move the section that prints CXX and CC compiler name
2023-11-05 22:23:18 +08:00
Concedo
a62468ec4c Merge branch 'master' into concedo_experimental
should fix multigpu
2023-11-05 22:14:40 +08:00
Concedo
bdf16d7a3c aria2 needs to show more info 2023-11-05 22:13:22 +08:00
Meng Zhang
3d48f42efc
llama : mark LLM_ARCH_STARCODER as full offload supported (#3945)
as done in https://github.com/ggerganov/llama.cpp/pull/3827
2023-11-05 14:40:08 +02:00
Eve
c41ea36eaa
cmake : MSVC instruction detection (fixed up #809) (#3923)
* Add detection code for avx

* Only check hardware when option is ON

* Modify per code review sugguestions

* Build locally will detect CPU

* Fixes CMake style to use lowercase like everywhere else

* cleanup

* fix merge

* linux/gcc version for testing

* msvc combines avx2 and fma into /arch:AVX2 so check for both

* cleanup

* msvc only version

* style

* Update FindSIMD.cmake

---------

Co-authored-by: Howard Su <howard0su@gmail.com>
Co-authored-by: Jeremy Dunn <jeremydunn123@gmail.com>
2023-11-05 10:03:09 +02:00
Eve
a7fac013cf
ci : use intel sde when ci cpu doesn't support avx512 (#3949) 2023-11-05 09:46:44 +02:00
slaren
48ade94538
cuda : revert CUDA pool stuff (#3944)
* Revert "cuda : add ROCM aliases for CUDA pool stuff (#3918)"

This reverts commit 629f917cd6.

* Revert "cuda : use CUDA memory pool with async memory allocation/deallocation when available (#3903)"

This reverts commit d6069051de.

ggml-ci
2023-11-05 09:12:13 +02:00
Concedo
351dcabd3e lite fix 2023-11-05 14:47:02 +08:00
Concedo
faae84ee1d removed c flag in wget 2023-11-05 10:21:28 +08:00
henk717
02595f9d21
Colabcpp improvements (#512)
* Aria2

* Aria2 Typo fix

* Streamlined Wget

* Streamlining Fix

* Back to .so downloading

* Crash colab if no GPU is present

* Created using Colaboratory

* Restore proper link

Colab overwrite the link, manually changing it back so people don't land on my branch.

* Restore file juggle

* Fixing the colab link... again
2023-11-05 10:19:09 +08:00
Concedo
5e5be717c3 fix for removing inaccessible backends in gui 2023-11-05 10:12:12 +08:00
Kerfuffle
f28af0d81a
gguf-py: Support 01.AI Yi models (#3943) 2023-11-04 16:20:34 -06:00
Concedo
1e7088a80b autopick cublas in gui if possible, better layer picking logic 2023-11-05 01:35:27 +08:00
Concedo
7a8c0df2e5 Merge branch 'master' into concedo_experimental 2023-11-04 09:18:28 +08:00
Concedo
135001abc4 try to make the tunnel more reliable 2023-11-04 09:18:19 +08:00
Concedo
38471fbe06 tensor core info better printout (+1 squashed commits)
Squashed commits:

[be4ef93f] tensor core info better printout
2023-11-04 08:38:25 +08:00
Peter Sugihara
d9b33fe95b
metal : round up to 16 to fix MTLDebugComputeCommandEncoder assertion (#3938) 2023-11-03 21:18:18 +02:00
Xiao-Yong Jin
5ba3746171
ggml-metal: fix yarn rope (#3937) 2023-11-03 14:00:31 -04:00
Concedo
36f43ae834 syntax correction 2023-11-04 00:03:45 +08:00
Concedo
9bc2e35b2e Merge branch 'master' into concedo_experimental 2023-11-03 23:51:32 +08:00
Concedo
373c20ad51 print error log if tunnel fails 2023-11-03 23:48:21 +08:00
slaren
abb77e7319
ggml-cuda : move row numbers to x grid dim in mmv kernels (#3921) 2023-11-03 12:13:09 +01:00
Concedo
c794fd5ceb sampler seed added (+1 squashed commits)
Squashed commits:

[8a1b0d3d] sampler seed added
2023-11-03 17:30:16 +08:00
Concedo
d7729ac3eb Merge branch 'master' into concedo_experimental 2023-11-03 16:00:05 +08:00
Georgi Gerganov
8f961abdc4
speculative : change default p_accept to 0.5 + CLI args (#3919)
ggml-ci
2023-11-03 09:41:56 +02:00
Georgi Gerganov
05816027d6
common : YAYF (yet another YARN fix) (#3925)
ggml-ci
2023-11-03 09:24:00 +02:00
cebtenzzre
3fdbe6b66b
llama : change yarn_ext_factor placeholder to -1 (#3922) 2023-11-03 08:31:58 +02:00
Concedo
8c14c81b33 hopefully this fixes the dotnet nonsense 2023-11-03 11:23:56 +08:00
Concedo
bc2027b008 Merge remote-tracking branch 'ceb/fix-fast-ext-factor' into concedo_experimental 2023-11-03 11:21:14 +08:00
Concedo
c07c9b857d Merge branch 'master' into concedo_experimental
# Conflicts:
#	README.md
2023-11-03 11:17:07 +08:00
cebtenzzre
25fef506cf llama : change yarn_ext_factor placeholder to -1 2023-11-02 21:53:59 -04:00
Kerfuffle
629f917cd6
cuda : add ROCM aliases for CUDA pool stuff (#3918) 2023-11-02 21:58:22 +02:00
Andrei
51b2fc11f7
cmake : fix relative path to git submodule index (#3915) 2023-11-02 21:40:31 +02:00
Georgi Gerganov
224e7d5b14
readme : add notice about #3912 2023-11-02 20:44:12 +02:00
Georgi Gerganov
c7743fe1c1
cuda : fix const ptrs warning causing ROCm build issues (#3913) 2023-11-02 20:32:11 +02:00
Oleksii Maryshchenko
d6069051de
cuda : use CUDA memory pool with async memory allocation/deallocation when available (#3903)
* Using cuda memory pools for async alloc/dealloc.

* If cuda device doesnt support memory pool than use old implementation.

* Removed redundant cublasSetStream

---------

Co-authored-by: Oleksii Maryshchenko <omaryshchenko@dtis.com>
2023-11-02 19:10:39 +02:00
Concedo
879061c5d5 noavx2 clblast selector 2023-11-02 23:13:16 +08:00
Concedo
c7c3f3d9ab updated lite 2023-11-02 22:46:54 +08:00
Concedo
b0c7b88eac try fix clouflare tunnel (+2 squashed commit)
Squashed commit:

[87d96bf2] update remote option

[c30bc909] updated fixed colab (+1 squashed commits)

Squashed commits:

[97b77563] updated fixed colab (+2 squashed commit)

Squashed commit:

[d851b04c] replaced cloudflare manual dl with remotetunnel in colab

[90ff1790] updated lite
2023-11-02 22:27:35 +08:00
Georgi Gerganov
4ff1046d75
gguf : print error for GGUFv1 files (#3908) 2023-11-02 16:22:30 +02:00
Concedo
6dbb8d82b0 Merge branch 'master' into concedo_experimental
# Conflicts:
#	CMakeLists.txt
#	models/ggml-vocab-llama.gguf
2023-11-02 20:51:45 +08:00
Concedo
42eabf2f2f rope fixes 2023-11-02 20:41:16 +08:00
slaren
21958bb393
cmake : disable LLAMA_NATIVE by default (#3906) 2023-11-02 14:10:33 +02:00
Concedo
bc4ff72317 not working merge 2023-11-02 17:52:40 +08:00
Georgi Gerganov
2756c4fbff
gguf : remove special-case code for GGUFv1 (#3901)
ggml-ci
2023-11-02 11:20:21 +02:00
Georgi Gerganov
1efae9b7dc
llm : prevent from 1-D tensors being GPU split (#3697) 2023-11-02 09:54:44 +02:00
Concedo
fca7a4c054 added noavx2 model for clblast (+1 squashed commits)
Squashed commits:

[291ecae6] added noavx2 mode for clblast (+1 squashed commits)

Squashed commits:

[562bc872] wip adding noavx2 cl
2023-11-02 15:22:34 +08:00
cebtenzzre
b12fa0d1c1
build : link against build info instead of compiling against it (#3879)
* cmake : fix build when .git does not exist

* cmake : simplify BUILD_INFO target

* cmake : add missing dependencies on BUILD_INFO

* build : link against build info instead of compiling against it

* zig : make build info a .cpp source instead of a header

Co-authored-by: Matheus C. França <matheus-catarino@hotmail.com>

* cmake : revert change to CMP0115

---------

Co-authored-by: Matheus C. França <matheus-catarino@hotmail.com>
2023-11-02 08:50:16 +02:00
Georgi Gerganov
4d719a6d4e
cuda : check if this fixes Pascal card regression (#3882) 2023-11-02 08:35:10 +02:00