regression of #4490
Adds defines for two new datatypes
cublasComputeType_t, cudaDataType_t.
Currently using deprecated hipblasDatatype_t since newer ones very recent.
* build : Check the ROCm installation location
* more generic approach
* fixup! It was returning the path instead of the command output
* fixup! Trailing whitespace
* Add API key authentication for enhanced server-client security
* server : to snake_case
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* ggml : group mul_mat_id rows by matrix (cpu only)
* remove mmid parameters from mm forward
* store row groups in wdata and calculate only once in GGML_TASK_INIT
ggml-ci
* Fixes "Not enough space in the context's memory pool" encountered on certain models, which seems to be caused by some imprecision related to the automatic casting of floating point values
* do not cast to size_t, instead just use doubles
* ggml : add ggml_row_size(), deprecate ggml_type_sizef()
* ggml : fix row size compute to avoid overflows
* tests : fix sizey -> sizez
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* .sh script V1
* koboldcpp.sh polish
* koboldcpp.sh dist generator
* Include html's in dist
* RWKV in Linux Dist
* Lower dependency requirements
* Eliminate wget dependency
* More distinct binary name
I know its technically amd64, but I don't want to cause confusion among nvidia users.
* Use System OpenCL
Unsure how this will behave in the pyinstaller build, but pocl ended up CPU only. With a bit of luck the pyinstaller uses the one from the actual system if compiled in a system without opencl, while conda now includes it for that specific system.
* Add cblas dependency
Missing this causes compile failures on some system's
* ICD workaround
Ideally we find a better solution, but conda forces ICD and needs this for the successful compile. However, pyinstaller then embeds the ICD causing it to be limited to the system it was compiled for. By temporarily removing the ICD pyinstaller can't find it and everything remains functional. Ideally we do this on a pyinstaller level, but I could not find any good options to do so yet.
* Fix & Nocuda
* Automatically build Linux Binary
* Auto build on v tag
* Better on release
* Fix missing jobs:
* More distinct name
* I am to retro...
* Fix release upload
* Another upload attempt
* Another upload attempt
* Also rebuild on release edit
* Placebo commit to maybe fix CI
---------
Co-authored-by: root <root@DESKTOP-DQ1QRAG>
* sync : ggml (SD ops, tests, kernels)
ggml-ci
* cuda : restore im2col
ggml-ci
* metal : fix accuracy of dequantization kernels
ggml-ci
* cuda : restore correct im2col
ggml-ci
* metal : try to fix moe test by reducing expert size
ggml-ci
* cuda : fix bin bcast when src1 and dst have different types
ggml-ci
---------
Co-authored-by: slaren <slarengh@gmail.com>
* .sh script V1
* koboldcpp.sh polish
* koboldcpp.sh dist generator
* Include html's in dist
* RWKV in Linux Dist
* Lower dependency requirements
* Eliminate wget dependency
* More distinct binary name
I know its technically amd64, but I don't want to cause confusion among nvidia users.
* Use System OpenCL
Unsure how this will behave in the pyinstaller build, but pocl ended up CPU only. With a bit of luck the pyinstaller uses the one from the actual system if compiled in a system without opencl, while conda now includes it for that specific system.
* Add cblas dependency
Missing this causes compile failures on some system's
* ICD workaround
Ideally we find a better solution, but conda forces ICD and needs this for the successful compile. However, pyinstaller then embeds the ICD causing it to be limited to the system it was compiled for. By temporarily removing the ICD pyinstaller can't find it and everything remains functional. Ideally we do this on a pyinstaller level, but I could not find any good options to do so yet.
* Fix & Nocuda
---------
Co-authored-by: root <root@DESKTOP-DQ1QRAG>