cuda : improve cuda pool efficiency using virtual memory (#4606)
* cuda : improve cuda pool efficiency using virtual memory * fix mixtral * fix cmake build * check for vmm support, disable for hip ggml-ci * fix hip build * clarify granularity * move all caps to g_device_caps * refactor error checking * add cuda_pool_alloc, refactor most pool allocations ggml-ci * fix hip build * CUBLAS_TF32_TENSOR_OP_MATH is not a macro * more hip crap * llama : fix msvc warnings * ggml : fix msvc warnings * minor * minor * cuda : fallback to CPU on host buffer alloc fail * Update ggml-cuda.cu Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * Update ggml-cuda.cu Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * ensure allocations are always aligned * act_size -> actual_size --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
This commit is contained in:
parent
708e179e85
commit
5bf3953d7e
8 changed files with 328 additions and 208 deletions
|
@ -883,9 +883,6 @@ int main(int argc, const char ** argv) {
|
|||
srand(seed);
|
||||
const int nargs = 1;
|
||||
|
||||
int64_t ne2[4];
|
||||
ne2[0] = 1;
|
||||
|
||||
for (int ndims = 1; ndims <= 2; ++ndims) {
|
||||
x[0] = get_random_tensor_f32(ctx0, ndims, ne, -1.0f, 1.0f);
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue