ROCm Port (#1087)

* use hipblas based on cublas
* Update Makefile for the Cuda kernels
* Expand arch list and make it overrideable
* Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5)
* add hipBLAS to README
* new build arg LLAMA_CUDA_MMQ_Y
* fix half2 decomposition
* Add intrinsics polyfills for AMD
* AMD assembly optimized __dp4a
* Allow overriding CC_TURING
* use "ROCm" instead of "CUDA"
* ignore all build dirs
* Add Dockerfiles
* fix llama-bench
* fix -nommq help for non CUDA/HIP

---------

Co-authored-by: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com>
Co-authored-by: funnbot <22226942+funnbot@users.noreply.github.com>
Co-authored-by: Engininja2 <139037756+Engininja2@users.noreply.github.com>
Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>
Co-authored-by: jammm <2500920+jammm@users.noreply.github.com>
Co-authored-by: jdecourval <7315817+jdecourval@users.noreply.github.com>

This commit is contained in:

Henri Vasserman

2023-08-25 12:09:42 +03:00

• committed by

GitHub

parent 3f460a2b72

commit 6bbc598a63

No known key found for this signature in database

GPG key ID: 4AEE18F83AFDEB23

12 changed files with 335 additions and 59 deletions

									
										8

ggml-cuda.h
									
										View file
										
				@ -2,6 +2,14 @@

				#include "ggml.h"

				#ifdef GGML_USE_HIPBLAS

				#define GGML_CUDA_NAME "ROCm"

				#define GGML_CUBLAS_NAME "hipBLAS"

				#else

				#define GGML_CUDA_NAME "CUDA"

				#define GGML_CUBLAS_NAME "cuBLAS"

				#endif

				#ifdef  __cplusplus

				extern "C" {

				#endif

Rows
Columns

ROCm Port (#1087)

8 ggml-cuda.h Unescape Escape View file

8

ggml-cuda.h

View file