mpi : add support for distributed inference via MPI (#2099)

* MPI support, first cut * fix warnings, update README * fixes * wrap includes * PR comments * Update CMakeLists.txt * Add GH workflow, fix test * Add info to README * mpi : trying to move more MPI stuff into ggml-mpi (WIP) (#2099) * mpi : add names for layer inputs + prep ggml_mpi_graph_compute() * mpi : move all MPI logic into ggml-mpi Not tested yet * mpi : various fixes - communication now works but results are wrong * mpi : fix output tensor after MPI compute (still not working) * mpi : fix inference * mpi : minor * Add OpenMPI to GH action * [mpi] continue-on-error: true * mpi : fix after master merge * [mpi] Link MPI C++ libraries to fix OpenMPI * tests : fix new llama_backend API * [mpi] use MPI_INT32_T * mpi : factor out recv / send in functions and reuse * mpi : extend API to allow usage with outer backends (e.g. Metal) --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-07-10 11:49:56 -04:00 · 2023-07-10 11:49:56 -04:00 · 5656d10599
commit 5656d10599
parent 1d16309969
18 changed files with 460 additions and 35 deletions
--- a/examples/simple/simple.cpp
+++ b/examples/simple/simple.cpp
@ -66,7 +66,7 @@ int main(int argc, char ** argv)
    // Init LLM :
    //---------------------------------

-    llama_init_backend(params.numa);
+    llama_backend_init(params.numa);

    llama_model * model;
    llama_context * ctx;
@ -173,6 +173,8 @@ int main(int argc, char ** argv)
    llama_free( ctx );
    llama_free_model( model );

+    llama_backend_free();
+
    return 0;
 }