from http://github.com/ggerganov/llama.cpp.git

Find a file

Philipp Zagar b52285d056 Fix external parameters		2024-02-28 12:44:01 -08:00
.devops	nix: init singularity and docker images (#5056 )	2024-02-22 11:44:10 -08:00
.github/workflows	Add XCFramework building	2024-02-28 12:19:44 -08:00
ci	ci : reduce 3b ppl chunks to 1 to avoid timeout (#5771 )	2024-02-28 21:44:21 +02:00
cmake	cmake : MSVC instruction detection (fixed up #809 ) (#3923 )	2023-11-05 10:03:09 +02:00
common	Fix external parameters	2024-02-28 12:44:01 -08:00
docs	docs : add llama-star arch idea	2023-11-23 11:35:04 +02:00
examples	server : hit Ctrl+C twice to exit (#5734 )	2024-02-28 10:55:37 +02:00
gguf-py	llama : add `gemma` model (#5631 )	2024-02-21 15:08:22 +02:00
grammars	Fix some documentation typos/grammar mistakes (#4032 )	2023-11-11 23:04:58 -07:00
kompute@d1e3b0953c	Add XCFramework building	2024-02-28 12:19:44 -08:00
kompute-shaders	Nomic Vulkan backend (#4456 )	2024-01-29 15:50:50 -05:00
llama.xcframework	Add XCFramework building	2024-02-28 12:19:44 -08:00
llama.xcodeproj	Add XCFramework building	2024-02-28 12:19:44 -08:00
llama.xcworkspace	Add XCFramework building	2024-02-28 12:19:44 -08:00
media	media : add logos and banners	2023-04-05 18:58:31 +03:00
models	gpt2 : Add gpt2 architecture integration (#4555 )	2023-12-28 15:03:57 +01:00
pocs	ggml : add mmla kernels for quantized GEMM (#4966 )	2024-02-11 15:22:33 +02:00
prompts	llama : add Qwen support (#4281 )	2023-12-01 20:16:31 +02:00
requirements	python : add check-requirements.sh and GitHub workflow (#4585 )	2023-12-29 16:50:29 +02:00
scripts	sync : ggml	2024-02-28 11:17:32 +02:00
spm-headers	swift : package no longer use ggml dependency (#5465 )	2024-02-12 19:54:29 +02:00
swift	Add XCFramework building	2024-02-28 12:19:44 -08:00
tests	IQ4_XS: a 4.25 bpw quantization (#5747 )	2024-02-27 16:34:24 +02:00
.clang-tidy	fix some warnings from gcc and clang-tidy (#3038 )	2023-09-07 13:22:29 -04:00
.dockerignore	docker : ignore Git files (#3314 )	2023-10-02 11:53:53 +03:00
.ecrc	Nomic Vulkan backend (#4456 )	2024-01-29 15:50:50 -05:00
.editorconfig	llama.swiftui : add bench functionality (#4483 )	2023-12-17 19:38:41 +02:00
.flake8	Add support for BERT embedding models (#5423 )	2024-02-11 11:21:38 -05:00
.gitignore	Add XCFramework building	2024-02-28 12:19:44 -08:00
.gitmodules	Nomic Vulkan backend (#4456 )	2024-01-29 15:50:50 -05:00
.pre-commit-config.yaml	hooks : setting up flake8 and pre-commit hooks (#1681 )	2023-06-17 13:32:48 +03:00
build.zig	server : support llava 1.6 (#5553 )	2024-02-20 21:07:22 +02:00
CMakeLists.txt	cmake : fix compilation for Android armeabi-v7a (#5702 )	2024-02-25 12:53:11 +02:00
codecov.yml	cov : disable comment in PRs (#2989 )	2023-09-03 13:19:01 +03:00
convert-hf-to-gguf.py	py : fix StableLM conversion after config.json changes (#5703 )	2024-02-25 11:54:04 +02:00
convert-llama-ggml-to-gguf.py	convert : partially revert PR #4818 (#5041 )	2024-01-20 18:14:18 -05:00
convert-lora-to-ggml.py	add safetensors support to convert-lora-to-ggml.py (#5062 )	2024-01-21 17:28:14 +01:00
convert-persimmon-to-gguf.py	py : fix persimmon `n_rot` conversion (#5460 )	2024-02-12 19:29:57 +02:00
convert.py	llava : support v1.6 (#5267 )	2024-02-14 09:38:35 +02:00
flake.lock	flake.lock: Update	2024-02-25 22:24:22 +00:00
flake.nix	nix: now that we can do so, allow MacOS to build Vulkan binaries	2024-02-19 14:49:49 -08:00
ggml-alloc.c	ggml-alloc : apply ggml/731	2024-02-19 15:09:43 +02:00
ggml-alloc.h	sync : ggml (#5452 )	2024-02-12 09:16:06 +02:00
ggml-backend-impl.h	Introduce backend GUIDs (ggml/743)	2024-02-28 11:17:05 +02:00
ggml-backend.c	Introduce backend GUIDs (ggml/743)	2024-02-28 11:17:05 +02:00
ggml-backend.h	Introduce backend GUIDs (ggml/743)	2024-02-28 11:17:05 +02:00
ggml-cuda.cu	Introduce backend GUIDs (ggml/743)	2024-02-28 11:17:05 +02:00
ggml-cuda.h	ggml : introduce GGML_CALL function annotation (#4850 )	2024-01-16 13:16:33 +02:00
ggml-impl.h	ggml : always define ggml_fp16_t as uint16_t (#5666 )	2024-02-22 23:21:39 +02:00
ggml-kompute.cpp	Introduce backend GUIDs (ggml/743)	2024-02-28 11:17:05 +02:00
ggml-kompute.h	Nomic Vulkan backend (#4456 )	2024-01-29 15:50:50 -05:00
ggml-metal.h	metal : add debug capture backend function (ggml/694)	2024-01-30 16:20:25 +02:00
ggml-metal.m	Introduce backend GUIDs (ggml/743)	2024-02-28 11:17:05 +02:00
ggml-metal.metal	ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (#5760 )	2024-02-28 10:37:02 +02:00
ggml-mpi.c	ggml : remove src0 and src1 from ggml_tensor and rename opt to src (#2178 )	2023-07-11 19:31:10 +03:00
ggml-mpi.h	mpi : add support for distributed inference via MPI (#2099 )	2023-07-10 18:49:56 +03:00
ggml-opencl.cpp	code : normalize enum names (#5697 )	2024-02-25 12:09:09 +02:00
ggml-opencl.h	Add OpenCL add kernel (#5151 )	2024-01-26 23:07:32 +01:00
ggml-quants.c	ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (#5760 )	2024-02-28 10:37:02 +02:00
ggml-quants.h	ggml : make i-quants work with super-blocks of 64 (CPU,Metal) (#5760 )	2024-02-28 10:37:02 +02:00
ggml-sycl.cpp	Introduce backend GUIDs (ggml/743)	2024-02-28 11:17:05 +02:00
ggml-sycl.h	add --no-mmap in llama-bench (#5257 )	2024-02-01 20:48:53 +01:00
ggml-vulkan-shaders.hpp	Vulkan Intel Fixes, Optimizations and Debugging Flags (#5301 )	2024-02-03 18:15:00 +01:00
ggml-vulkan.cpp	make portability_enumeration_ext apple only (#5757 )	2024-02-28 20:33:37 +01:00
ggml-vulkan.h	Basic Vulkan Multi-GPU implementation (#5321 )	2024-02-07 07:54:50 +01:00
ggml.c	add google magika inference example (ggml/748)	2024-02-28 11:17:06 +02:00
ggml.h	Introduce backend GUIDs (ggml/743)	2024-02-28 11:17:05 +02:00
ggml_vk_generate_shaders.py	vulkan: Set limit for task concurrency (#5427 )	2024-02-09 19:30:19 +01:00
LICENSE	Add LICENSE (#21 )	2023-03-12 08:36:03 +02:00
llama-module.modulemap	Add XCFramework building	2024-02-28 12:19:44 -08:00
llama.cpp	llama : remove deprecated API (#5770 )	2024-02-28 18:43:38 +02:00
llama.h	llama : remove deprecated API (#5770 )	2024-02-28 18:43:38 +02:00
Makefile	Makefile: use variables for cublas (#5689 )	2024-02-27 03:03:06 +01:00
mypy.ini	convert : partially revert PR #4818 (#5041 )	2024-01-20 18:14:18 -05:00
Package.swift	Add XCFramework building	2024-02-28 12:19:44 -08:00
README-sycl.md	readme : fix typo in README-sycl.md (#5353 )	2024-02-19 12:37:10 +02:00
README.md	Add XCFramework building	2024-02-28 12:19:44 -08:00
requirements.txt	python : add check-requirements.sh and GitHub workflow (#4585 )	2023-12-29 16:50:29 +02:00
unicode.h	llama : improve BERT tokenization (#5740 )	2024-02-28 10:51:11 +02:00

README.md

Stanford BDHG llama.cpp

Overview

This project is a Stanford BDHG-maintained fork of the well-regarded llama.cpp, tailored for deploying LLaMA models using C/C++. Our modifications package the library as an XCFramework for distribution as a binary compatible with multiple platforms. The inclusion of a Package.swift file facilitates the integration with the Swift Package Manager (SPM).

Note

Should you have inquiries regarding the llama.cpp codebase this fork builds upon, please refer to the upstream llama.cpp README for comprehensive details and guidance.

Setup

Add Stanford BDHG llama.cpp as a Dependency

You need to add Stanford BDHG llama.cpp Swift package to your app in Xcode or Swift package.

Important

Important: In order to use the library, one needs to set build parameters in the consuming Xcode project or the consuming SPM package to enable the Swift / C++ Interop, introduced in Xcode 15 and Swift 5.9. Keep in mind that this is true for nested dependencies, one needs to set this configuration recursivly for the entire dependency tree towards the llama.cpp SPM package.

For Xcode projects:

Open your project settings in Xcode by selecting PROJECT_NAME > TARGET_NAME > Build Settings.

Within the Build Settings, search for the C++ and Objective-C Interoperability setting and set it to C++ / Objective-C++. This enables the project to use the C++ headers from llama.cpp.

For SPM packages:

Open the Package.swift file of your SPM package

Within the package target that consumes the llama.cpp package, add the interoperabilityMode(_:) Swift build setting like that:

/// Adds the dependency to the Stanford BDHG llama.cpp SPM package
dependencies: [
    .package(url: "https://github.com/StanfordBDHG/llama.cpp", .upToNextMinor(from: "0.1.0"))
],
targets: [
  .target(
      name: "ExampleConsumingTarget",
      /// State the dependence of the target to llama.cpp
      dependencies: [
          .product(name: "llama", package: "llama.cpp")
      ],
      /// Important: Configure the `.interoperabilityMode(_:)` within the `swiftSettings`
      swiftSettings: [
          .interoperabilityMode(.Cxx)
      ]
  )
]

Contributing

Contributions to this project are welcome. Please make sure to read the contribution guidelines and the contributor covenant code of conduct first. You can find a list of contributors in the CONTRIBUTORS.md file.

License

This project is a fork of an existing project that is licensed under the MIT License, and all changes made in this fork continue to be under the MIT License. For more information about the license terms, see the Licenses folder.

Our Research

For more information, check out our website at biodesigndigitalhealth.stanford.edu.