Find a file
2024-11-06 11:20:36 +08:00
.devops docker : update CUDA images (#9213) 2024-08-28 13:20:36 +02:00
.github docker : build images only once (#9225) 2024-08-28 17:28:00 +02:00
ci ci : add VULKAN support to ggml-ci (#9055) 2024-08-26 12:19:39 +03:00
cmake vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
common disable <cxxabi.h> for MSC_VER 2024-11-04 05:45:52 +00:00
docs docker : update CUDA images (#9213) 2024-08-28 13:20:36 +02:00
examples add returned string (pure c const char* type) for omni-vlm inference api 2024-11-06 11:20:36 +08:00
ggml_llama update vulkan target name 2024-10-23 20:54:39 +00:00
gguf-py llama : support for falcon-mamba architecture (#9074) 2024-08-21 11:06:36 +03:00
grammars readme : fix typo [no ci] (#8389) 2024-07-09 09:16:00 +03:00
include llama : simplify Mamba with advanced batch splits (#8526) 2024-08-21 17:58:11 -04:00
media README: add graphic for matrix multiplication (#6881) 2024-04-24 21:29:13 +02:00
models tests : re-enable tokenizer tests (#8611) 2024-07-22 13:32:49 +03:00
pocs build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
prompts llama : add Qwen support (#4281) 2023-12-01 20:16:31 +02:00
requirements Refactor lora adapter support (#8332) 2024-07-15 20:50:47 +02:00
scripts sync : ggml 2024-08-27 22:41:27 +03:00
spm-headers llama : reorganize source code + improve CMake (#8006) 2024-06-26 18:33:02 +03:00
src support ggml 2024-09-10 20:50:54 +00:00
tests sync : ggml 2024-08-27 22:41:27 +03:00
.clang-tidy cuda : refactor into multiple files (#6269) 2024-03-25 13:50:23 +01:00
.dockerignore build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
.ecrc common : Update stb_image.h to latest version (#9161) 2024-08-27 08:58:50 +03:00
.editorconfig cvector: fix CI + correct help message (#8064) 2024-06-22 18:11:30 +02:00
.flake8 py : logging and flake8 suppression refactoring (#7081) 2024-05-05 08:07:48 +03:00
.gitignore tests : add integration test for lora adapters (#8957) 2024-08-18 11:58:04 +02:00
.gitmodules llama : reorganize source code + improve CMake (#8006) 2024-06-26 18:33:02 +03:00
.pre-commit-config.yaml convert.py : add python logging instead of print() (#6511) 2024-05-03 22:36:41 +03:00
AUTHORS authors : regen 2024-06-26 19:36:44 +03:00
CMakeLists.txt Disable cxxabi.h dependency on Windows 2024-11-04 03:48:20 +00:00
CMakePresets.json [SYCL] Add oneDNN primitive support (#9091) 2024-08-22 12:50:10 +08:00
CONTRIBUTING.md contributing : add note about write access 2024-08-06 11:48:01 +03:00
convert_hf_to_gguf.py llama : fix llama3.1 rope_freqs not respecting custom head_dim (#9141) 2024-08-27 09:53:40 +03:00
convert_hf_to_gguf_update.py llama : add EXAONE model support (#9025) 2024-08-16 09:35:18 +03:00
convert_llama_ggml_to_gguf.py py : fix wrong input type for raw_dtype in ggml to gguf scripts (#8928) 2024-08-16 13:36:30 +03:00
convert_lora_to_gguf.py lora : fix llama conversion script with ROPE_FREQS (#9117) 2024-08-23 12:58:53 +02:00
flake.lock flake.lock: Update (#9162) 2024-08-28 21:28:14 -07:00
flake.nix build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809) 2024-06-13 00:41:52 +01:00
LICENSE license : update copyright notice + add AUTHORS (#6405) 2024-04-09 09:23:19 +03:00
Makefile support ggml 2024-09-10 20:50:54 +00:00
mypy.ini convert : partially revert PR #4818 (#5041) 2024-01-20 18:14:18 -05:00
Package.swift llama : move vocab, grammar and sampling into separate files (#8508) 2024-07-23 13:10:17 +03:00
poetry.lock build(python): Package scripts with pip-0517 compliance 2024-07-04 15:39:13 +00:00
pyproject.toml doc: Add context for why we add an explicit pytorch source 2024-07-04 15:39:13 +00:00
pyrightconfig.json py : type-check all Python scripts with Pyright (#8341) 2024-07-07 15:04:39 -04:00
README.md update README after renaming GGML 2024-09-10 20:53:14 +00:00
requirements.txt Refactor lora adapter support (#8332) 2024-07-15 20:50:47 +02:00
SECURITY.md chore: Fix markdown warnings (#6625) 2024-04-12 10:52:36 +02:00

Llama cpp

The original Llama cpp implementation is available at here.

To build this project:

make clean
cmake -B build
cmake --build build --config Release -j 24

Checking in Your Code Changes

To commit your local code changes and push them to your repository, use the following steps:

  1. Stage all changes:

    git add .
    
  2. Commit the changes with a descriptive message:

    git commit -m "Describe your changes here"
    
  3. Push the changes to your master branch:

    git push origin master
    

Syncing with Upstream Changes

To pull the latest changes from the upstream repository (ggerganov/llama.cpp), follow these steps:

  1. Add the upstream repository if you haven't done so already:

    git remote add upstream https://github.com/ggerganov/llama.cpp
    
  2. Fetch the latest changes from the upstream repository:

    git fetch upstream
    
  3. Merge the upstream changes into your local master branch:

    git merge upstream/master
    
  4. If necessary, commit the merge (if there were any conflicts to resolve):

    git commit -m "Merge from upstream"
    
  5. Push the merged changes to your origin/master branch:

    git push origin master
    

Error handling

If you got below error for kompute:

fatal: cannot chdir to '../../../ggml/src/kompute': No such file or directory

You can fix it by running below command:

git reset ggml_llama/src/kompute