Commit graph

1176 commits

Author SHA1 Message Date
Alon
457de66764
Merge branch 'master' into add-code-coverage 2023-09-03 11:38:16 +03:00
Alon Faraj
2c6ae862a4 wrap coverage output files in COV_TARGETS 2023-09-03 11:29:04 +03:00
Kawrakow
ca82cf7bac
metal : more optimizations (#2959)
* Very minor speedup via simd-group synchronization in f16 x f32

* Another very minor speedup on metal

* Quite significant PP speedup on metal

* Another attempt

* Minor

* Massive improvement for TG for fp16

* ~4-5% improvement for Q8_0 TG on metal

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-03 11:06:22 +03:00
kchro3
6a31a3bd98
swift : add support for k-quants (#2983) 2023-09-03 09:21:05 +03:00
Kerfuffle
cff7b0bf07
convert.py : BPE fixes (#2938)
* convert.py: BPE fixes?

* Remove unnecessary conditional in addl token error handling
2023-09-03 08:52:13 +03:00
Ido S
340af42f09
docs : add catai to README.md (#2967) 2023-09-03 08:50:51 +03:00
momonga
c42f0ec6b3
examples : fix gpt-neox (#2943)
Co-authored-by: mmnga <mmnga1mmnga@gmail.com>
2023-09-03 08:36:28 +03:00
kchro3
2753415afd
swift : add missing c file to Package.swift (#2978) 2023-09-03 08:27:25 +03:00
Cebtenzzre
bc054af97a
make : support overriding CFLAGS/CXXFLAGS/CPPFLAGS/LDFLAGS (#2886)
* make : remove unused -DGGML_BIG_ENDIAN

* make : put preprocessor stuff in CPPFLAGS

* make : pass Raspberry Pi arch flags to g++ as well

* make : support overriding CFLAGS/CXXFLAGS/CPPFLAGS/LDFLAGS

* make : fix inverted conditional
2023-09-03 08:26:59 +03:00
Kerfuffle
3358c381f6
logging: Fix creating empty file even when disabled (#2966)
* logging: Fix creating empty file even when disabled

* Minor formatting fix

Co-authored-by: staviq <staviq@gmail.com>

---------

Co-authored-by: staviq <staviq@gmail.com>
2023-09-02 11:53:55 -06:00
bandoti
52315a4216
readme : update clblast instructions (#2903)
* Update Windows CLBlast instructions

* Update Windows CLBlast instructions

* Remove trailing whitespace
2023-09-02 15:53:18 +03:00
Karsten Weiss
8b56b4f2c3
metal : show all Metal device instances in the system (#2952)
* ggml_metal_init: Show all Metal device instances in the system

Also show the default Metal device that was picked.

* Update ggml-metal.m

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-02 15:29:09 +03:00
Jhen-Jie Hong
21f3d1be86
k-quants : fix build on armv7 (android only) (#2920)
* k-quants : fix build on armv7

* ggml : cleanup unused arm32 specific impl

* k-quants : avoid some unused vzero / mzero define

* ggml-alloc : use 4g for MEASURE_MAX_SIZE in 32-bit arm
2023-09-02 15:23:45 +03:00
Jhen-Jie Hong
571083f508
server : avoid aniprompt in probabilities of final response (#2849) 2023-09-02 08:31:46 +08:00
Engininja2
f04d002844
cuda : vsubss4 for older versions of ROCm/clang (#2942) 2023-09-01 23:33:19 +02:00
ZHAOKAI WANG
69fdbb9abc
readme : quick start command fix (#2908)
* quick start command fix

* quick start win command fix
2023-09-01 17:06:44 +03:00
Kerfuffle
5d6f19f16b
Allow quantize to only copy tensors, some other improvements (#2931)
* Allow quantize tool to only copy tensors to allow repackaging models.

* Slightly better logic when requantizing.

* Change help message to go to `stdout`.
2023-09-01 08:02:48 -06:00
Georgi Gerganov
0d58936686
llama2c : rename function 2023-09-01 17:01:11 +03:00
Cebtenzzre
6c9c23429b
make : use unaligned vector moves on MinGW (#2945)
Fixes #2922
2023-09-01 16:53:14 +03:00
m3ndax
ee8654bcd0
minor : add const qualifiers (#2853)
* made the methods const

# Conflicts:
#	examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp

* made method const

* Update convert-llama2c-to-ggml.cpp

removed write_raw and write_u32

* llama2c : remove misleading const

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-09-01 16:47:27 +03:00
Konstantin Herud
49bb9cbe0f
docs : add java-llama.cpp to README.md (#2935) 2023-09-01 16:36:14 +03:00
Cebtenzzre
ef15649972
build : fix most gcc and clang warnings (#2861)
* fix most gcc and clang warnings

* baby-llama : remove commented opt_params_adam

* fix some MinGW warnings

* fix more MinGW warnings
2023-09-01 16:34:50 +03:00
Ben Siraphob
d8d6977f48
examples : add C grammar (#2357) 2023-09-01 16:32:14 +03:00
Tameem
5aec2cfaac
ggml : add RISC-V vector intrinsics support (#2929)
* added support for RISCV CFLAGS & native compile + cross compile options

* Add RISC-V Vector Intrinsics Support

Added RVV intrinsics for following
   ggml_vec_dot_q4_0_q8_0
   ggml_vec_dot_q4_1_q8_1
   ggml_vec_dot_q5_0_q8_0
   ggml_vec_dot_q5_1_q8_1
   ggml_vec_dot_q8_0_q8_0

Co-authored-by: Sharafat <sharafat.hussain@10xengineers.ai>
Signed-off-by: Ahmad Tameem <ahmad.tameem@10xengineers.ai>

---------

Signed-off-by: Ahmad Tameem <ahmad.tameem@10xengineers.ai>
Co-authored-by: moiz.hussain <moiz.hussain@10xengineers.ai>
Co-authored-by: Sharafat <sharafat.hussain@10xengineers.ai>
2023-09-01 16:27:40 +03:00
Georgi Gerganov
13268c5331
metal : slight speed-up for add and mul kernels (#2917) 2023-09-01 13:42:41 +03:00
staviq
4dcd47d71d
logs : fix mingw-like builds (fixes #2898) (#2911)
* fix mingw-like builds

* formatting

* make LOG_COMPAT easier to override and extend

* simplify win detection

* fix for #2940
2023-09-01 12:07:06 +03:00
Cebtenzzre
18705a30ef
llama2c : fix segfault and alloc-dealloc-mismatch (#2913)
* llama2c : fix segfault if vocab is not found

* llama2c : fix mismatch between new[] and delete

* llama2c : fix basename on Windows

* llama2c : use a destructor to prevent memory leaks
2023-09-01 12:03:49 +03:00
Kawrakow
e8d9158925
metal: somewhat faster f16 x f32 matrix multiply kernel (#2951)
* Somewhat faster f16 x f32 matrix multiply kernel

* Better use 32 thread groups for f16 x f32

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2023-09-01 11:15:57 +03:00
Cebtenzzre
bce1fef328
convert : fix another python 3.8 issue (#2949) 2023-08-31 22:13:51 -04:00
slaren
528134dd02
remove convert-llama-7b-pth-to-gguf.py and convert-llama-hf-to-gguf.py (#2906) 2023-09-01 01:32:09 +02:00
Kerfuffle
aeefac4ff7
scripts: Use local gguf package when running from repo (#2927)
* scripts: Use local gguf when running from repo
2023-08-31 16:49:24 -06:00
DannyDaemonic
e8422de39e
@vxiiduu's fix for PrefetchVirtualMemory (#2930)
Reimplement fix for `PrefetchVirtualMemory`.
Co-authored-by: vxiiduu <73044267+vxiiduu@users.noreply.github.com>
2023-08-31 04:21:45 -07:00
Alon Faraj
027930c0b7 remove unused GGML_NITER from workflows 2023-08-31 13:59:46 +03:00
Alon Faraj
12b611ace0 update make file -pb flag 2023-08-31 12:53:56 +03:00
Alon Faraj
eb2b59c2f1 install lcov 2023-08-31 12:28:32 +03:00
Alon Faraj
a589615fd3 - add CODECOV_TOKEN
- add missing make lcov-report
2023-08-31 12:19:25 +03:00
Alon Faraj
a62df95d35 add LLAMA_CODE_COVERAGE=1 again 2023-08-31 12:16:23 +03:00
Alon Faraj
5a4426c633 add env vars 2023-08-31 12:14:27 +03:00
Alon Faraj
a5e2aa33f7 check why the job hang 2023-08-31 12:11:15 +03:00
Alon Faraj
c4ced68344 use gcc-8 2023-08-31 12:02:03 +03:00
Alon Faraj
89ff60e941 wun on ubuntu 20.04 2023-08-31 11:57:52 +03:00
Alon Faraj
09e0f58692 update code coverage workflow 2023-08-31 11:56:39 +03:00
Alon Faraj
997a88c213 add code-coverage workflow 2023-08-31 11:43:31 +03:00
Cebtenzzre
92d0b751a7
convert : fix python 3.8 support, modernize type annotations (#2916)
* convert : fix python 3.8 support

* convert : sort imports

* convert : fix required parameters in convert-llama-ggmlv3-to-gguf

* convert : fix mypy errors in convert-llama-ggmlv3-to-gguf

* convert : use PEP 585 generics and PEP 604 unions

Now that we have `from __future__ import annotations`, we can use this
modern syntax in Python 3.7 instead of restricting support to Python 3.9
or 3.10 respectively.

* gguf.py : a tuple is already a tuple

* add mypy.ini

* convert : add necessary `type: ignore` comments

* gguf-py: bump version
2023-08-31 08:02:23 +03:00
Johannes Gäßler
8afe228000
CUDA: mul_mat_q=true llama_context_params default (#2912) 2023-08-30 21:46:19 +02:00
Henri Vasserman
71d6975559
[Docker] fix tools.sh argument passing. (#2884)
* [Docker] fix tools.sh argument passing.

This should allow passing multiple arguments to containers with
the full image that are using the tools.sh frontend.

Fix from https://github.com/ggerganov/llama.cpp/issues/2535#issuecomment-1697091734
2023-08-30 19:14:53 +03:00
Alon Faraj
9129ba88f8 makefile: add coverage support (lcov, gcovr) 2023-08-30 18:52:59 +03:00
Alon Faraj
f0e1978d38 update .gitignore 2023-08-30 18:51:22 +03:00
Georgi Gerganov
b532a69b2f
convert.py : use dir name to name the llama 2023-08-30 13:29:40 +03:00
Georgi Gerganov
c90d135eb4
examples : fix underscore in beam-search + .gitignore (close #2900) 2023-08-30 12:53:24 +03:00