Przemysław Pawełczyk 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fec2fb19e4 
								
							 
						 
						
							
							
								
								ggml : posixify madvise and pagesize ( #3037 )  
							
							... 
							
							
							
							* llama : use posix_madvise() instead of madvise() derived from BSD
sed -i 's,\<madvise\>,posix_&,g;s,\<MADV_,POSIX_&,g' llama.cpp
* ggml : use sysconf(_SC_PAGESIZE) instead of getpagesize() derived from BSD
sed -i 's,getpagesize(),sysconf(_SC_PAGESIZE),g' ggml.c
* metal : use sysconf(_SC_PAGESIZE) instead of getpagesize() derived from BSD
sed -i 's,getpagesize(),sysconf(_SC_PAGESIZE),g' ggml-metal.m 
							
						 
						
							2023-09-07 11:15:06 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								178b1850eb 
								
							 
						 
						
							
							
								
								k-quants : fix zero-weight guard in Q6_K (ref  #3040 )  
							
							
							
						 
						
							2023-09-06 12:40:57 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Kerfuffle 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ea2c85d5d2 
								
							 
						 
						
							
							
								
								convert-llama-ggml-to-gguf: Try to handle files older than GGJTv3 ( #3023 )  
							
							... 
							
							
							
							* convert-llama-ggmlv3-to-gguf: Try to handle files older than GGJTv3
* Better error messages for files that cannot be converted
* Add file type to GGUF output
* Rename to convert-llama-ggml-to-gguf.py
* Include original file type information in description
* Improve some informational output 
							
						 
						
							2023-09-06 02:49:11 -06:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Cebtenzzre 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9912b9efc8 
								
							 
						 
						
							
							
								
								build : add LLAMA_METAL_NDEBUG flag ( #3033 )  
							
							
							
						 
						
							2023-09-05 18:21:10 -04:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Cebtenzzre 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9e2023156e 
								
							 
						 
						
							
							
								
								make : use new flag variables for recent changes ( #3019 )  
							
							
							
						 
						
							2023-09-05 15:12:00 -04:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Cebtenzzre 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								de2fe892af 
								
							 
						 
						
							
							
								
								examples : replace fprintf to stdout with printf ( #3017 )  
							
							
							
						 
						
							2023-09-05 15:10:27 -04:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Erik Scholz 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c9c3220c48 
								
							 
						 
						
							
							
								
								convert: fix convert.py not working with int filename_stem ( #3028 )  
							
							... 
							
							
							
							* fix implicit int to string conversion
* convert : remove an obsolete pyright comment
---------
Co-authored-by: Cebtenzzre <cebtenzzre@gmail.com> 
							
						 
						
							2023-09-05 19:41:00 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Kawrakow 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d59bd97065 
								
							 
						 
						
							
							
								
								Guard against all weights in a super-block being zero ( #3010 )  
							
							... 
							
							
							
							* Guard against all weights in a super-block being zero
* Also guard against extremely small weights
Closes  #2982  
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com> 
							
						 
						
							2023-09-05 09:55:33 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								35938ee3b0 
								
							 
						 
						
							
							
								
								llama : update logic for number of threads when using BLAS  
							
							
							
						 
						
							2023-09-05 10:46:39 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								921772104b 
								
							 
						 
						
							
							
								
								speculative : add grammar support ( #2991 )  
							
							... 
							
							
							
							* speculative : add grammar support
* grammars : add json_arr.gbnf
* grammar : add comments to new grammar file
* grammar : remove one nested level
* common : warm-up with 2 tokens - seems to work better
* speculative : print draft token pieces
* speculative : reuse grammar parser + better logs and comments
* speculative : avoid grammar_mem
* make : fix speculative build 
							
						 
						
							2023-09-05 08:46:17 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2ba85c8609 
								
							 
						 
						
							
							
								
								py : minor  
							
							
							
						 
						
							2023-09-04 22:50:50 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e36ecdccc8 
								
							 
						 
						
							
							
								
								build : on Mac OS enable Metal by default ( #2901 )  
							
							... 
							
							
							
							* build : on Mac OS enable Metal by default
* make : try to fix build on Linux
* make : move targets back to the top
* make : fix target clean
* llama : enable GPU inference by default with Metal
* llama : fix vocab_only logic when GPU is enabled
* common : better `n_gpu_layers` assignment
* readme : update Metal instructions
* make : fix merge conflict remnants
* gitignore : metal 
							
						 
						
							2023-09-04 22:26:24 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									slaren 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bd33e5ab92 
								
							 
						 
						
							
							
								
								ggml-opencl : store GPU buffer in ggml_tensor::extra ( #2994 )  
							
							
							
						 
						
							2023-09-04 14:59:52 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Cebtenzzre 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3103568144 
								
							 
						 
						
							
							
								
								llama-bench : make cpp file non-executable ( #2999 )  
							
							
							
						 
						
							2023-09-04 13:40:18 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Leng Yue 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5b8530d88c 
								
							 
						 
						
							
							
								
								make : add speculative example ( #3003 )  
							
							
							
						 
						
							2023-09-04 13:39:57 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Aarni Koskela 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e4386f417f 
								
							 
						 
						
							
							
								
								server : add a subtle loading animation to the edit box ( #2466 )  
							
							... 
							
							
							
							* editorconfig: add override for the server HTML (which already is 2-space indented)
* server: add a subtle loading animation to the edit box 
							
						 
						
							2023-09-04 16:28:55 +08:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Jiahao Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								35195689cd 
								
							 
						 
						
							
							
								
								2x faster (rms) norm cuda kernels (3.7% e2e improvement) ( #2985 )  
							
							... 
							
							
							
							* 2x faster (rms) norm cuda kernels
* Fix code style 
							
						 
						
							2023-09-04 08:53:30 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									slaren 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cf9b08485c 
								
							 
						 
						
							
							
								
								ggml-alloc : use virtual memory for measurement ( #2973 )  
							
							... 
							
							
							
							* ggml-alloc : use virtual memory for measurement
* compatibility fixes for MAP_ANONYMOUS
* fallback to fixed address for systems without virtual memory 
							
						 
						
							2023-09-03 20:34:09 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								47068e5170 
								
							 
						 
						
							
							
								
								speculative : PoC for speeding-up inference via speculative sampling ( #2926 )  
							
							... 
							
							
							
							* speculative : initial example
* speculative : print encoding speed
* speculative : add --draft CLI arg 
							
						 
						
							2023-09-03 15:12:08 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8f429fa511 
								
							 
						 
						
							
							
								
								perplexity : fix ETA by warming up the model with an empty run  
							
							
							
						 
						
							2023-09-03 13:43:17 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Kerfuffle 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6519e9c99c 
								
							 
						 
						
							
							
								
								gguf(python): Fix special vocab handling when id < 0 ( #2984 )  
							
							
							
						 
						
							2023-09-03 04:38:43 -06:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b7f2aa9e51 
								
							 
						 
						
							
							
								
								metal : restore  363f0bf and fix reduce in F16_F32 kernels ( #2986 )  
							
							
							
						 
						
							2023-09-03 13:23:33 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Alon 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								73a12a6344 
								
							 
						 
						
							
							
								
								cov : disable comment in PRs ( #2989 )  
							
							
							
						 
						
							2023-09-03 13:19:01 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									opparco 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3730134776 
								
							 
						 
						
							
							
								
								llama : fix bpe tokenize from byte ( #2889 )  
							
							
							
						 
						
							2023-09-03 13:18:09 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d9151e6f57 
								
							 
						 
						
							
							
								
								metal : revert  6af0bab until we fix it  
							
							... 
							
							
							
							This restores the generated text to be the same as before #2959  
							
						 
						
							2023-09-03 12:40:56 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Alon 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								afc43d5f82 
								
							 
						 
						
							
							
								
								cov : add Code Coverage and codecov.io integration ( #2928 )  
							
							... 
							
							
							
							* update .gitignore
* makefile: add coverage support (lcov, gcovr)
* add code-coverage workflow
* update code coverage workflow
* wun on ubuntu 20.04
* use gcc-8
* check why the job hang
* add env vars
* add LLAMA_CODE_COVERAGE=1 again
* - add CODECOV_TOKEN
- add missing make lcov-report
* install lcov
* update make file -pb flag
* remove unused  GGML_NITER from workflows
* wrap coverage output files in COV_TARGETS 
							
						 
						
							2023-09-03 11:48:49 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Wentai Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6460f758db 
								
							 
						 
						
							
							
								
								opencl : fix a bug in ggml_cl_pool_malloc() for ggml_cl_mul_mat_f32() ( #2955 )  
							
							... 
							
							
							
							Co-authored-by: Wentai Zhang <wentaizhang@tencent.com> 
							
						 
						
							2023-09-03 11:46:44 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Kawrakow 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ca82cf7bac 
								
							 
						 
						
							
							
								
								metal : more optimizations ( #2959 )  
							
							... 
							
							
							
							* Very minor speedup via simd-group synchronization in f16 x f32
* Another very minor speedup on metal
* Quite significant PP speedup on metal
* Another attempt
* Minor
* Massive improvement for TG for fp16
* ~4-5% improvement for Q8_0 TG on metal
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 
							
						 
						
							2023-09-03 11:06:22 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									kchro3 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6a31a3bd98 
								
							 
						 
						
							
							
								
								swift : add support for k-quants ( #2983 )  
							
							
							
						 
						
							2023-09-03 09:21:05 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Kerfuffle 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cff7b0bf07 
								
							 
						 
						
							
							
								
								convert.py : BPE fixes ( #2938 )  
							
							... 
							
							
							
							* convert.py: BPE fixes?
* Remove unnecessary conditional in addl token error handling 
							
						 
						
							2023-09-03 08:52:13 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Ido S 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								340af42f09 
								
							 
						 
						
							
							
								
								docs : add catai to README.md ( #2967 )  
							
							
							
						 
						
							2023-09-03 08:50:51 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									momonga 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c42f0ec6b3 
								
							 
						 
						
							
							
								
								examples : fix gpt-neox ( #2943 )  
							
							... 
							
							
							
							Co-authored-by: mmnga <mmnga1mmnga@gmail.com> 
							
						 
						
							2023-09-03 08:36:28 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									kchro3 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2753415afd 
								
							 
						 
						
							
							
								
								swift : add missing c file to Package.swift ( #2978 )  
							
							
							
						 
						
							2023-09-03 08:27:25 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Cebtenzzre 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bc054af97a 
								
							 
						 
						
							
							
								
								make : support overriding CFLAGS/CXXFLAGS/CPPFLAGS/LDFLAGS ( #2886 )  
							
							... 
							
							
							
							* make : remove unused -DGGML_BIG_ENDIAN
* make : put preprocessor stuff in CPPFLAGS
* make : pass Raspberry Pi arch flags to g++ as well
* make : support overriding CFLAGS/CXXFLAGS/CPPFLAGS/LDFLAGS
* make : fix inverted conditional 
							
						 
						
							2023-09-03 08:26:59 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Kerfuffle 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3358c381f6 
								
							 
						 
						
							
							
								
								logging: Fix creating empty file even when disabled ( #2966 )  
							
							... 
							
							
							
							* logging: Fix creating empty file even when disabled
* Minor formatting fix
Co-authored-by: staviq <staviq@gmail.com>
---------
Co-authored-by: staviq <staviq@gmail.com> 
							
						 
						
							2023-09-02 11:53:55 -06:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									bandoti 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								52315a4216 
								
							 
						 
						
							
							
								
								readme : update clblast instructions ( #2903 )  
							
							... 
							
							
							
							* Update Windows CLBlast instructions
* Update Windows CLBlast instructions
* Remove trailing whitespace 
							
						 
						
							2023-09-02 15:53:18 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Karsten Weiss 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8b56b4f2c3 
								
							 
						 
						
							
							
								
								metal : show all Metal device instances in the system ( #2952 )  
							
							... 
							
							
							
							* ggml_metal_init: Show all Metal device instances in the system
Also show the default Metal device that was picked.
* Update ggml-metal.m
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 
							
						 
						
							2023-09-02 15:29:09 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Jhen-Jie Hong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								21f3d1be86 
								
							 
						 
						
							
							
								
								k-quants : fix build on armv7 (android only) ( #2920 )  
							
							... 
							
							
							
							* k-quants : fix build on armv7
* ggml : cleanup unused arm32 specific impl
* k-quants : avoid some unused vzero / mzero define
* ggml-alloc : use 4g for MEASURE_MAX_SIZE in 32-bit arm 
							
						 
						
							2023-09-02 15:23:45 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Jhen-Jie Hong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								571083f508 
								
							 
						 
						
							
							
								
								server : avoid aniprompt in probabilities of final response ( #2849 )  
							
							
							
						 
						
							2023-09-02 08:31:46 +08:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Engininja2 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f04d002844 
								
							 
						 
						
							
							
								
								cuda : vsubss4 for older versions of ROCm/clang ( #2942 )  
							
							
							
						 
						
							2023-09-01 23:33:19 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ZHAOKAI WANG 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								69fdbb9abc 
								
							 
						 
						
							
							
								
								readme : quick start command fix ( #2908 )  
							
							... 
							
							
							
							* quick start command fix
* quick start win command fix 
							
						 
						
							2023-09-01 17:06:44 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Kerfuffle 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5d6f19f16b 
								
							 
						 
						
							
							
								
								Allow quantize to only copy tensors, some other improvements ( #2931 )  
							
							... 
							
							
							
							* Allow quantize tool to only copy tensors to allow repackaging models.
* Slightly better logic when requantizing.
* Change help message to go to `stdout`. 
							
						 
						
							2023-09-01 08:02:48 -06:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0d58936686 
								
							 
						 
						
							
							
								
								llama2c : rename function  
							
							
							
						 
						
							2023-09-01 17:01:11 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Cebtenzzre 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6c9c23429b 
								
							 
						 
						
							
							
								
								make : use unaligned vector moves on MinGW ( #2945 )  
							
							... 
							
							
							
							Fixes  #2922  
						
							2023-09-01 16:53:14 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									m3ndax 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ee8654bcd0 
								
							 
						 
						
							
							
								
								minor : add const qualifiers ( #2853 )  
							
							... 
							
							
							
							* made the methods const
# Conflicts:
#	examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp
* made method const
* Update convert-llama2c-to-ggml.cpp
removed write_raw and write_u32
* llama2c : remove misleading const
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 
							
						 
						
							2023-09-01 16:47:27 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Konstantin Herud 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								49bb9cbe0f 
								
							 
						 
						
							
							
								
								docs : add java-llama.cpp to README.md ( #2935 )  
							
							
							
						 
						
							2023-09-01 16:36:14 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Cebtenzzre 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ef15649972 
								
							 
						 
						
							
							
								
								build : fix most gcc and clang warnings ( #2861 )  
							
							... 
							
							
							
							* fix most gcc and clang warnings
* baby-llama : remove commented opt_params_adam
* fix some MinGW warnings
* fix more MinGW warnings 
							
						 
						
							2023-09-01 16:34:50 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Ben Siraphob 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d8d6977f48 
								
							 
						 
						
							
							
								
								examples : add C grammar ( #2357 )  
							
							
							
						 
						
							2023-09-01 16:32:14 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Tameem 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5aec2cfaac 
								
							 
						 
						
							
							
								
								ggml : add RISC-V vector intrinsics support ( #2929 )  
							
							... 
							
							
							
							* added support for RISCV CFLAGS & native compile + cross compile options
* Add RISC-V Vector Intrinsics Support
Added RVV intrinsics for following
   ggml_vec_dot_q4_0_q8_0
   ggml_vec_dot_q4_1_q8_1
   ggml_vec_dot_q5_0_q8_0
   ggml_vec_dot_q5_1_q8_1
   ggml_vec_dot_q8_0_q8_0
Co-authored-by: Sharafat <sharafat.hussain@10xengineers.ai>
Signed-off-by: Ahmad Tameem <ahmad.tameem@10xengineers.ai>
---------
Signed-off-by: Ahmad Tameem <ahmad.tameem@10xengineers.ai>
Co-authored-by: moiz.hussain <moiz.hussain@10xengineers.ai>
Co-authored-by: Sharafat <sharafat.hussain@10xengineers.ai> 
							
						 
						
							2023-09-01 16:27:40 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								13268c5331 
								
							 
						 
						
							
							
								
								metal : slight speed-up for add and mul kernels ( #2917 )  
							
							
							
						 
						
							2023-09-01 13:42:41 +03:00