Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c083718c89 
								
							 
						 
						
							
							
								
								readme : update coding guidelines  
							
							
							
						 
						
							2023-12-21 19:27:14 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b1306c4394 
								
							 
						 
						
							
							
								
								readme : update hot topics  
							
							
							
						 
						
							2023-12-17 20:16:23 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									BarfingLemurs 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0353a18401 
								
							 
						 
						
							
							
								
								readme : update supported model list ( #4457 )  
							
							
							
						 
						
							2023-12-14 09:38:49 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								113f9942fc 
								
							 
						 
						
							
							
								
								readme : update hot topics  
							
							
							
						 
						
							2023-12-13 14:05:38 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bcc0eb4591 
								
							 
						 
						
							
							
								
								llama : per-layer KV cache + quantum K cache ( #4309 )  
							
							... 
							
							
							
							* per-layer KV
* remove unnecessary copies
* less code duplication, offload k and v separately
* llama : offload KV cache per-layer
* llama : offload K shift tensors
* llama : offload for rest of the model arches
* llama : enable offload debug temporarily
* llama : keep the KV related layers on the device
* llama : remove mirrors, perform Device -> Host when partial offload
* common : add command-line arg to disable KV cache offloading
* llama : update session save/load
* llama : support quantum K cache (#4312 )
* llama : support quantum K cache (wip)
* metal : add F32 -> Q8_0 copy kernel
* cuda : add F32 -> Q8_0 copy kernel
ggml-ci
* cuda : use mmv kernel for quantum cache ops
* llama : pass KV cache type through API
* llama : fix build
ggml-ci
* metal : add F32 -> Q4_0 copy kernel
* metal : add F32 -> Q4_1 copy kernel
* cuda : wip
* cuda : add F32 -> Q4_0 and F32 -> Q4_1 copy kernels
* llama-bench : support type_k/type_v
* metal : use mm kernel only for quantum KV cache
* cuda : add comment
* llama : remove memory_f16 and kv_f16 flags
---------
Co-authored-by: slaren <slarengh@gmail.com>
* readme : add API change notice
---------
Co-authored-by: slaren <slarengh@gmail.com> 
							
						 
						
							2023-12-07 13:03:17 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									vodkaslime 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								524907aa76 
								
							 
						 
						
							
							
								
								readme : fix ( #4135 )  
							
							... 
							
							
							
							* fix: readme
* chore: resolve comments
* chore: resolve comments 
							
						 
						
							2023-11-30 23:49:21 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Dawid Wysocki 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								74daabae69 
								
							 
						 
						
							
							
								
								readme : fix typo ( #4253 )  
							
							... 
							
							
							
							llama.cpp uses GitHub Actions, not Gitlab Actions. 
							
						 
						
							2023-11-30 23:43:32 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Peter Sugihara 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4fea3420ee 
								
							 
						 
						
							
							
								
								readme : add FreeChat ( #4248 )  
							
							
							
						 
						
							2023-11-29 09:16:34 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Kasumi 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0dab8cd7cc 
								
							 
						 
						
							
							
								
								readme : add Amica to UI list ( #4230 )  
							
							
							
						 
						
							2023-11-27 19:39:42 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9656026b53 
								
							 
						 
						
							
							
								
								readme : update hot topics  
							
							
							
						 
						
							2023-11-26 20:42:51 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								04814e718e 
								
							 
						 
						
							
							
								
								readme : update hot topics  
							
							
							
						 
						
							2023-11-25 12:02:13 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Aaryaman Vasishta 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b35f3d0def 
								
							 
						 
						
							
							
								
								readme : use PATH for Windows ROCm ( #4195 )  
							
							... 
							
							
							
							* Update README.md to use PATH for Windows ROCm
* Update README.md
* Update README.md 
							
						 
						
							2023-11-24 09:52:39 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d103d935c0 
								
							 
						 
						
							
							
								
								readme : update hot topics  
							
							
							
						 
						
							2023-11-23 13:51:22 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Aaryaman Vasishta 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								dfc7cd48b1 
								
							 
						 
						
							
							
								
								readme : update ROCm Windows instructions ( #4122 )  
							
							... 
							
							
							
							* Update README.md
* Update README.md
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
---------
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com> 
							
						 
						
							2023-11-20 17:02:46 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Galunid 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								36eed0c42c 
								
							 
						 
						
							
							
								
								stablelm : StableLM support ( #3586 )  
							
							... 
							
							
							
							* Add support for stablelm-3b-4e1t
* Supports GPU offloading of (n-1) layers 
							
						 
						
							2023-11-14 11:17:12 +01:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c049b37d7b 
								
							 
						 
						
							
							
								
								readme : update hot topics  
							
							
							
						 
						
							2023-11-13 14:18:08 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Richard Kiss 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								532dd74e38 
								
							 
						 
						
							
							
								
								Fix some documentation typos/grammar mistakes ( #4032 )  
							
							... 
							
							
							
							* typos
* Update examples/parallel/README.md
Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>
---------
Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com> 
							
						 
						
							2023-11-11 23:04:58 -07:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								224e7d5b14 
								
							 
						 
						
							
							
								
								readme : add notice about  #3912  
							
							
							
						 
						
							2023-11-02 20:44:12 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ian Scrivener 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5a42a5f8e8 
								
							 
						 
						
							
							
								
								readme : remove unsupported node.js library ( #3703 )  
							
							... 
							
							
							
							- https://github.com/Atome-FE/llama-node  is quite out of date
- doesn't support recent/current llama.cpp functionality 
							
						 
						
							2023-10-22 21:16:43 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d1031cf49c 
								
							 
						 
						
							
							
								
								sampling : refactor init to use llama_sampling_params ( #3696 )  
							
							... 
							
							
							
							* sampling : refactor init to use llama_sampling_params
* llama : combine repetition, frequency and presence penalties in 1 call
* examples : remove embd-input and gptneox-wip
* sampling : rename penalty params + reduce size of "prev" vector
* sampling : add llama_sampling_print helper
* sampling : hide prev behind API and apply #3661 
ggml-ci 
							
						 
						
							2023-10-20 21:07:23 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								004797f6ac 
								
							 
						 
						
							
							
								
								readme : update hot topics  
							
							
							
						 
						
							2023-10-18 21:44:43 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									BarfingLemurs 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8402566a7c 
								
							 
						 
						
							
							
								
								readme : update hot-topics & models, detail windows release in usage ( #3615 )  
							
							... 
							
							
							
							* Update README.md
* Update README.md
* Update README.md
* move "Running on Windows" section below "Prepare data and run"
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 
							
						 
						
							2023-10-17 21:13:21 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									ldwang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5fe268a4d9 
								
							 
						 
						
							
							
								
								readme : add Aquila2 links ( #3610 )  
							
							... 
							
							
							
							Signed-off-by: ldwang <ftgreat@gmail.com>
Co-authored-by: ldwang <ftgreat@gmail.com> 
							
						 
						
							2023-10-17 18:52:33 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ian Scrivener 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f3040beaab 
								
							 
						 
						
							
							
								
								typo : it is --n-gpu-layers not --gpu-layers ( #3592 )  
							
							... 
							
							
							
							fixed a typo in the MacOS Metal run doco 
							
						 
						
							2023-10-12 14:10:50 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Galunid 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9f6ede19f3 
								
							 
						 
						
							
							
								
								Add MPT model to supported models in README.md ( #3574 )  
							
							
							
						 
						
							2023-10-10 19:02:49 -04:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xingchen Song(宋星辰) 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c5b49360d0 
								
							 
						 
						
							
							
								
								readme : add bloom ( #3570 )  
							
							
							
						 
						
							2023-10-10 19:28:50 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									BarfingLemurs 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1faaae8c2b 
								
							 
						 
						
							
							
								
								readme : update models, cuda + ppl instructions ( #3510 )  
							
							
							
						 
						
							2023-10-06 22:13:36 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								beabc8cfb0 
								
							 
						 
						
							
							
								
								readme : add project status link  
							
							
							
						 
						
							2023-10-04 16:50:44 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									slaren 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								40e07a60f9 
								
							 
						 
						
							
							
								
								llama.cpp : add documentation about rope_freq_base and scale values ( #3401 )  
							
							... 
							
							
							
							* llama.cpp : add documentation about rope_freq_base and scale values
* add notice to hot topics 
							
						 
						
							2023-09-29 18:42:32 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									BarfingLemurs 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0a4a4a0982 
								
							 
						 
						
							
							
								
								readme : update hot topics + model links ( #3399 )  
							
							
							
						 
						
							2023-09-29 15:50:35 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Andrew Duffy 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								569550df20 
								
							 
						 
						
							
							
								
								readme : add link to grammars app ( #3388 )  
							
							... 
							
							
							
							* Add link to grammars app per @ggernagov suggestion
Adding a sentence in the Grammars section of README to point to grammar app, per https://github.com/ggerganov/llama.cpp/discussions/2494#discussioncomment-7138211 
* Update README.md 
							
						 
						
							2023-09-29 14:15:57 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Pierre Alexandre SCHEMBRI 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4aea3b846e 
								
							 
						 
						
							
							
								
								readme : add Mistral AI release 0.1 ( #3362 )  
							
							
							
						 
						
							2023-09-28 15:13:37 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									BarfingLemurs 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ffe88a36a9 
								
							 
						 
						
							
							
								
								readme : add some recent perplexity and bpw measurements to READMES, link for k-quants ( #3340 )  
							
							... 
							
							
							
							* Update README.md
* Update README.md
* Update README.md with k-quants bpw measurements 
							
						 
						
							2023-09-27 18:30:36 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									2f38b454 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1726f9626f 
								
							 
						 
						
							
							
								
								docs: Fix typo CLBlast_DIR var. ( #3330 )  
							
							
							
						 
						
							2023-09-25 20:24:52 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Lee Drake 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bc9d3e3971 
								
							 
						 
						
							
							
								
								Update README.md ( #3289 )  
							
							... 
							
							
							
							* Update README.md
* Update README.md
Co-authored-by: slaren <slarengh@gmail.com>
---------
Co-authored-by: slaren <slarengh@gmail.com> 
							
						 
						
							2023-09-21 21:00:24 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7eb41179ed 
								
							 
						 
						
							
							
								
								readme : update hot topics  
							
							
							
						 
						
							2023-09-20 20:48:22 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Johannes Gäßler 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								111163e246 
								
							 
						 
						
							
							
								
								CUDA: enable peer access between devices ( #2470 )  
							
							
							
						 
						
							2023-09-17 16:37:53 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									dylan 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								980ab41afb 
								
							 
						 
						
							
							
								
								docker : add gpu image CI builds ( #3103 )  
							
							... 
							
							
							
							Enables the GPU enabled container images to be built and pushed
alongside the CPU containers.
Co-authored-by: canardleteer <eris.has.a.dad+github@gmail.com> 
							
						 
						
							2023-09-14 19:47:00 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ikko Eltociear Ashimine 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7d99aca759 
								
							 
						 
						
							
							
								
								readme : fix typo ( #3043 )  
							
							... 
							
							
							
							* readme : fix typo
acceleation -> acceleration
* Update README.md
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 
							
						 
						
							2023-09-08 19:04:32 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								94f10b91ed 
								
							 
						 
						
							
							
								
								readme : update hot tpoics  
							
							
							
						 
						
							2023-09-08 18:18:04 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Yui 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6ff712a6d1 
								
							 
						 
						
							
							
								
								Update deprecated GGML TheBloke links to GGUF ( #3079 )  
							
							
							
						 
						
							2023-09-08 12:32:55 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e36ecdccc8 
								
							 
						 
						
							
							
								
								build : on Mac OS enable Metal by default ( #2901 )  
							
							... 
							
							
							
							* build : on Mac OS enable Metal by default
* make : try to fix build on Linux
* make : move targets back to the top
* make : fix target clean
* llama : enable GPU inference by default with Metal
* llama : fix vocab_only logic when GPU is enabled
* common : better `n_gpu_layers` assignment
* readme : update Metal instructions
* make : fix merge conflict remnants
* gitignore : metal 
							
						 
						
							2023-09-04 22:26:24 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Ido S 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								340af42f09 
								
							 
						 
						
							
							
								
								docs : add catai to README.md ( #2967 )  
							
							
							
						 
						
							2023-09-03 08:50:51 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									bandoti 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								52315a4216 
								
							 
						 
						
							
							
								
								readme : update clblast instructions ( #2903 )  
							
							... 
							
							
							
							* Update Windows CLBlast instructions
* Update Windows CLBlast instructions
* Remove trailing whitespace 
							
						 
						
							2023-09-02 15:53:18 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Konstantin Herud 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								49bb9cbe0f 
								
							 
						 
						
							
							
								
								docs : add java-llama.cpp to README.md ( #2935 )  
							
							
							
						 
						
							2023-09-01 16:36:14 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Gilad S 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								35092fb547 
								
							 
						 
						
							
							
								
								docs : add node-llama-cpp to README.md ( #2885 )  
							
							
							
						 
						
							2023-08-30 11:40:12 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									slaren 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c03a243abf 
								
							 
						 
						
							
							
								
								remove outdated references to -eps and -gqa from README ( #2881 )  
							
							
							
						 
						
							2023-08-29 23:17:34 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Jhen-Jie Hong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								74e0caeb82 
								
							 
						 
						
							
							
								
								readme : add react-native binding ( #2869 )  
							
							
							
						 
						
							2023-08-29 12:30:10 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								da7455d046 
								
							 
						 
						
							
							
								
								readme : fix headings  
							
							
							
						 
						
							2023-08-27 15:52:34 +03:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c48c5bb0b0 
								
							 
						 
						
							
							
								
								readme : update hot topics  
							
							
							
						 
						
							2023-08-27 14:44:35 +03:00