Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f048af0230 
								
							 
						 
						
							
							
								
								ggml : sync alibi fix from ggml repo  
							
							
							
						 
						
							2023-05-13 11:54:33 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									3ooabkhxtn 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ac0cd259d5 
								
							 
						 
						
							
							
								
								Adding SSE instructions to ggml_vec_dot_q4_0_q8_0 ( #1413 )  
							
							
							
						 
						
							2023-05-13 08:43:33 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0cd22e190a 
								
							 
						 
						
							
							
								
								llama : fix various warnings  
							
							
							
						 
						
							2023-05-13 11:23:15 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Rinne 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6456a4eb9f 
								
							 
						 
						
							
							
								
								embedding : remove unused code ( #1426 )  
							
							
							
						 
						
							2023-05-13 10:24:20 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cdd5350892 
								
							 
						 
						
							
							
								
								readme : update Q4_0 perplexities  
							
							... 
							
							
							
							I think these were affected by the removal of the `round` during quantization 
							
						 
						
							2023-05-13 09:12:44 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								738ace394a 
								
							 
						 
						
							
							
								
								llama : free ggml context in set / copy state data ( close   #1425 )  
							
							
							
						 
						
							2023-05-13 09:08:52 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Henri Vasserman 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								699b1ad7fe 
								
							 
						 
						
							
							
								
								opencl : fix kernels for the new formats ( #1422 )  
							
							... 
							
							
							
							* Fix OpenCL kernels for the new formats
* Fix Q5_0 alignment issues. 
							
						 
						
							2023-05-13 09:01:15 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fb62f92433 
								
							 
						 
						
							
							
								
								llama : fix --mtest option ( close   #1414 )  
							
							
							
						 
						
							2023-05-12 21:44:20 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Johannes Gäßler 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								773ee249fb 
								
							 
						 
						
							
							
								
								CLI args use - instead of _, backwards compatible ( #1416 )  
							
							
							
						 
						
							2023-05-12 14:34:55 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									slaren 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								553fd4d4b5 
								
							 
						 
						
							
							
								
								Add clang-tidy reviews to CI ( #1407 )  
							
							
							
						 
						
							2023-05-12 15:40:53 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Rinne 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								089b1c93ba 
								
							 
						 
						
							
							
								
								readme : add C#/.NET bindings repo ( #1409 )  
							
							
							
						 
						
							2023-05-12 08:39:40 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b9fd7eee57 
								
							 
						 
						
							
							
								
								ggml : remove bit shuffling ( #1405 )  
							
							... 
							
							
							
							* ggml : remove Q4_0 bit shufling (ARM NEON)
* ggml : remove Q4_1 bit shuffling (ARM NEON + reference)
* ggml : nibbles_from_floats() + bytes_from_nibbles() (ARM NEON)
* ggml : remove Q4_2 bit shuffling (WIP, BROKEN)
* ggml : remove Q5_0 bit shuffling (ARM NEON)
* ggml : 2x faster scalar implementations
* ggml : remove Q5_1 bit shuffling (ARM NEON + scalar)
* ggml : simplify scalar dot
* ggml : remove WASM SIMD bit shuffling + remove vzip for ARM 32-bit
* ggml : fix Q4_1 quantization
* ggml : update cuBLAS + normalize variable names
* ggml : remove Q4_2 mode
* ggml : minor formatting
* ggml : fix Q5_0 quantization
* scripts : add script for measuring the time per token
* AVX implementations (#1370 )
* ggml : uniform 5th bit extraction
* llama : produce error upon loading old model files
* llama : fix model magic/version write
* ggml : speed-up Q5_0 + Q5_1 at 4 threads
* ggml : preserve old Q4 and Q5 formats
* ggml : simplify Q8_1 - no need for low / high sums anymore
* ggml : fix Q8_0 and Q8_1 rounding
* Revert "AVX implementations (#1370 )"
This reverts commit 948d124837 
							
						 
						
							2023-05-12 00:23:08 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									CRD716 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b608b55a3e 
								
							 
						 
						
							
							
								
								prompts : model agnostic DAN ( #1304 )  
							
							... 
							
							
							
							* add model-agnostic dan prompt
* quick readme update
* save a token
* Revert "quick readme update"
This reverts commit 8dc342c069 
							
						 
						
							2023-05-11 18:10:19 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Evan Jones 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cf348a60e0 
								
							 
						 
						
							
							
								
								main : add option to save full output to session ( #1338 )  
							
							... 
							
							
							
							* main : add option to save full output to session
* split behavior into --session and --prompt-cache
* restore original implementation with new names
* PR comments
* move the check for incompatible parameters to gpt_params_parse
* Fix whitespace
Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com>
---------
Co-authored-by: DannyDaemonic <DannyDaemonic@gmail.com> 
							
						 
						
							2023-05-10 11:37:14 -04:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									DannyDaemonic 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e6a46b0ed1 
								
							 
						 
						
							
							
								
								Locale fix for Windows ( #1379 )  
							
							
							
						 
						
							2023-05-09 19:53:28 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Sami Farin 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9f8dbc4787 
								
							 
						 
						
							
							
								
								use pause asm insn in busyloop to run the CPU (13600K) 10 °C cooler ( #1314 )  
							
							... 
							
							
							
							* use pause asm insn in busyloop to run the CPU (13600K) 10 °C cooler
Tested with a 13B model.
* use _mm_pause() in busyloop
* use _mm_pause() in busyloop on x86_64 to reduce power consumption 
							
						 
						
							2023-05-09 14:29:20 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									DannyDaemonic 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								41654efea8 
								
							 
						 
						
							
							
								
								Interface improvements and --multiline-input (previously --author-mode) ( #1040 )  
							
							... 
							
							
							
							* Interface improvements
* Multiline input
* Track character width
* Works with all characters and control codes + Windows console fixes 
							
						 
						
							2023-05-08 19:45:48 -07:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								56551bc11f 
								
							 
						 
						
							
							
								
								readme : add notice about upcoming breaking change  
							
							
							
						 
						
							2023-05-08 22:52:18 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									AlpinDale 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fe60904eef 
								
							 
						 
						
							
							
								
								readme : add TOC and Pygmalion instructions ( #1359 )  
							
							
							
						 
						
							2023-05-08 19:33:30 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Pavol Rusnak 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								003ba2fb43 
								
							 
						 
						
							
							
								
								llama : fix hparams shadow ( #1367 )  
							
							... 
							
							
							
							fixes  #1363  
						
							2023-05-08 17:48:21 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f9a6364912 
								
							 
						 
						
							
							
								
								llama : require first token to be BOS ( #1303 )  
							
							... 
							
							
							
							* llama : require first token to be BOS
* scripts : add ppl-run-all.sh
* perplexity : add BOS for each chunk
* readme : update perplexity values after BOS fix
* perplexity : add clarifying comments 
							
						 
						
							2023-05-08 17:41:54 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ubik2 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								95078cc554 
								
							 
						 
						
							
							
								
								convert: add ability to convert safetensors files ( #1276 )  
							
							... 
							
							
							
							* when loading a safetensors file, ignore the metadata header
* check for safetensors files first, and only use PyTorch versions when safetensors aren't available 
							
						 
						
							2023-05-08 13:54:26 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Johannes Gäßler 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1f48b0abcf 
								
							 
						 
						
							
							
								
								Documented CUDA reproducibility, added warning ( #1346 )  
							
							
							
						 
						
							2023-05-08 02:42:01 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Henri Vasserman 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e1295513a4 
								
							 
						 
						
							
							
								
								CI: add Windows CLBlast and OpenBLAS builds ( #1277 )  
							
							... 
							
							
							
							* Add OpenCL and CLBlast support
* Add OpenBLAS support
* Remove testing from matrix
* change build name to 'clblast' 
							
						 
						
							2023-05-07 13:20:09 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									swittk 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1b0fd45465 
								
							 
						 
						
							
							
								
								ggml : Allow usage of CLBlast alongside Accelerate.framework ( #1336 )  
							
							... 
							
							
							
							Minor edit in ggml.c which originally would prevent OpenCL from loading completely if GGML_USE_ACCELERATE was defined.
Minor speedup in prompt eval time. 
							
						 
						
							2023-05-06 23:03:23 -04:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Jed Fox 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3924088512 
								
							 
						 
						
							
							
								
								Remove default arguments from sampling functions ( #1343 )  
							
							
							
						 
						
							2023-05-06 17:01:47 -04:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									DaniAndTheWeb 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								173d0e6419 
								
							 
						 
						
							
							
								
								makefile: automatic Arch Linux detection ( #1332 )  
							
							... 
							
							
							
							This commit is a port of a detection method used in koboldcpp's Makefile in order to automatically set the -lcblas option on Arch Linux 
							
						 
						
							2023-05-05 23:57:14 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Erik Scholz 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a3b85b28da 
								
							 
						 
						
							
							
								
								ci : add cublas to windows release ( #1271 )  
							
							
							
						 
						
							2023-05-05 22:56:09 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Pavol Rusnak 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								921dcee00a 
								
							 
						 
						
							
							
								
								readme: add missing info ( #1324 )  
							
							
							
						 
						
							2023-05-05 16:43:36 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Ionoclast Laboratories 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2d13786e91 
								
							 
						 
						
							
							
								
								Fix for OpenCL / clbast builds on macOS. ( #1329 )  
							
							
							
						 
						
							2023-05-05 14:18:21 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Benjamin Lecaillon 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a90e96b266 
								
							 
						 
						
							
							
								
								Convert.py @staticmethod ( #1327 )  
							
							... 
							
							
							
							* Line 698 has one #staticmethod and should not
otherwise throw error at unpickle.load() as not callable
* Update convert.py
---------
Co-authored-by: Ivan Stepanov <ivanstepanovftw@gmail.com> 
							
						 
						
							2023-05-05 03:17:07 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									slaren 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								94c5652fc0 
								
							 
						 
						
							
							
								
								quantize: make output filename optional, default to ggml-model-<ftype>.bin ( #1301 )  
							
							
							
						 
						
							2023-05-05 00:58:56 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Ivan Stepanov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								34d9f22f44 
								
							 
						 
						
							
							
								
								Wrap exceptions in std::exception to verbose output on exception. ( #1316 )  
							
							
							
						 
						
							2023-05-04 18:56:27 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Ivan Stepanov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d3e8093e9b 
								
							 
						 
						
							
							
								
								convert: support DT_BF16 tensors ( #1309 )  
							
							... 
							
							
							
							Co-authored-by: Pavol Rusnak <pavol@rusnak.io> 
							
						 
						
							2023-05-04 18:54:37 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									44670 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								360cfe5bec 
								
							 
						 
						
							
							
								
								readme : add OpenBuddy link ( #1321 )  
							
							
							
						 
						
							2023-05-04 19:33:31 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									44670 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2edbdb0f99 
								
							 
						 
						
							
							
								
								main : add --in-suffix option ( #1318 )  
							
							... 
							
							
							
							* adding --in-suffix option
* print input suffix before generation 
							
						 
						
							2023-05-04 18:41:12 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Ron Jailall 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								20fbf2a2a0 
								
							 
						 
						
							
							
								
								ggml : change immintrin.h to intrin.h for compatibility ( #1307 )  
							
							... 
							
							
							
							* change immintrin.h to intrin.h for compatibility
Building on windows11 arm throws an error on this line. Seems like using intrin.h covers x86 and and arm
* conditional def of intrin.h
* fix typo in ggml.c 
							
						 
						
							2023-05-04 18:05:59 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									DannyDaemonic 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								db1080876a 
								
							 
						 
						
							
							
								
								Only escape prompts when used with -e ( #1311 )  
							
							
							
						 
						
							2023-05-04 05:08:25 -07:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									DannyDaemonic 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c65a7fbfa9 
								
							 
						 
						
							
							
								
								Update main's README.md with new features ( #1296 )  
							
							
							
						 
						
							2023-05-04 03:02:59 -07:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Tomas 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f647ce040f 
								
							 
						 
						
							
							
								
								fix   #1224  reverse prompt and multi line ( #1297 )  
							
							... 
							
							
							
							* fix reverse prompt and multi line
* Code Formatting
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 
							
						 
						
							2023-05-04 03:02:30 -07:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								799fdc1b5d 
								
							 
						 
						
							
							
								
								ggml : vectorize Q8_0 quantization  
							
							... 
							
							
							
							https://github.com/ggerganov/ggml/pull/127#issuecomment-1533648531  
						
							2023-05-03 23:24:20 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									khimaros 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6daa09d879 
								
							 
						 
						
							
							
								
								examples : read chat prompts from a template file ( #1196 )  
							
							
							
						 
						
							2023-05-03 20:58:11 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bca9ad938a 
								
							 
						 
						
							
							
								
								minor : fix whitespaces ( #1302 )  
							
							
							
						 
						
							2023-05-03 20:09:42 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e2a937ca6a 
								
							 
						 
						
							
							
								
								minor : fix trailing whitespaces  
							
							
							
						 
						
							2023-05-03 18:43:23 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									KASR 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b0c71c7b6d 
								
							 
						 
						
							
							
								
								scripts : platform independent script to verify sha256 checksums ( #1203 )  
							
							... 
							
							
							
							* python script to verify the checksum of the llama models
Added Python script for verifying SHA256 checksums of files in a directory, which can run on multiple platforms. Improved the formatting of the output results for better readability.
* Update README.md
update to the readme for improved readability and to explain the usage of the python checksum verification script
* update the verification script
I've extended the script based on suggestions by @prusnak
The script now checks the available RAM, is there is enough to check the file at once it will do so. If not the file is read in chunks.
* minor improvment
small change so that the available ram is checked and not the total ram
* remove the part of the code that reads the file at once if enough ram is available
based on suggestions from @prusnak i removed the part of the code that checks whether the user had enough ram to read the entire model at once. the file is now always read in chunks.
* Update verify-checksum-models.py
quick fix to pass the git check 
							
						 
						
							2023-05-03 18:31:28 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									CRD716 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a8a2efdc81 
								
							 
						 
						
							
							
								
								examples : various prompt and example fixes ( #1298 )  
							
							... 
							
							
							
							* fix dan.txt
* miku prompt improvements
* use common characters 
							
						 
						
							2023-05-03 18:26:47 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Evan Jones 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e216aa0463 
								
							 
						 
						
							
							
								
								llama : only copy used KV cache in get / set state ( #1272 )  
							
							... 
							
							
							
							* llama : only copy used KV cache in get / set state
* switch to ggml for copying k, v
* avoid designated initializers 
							
						 
						
							2023-05-02 22:26:13 -04:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									DannyDaemonic 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2485d7a4d3 
								
							 
						 
						
							
							
								
								Process escape sequences given in prompts ( #1173 )  
							
							
							
						 
						
							2023-05-02 18:46:20 -07:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									DannyDaemonic 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								13b0c68ed7 
								
							 
						 
						
							
							
								
								Handle signals properly on Windows ( #1123 )  
							
							
							
						 
						
							2023-05-02 18:01:57 -07:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									DannyDaemonic 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								55bc5f0900 
								
							 
						 
						
							
							
								
								Call sh on build-info.sh ( #1294 )  
							
							
							
						 
						
							2023-05-02 17:52:35 -07:00