slaren 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2a98bc18ea 
								
							 
						 
						
							
							
								
								ggml : add AVX2 implementation of quantize_row_q4_1 ( #515 )  
							
							... 
							
							
							
							* Add AVX2 implementation of quantize_row_q4_1
* Actually use AVX2
* Make quantize_row_q4_1 static
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 
							
						 
						
							2023-03-28 21:06:03 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									thement 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d0aaff571c 
								
							 
						 
						
							
							
								
								py : add temporary script to convert old ggml files to newer version ( #539 )  
							
							... 
							
							
							
							Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net> 
							
						 
						
							2023-03-28 20:55:42 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Tai Duc Nguyen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d0330fd783 
								
							 
						 
						
							
							
								
								py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning ( #403 )  
							
							
							
						 
						
							2023-03-28 20:51:29 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Stephan Walter 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								99c5b27654 
								
							 
						 
						
							
							
								
								ggml : refactor quantized processing functions ( #509 )  
							
							... 
							
							
							
							* Refactor quantized processing functions
* ggml : minor
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 
							
						 
						
							2023-03-28 20:13:01 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									DooWoong Lee (David) 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								692ce3164e 
								
							 
						 
						
							
							
								
								py : removed unused model variable and verified that the code functions correctly with vocab_only setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. ( #547 )  
							
							
							
						 
						
							2023-03-28 20:02:34 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								96f9c0506f 
								
							 
						 
						
							
							
								
								ci : make ctest verbose, hopefully we see what is wrong with the sanitizer  
							
							
							
						 
						
							2023-03-28 20:01:09 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d502bc7c9d 
								
							 
						 
						
							
							
								
								tests : free llama context at the end of the test  
							
							
							
						 
						
							2023-03-28 19:51:55 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Stephan Walter 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								436e561931 
								
							 
						 
						
							
							
								
								all : be more strict about converting float to double ( #458 )  
							
							... 
							
							
							
							* Be more strict about converting float to double
* Test equivalence of round, SILU implementations
Test module is commented out in CMakeLists.txt because the tests may
take a long time, depending on how much the compiler optimizes.
* Fix softmax in perplexity.cpp
* all : prefer float over double where appropriate
* perplexity : add <cmath>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 
							
						 
						
							2023-03-28 19:48:20 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Jed Fox 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								20e1e84884 
								
							 
						 
						
							
							
								
								deploy : add a Package.swift for SwiftPM support ( #393 )  
							
							... 
							
							
							
							* Add a Package.swift for SwiftPM support
* Swap from exclusions to allowlist 
							
						 
						
							2023-03-28 19:39:01 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Stephan Walter 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c1f885067c 
								
							 
						 
						
							
							
								
								ggml : introduce structs for the q4 data blocks ( #356 )  
							
							... 
							
							
							
							* Introduce structs for the q4 data blocks
* ggml : rename quant struct variables + fix ARM_NEON
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 
							
						 
						
							2023-03-28 18:56:03 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e0670260fb 
								
							 
						 
						
							
							
								
								gitignore : add "embedding"  
							
							
							
						 
						
							2023-03-28 18:34:35 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									dotpy314 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								28ba975aea 
								
							 
						 
						
							
							
								
								Check the existence of f16_model_path_base in quantize.py ( #574 )  
							
							... 
							
							
							
							Co-authored-by: Jincheng Miao <jincheng.miao@gmail.com> 
							
						 
						
							2023-03-28 18:06:28 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									slaren 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a6bdc47cba 
								
							 
						 
						
							
							
								
								Fix usage of F16C intrinsics in AVX code ( #563 )  
							
							... 
							
							
							
							* Fix usage of F16C intrinsics in AVX code when F16C is not defined 
							
						 
						
							2023-03-28 17:26:55 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									anzz1 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7b8dbcb78b 
								
							 
						 
						
							
							
								
								main.cpp fixes, refactoring ( #571 )  
							
							... 
							
							
							
							- main: entering empty line passes back control without new input in interactive/instruct modes
- instruct mode: keep prompt fix
- instruct mode: duplicate instruct prompt fix
- refactor: move common console code from main->common 
							
						 
						
							2023-03-28 17:09:55 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									RJ Adriaansen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4b8efff0e3 
								
							 
						 
						
							
							
								
								Add embedding example to Makefile ( #540 )  
							
							
							
						 
						
							2023-03-28 09:11:09 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Marco Matthies 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7e5395575a 
								
							 
						 
						
							
							
								
								Fix missing ggml link in cmake for examples/* on w64-mingw32 ( #542 )  
							
							
							
						 
						
							2023-03-27 07:55:26 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Erik Scholz 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								34c1072e49 
								
							 
						 
						
							
							
								
								ci: add debug build to sanitizer build matrix ( #527 )  
							
							
							
						 
						
							2023-03-26 15:48:40 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Stephan Walter 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								939ad2d3a5 
								
							 
						 
						
							
							
								
								Fix undefined variables in debug build, remove unused variables ( #531 )  
							
							
							
						 
						
							2023-03-26 15:34:02 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Juan Calderon-Perez 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8c2ec5e21d 
								
							 
						 
						
							
							
								
								Add support for linux/arm64 platform during Docker Builds ( #514 )  
							
							... 
							
							
							
							* Add support for linux/arm64 platform
* Add platform to versioned builds 
							
						 
						
							2023-03-26 14:48:42 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Stephan Walter 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b391579db9 
								
							 
						 
						
							
							
								
								Update README and comments for standalone perplexity tool ( #525 )  
							
							
							
						 
						
							2023-03-26 16:14:01 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									anzz1 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7a87d31f4f 
								
							 
						 
						
							
							
								
								[main] fix infinite generation (-n == -1) ( #523 )  
							
							
							
						 
						
							2023-03-26 16:06:10 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								348d6926ee 
								
							 
						 
						
							
							
								
								Add logo to README.md  
							
							
							
						 
						
							2023-03-26 10:20:49 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Harald Fernengel 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								33e35b8fe8 
								
							 
						 
						
							
							
								
								Exit from interactive mode if input stream is bad ( #491 )  
							
							... 
							
							
							
							Allow exiting the interactive prompt also with CTRL-D on Unix and CTRL-Z
on Windows. 
							
						 
						
							2023-03-26 08:25:46 +03:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									anzz1 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								19726169b3 
								
							 
						 
						
							
							
								
								CI: Run other sanitizer builds even if one fails ( #511 )  
							
							... 
							
							
							
							applies only to sanitizer builds so they wont be cancelled 
							
						 
						
							2023-03-26 00:13:28 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									jp-x-g 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f732695cd5 
								
							 
						 
						
							
							
								
								Clarify console output in convert-pth-to-ggml.py ( #512 )  
							
							... 
							
							
							
							"Processing part 1 of 3" instead of "Processing part 0" 
							
						 
						
							2023-03-25 23:53:55 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									anzz1 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2f7bf7dd7c 
								
							 
						 
						
							
							
								
								CMake / CI additions ( #497 )  
							
							... 
							
							
							
							* CMake: Add AVX512 option
* CI: Add AVX/AVX512 builds (Windows)
(AVX512 tests can only be run when the worker happens to support it, building works anyway)
* CMake: Fix sanitizer linkage ( merged #468  )
* CI: Add sanitizer builds (Ubuntu)
* CI: Fix release tagging
(change @zendesk/action-create-release to @anzz1/action-create-release until upstream PR Added commitish as input zendesk/action-create-release#32  is merged) 
							
						 
						
							2023-03-25 23:38:11 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									anzz1 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								34ab526843 
								
							 
						 
						
							
							
								
								(Windows) Set console to UTF-8 on init ( #420 )  
							
							... 
							
							
							
							Sets console codepage to 65001 (CP_UTF8) on start for both input and output, should fix problems with UTF-8 characters. 
							
						 
						
							2023-03-25 22:29:22 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c2b25b6912 
								
							 
						 
						
							
							
								
								Fix colors enabling on WIN32  
							
							
							
						 
						
							2023-03-25 21:53:39 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								79b2b266db 
								
							 
						 
						
							
							
								
								If n_predict == -1, generate forever  
							
							
							
						 
						
							2023-03-25 21:51:41 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e2d490dafd 
								
							 
						 
						
							
							
								
								Inifinite generation via context swapping ( #71 )  
							
							
							
						 
						
							2023-03-25 21:36:22 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								03f7e33560 
								
							 
						 
						
							
							
								
								Cleanup STL headers + fix embedding examples + minor stuff  
							
							
							
						 
						
							2023-03-25 20:51:14 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								55ad42af84 
								
							 
						 
						
							
							
								
								Move chat scripts into "./examples"  
							
							
							
						 
						
							2023-03-25 20:37:09 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									slaren 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								459e93cce0 
								
							 
						 
						
							
							
								
								Add AVX2 implementation of dequantize_row_q4_1 ( #505 )  
							
							
							
						 
						
							2023-03-25 20:31:48 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a316a425d0 
								
							 
						 
						
							
							
								
								Overhaul the examples structure  
							
							... 
							
							
							
							- main -> examples
- utils -> examples (renamed to "common")
- quantize -> examples
- separate tools for "perplexity" and "embedding"
Hope I didn't break something ! 
							
						 
						
							2023-03-25 20:26:40 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ecbe466a36 
								
							 
						 
						
							
							
								
								Retire the ggml_mul_mat() branch for transposed src0 ( #500 )  
							
							... 
							
							
							
							* Retire the ggml_mul_mat() for transposed src0
- It can always be made contiguous with ggml_cpy()
- The code is now simplified
- The results are deterministic in respect to num threads
* SIMD-ify dequantize_row_q4_0() for ARM_NEON (#502 )
* Attempt to SIMD-ify dequantize_row_q4_0() for ARM_NEON
* Fix dequantization - forgot to interleave the quants 
							
						 
						
							2023-03-25 19:47:21 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								502a400192 
								
							 
						 
						
							
							
								
								Disable prompt verbosity by default and add option to enable ( #480 )  
							
							
							
						 
						
							2023-03-25 17:17:16 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									slaren 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								09aecbf628 
								
							 
						 
						
							
							
								
								Add AVX2 implementation of dequantize_row_q4_0 ( #467 )  
							
							
							
						 
						
							2023-03-25 17:06:49 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4640eff23d 
								
							 
						 
						
							
							
								
								Don't interefe with BLAS for large prompts by running only 1 thread  
							
							
							
						 
						
							2023-03-25 17:03:10 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ab77d76312 
								
							 
						 
						
							
							
								
								Add longer DAN prompt for testing big batch numbers  
							
							
							
						 
						
							2023-03-25 16:49:09 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									slaren 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								29b7baab67 
								
							 
						 
						
							
							
								
								Add timings for the prompt evaluation ( #478 )  
							
							
							
						 
						
							2023-03-25 16:34:23 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4a7129acd2 
								
							 
						 
						
							
							
								
								Remove obsolete information from README  
							
							
							
						 
						
							2023-03-25 16:30:32 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6b6dbc8910 
								
							 
						 
						
							
							
								
								Remove obsolete assert and fix compiler warning  
							
							
							
						 
						
							2023-03-25 16:22:05 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2a2e63ce05 
								
							 
						 
						
							
							
								
								Fix nasty bug in ggml_compute_forward_mul_mat_f32() and reenable BLAS  
							
							
							
						 
						
							2023-03-25 16:10:14 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									anzz1 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e899bf54b2 
								
							 
						 
						
							
							
								
								bounds checking for input prefix ( #492 )  
							
							
							
						 
						
							2023-03-25 14:42:09 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									anzz1 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fbd4d38c64 
								
							 
						 
						
							
							
								
								feat: '--in-prefix STRING' option ( #426 )  
							
							... 
							
							
							
							Prefix user inputs with a string 
							
						 
						
							2023-03-25 14:03:19 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Jed Fox 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								58e6c9f36f 
								
							 
						 
						
							
							
								
								Add support for file load progress reporting callbacks ( #434 )  
							
							... 
							
							
							
							* File load progress reporting
* Move llama_progress_handler into llama_context_params
* Renames
* Use seekg to find file size instead
* More correct load progress
* Call progress callback more frequently
* Fix typo 
							
						 
						
							2023-03-25 07:26:28 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Doomsdayrs 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								36d07532ef 
								
							 
						 
						
							
							
								
								Add missing struct annotation ( #483 )  
							
							... 
							
							
							
							`llama_sample_top_p_top_k` was missing the struct annotation on line 126.
This causes a compiler issue when being parsed by the Kotlin C interop generator.
This commit fixes the above issue by adding the struct annotation. 
							
						 
						
							2023-03-25 07:21:24 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Chris Kuehl 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6f1ee4b640 
								
							 
						 
						
							
							
								
								Fix crash for 65B model with pre-allocated memory ( #485 )  
							
							
							
						 
						
							2023-03-25 06:38:14 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8520fc310e 
								
							 
						 
						
							
							
								
								Disable BLAS altogether - the bug is not just for qunatized mat mul  
							
							
							
						 
						
							2023-03-24 23:47:06 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b3f460e941 
								
							 
						 
						
							
							
								
								Disable BLAS branch in mul_mat - seems there is a bug  
							
							
							
						 
						
							2023-03-24 23:39:17 +02:00