This operator overload is not used anyway - explicitly deleting it seems
to have no effect on compilation.
train-text-from-scratch.cpp:174:16: warning: comparing object representation of type 'my_llama_hparams' which does not have a unique object representation; consider comparing the members of the object manually [bugprone-suspicious-memory-comparison]
return memcmp(this, &other, sizeof(my_llama_hparams));
^
There is a -Warray-bounds warning from g++ 13.2.1 in
test-llama-grammar.cpp that is a false-positive because there is a
ternary that special-cases zero in the std::vector code.
/usr/include/c++/13.2.1/bits/stl_algobase.h:398:17: warning: array subscript 0 is outside array bounds of ‘const llama_grammar_element* [0]’ [-Warray-bounds=]
398 | { *__to = *__from; }
| ~~~~~~^~~~~~~~~
* Guard against all weights in a super-block being zero
* Also guard against extremely small weights
Closes#2982
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
* build : on Mac OS enable Metal by default
* make : try to fix build on Linux
* make : move targets back to the top
* make : fix target clean
* llama : enable GPU inference by default with Metal
* llama : fix vocab_only logic when GPU is enabled
* common : better `n_gpu_layers` assignment
* readme : update Metal instructions
* make : fix merge conflict remnants
* gitignore : metal
* ggml-alloc : use virtual memory for measurement
* compatibility fixes for MAP_ANONYMOUS
* fallback to fixed address for systems without virtual memory
* Very minor speedup via simd-group synchronization in f16 x f32
* Another very minor speedup on metal
* Quite significant PP speedup on metal
* Another attempt
* Minor
* Massive improvement for TG for fp16
* ~4-5% improvement for Q8_0 TG on metal
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* make : remove unused -DGGML_BIG_ENDIAN
* make : put preprocessor stuff in CPPFLAGS
* make : pass Raspberry Pi arch flags to g++ as well
* make : support overriding CFLAGS/CXXFLAGS/CPPFLAGS/LDFLAGS
* make : fix inverted conditional
* ggml_metal_init: Show all Metal device instances in the system
Also show the default Metal device that was picked.
* Update ggml-metal.m
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* k-quants : fix build on armv7
* ggml : cleanup unused arm32 specific impl
* k-quants : avoid some unused vzero / mzero define
* ggml-alloc : use 4g for MEASURE_MAX_SIZE in 32-bit arm
* Allow quantize tool to only copy tensors to allow repackaging models.
* Slightly better logic when requantizing.
* Change help message to go to `stdout`.