this is not a lora-finetune, but the whole model changed to have only low-rank "lora" matrices.
training this instead of the normal model resulted in much worse results though...
Minor edit in ggml.c which originally would prevent OpenCL from loading completely if GGML_USE_ACCELERATE was defined.
Minor speedup in prompt eval time.
when not keeping gradients of model parameters they are overwritten by tensors created by opt, which may be invalid after opt context is renewed.
so we need to keep the original gradients and make dups for opt
* Line 698 has one #staticmethod and should not
otherwise throw error at unpickle.load() as not callable
* Update convert.py
---------
Co-authored-by: Ivan Stepanov <ivanstepanovftw@gmail.com>
* change immintrin.h to intrin.h for compatibility
Building on windows11 arm throws an error on this line. Seems like using intrin.h covers x86 and and arm
* conditional def of intrin.h
* fix typo in ggml.c
* fix reverse prompt and multi line
* Code Formatting
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* python script to verify the checksum of the llama models
Added Python script for verifying SHA256 checksums of files in a directory, which can run on multiple platforms. Improved the formatting of the output results for better readability.
* Update README.md
update to the readme for improved readability and to explain the usage of the python checksum verification script
* update the verification script
I've extended the script based on suggestions by @prusnak
The script now checks the available RAM, is there is enough to check the file at once it will do so. If not the file is read in chunks.
* minor improvment
small change so that the available ram is checked and not the total ram
* remove the part of the code that reads the file at once if enough ram is available
based on suggestions from @prusnak i removed the part of the code that checks whether the user had enough ram to read the entire model at once. the file is now always read in chunks.
* Update verify-checksum-models.py
quick fix to pass the git check