Brian
455c0e53ac
Apply suggestions from code review
...
Co-authored-by: compilade <git@compilade.net>
2024-07-16 06:42:38 +10:00
brian khuu
ccff6c7fb2
convert-*.py: remove reference to uuid generation
2024-07-16 06:42:38 +10:00
Brian
8156835d4a
constants.py : Revert removal of backward compatibility KEY_GENERAL_SOURCE_URL
2024-07-16 06:42:38 +10:00
Brian
2c060303a6
Update constants.py : spacing correction
2024-07-16 06:42:38 +10:00
Brian
aa4e5892a0
Update convert_hf_to_gguf.py
...
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
2024-07-16 06:42:38 +10:00
Brian
60278e4f4d
Update convert_hf_to_gguf.py
...
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
2024-07-16 06:42:38 +10:00
brian khuu
ad217d7249
convert-*.py: remove autogenerated uuid
2024-07-16 06:42:38 +10:00
brian khuu
f2b425c59c
convert-*.py: import cast from typing and other refactor
2024-07-16 06:42:38 +10:00
brian khuu
04c4fffdcc
convert-*.py: prepare_tensors_for_writing() --> prepare_tensors()
...
> Especially since it can be used for other purposes than "for writing", like preparing the tensors to then count and sum all their sizes.
Co-authored-by: compilade <git@compilade.net>
2024-07-16 06:42:38 +10:00
brian khuu
64707b625c
convert-*.py: remove redundant gguf_writer.add_name() calls
2024-07-16 06:42:38 +10:00
brian khuu
f8b5931180
convert-*.py: parameter_class_attribute --> size_label
2024-07-16 06:42:38 +10:00
brian khuu
6eb08ac868
convert-*.py: Removing the redundant metadata is not None from all conditions, and indenting them.
...
Co-authored-by: compilade <git@compilade.net>
2024-07-16 06:42:38 +10:00
brian khuu
4c91d077d2
convert-*.py: cast not required if Metadata.load_metadata_override returned a dict[str, Any] instead of a dict[str, object]
...
Co-authored-by: compilade <git@compilade.net>
2024-07-16 06:42:38 +10:00
Brian
74383ba6d2
Apply suggestions from code review
...
Co-authored-by: compilade <git@compilade.net>
2024-07-16 06:42:38 +10:00
brian khuu
dd14b8fdb1
convert-*.py: pyright type fixes
2024-07-16 06:42:38 +10:00
brian khuu
59a01df784
convert-*.py: refactor per model weight count estimation
2024-07-16 06:42:38 +10:00
brian khuu
2a976e1211
convert-*.py: write_tensors() --> prepare_tensors_for_writing()
2024-07-16 06:42:38 +10:00
brian khuu
fdc5a3fc80
convert-*.py: autogenerate general.uuid if missing
2024-07-16 06:42:35 +10:00
brian khuu
7ecb8f00a0
test: remove test_gguf.py and remove test_generate_any_missing_uuid()
2024-07-16 06:38:40 +10:00
brian khuu
007708e32d
gguf_writer.py: generate tensor uuid if missing
2024-07-16 06:38:40 +10:00
brian khuu
4dc8ddd35a
convert_hf_to_gguf.py: Remove code that is already in fill_templated_filename() and GGUFWriter()
2024-07-16 06:38:40 +10:00
brian khuu
2f23927d37
convert_hf_to_gguf.py: rebase error correction
2024-07-16 06:38:40 +10:00
brian khuu
5011eefeaf
convert_hf_to_gguf.py: optional, dataclass removed from type as it was unused
2024-07-16 06:38:40 +10:00
brian khuu
e9734434bd
convert-*.py: Remove self.model_name that was left in since last rebase
2024-07-16 06:38:40 +10:00
brian khuu
eaa47f5546
convert-*.py: separated unit test, hf_repo to repo_url
2024-07-16 06:38:40 +10:00
brian khuu
d060fcdbe2
convert-*.py: adjusted authorship KV store
2024-07-16 06:38:40 +10:00
brian khuu
91e65d9485
convert-*.py: add unittest to metadata class
2024-07-16 06:38:38 +10:00
brian khuu
3625a42061
convert-*.py: add heuristic to directory name fallback
...
Also add source_url for huggingface url
2024-07-16 06:37:42 +10:00
brian khuu
39472a09da
convert-*.py: need to include self in per_model_weight_count_estimation()
2024-07-16 06:37:42 +10:00
brian khuu
54918ad14e
convert-*.py: refactor parameter weight class
2024-07-16 06:37:42 +10:00
brian khuu
32e80e094c
convert-*.py: base_model is actually in spec for model cards
2024-07-16 06:37:42 +10:00
brian khuu
4d5cd0670a
convert-*.py: use heuristics to parse _name_or_path
2024-07-16 06:37:42 +10:00
brian khuu
b0553f42da
convert-*.py: adjust help message
2024-07-16 06:37:42 +10:00
brian khuu
dd1571211e
convert-*.py: add quantized_by and enhance heuristics
2024-07-16 06:37:38 +10:00
brian khuu
5a86dfaa1c
convert-*.py: add general.organization to kv store
2024-07-16 06:36:03 +10:00
brian khuu
f7c20793b9
convert-*.py: enable --model-name direct metadata override
2024-07-16 06:36:03 +10:00
brian khuu
b1927eed82
convert-*.py: move per model weight estimation away from util back to main script
...
plus some refactoring
2024-07-16 06:36:03 +10:00
brian khuu
684c604eca
convert-*.py: add datasets and language to KV store
2024-07-16 06:36:03 +10:00
brian khuu
0f1d50fab7
convert-*.py: add parameter size class
2024-07-16 06:36:03 +10:00
brian khuu
8f734083dd
convert-*.py: add base_version and add tags
2024-07-16 06:36:03 +10:00
brian khuu
b36e391b87
convert-*.py: parse model card in metadata util. Add license_link and license_name to kv store
2024-07-16 06:36:03 +10:00
brian khuu
5c263cb257
convert-*.py: encoding_scheme --> output_type
2024-07-16 06:36:03 +10:00
brian khuu
4d5f18a0e6
convert-*.py: metadata class moved to utility
2024-07-16 06:36:03 +10:00
brian khuu
916872f72f
convert-*.py: model card metadata
2024-07-16 06:36:03 +10:00
brian khuu
a42c2b7efc
convert-*.py: add basename and finetune metadata
2024-07-16 06:36:03 +10:00
brian khuu
dbb1b471e4
convert-*.py: add --get-outfile command and refactor
2024-07-16 06:36:03 +10:00
brian khuu
d3a936fd0e
convert-*.py: licence -> license
2024-07-16 06:36:03 +10:00
Xuan Son Nguyen
97bdd26eee
Refactor lora adapter support ( #8332 )
...
* lora: load to devide buft
* add patch tensor function
* correct tensor patch
* llama_lora_adapter_apply
* correct ggml_backend_tensor_copy
* add llm_build_mm
* fix auto merge
* update based on review comments
* add convert script
* no more transpose A
* add f16 convert
* add metadata check
* add sanity check
* fix ftype
* add requirements
* fix requirements
* fix outfile
* conversion: only allow selected models
* fix types
* cuda : do not use dmmv if the tensor does not have enough cols
* llama : lora fixes
* do not disable mmap with lora
Co-authored-by: slaren <slarengh@gmail.com>
* llm_build_lora_mm_id
* convert_lora : MoE LoRA conversion support
* convert_lora : prefer safetensors, similarly to convert_hf
* convert_hf : simplify modify_tensors for InternLM2
* convert_lora : lazy conversion
* llama : load and use alpha from LoRA adapters
* llama : use llm_build_lora_mm in most model graphs
* auto scale
* Revert "auto scale"
This reverts commit 42415a4874
.
* remove redundant params
* Apply suggestions from code review
Co-authored-by: slaren <slarengh@gmail.com>
* change kv metadata
* move add_type to __init__
* convert_hf : move add_type to main()
* convert_lora : use the GGUFWriter from Model instead of overwriting it
---------
Co-authored-by: slaren <slarengh@gmail.com>
Co-authored-by: Francis Couture-Harpin <git@compilade.net>
2024-07-15 20:50:47 +02:00
Xuan Son Nguyen
4db8f60fe7
fix ci ( #8494 )
2024-07-15 19:23:10 +02:00
Daniel Bevenius
8fac431b06
ggml : suppress unknown pragma 'GCC' on windows ( #8460 )
...
This commit adds a macro guard to pragma GCC to avoid the following
warning on windows:
```console
C:\llama.cpp\ggml\src\ggml-aarch64.c(17,9): warning C4068:
unknown pragma 'GCC' [C:\lama.cpp\build\ggml\src\ggml.vcxproj]
```
2024-07-15 15:48:17 +03:00