Commit graph

3449 commits

Author SHA1 Message Date
brian khuu
5cdb03b2fc convert-*.py: update nix package to add python frontmatter 2024-07-16 06:42:38 +10:00
brian khuu
5ab1a84085 convert-*.py: dict_item --> Iterable 2024-07-16 06:42:38 +10:00
Brian
455c0e53ac Apply suggestions from code review
Co-authored-by: compilade <git@compilade.net>
2024-07-16 06:42:38 +10:00
brian khuu
ccff6c7fb2 convert-*.py: remove reference to uuid generation 2024-07-16 06:42:38 +10:00
Brian
8156835d4a constants.py : Revert removal of backward compatibility KEY_GENERAL_SOURCE_URL 2024-07-16 06:42:38 +10:00
Brian
2c060303a6 Update constants.py : spacing correction 2024-07-16 06:42:38 +10:00
Brian
aa4e5892a0 Update convert_hf_to_gguf.py
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
2024-07-16 06:42:38 +10:00
Brian
60278e4f4d Update convert_hf_to_gguf.py
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
2024-07-16 06:42:38 +10:00
brian khuu
ad217d7249 convert-*.py: remove autogenerated uuid 2024-07-16 06:42:38 +10:00
brian khuu
f2b425c59c convert-*.py: import cast from typing and other refactor 2024-07-16 06:42:38 +10:00
brian khuu
04c4fffdcc convert-*.py: prepare_tensors_for_writing() --> prepare_tensors()
> Especially since it can be used for other purposes than "for writing", like preparing the tensors to then count and sum all their sizes.

Co-authored-by: compilade <git@compilade.net>
2024-07-16 06:42:38 +10:00
brian khuu
64707b625c convert-*.py: remove redundant gguf_writer.add_name() calls 2024-07-16 06:42:38 +10:00
brian khuu
f8b5931180 convert-*.py: parameter_class_attribute --> size_label 2024-07-16 06:42:38 +10:00
brian khuu
6eb08ac868 convert-*.py: Removing the redundant metadata is not None from all conditions, and indenting them.
Co-authored-by: compilade <git@compilade.net>
2024-07-16 06:42:38 +10:00
brian khuu
4c91d077d2 convert-*.py: cast not required if Metadata.load_metadata_override returned a dict[str, Any] instead of a dict[str, object]
Co-authored-by: compilade <git@compilade.net>
2024-07-16 06:42:38 +10:00
Brian
74383ba6d2 Apply suggestions from code review
Co-authored-by: compilade <git@compilade.net>
2024-07-16 06:42:38 +10:00
brian khuu
dd14b8fdb1 convert-*.py: pyright type fixes 2024-07-16 06:42:38 +10:00
brian khuu
59a01df784 convert-*.py: refactor per model weight count estimation 2024-07-16 06:42:38 +10:00
brian khuu
2a976e1211 convert-*.py: write_tensors() --> prepare_tensors_for_writing() 2024-07-16 06:42:38 +10:00
brian khuu
fdc5a3fc80 convert-*.py: autogenerate general.uuid if missing 2024-07-16 06:42:35 +10:00
brian khuu
7ecb8f00a0 test: remove test_gguf.py and remove test_generate_any_missing_uuid() 2024-07-16 06:38:40 +10:00
brian khuu
007708e32d gguf_writer.py: generate tensor uuid if missing 2024-07-16 06:38:40 +10:00
brian khuu
4dc8ddd35a convert_hf_to_gguf.py: Remove code that is already in fill_templated_filename() and GGUFWriter() 2024-07-16 06:38:40 +10:00
brian khuu
2f23927d37 convert_hf_to_gguf.py: rebase error correction 2024-07-16 06:38:40 +10:00
brian khuu
5011eefeaf convert_hf_to_gguf.py: optional, dataclass removed from type as it was unused 2024-07-16 06:38:40 +10:00
brian khuu
e9734434bd convert-*.py: Remove self.model_name that was left in since last rebase 2024-07-16 06:38:40 +10:00
brian khuu
eaa47f5546 convert-*.py: separated unit test, hf_repo to repo_url 2024-07-16 06:38:40 +10:00
brian khuu
d060fcdbe2 convert-*.py: adjusted authorship KV store 2024-07-16 06:38:40 +10:00
brian khuu
91e65d9485 convert-*.py: add unittest to metadata class 2024-07-16 06:38:38 +10:00
brian khuu
3625a42061 convert-*.py: add heuristic to directory name fallback
Also add source_url for huggingface url
2024-07-16 06:37:42 +10:00
brian khuu
39472a09da convert-*.py: need to include self in per_model_weight_count_estimation() 2024-07-16 06:37:42 +10:00
brian khuu
54918ad14e convert-*.py: refactor parameter weight class 2024-07-16 06:37:42 +10:00
brian khuu
32e80e094c convert-*.py: base_model is actually in spec for model cards 2024-07-16 06:37:42 +10:00
brian khuu
4d5cd0670a convert-*.py: use heuristics to parse _name_or_path 2024-07-16 06:37:42 +10:00
brian khuu
b0553f42da convert-*.py: adjust help message 2024-07-16 06:37:42 +10:00
brian khuu
dd1571211e convert-*.py: add quantized_by and enhance heuristics 2024-07-16 06:37:38 +10:00
brian khuu
5a86dfaa1c convert-*.py: add general.organization to kv store 2024-07-16 06:36:03 +10:00
brian khuu
f7c20793b9 convert-*.py: enable --model-name direct metadata override 2024-07-16 06:36:03 +10:00
brian khuu
b1927eed82 convert-*.py: move per model weight estimation away from util back to main script
plus some refactoring
2024-07-16 06:36:03 +10:00
brian khuu
684c604eca convert-*.py: add datasets and language to KV store 2024-07-16 06:36:03 +10:00
brian khuu
0f1d50fab7 convert-*.py: add parameter size class 2024-07-16 06:36:03 +10:00
brian khuu
8f734083dd convert-*.py: add base_version and add tags 2024-07-16 06:36:03 +10:00
brian khuu
b36e391b87 convert-*.py: parse model card in metadata util. Add license_link and license_name to kv store 2024-07-16 06:36:03 +10:00
brian khuu
5c263cb257 convert-*.py: encoding_scheme --> output_type 2024-07-16 06:36:03 +10:00
brian khuu
4d5f18a0e6 convert-*.py: metadata class moved to utility 2024-07-16 06:36:03 +10:00
brian khuu
916872f72f convert-*.py: model card metadata 2024-07-16 06:36:03 +10:00
brian khuu
a42c2b7efc convert-*.py: add basename and finetune metadata 2024-07-16 06:36:03 +10:00
brian khuu
dbb1b471e4 convert-*.py: add --get-outfile command and refactor 2024-07-16 06:36:03 +10:00
brian khuu
d3a936fd0e convert-*.py: licence -> license 2024-07-16 06:36:03 +10:00
Xuan Son Nguyen
97bdd26eee
Refactor lora adapter support (#8332)
* lora: load to devide buft

* add patch tensor function

* correct tensor patch

* llama_lora_adapter_apply

* correct ggml_backend_tensor_copy

* add llm_build_mm

* fix auto merge

* update based on review comments

* add convert script

* no more transpose A

* add f16 convert

* add metadata check

* add sanity check

* fix ftype

* add requirements

* fix requirements

* fix outfile

* conversion: only allow selected models

* fix types

* cuda : do not use dmmv if the tensor does not have enough cols

* llama : lora fixes

* do not disable mmap with lora

Co-authored-by: slaren <slarengh@gmail.com>

* llm_build_lora_mm_id

* convert_lora : MoE LoRA conversion support

* convert_lora : prefer safetensors, similarly to convert_hf

* convert_hf : simplify modify_tensors for InternLM2

* convert_lora : lazy conversion

* llama : load and use alpha from LoRA adapters

* llama : use llm_build_lora_mm in most model graphs

* auto scale

* Revert "auto scale"

This reverts commit 42415a4874.

* remove redundant params

* Apply suggestions from code review

Co-authored-by: slaren <slarengh@gmail.com>

* change kv metadata

* move add_type to __init__

* convert_hf : move add_type to main()

* convert_lora : use the GGUFWriter from Model instead of overwriting it

---------

Co-authored-by: slaren <slarengh@gmail.com>
Co-authored-by: Francis Couture-Harpin <git@compilade.net>
2024-07-15 20:50:47 +02:00