Commit graph

  • eb0bf6b92f convert-*.py: Add naming_convention_vocab_only() brian khuu 2024-07-17 01:04:14 +10:00
  • 5da16bb1d7 Merge branch 'master' into refactor-convert-py brian khuu 2024-07-16 23:58:56 +10:00
  • 37373aaa55 make/cmake: add missing force MMQ/cuBLAS for HIP Johannes Gäßler 2024-07-16 15:52:31 +02:00
  • 9b70728ea3
    Added AI Studio to the list of UIs Thorsten Sommer 2024-07-16 11:05:35 +02:00
  • 7b575e70f5 fix func call tokens for internlm2 RunningLeon 2024-07-16 17:03:29 +08:00
  • 27396eac46 CUDA => MUSA Xiaodong Ye 2024-07-16 16:31:30 +08:00
  • 1115d2f836 Rename LLAMA_CANN to GGML_CANN huafengchun 2024-07-16 07:22:36 +00:00
  • f50f0905bc Modify the code based on review comment huafengchun 2024-07-16 07:15:26 +00:00
  • 1666f92dcd
    gguf-hash : update clib.json to point to original xxhash repo (#8491) Brian 2024-07-16 17:14:16 +10:00
  • 37b12f92ab
    export-lora : handle help argument (#8497) b3403 Steve Bonds 2024-07-16 00:04:45 -07:00
  • f6ea7a093c
    llama : change fallback type IQ4_NL -> Q4_0 gg/quantize-fallback Georgi Gerganov 2024-07-15 10:27:07 +03:00
  • 0efec57787
    llama : valign + remove unused ftype (#8502) b3402 Georgi Gerganov 2024-07-16 10:00:30 +03:00
  • dc716d6198
    llama : disable context-shift for DeepSeek v2 Georgi Gerganov 2024-07-16 09:46:00 +03:00
  • 0da1e1fc19 delete trailing whitespaces huafengchun 2024-07-16 06:26:45 +00:00
  • f8c345d59d [CANN] Add Ascend NPU backend huafengchun 2024-03-13 03:44:11 +00:00
  • 7acfd4e8d5
    convert_hf : faster lazy safetensors (#8482) compilade 2024-07-15 23:13:10 -04:00
  • b971122eb1 convert_hf : fix memory leak in lazy MoE conversion compilade/faster-lazy-safetensors Francis Couture-Harpin 2024-07-15 21:09:04 -04:00
  • 58409cd56d
    Merge pull request #1 from zihaoccc/monitor Zihao Chen 2024-07-15 19:58:12 -05:00
  • 804b303ec6
    handle export-lora help argument Steve Bonds 2024-07-15 17:20:40 -07:00
  • c7b3616449
    Update convert_hf_to_gguf.py Brian 2024-07-16 07:02:07 +10:00
  • 7899cff811 add monitor registry for rpc instance endpoint Zihao Chen 2024-07-15 15:51:23 -05:00
  • 9a925b56a0 metadata.py: account for decimal point in size label within model id components brian khuu 2024-07-15 19:16:38 +10:00
  • 417d7a7c62 convert_hf : use GGUFWriter to count model parameters Francis Couture-Harpin 2024-07-14 20:38:26 -04:00
  • 78a42fbee5 gguf-py : use pyyaml instead of python-frontmatter Francis Couture-Harpin 2024-07-14 15:36:50 -04:00
  • 3b1766a992 convert-*.py: flake8 remove blank line brian khuu 2024-07-14 16:33:19 +10:00
  • f98f1098f9 convert-*.py: more rigorous regexp for get_model_id_components() brian khuu 2024-07-14 16:28:52 +10:00
  • 4e3761109d covert-*.py: flake8 newline missing brian khuu 2024-07-14 12:28:55 +10:00
  • 8629b7bdc2 covert-*.py: per_model_weight_count_estimation() tensor arg type is Iterable[tuple[str, LazyTensor]] brian khuu 2024-07-14 12:19:23 +10:00
  • 144a7ec3a4 convert-*.py: pathlib.Path exist() --> is_file() or is_dir() brian khuu 2024-07-14 12:12:23 +10:00
  • abc351c270 convert-*.py: quantized_by in model card is not relevant for converted gguf brian khuu 2024-07-14 12:00:59 +10:00
  • 9954b64862 convert-*.py: add logger and refactor load_model_card() brian khuu 2024-07-14 12:00:03 +10:00
  • 5cdb03b2fc convert-*.py: update nix package to add python frontmatter brian khuu 2024-07-14 11:24:53 +10:00
  • 5ab1a84085 convert-*.py: dict_item --> Iterable brian khuu 2024-07-14 11:24:25 +10:00
  • 455c0e53ac Apply suggestions from code review Brian 2024-07-14 10:29:03 +10:00
  • ccff6c7fb2 convert-*.py: remove reference to uuid generation brian khuu 2024-07-13 23:21:38 +10:00
  • 8156835d4a constants.py : Revert removal of backward compatibility KEY_GENERAL_SOURCE_URL Brian 2024-07-13 22:26:32 +10:00
  • 2c060303a6 Update constants.py : spacing correction Brian 2024-07-13 22:02:09 +10:00
  • aa4e5892a0 Update convert_hf_to_gguf.py Brian 2024-07-13 20:43:17 +10:00
  • 60278e4f4d Update convert_hf_to_gguf.py Brian 2024-07-13 20:42:55 +10:00
  • ad217d7249 convert-*.py: remove autogenerated uuid brian khuu 2024-07-13 19:18:11 +10:00
  • f2b425c59c convert-*.py: import cast from typing and other refactor brian khuu 2024-07-11 21:52:53 +10:00
  • 04c4fffdcc convert-*.py: prepare_tensors_for_writing() --> prepare_tensors() brian khuu 2024-07-11 21:14:04 +10:00
  • 64707b625c convert-*.py: remove redundant gguf_writer.add_name() calls brian khuu 2024-07-11 21:11:16 +10:00
  • f8b5931180 convert-*.py: parameter_class_attribute --> size_label brian khuu 2024-07-11 21:01:52 +10:00
  • 6eb08ac868 convert-*.py: Removing the redundant metadata is not None from all conditions, and indenting them. brian khuu 2024-07-11 20:42:11 +10:00
  • 4c91d077d2 convert-*.py: cast not required if Metadata.load_metadata_override returned a dict[str, Any] instead of a dict[str, object] brian khuu 2024-07-11 20:39:10 +10:00
  • 74383ba6d2 Apply suggestions from code review Brian 2024-07-11 21:10:51 +10:00
  • dd14b8fdb1 convert-*.py: pyright type fixes brian khuu 2024-07-10 23:39:09 +10:00
  • 59a01df784 convert-*.py: refactor per model weight count estimation brian khuu 2024-07-10 20:20:54 +10:00
  • 2a976e1211 convert-*.py: write_tensors() --> prepare_tensors_for_writing() brian khuu 2024-07-10 20:18:40 +10:00
  • fdc5a3fc80 convert-*.py: autogenerate general.uuid if missing brian khuu 2024-07-09 23:30:28 +10:00
  • 7ecb8f00a0 test: remove test_gguf.py and remove test_generate_any_missing_uuid() brian khuu 2024-07-09 23:24:19 +10:00
  • 007708e32d gguf_writer.py: generate tensor uuid if missing brian khuu 2024-07-09 06:52:44 +10:00
  • 4dc8ddd35a convert_hf_to_gguf.py: Remove code that is already in fill_templated_filename() and GGUFWriter() brian khuu 2024-07-07 20:00:26 +10:00
  • 2f23927d37 convert_hf_to_gguf.py: rebase error correction brian khuu 2024-07-07 18:52:52 +10:00
  • 5011eefeaf convert_hf_to_gguf.py: optional, dataclass removed from type as it was unused brian khuu 2024-07-07 18:06:14 +10:00
  • e9734434bd convert-*.py: Remove self.model_name that was left in since last rebase brian khuu 2024-06-09 16:57:39 +10:00
  • eaa47f5546 convert-*.py: separated unit test, hf_repo to repo_url brian khuu 2024-06-08 21:54:20 +10:00
  • d060fcdbe2 convert-*.py: adjusted authorship KV store brian khuu 2024-06-07 03:33:21 +10:00
  • 91e65d9485 convert-*.py: add unittest to metadata class brian khuu 2024-06-05 03:51:38 +10:00
  • 3625a42061 convert-*.py: add heuristic to directory name fallback brian khuu 2024-06-04 02:25:11 +10:00
  • 39472a09da convert-*.py: need to include self in per_model_weight_count_estimation() brian khuu 2024-06-04 02:18:53 +10:00
  • 54918ad14e convert-*.py: refactor parameter weight class brian khuu 2024-06-04 01:14:50 +10:00
  • 32e80e094c convert-*.py: base_model is actually in spec for model cards brian khuu 2024-06-04 00:28:16 +10:00
  • 4d5cd0670a convert-*.py: use heuristics to parse _name_or_path brian khuu 2024-06-04 00:22:52 +10:00
  • b0553f42da convert-*.py: adjust help message brian khuu 2024-06-03 23:56:14 +10:00
  • dd1571211e convert-*.py: add quantized_by and enhance heuristics brian khuu 2024-06-03 23:52:46 +10:00
  • 5a86dfaa1c convert-*.py: add general.organization to kv store brian khuu 2024-06-03 00:57:37 +10:00
  • f7c20793b9 convert-*.py: enable --model-name direct metadata override brian khuu 2024-06-02 23:56:04 +10:00
  • b1927eed82 convert-*.py: move per model weight estimation away from util back to main script brian khuu 2024-06-02 17:44:53 +10:00
  • 684c604eca convert-*.py: add datasets and language to KV store brian khuu 2024-06-02 17:17:56 +10:00
  • 0f1d50fab7 convert-*.py: add parameter size class brian khuu 2024-06-02 15:40:31 +10:00
  • 8f734083dd convert-*.py: add base_version and add tags brian khuu 2024-06-02 15:11:52 +10:00
  • b36e391b87 convert-*.py: parse model card in metadata util. Add license_link and license_name to kv store brian khuu 2024-06-02 12:27:28 +10:00
  • 5c263cb257 convert-*.py: encoding_scheme --> output_type brian khuu 2024-06-02 01:58:47 +10:00
  • 4d5f18a0e6 convert-*.py: metadata class moved to utility brian khuu 2024-06-02 01:49:58 +10:00
  • 916872f72f convert-*.py: model card metadata brian khuu 2024-05-31 14:19:53 +10:00
  • a42c2b7efc convert-*.py: add basename and finetune metadata brian khuu 2024-05-31 03:14:11 +10:00
  • dbb1b471e4 convert-*.py: add --get-outfile command and refactor brian khuu 2024-05-24 03:48:00 +10:00
  • d3a936fd0e convert-*.py: licence -> license brian khuu 2024-05-28 12:44:56 +10:00
  • 2a49a68d70 Merge branch 'master' into compilade/faster-lazy-safetensors Francis Couture-Harpin 2024-07-15 15:24:25 -04:00
  • 654b1b35b3 add swin norm param nopperl 2024-07-15 21:19:27 +02:00
  • 97bdd26eee
    Refactor lora adapter support (#8332) b3400 Xuan Son Nguyen 2024-07-15 20:50:47 +02:00
  • f455e82a42
    Merge branch 'ggerganov:master' into gguf-model-template Austin 2024-07-15 14:18:24 -04:00
  • 0cb404cc13
    feat : Add shebang and executable bit to enable script execution teleprint-me 2024-07-15 13:57:46 -04:00
  • b7528fdf89
    chore : Add jinja2 as dev dependency in pyproject.toml and explicit dependency in requirements.txt teleprint-me 2024-07-15 13:56:54 -04:00
  • 383b6bcef8 Merge branch 'master' into xsn/fix_lora ngxson 2024-07-15 19:54:15 +02:00
  • 4db8f60fe7
    fix ci (#8494) Xuan Son Nguyen 2024-07-15 19:23:10 +02:00
  • ff601abc1c add todo hongruichen 2024-07-16 00:05:40 +08:00
  • 0453f7d114 implement chameleon graph nopperl 2024-07-15 17:51:16 +02:00
  • d09382fac7 convert_hf : move add_type to main() Francis Couture-Harpin 2024-07-15 11:39:42 -04:00
  • 4d9ac0f375 Merge branch 'master' into xsn/fix_lora ngxson 2024-07-15 17:22:40 +02:00
  • b1c4069502 move add_type to __init__ ngxson 2024-07-15 17:22:38 +02:00
  • 9d7552cf85 fix ci ngxson 2024-07-15 17:19:42 +02:00
  • 0ba23bad6f change kv metadata ngxson 2024-07-15 15:35:19 +02:00
  • 78eff8bdca gguf-hash: readme update to point to Cyan4973 xxHash repo [no ci] Brian 2024-07-15 21:52:17 +10:00
  • 9175f4b77c
    Apply suggestions from code review Xuan Son Nguyen 2024-07-15 15:02:46 +02:00
  • 8fac431b06
    ggml : suppress unknown pragma 'GCC' on windows (#8460) b3398 Daniel Bevenius 2024-07-15 14:48:17 +02:00
  • f17f39ff9c
    server: update README.md with llama-server --help output [no ci] (#8472) M-A 2024-07-15 08:04:56 -04:00
  • 9104bc20ed
    common : add --no-cont-batching arg (#6358) b3396 Georgi Gerganov 2024-07-15 14:54:58 +03:00