Commit graph

  • 1d6d9497a8 readme Oleksandr Kuvshynov 2024-05-27 12:36:57 -04:00
  • 95f84d5ce8
    Fix q_xxs using mul_mat_q (#7459) b3010 AidanBeltonS 2024-05-27 17:34:51 +01:00
  • bde971a9ca convert-hf : fix flake8 Lint errors Stanisław Szymczyk 2024-05-27 18:25:47 +02:00
  • a572666e4e
    remove e Chris Elrod 2024-05-27 11:29:03 -04:00
  • dc18c34bca
    improve accuracy, handle special cases Chris Elrod 2024-05-27 11:12:37 -04:00
  • 450471454c clean the code Yazan Agha-Schrader 2024-05-27 17:00:03 +02:00
  • a2edaf48c3 Add API key CSS classes and update styling in style.css Yazan Agha-Schrader 2024-05-27 16:33:45 +02:00
  • b16e10bb69 some necessary fixes Yazan Agha-Schrader 2024-05-27 16:28:09 +02:00
  • ea6e19cd40
    x->r Chris Elrod 2024-05-27 10:06:30 -04:00
  • d02130d549 llama : print DeekSeek-V2-specific parameters in llm_load_print_meta() Stanisław Szymczyk 2024-05-27 15:30:17 +02:00
  • 5a14ef1dca add api-key css classes Yazan Agha-Schrader 2024-05-27 15:15:00 +02:00
  • 21826514df Allow multiple copy function pointers for CUDA graph kernel param updates Alan Gray 2024-05-27 05:48:18 -07:00
  • 9d8f12ce4f
    metal : bugfix kernel Georgi Gerganov 2024-05-27 16:06:11 +03:00
  • b9a63636c0
    ggml : fix op params handling Georgi Gerganov 2024-05-27 15:55:35 +03:00
  • 5cc7ec161c llama : rename query_states, key_states, value_states to q_states, k_states, v_states Stanisław Szymczyk 2024-05-27 14:42:27 +02:00
  • 5487593bc7
    Add freq factors (#7495) AidanBeltonS 2024-05-27 13:34:09 +01:00
  • 82cec8b84b llama : use attn_factor in mscale calculation to match the rope_yarn() implementation Stanisław Szymczyk 2024-05-27 14:33:31 +02:00
  • ec96ee57f4
    sycl : add warning and assert Georgi Gerganov 2024-05-27 15:13:23 +03:00
  • ed0891f2a4
    cuda : generalize concat kernel Georgi Gerganov 2024-05-27 15:08:37 +03:00
  • 56f70112eb llama : rename n_leading_dense_layer to n_layer_dense_lead Stanisław Szymczyk 2024-05-27 13:39:06 +02:00
  • 222a71ccf6
    tests : naming Georgi Gerganov 2024-05-27 14:35:49 +03:00
  • acdc075b60
    metal : generalize concat kernel Georgi Gerganov 2024-05-27 14:33:30 +03:00
  • d8033d9c8c add support for Poro pre-tokenizer ezosa 2024-05-27 14:31:28 +03:00
  • 0347657a3b
    tests : add dim != 2 tests Georgi Gerganov 2024-05-27 14:19:23 +03:00
  • fac1e804a1 llama : rename moe_intermediate_size variable to n_ff_exp Stanisław Szymczyk 2024-05-27 13:17:49 +02:00
  • 0b52245bdb
    ggml : generalize GGML_OP_CONCAT (WIP) Georgi Gerganov 2024-05-27 14:14:56 +03:00
  • 20769c0f7f llama : remove trailing whitespaces Stanisław Szymczyk 2024-05-27 13:11:31 +02:00
  • f3534141c9 support for Poro chat pre-tokenizer ezosa 2024-05-27 14:09:41 +03:00
  • 5a3e6b6cd1 llama : rename qk_rope_head_dim, qk_nope_head_dim variables to n_embd_head_qk_rope, n_embd_head_qk_nope Stanisław Szymczyk 2024-05-27 13:09:06 +02:00
  • a654cd992b llama : rename n_expert_ff to n_ff_exp Stanisław Szymczyk 2024-05-27 12:54:47 +02:00
  • 8b8a38068d github: add refactor issue template [no ci] brian khuu 2024-05-27 20:46:49 +10:00
  • abef8b2634 llama : code style corrections Stanisław Szymczyk 2024-05-27 12:47:53 +02:00
  • 1d8fca72ae
    metal : add GGML_OP_REPEAT kernels (#7557) b3008 Georgi Gerganov 2024-05-27 12:10:19 +03:00
  • ddc59e8e0a
    wipwipwiwpip compilade/refactor-kv-cache-gg Georgi Gerganov 2024-05-27 12:04:09 +03:00
  • 4b1770109c Fix q_xxs using mul_mat_q fix_q_xxs_mul_mat Aidan 2024-05-22 11:46:22 +01:00
  • 9ac5bdf7ef Add freq factors Aidan 2024-05-23 13:50:56 +01:00
  • 6b8ccd805f
    metal : add GGML_OP_REPEAT kernels Georgi Gerganov 2024-05-27 11:23:32 +03:00
  • 62bfef5194
    metal : disable FA kernel for HS=256 (#7556) b3007 Georgi Gerganov 2024-05-27 10:38:39 +03:00
  • 5d455f2789 chore: Update HTML meta tags in index.html file Yazan Agha-Schrader 2024-05-27 09:20:10 +02:00
  • 8d49b9906a de prompts Yazan Agha-Schrader 2024-05-27 09:18:19 +02:00
  • bd2c97c51a add the belonging stuff: css,favicon etc Yazan Agha-Schrader 2024-05-27 09:14:54 +02:00
  • 0a478c048a
    chore: Add pre tokenizers and include enum mappings teleprint-me 2024-05-27 03:11:40 -04:00
  • 1c6cde92bb
    metal : disable FA kernel for HS=256 gg/metal-disable-fa-256 Georgi Gerganov 2024-05-27 09:24:34 +03:00
  • 902862a505 migrate my eary work Yazan Agha-Schrader 2024-05-27 08:33:36 +02:00
  • eaf6e03174
    llama : add comments about experimental flags (#7544) b3006 Georgi Gerganov 2024-05-27 09:24:13 +03:00
  • 215394947e
    feat: Add prototype for bootstrapping registry teleprint-me 2024-05-27 01:05:36 -04:00
  • 0a30b6e082 ic Yazan Agha-Schrader 2024-05-27 07:05:08 +02:00
  • 375736270c fix performance regression on woa Reinforce-II 2024-05-27 12:44:56 +08:00
  • 0732bd9051
    feat: Ignore pre-existing model files teleprint-me 2024-05-27 00:06:53 -04:00
  • b1c922fec7
    feat: Add a proto sketch for handling mode vocab metadata teleprint-me 2024-05-27 00:06:39 -04:00
  • 7f48eb97db
    feat: Add experimental model registry for known models and their related metadata teleprint-me 2024-05-26 23:22:03 -04:00
  • 5833323754 Ignore second mlp layer if weights are null Andrei Betlen 2024-05-26 22:40:28 -04:00
  • 0adedd712e move unsed variable Reinforce-II 2024-05-23 15:23:12 +08:00
  • c812542f86 better toolchain compability Reinforce-II 2024-05-23 11:58:19 +08:00
  • 9a166331e0 use larger block size Reinforce-II 2024-05-23 01:17:51 +08:00
  • 3047229758 basic implementation Reinforce-II 2024-05-23 01:17:36 +08:00
  • bf5261e04e
    faster avx512 exp implementation Chris Elrod 2024-05-26 21:07:13 -04:00
  • d6ef0e77dd
    github: add self sorted issue ticket forms (#7543) Brian 2024-05-27 10:54:30 +10:00
  • 36bea177cb
    Merge branch 'master' into auto-model-support teleprint-me 2024-05-26 18:07:18 -04:00
  • 8541e99629 better pos_embed in clip caitianchi 2024-05-27 04:27:54 +08:00
  • 2997a680d2 change for ollama caitianchi 2024-05-27 03:42:56 +08:00
  • 150111f419 Description fixed Paul Rock 2024-05-26 22:37:35 +03:00
  • 18fe620976 change for ollama caitianchi 2024-05-27 03:29:55 +08:00
  • d9fbc1d1c5 add positions index caitianchi 2024-05-27 03:18:35 +08:00
  • 9f6bedea7b Support of converting local models added to convert-hf-to-gguf-update.py Paul Rock 2024-05-26 22:14:32 +03:00
  • f3b5e7d436 llama : correct llm_build_moe_ffn() arguments in build_arctic() Stanisław Szymczyk 2024-05-26 20:36:28 +02:00
  • a72b75738b Get rid of ADDED_TOKENS_FILE, FAST_TOKENIZER_FILE Galunid 2024-05-26 19:57:02 +02:00
  • a57484ae5b Fix check-requirements.sh Galunid 2024-05-26 19:51:27 +02:00
  • b4abdbb881 Add Vulkan sum_rows and div ops 0cc4m 2024-05-26 19:00:24 +02:00
  • 2c463c0377
    main: replace --no-special with --special (#7534) This also flips the default behavior of the output to not include control token by default. Bryan 2024-05-26 16:58:45 +00:00
  • a6a1abd98e simplify code, more consistent style slaren 2024-05-26 18:47:42 +02:00
  • 173ab69d9f Better variable names jaime-m-p 2024-05-26 17:57:05 +02:00
  • dff451cfa1
    flake.lock: Update (#7540) b3004 Georgi Gerganov 2024-05-26 18:54:56 +03:00
  • ae741af66d github: remove bios from os dropdown in bug report [no ci] brian khuu 2024-05-27 01:39:09 +10:00
  • c521960c88 github: remove contact from bug ticket template [no ci] brian khuu 2024-05-27 01:33:19 +10:00
  • bce882d94a github: consolidate BSD in bug issue ticket brian khuu 2024-05-27 01:31:32 +10:00
  • d298382ad9
    main: replace --no-special with --special (#7534) b3003 Brian 2024-05-27 00:10:17 +10:00
  • 32a28217f4
    Fix aya-23 conversion scripts (#7539) Galunid 2024-05-26 16:02:34 +02:00
  • c429b33beb
    llama : add Smaug 70B support (#7402) b3001 Bartowski 2024-05-26 08:28:35 -04:00
  • f6d56a474f github: add self sorted issue ticket forms [no ci] brian khuu 2024-05-26 22:13:52 +10:00
  • aa0de27ea7
    llama : add comments about experimental flags Georgi Gerganov 2024-05-26 14:51:25 +03:00
  • 9146d36fe7
    Readme: add akx/ggify to tools (#1484) Aarni Koskela 2024-05-26 15:09:42 +03:00
  • e0b2a40e0c make: add --device-debug to NVCC debug flags Johannes Gäßler 2024-05-26 11:56:37 +02:00
  • b48708af22 random pos_embed caitianchi 2024-05-26 19:40:37 +08:00
  • 579f059a15 Finish Vulkan mul_mat_id implementation 0cc4m 2024-05-26 12:58:03 +02:00
  • de26d49fbe duo: v5 Oleksandr Kuvshynov 2024-05-25 22:19:23 -04:00
  • 7c8699add6 pass user data Oleksandr Kuvshynov 2024-05-25 22:10:19 -04:00
  • b9adcbbf92
    SimpleChat Completion Mode flexibility and cleanup, Settings gMe, Optional sliding window (#7480) HanishKVC 2024-05-26 06:26:34 +05:30
  • b3a54291cb
    Merge branch 'huggingface-hub-api' into auto-model-support teleprint-me 2024-05-25 20:28:40 -04:00
  • 47d985a169 flake.lock: Update github-actions[bot] 2024-05-26 00:18:39 +00:00
  • e4275bcef4
    feat: Add example script for downloading models teleprint-me 2024-05-25 19:12:34 -04:00
  • fcd20ab9e9
    chore: Add comments for each file extension type teleprint-me 2024-05-25 19:12:16 -04:00
  • da72554f58
    feat: Add static methods for resolving model types and model extensions teleprint-me 2024-05-25 19:11:56 -04:00
  • 7a5578f211 Fix default value for WPM special_add_eos jaime-m-p 2024-05-26 01:11:53 +02:00
  • 865862627c Fix aya-23 conversion scripts Galunid 2024-05-26 00:07:53 +02:00
  • f84b04f1be Default values for special_add_bos/eos jaime-m-p 2024-05-25 23:17:09 +02:00
  • 615f425aab Allow lstrip for 'added_tokens' jaime-m-p 2024-05-25 21:45:32 +02:00
  • c83ea1a1f8 Move tokenizer flags to vocab structure. jaime-m-p 2024-05-25 21:39:50 +02:00
  • 1d2f3ad471 Better name functions to append token/bos/eos jaime-m-p 2024-05-25 21:30:26 +02:00
  • 534093878b duo: v3 Oleksandr Kuvshynov 2024-05-25 14:41:30 -04:00