Commit graph

  • 6c9ac0fc52
    refactor: Add a custom tokenizer component and fix vocab request class teleprint-me 2024-05-24 01:30:29 -04:00
  • 6b5c3753c8 refactor SplitStrategy to be a deque Christian Zhou-Zheng 2024-05-24 00:28:48 -04:00
  • 629420ee39 add result in readme caitianchi 2024-05-24 12:06:48 +08:00
  • e62e09bbb1
    refactor: Apply fix for file path references teleprint-me 2024-05-23 22:59:16 -04:00
  • c91dcdf2a4
    refactor: Add fixes for logging teleprint-me 2024-05-23 22:58:03 -04:00
  • dd14d818e0
    Update main-intel.Dockerfile base image to 2024.1.0 7507-main-intel-dockerfile Brian 2024-05-24 12:47:58 +10:00
  • b31f51f597
    Merge pull request #1 from harvestingmoon/minicpm-v2.5 tc-mb 2024-05-24 10:35:09 +08:00
  • 0df0aa8e43
    add build shared lib in win release package (#7438) Neo Zhang 2024-05-24 10:06:56 +08:00
  • 77bc7394c8
    refactor: Add tokenizer path, add methods for extracting vocab metadata, fix checksum method name teleprint-me 2024-05-23 21:40:05 -04:00
  • b4b553fe6c
    chore: Apply ruff formatting for readability teleprint-me 2024-05-23 21:36:51 -04:00
  • ea4fc1095e
    refactor: Apply fixes to required arguments and fixes to options teleprint-me 2024-05-23 21:36:31 -04:00
  • f62080adfa
    refactor: Simplify huggingface hub vocab request teleprint-me 2024-05-23 20:50:58 -04:00
  • 1749209406
    refactor: Simplify huggingface hub api implementation teleprint-me 2024-05-23 20:50:15 -04:00
  • c92c6ad480
    feat: Add CLI tool for fetching vocab files teleprint-me 2024-05-23 20:33:12 -04:00
  • 11d2d31622 SimpleChat: placeholder based usage hint for user-in textarea HanishKVC 2024-05-24 04:18:04 +05:30
  • b57aad79a8 SimpleChat:SlidingWindow: iRecentUserMsgCnt to limit context load HanishKVC 2024-05-24 01:05:05 +05:30
  • f0dd91d550 SimpleChat: Consolidate global vars into gMe, Display to user HanishKVC 2024-05-23 21:18:48 +05:30
  • 4b29736da5 SimpleChat: Reduce max_tokens to be small but still sufficient HanishKVC 2024-05-23 20:55:11 +05:30
  • cbd853eda9 SimpleChat:ChatRequestOptions: max_tokens HanishKVC 2024-05-23 18:51:36 +05:30
  • 59f74c7de9 SimpleChat: Update title, usage and readme a bit HanishKVC 2024-05-23 15:43:41 +05:30
  • 073eae6778 SimpleChat: Common chat request options from a global object HanishKVC 2024-05-23 15:32:54 +05:30
  • 5d84a92d62 SimpleChat: Rename the half asleep mis-spelled global var HanishKVC 2024-05-23 15:23:47 +05:30
  • 40fbbeb2f6 SimpleChat:Try read json early, if available HanishKVC 2024-05-23 14:32:01 +05:30
  • e2164d66e6 SimpleChat:Completion: clear any prev chat history at begining HanishKVC 2024-05-23 05:26:49 +05:30
  • 01594daeb4 SimpleChat: Update usage note and readme a bit HanishKVC 2024-05-23 05:07:32 +05:30
  • fe60655e1f SimpleChat:SC: Ensure proper clearing/reseting HanishKVC 2024-05-23 03:42:02 +05:30
  • 7a0a42367f SimpleChat:CompletionMode: Update readme/usage, trim textarea newline HanishKVC 2024-05-22 22:50:44 +05:30
  • 0dba8f8857 SimpleChat:Completion: Avoid Role: prefix; Newline only in between HanishKVC 2024-05-22 22:40:44 +05:30
  • 3c11098d1e SimpleChat:CompletionMode: Allow control of Role: prefix HanishKVC 2024-05-22 22:10:52 +05:30
  • 8042cb950d SimpleChat: A placeholder system prompt, Use usage msg in code HanishKVC 2024-05-22 18:56:51 +05:30
  • 3ff27efa89 Fix eager tensor memory leak and remove convert.py changes Christian Zhou-Zheng 2024-05-23 18:50:21 -04:00
  • 43321db396 better code style ngxson 2024-05-24 00:27:50 +02:00
  • e0a2d830ca use LOCALAPPDATA for fs_get_cache_directory() ngxson 2024-05-24 00:25:07 +02:00
  • c11fa8c8ed fix missing slash in fs_get_cache_directory() ngxson 2024-05-24 00:23:23 +02:00
  • 8afc0f3784 --hf-repo without --hf-file ngxson 2024-05-24 00:15:00 +02:00
  • d6fe374f9b add gradle file Elton Kola 2024-05-23 17:59:27 -04:00
  • 94dcaba646 fixed line harvestingmoon 2024-05-24 05:27:04 +08:00
  • 0ccf579242
    refactor: Apply consistent naming conventions teleprint-me 2024-05-23 17:17:22 -04:00
  • 2712886325 Merge branch 'master' of https://github.com/eltonkola/llama.cpp Elton Kola 2024-05-23 17:01:03 -04:00
  • 9ba6b92c2d
    chore: Add required vocabulary constants teleprint-me 2024-05-23 16:57:14 -04:00
  • 9814b7f9ab
    feat: Add custom huggingface hub api teleprint-me 2024-05-23 13:48:20 -04:00
  • ef2c668cae Made Clean andrew ferruolo 2024-05-23 10:58:43 -04:00
  • a1119c6d00
    Merge branch 'ggerganov:master' into master Andrew Ferruolo 2024-05-23 10:56:41 -04:00
  • e9a60f648a Remove whitespace Colin 2024-05-23 10:45:16 -04:00
  • 74f33adf5f
    readme : remove trailing space (#7469) b2986 Georgi Gerganov 2024-05-23 17:43:18 +03:00
  • 1debe72737
    ggml : silence UB sanitizer error during iq2_xxs quantization (#0) b2985 Georgi Gerganov 2024-05-23 17:17:43 +03:00
  • 27f788a868 Add checking for mixtrals new tensor naming to convert-hf-to-gguf.py Colin 2024-05-23 10:20:24 -04:00
  • 007489e895
    Fix phi3 chat template confusion with zephyr (#7449) b2984 Tristan Druyen 2024-05-23 16:15:15 +02:00
  • 7573b634a7
    Update README.md Hongji Zhu 2024-05-23 22:09:41 +08:00
  • a491f45cbc change name in readme caitianchi 2024-05-23 21:44:37 +08:00
  • ec1cea7182 add instructions in readme caitianchi 2024-05-23 21:41:11 +08:00
  • 0480d5faa2 add android readme caitianchi 2024-05-23 21:24:03 +08:00
  • 3574d63690
    Fix tests to not expect trimmed messages tristandruyen 2024-05-23 14:55:19 +02:00
  • 85ed87eb14
    Remove unneeded message trimming Tristan Druyen 2024-05-23 14:49:46 +02:00
  • a9bbb119f0
    Add all phi3 template variants in tests tristandruyen 2024-05-23 14:38:31 +02:00
  • 8b94e799df
    readme : add Bunny in supported models [no ci] (#7469) Raj Hammeer Singh Hada 2024-05-23 18:00:13 +05:30
  • 3015851c5a
    llama : add getters for n_threads/n_threads_batch (#7464) b2982 Daniel Bevenius 2024-05-23 14:29:26 +02:00
  • 55ac3b7aea
    ci : use Pythia models instead of OpenLlama (#7470) b2981 Georgi Gerganov 2024-05-23 15:28:14 +03:00
  • 05e6bc6c55
    Apply suggestion Tristan Druyen 2024-05-23 14:20:48 +02:00
  • dacfcebd60
    readme : add GPT-NeoX + Pythia to the list of supported models (#7491) Victor Nogueira 2024-05-23 15:12:43 +03:00
  • 2b9190344e add run android for termux in readme caitianchi 2024-05-23 20:11:44 +08:00
  • 21a7e7213d
    Add GPT-NeoX + Pythia to the list of supported models Victor Nogueira 2024-05-23 15:09:25 +03:00
  • d2bae45546
    llama : gptneox arch use F32 attn prec Georgi Gerganov 2024-05-23 15:01:40 +03:00
  • c536fa6ef9 rename caitianchi 2024-05-23 20:00:45 +08:00
  • 037af53bfa
    Fix phi3 jinja test templates & match by <|end|> tristandruyen 2024-05-23 13:40:53 +02:00
  • 7a49a6f6dc init caitianchi 2024-05-23 19:28:47 +08:00
  • 907c135a94
    ci : fix convert outfile name Georgi Gerganov 2024-05-23 14:20:37 +03:00
  • 1fbe80be47
    ci : update gg_get_model Georgi Gerganov 2024-05-23 13:57:49 +03:00
  • 8def9633e2
    ci : use convert-hf-to-gguf.py Georgi Gerganov 2024-05-23 13:57:34 +03:00
  • d2182dac9d
    ci : disable q2_k ppl tests Georgi Gerganov 2024-05-23 13:56:49 +03:00
  • d28bfd5ef7 remove ifdef msy-kato 2024-05-21 15:35:48 +09:00
  • 19531ac40a Add SVE support for q4_0_q8_0 q8_0_q8_0 msy-kato 2024-03-06 19:16:05 +09:00
  • 70a23863dc
    Merge branch 'master' into feat-minicpmv Brian 2024-05-23 20:34:37 +10:00
  • ebd5efeedf
    Add basic bf16 support to ggml-cuda Justine Tunney 2024-05-23 02:09:25 -07:00
  • 57496b2e17
    ci : start using Pythia models over OpenLlama Georgi Gerganov 2024-05-22 19:11:52 +03:00
  • 9b82476ee9
    Add missing inference support for GPTNeoXForCausalLM (Pythia and GPT-NeoX base models) (#7461) b2979 fairydreaming 2024-05-23 11:49:53 +02:00
  • a72f59e9da
    Merge 476d319fde into a61a94e543 Xuan Son Nguyen 2024-05-23 11:45:32 +02:00
  • a61a94e543
    llama : rename n_ctx -> cache.size, less confusing (#0) b2978 Georgi Gerganov 2024-05-23 12:38:18 +03:00
  • 7e171de882
    Merge branch 'ggerganov:master' into gpt-neox fairydreaming 2024-05-23 10:19:31 +02:00
  • 152da28ae5
    labeler.yml: add embedding label detector [no ci] (#7482) Brian 2024-05-23 17:40:43 +10:00
  • fcc8f820e1 llama : whitespace formatting fixes Stanisław Szymczyk 2024-05-23 09:23:49 +02:00
  • d48c88cbd5
    ggml : remove ggml_flash_attn and ggml_flash_ff (#7463) b2976 Georgi Gerganov 2024-05-23 10:00:44 +03:00
  • e84b71c2c6
    ggml : drop support for QK_K=64 (#7473) Georgi Gerganov 2024-05-23 10:00:21 +03:00
  • 1b1e27cb49
    Update vulkan rope implementation to support frequency factors (#7475) b2974 0cc4m 2024-05-23 08:59:59 +02:00
  • fbf777d2b9
    main : minor (#7462) b2973 Georgi Gerganov 2024-05-23 09:43:24 +03:00
  • 0f3bf7c96b llama : add comments for clarity, change confusing variable name Stanisław Szymczyk 2024-05-23 08:34:11 +02:00
  • c5fe1d6cdc gguf-py : remove unused import compilade/gguf-py-fix-q-shape Francis Couture-Harpin 2024-05-23 00:09:49 -04:00
  • 60fe62e6eb some renaming Oleksandr Kuvshynov 2024-05-22 23:52:36 -04:00
  • 2ff601fc32 gguf-py : fix and simplify quantized shape round-trip Francis Couture-Harpin 2024-05-22 23:40:41 -04:00
  • 479c80a0db duo: cleanup v2 Oleksandr Kuvshynov 2024-05-22 23:31:23 -04:00
  • eecdd3b0ce duo: first ~working option Oleksandr Kuvshynov 2024-05-22 23:02:31 -04:00
  • 3cff0f2a66 labeler.yml: add embedding label detector [no ci] brian khuu 2024-05-23 12:20:32 +10:00
  • 578a38234c fix json generation, use " not ' Yann Follet 2024-05-23 01:42:10 +00:00
  • 518b75260b cuda uma test sl/cuda-uma slaren 2024-05-23 03:13:48 +02:00
  • 2dd784108b Merge remote-tracking branch 'origin' into convert-split Christian Zhou-Zheng 2024-05-22 20:23:13 -04:00
  • 78d7828cc4
    chore: Add prototyped CLI options teleprint-me 2024-05-22 19:59:33 -04:00
  • cd00be886f
    chore: Add model metadata teleprint-me 2024-05-22 19:59:13 -04:00
  • cd93a28cb1
    CUDA: fix FA out-of-bounds reads (#7479) b2972 Johannes Gäßler 2024-05-23 00:31:20 +02:00
  • d76d1465e9 CUDA: fix FA out-of-bounds reads Johannes Gäßler 2024-05-23 00:00:33 +02:00
  • 1957ca41f2
    refactor: Simplify BPE pre-tokenizer mapping teleprint-me 2024-05-22 16:57:29 -04:00