Commit graph

  • 93f8d471c1 opencl alignment size should be converted from bits to bytes Albert Jin 2024-05-05 23:15:49 +08:00
  • 628b299106
    Adding support for the --numa argument for llama-bench. (#7080) b2794 kunnis 2024-05-05 07:17:47 -05:00
  • acb26fd05e
    Update examples/llama-bench/llama-bench.cpp slaren 2024-05-05 14:17:16 +02:00
  • 8f8acc8683
    Disable benchmark on forked repo (#7034) b2793 Sigbjørn Skjæret 2024-05-05 13:38:55 +02:00
  • 589c74d7bf
    update ref checks Sigbjørn Skjæret 2024-05-05 08:42:10 +02:00
  • d4abd657ad Fix flake8 0cc4m 2024-05-05 08:23:24 +02:00
  • 947df47591 Disable mul_mat_id shaders for now 0cc4m 2024-05-05 07:52:44 +02:00
  • 6697303c49 Add softmax with f16 mask and pos buffer support 0cc4m 2024-05-05 07:48:45 +02:00
  • f3dc70402c Disable MoE code (not ready yet), fix a number of bugs in shaders and Vulkan code 0cc4m 2024-05-05 07:41:30 +02:00
  • ca36326020
    readme : add note that LLaMA 3 is not supported with convert.py (#7065) Lyle Dean 2024-05-05 06:21:46 +01:00
  • 889bdd7686
    command-r : add BPE pre-tokenization (#7063) b2791 DAN™ 2024-05-05 01:19:30 -04:00
  • 6fbd432211
    py : logging and flake8 suppression refactoring (#7081) Brian 2024-05-05 15:07:48 +10:00
  • f5806b2d09
    command-r : add individual digits regex Georgi Gerganov 2024-05-05 07:46:55 +03:00
  • b964097b6f *.py: Flake8 refactoring and logging cleanup Set one as executable and add basicConfig() to another. Also added noqa tag to test scripts. brian khuu 2024-05-04 15:15:45 +10:00
  • 98db4347e8 convert-hf : remove einops requirement for InternLM2 Francis Couture-Harpin 2024-05-04 16:52:06 -04:00
  • 0c3833286e convert-hf : flake8 doesn't like lowercase L as a variable name Francis Couture-Harpin 2024-05-04 10:48:18 -04:00
  • f09674fbbd convert-hf : save memory with lazy evaluation Francis Couture-Harpin 2024-05-03 22:00:05 -04:00
  • 215a0d38c8 convert-hf : fix Refact conversion Francis Couture-Harpin 2024-05-04 23:55:42 -04:00
  • 1b09493378 Adding support for the --numa argument for benchmarking. Kunnis 2024-05-04 22:12:17 -05:00
  • e3cd5527cc flake.lock: Update github-actions[bot] 2024-05-05 00:17:54 +00:00
  • edf375d26f Restore BOM jaime-m-p 2024-05-05 01:58:34 +02:00
  • 67832e5554 llama3 custom regex split: fix \s jaime-m-p 2024-05-05 01:20:23 +02:00
  • 8fd849eb90 Unicode tables: separator, lowercase, uppercase and whitespace jaime-m-p 2024-05-05 01:19:20 +02:00
  • 78214ac56b
    fix: use vm_allocate only on macOS Gilad S 2024-05-05 02:13:54 +03:00
  • a92efecb86
    fix: don't call newBufferWithBytesNoCopy with NULL when ggml_metal_host_malloc returns NULL Gilad S 2024-05-05 01:56:36 +03:00
  • bfa4daea4e
    fix: use vm_allocate instead of posix_memalign Gilad S 2024-05-05 01:50:02 +03:00
  • 69a49ac3a1 Fix merge jaime-m-p 2024-05-05 00:42:44 +02:00
  • 4bcd1d8a74
    test-- Sigbjørn Skjæret 2024-05-05 00:05:35 +02:00
  • 94ebaae022
    test++ Sigbjørn Skjæret 2024-05-05 00:04:40 +02:00
  • 840986bd4a
    event_name is pull_request_target Sigbjørn Skjæret 2024-05-05 00:01:59 +02:00
  • a53e51796e
    fix: typo Gilad S 2024-05-05 00:49:24 +03:00
  • 06e9307ad6
    test-- Sigbjørn Skjæret 2024-05-04 23:41:25 +02:00
  • ab57af872d
    test++ Sigbjørn Skjæret 2024-05-04 23:39:00 +02:00
  • c3d9d7040a
    correct github.event usage Sigbjørn Skjæret 2024-05-04 23:35:32 +02:00
  • 5e8ca26f8b
    remove test condition Sigbjørn Skjæret 2024-05-04 23:34:16 +02:00
  • dfb33e8f7a
    correct github.event usage Sigbjørn Skjæret 2024-05-04 23:30:47 +02:00
  • 571dca5715
    fix: use malloc instead of posix_memalign in ggml-metal.m to make it not crash Electron proccesses Gilad S 2024-05-05 00:06:50 +03:00
  • acd1d89124
    this is driving me crazy Sigbjørn Skjæret 2024-05-04 23:00:01 +02:00
  • 3a394659f3
    test-- Sigbjørn Skjæret 2024-05-04 22:48:39 +02:00
  • fb7304b275
    do debug where we can get logs Sigbjørn Skjæret 2024-05-04 22:45:11 +02:00
  • 8122c9484c
    test++ Sigbjørn Skjæret 2024-05-04 22:30:41 +02:00
  • 29c30faffa
    remove debug Sigbjørn Skjæret 2024-05-04 22:29:30 +02:00
  • f15f9d9ee4
    test-- Sigbjørn Skjæret 2024-05-04 22:24:59 +02:00
  • 86394bf8e9
    enable actions debug Sigbjørn Skjæret 2024-05-04 22:24:23 +02:00
  • bde2fc7dc2
    test++ Sigbjørn Skjæret 2024-05-04 21:44:44 +02:00
  • 158215c828
    add progress bar Sigbjørn Skjæret 2024-05-04 20:29:40 +02:00
  • 1c533b99d5
    Merge ca0409fae4 into 842500144e ManniX-ITA 2024-05-05 02:25:38 +08:00
  • d39f20359e
    typing++ Sigbjørn Skjæret 2024-05-04 20:19:50 +02:00
  • 2d4be61694
    style++ Sigbjørn Skjæret 2024-05-04 20:05:50 +02:00
  • c4ae6c1cc9
    ternary won't work Sigbjørn Skjæret 2024-05-04 20:02:17 +02:00
  • ffdf22430f
    Further tidy on Android instructions README.md Jeximo 2024-05-04 15:00:10 -03:00
  • f006b5ca5e
    more readable as multi-line Sigbjørn Skjæret 2024-05-04 19:28:45 +02:00
  • 22fddbb625 add_special option for server tokenize endpoint Johan Aires Rastén 2024-05-03 14:45:53 +02:00
  • 842500144e
    gguf-split: add --no-tensor-first-split (#7072) b2789 Xuan Son Nguyen 2024-05-04 18:56:22 +02:00
  • cf768b7e71
    Tidy Android Instructions README.md (#7016) Jeximo 2024-05-04 13:10:15 -03:00
  • c3fa382071
    Update README.md Jeximo 2024-05-04 12:34:44 -03:00
  • 68e8732f04
    remove Fdroid reference, link directly to Termux Jeximo 2024-05-04 12:26:50 -03:00
  • 798b576c06 Merge remote-tracking branch 'upstream/master' into gg/bpe-preprocess jaime-m-p 2024-05-04 16:59:24 +02:00
  • aa814faf61 Merge branch 'master' into easier-vocab-conversion 20kdc 2024-05-04 15:47:32 +01:00
  • fcd84a0f5a
    Fix Linux /sys cpu path to guess number of cores (#7064) b2787 viric 2024-05-04 15:26:53 +02:00
  • f2099c50ab convert-hf : align the message logged for converted tensors Francis Couture-Harpin 2024-05-04 09:09:47 -04:00
  • 52d2cac4f8 gguf-split: add --no-tensor-first-split ngxson 2024-05-04 14:57:30 +02:00
  • b259634be6
    check owner on push also Sigbjørn Skjæret 2024-05-04 14:26:17 +02:00
  • 20157cfd80 Bump transformers convert requirement. DAN™ 2024-05-04 06:43:09 -04:00
  • d5d67316e6 Add BPE pre-tokenization for Command-R/R+. DAN™ 2024-05-03 07:49:43 -04:00
  • d686299124
    only check owner on schedule event Sigbjørn Skjæret 2024-05-04 12:08:38 +02:00
  • 03fb8a002d
    If first token generated from the server is the stop word the server will crash (#7038) b2786 maor-ps 2024-05-04 12:06:40 +03:00
  • 3098206b00 Further work towards MoE, disabled for now 0cc4m 2024-05-04 09:27:32 +02:00
  • 92139b90af
    tests : add test-tokenizer-0.sh + fix some tokenizers (#7036) b2785 Georgi Gerganov 2024-05-04 08:32:32 +03:00
  • 7e11d409fa
    phi-3 : update Georgi Gerganov 2024-05-04 08:29:22 +03:00
  • 3f96d538c0 two converter py files needs basicConfig() added brian khuu 2024-05-04 15:15:45 +10:00
  • f19b45cbfd
    unicode : digit -> number Georgi Gerganov 2024-05-04 08:21:04 +03:00
  • 5f30e30a59
    lint : fix Georgi Gerganov 2024-05-04 08:10:44 +03:00
  • d974aed567
    convert : print -> logging Georgi Gerganov 2024-05-04 07:58:39 +03:00
  • 26f606efed
    Merge branch 'master' into gg/add-tokenizer-test-script Georgi Gerganov 2024-05-04 07:51:42 +03:00
  • 98f2d0e0d7 convert-hf : more consistent formatting of cmdline args Francis Couture-Harpin 2024-05-03 22:04:31 -04:00
  • 4d441e4acf wip: fixing unicode codepoint ranges jaime-m-p 2024-05-04 01:36:13 +02:00
  • 3e3e2838a1 Add bruteforce random tests for token encoding jaime-m-p 2024-05-04 01:34:36 +02:00
  • 7038442a41
    Update README.md Lyle Dean 2024-05-03 21:36:14 +01:00
  • cedea33f11 Fix Linux /sys cpu path to guess number of cores Lluís Batlle i Rossell 2024-05-03 22:28:52 +02:00
  • 3e5e0dced5 Merge branch 'master' into compilade/convert-hf-refactor Francis Couture-Harpin 2024-05-03 16:20:54 -04:00
  • d4cbccb103 main : skip printing token healing prefix twice mare5x 2024-05-03 21:56:11 +02:00
  • a2ac89d6ef
    convert.py : add python logging instead of print() (#6511) b2784 Brian 2024-05-04 05:36:41 +10:00
  • 9745cf885f
    refact : add tests files Georgi Gerganov 2024-05-03 21:42:10 +03:00
  • bc26eb75f0
    tests : disable failing tests Georgi Gerganov 2024-05-03 21:38:33 +03:00
  • c30056a700
    lint : fix Georgi Gerganov 2024-05-03 21:34:18 +03:00
  • 7d0cc78bc3 main : better token healing support for interactive mode mare5x 2024-05-03 19:50:00 +02:00
  • 624a689b68
    delete reference to Android API Jeximo 2024-05-03 11:27:45 -03:00
  • d53240ccc2
    refact : add tokenizer model Georgi Gerganov 2024-05-03 17:27:12 +03:00
  • cd7c728a66
    unicode : regenerate unicode tables Georgi Gerganov 2024-05-03 17:08:18 +03:00
  • 433def286e
    llama : rename ctx to user_data in progress_callback (#7045) b2783 Daniel Bevenius 2024-05-03 15:24:30 +02:00
  • bcda3403da
    squash! llama : rename ctx to user_data in progress_callback Daniel Bevenius 2024-05-03 15:12:14 +02:00
  • 784d08e9f2
    correct grammar Jeximo 2024-05-03 09:54:14 -03:00
  • 868bb32970
    correct style Jeximo 2024-05-03 09:53:01 -03:00
  • 951b6593b2 main : first attempt at token healing in main mare5x 2024-05-03 13:50:31 +02:00
  • 52d0567402 *.py: add compilade warning suggestions and style fixes brian khuu 2024-05-03 12:35:14 +10:00
  • 36bff51a7a fix tokenizer.json tokenizer_config.json cpu() Achazwl 2024-05-03 10:06:36 +08:00
  • e65ab647cd
    Update README.md l3utterfly 2024-05-03 10:56:57 +09:00
  • 93af09a030
    Merge branch 'master' into master Paulo de Castro 2024-05-02 22:51:02 -03:00
  • a772cde9dc add test Paulo 2024-05-02 21:30:53 -03:00