Commit graph

  • dd30219332 buffer incomplete multi-byte characters anon 2023-05-31 10:40:42 -03:00
  • 27911d6d68 fix default model alias anon 2023-05-31 10:37:52 -03:00
  • aa2bbb2d35 fix parameter type anon 2023-05-31 10:36:51 -03:00
  • f1710b90dc add infinite generation when n_predict is -1 anon 2023-05-31 10:35:25 -03:00
  • 284bc293b1 reserve memory for generated_text anon 2023-05-31 10:46:06 -03:00
  • 446e42a8c6 change dmmv block size Concedo 2023-05-31 21:40:12 +08:00
  • 83a34444af
    remove trailing whitespace xaedes 2023-05-31 15:02:38 +02:00
  • 01fc3faf71
    add explicit cast to fix compile error xaedes 2023-05-31 15:00:54 +02:00
  • 2c08f29691 make api server use only a single thread anon 2023-05-31 09:02:32 -03:00
  • c1cbde82a1 print error when server can't bind to the interface anon 2023-05-31 00:00:56 -03:00
  • f88fb2bdc5
    add #include <climits> xaedes 2023-05-31 12:38:26 +02:00
  • 077ee4e989 Revert "Revert "opencl : no need to allocate cl_mem on heap (#1612)"" Concedo 2023-05-31 18:00:52 +08:00
  • 50c85bea4c Merge remote-tracking branch 'occam/opencl-dev' into concedo_experimental Concedo 2023-05-31 17:53:14 +08:00
  • 32dada5e5f updated lite Concedo 2023-05-31 17:52:09 +08:00
  • 5e1eecfe12 Adapt to #1612 cl_mem malloc changes 0cc4m 2023-05-31 07:07:47 +02:00
  • 49aaf08387 Merge remote-tracking branch 'origin/master' into opencl-dev 0cc4m 2023-05-31 06:58:51 +02:00
  • a5a85d68c6 Merge branch 'master' into concedo_experimental Concedo 2023-05-31 10:51:54 +08:00
  • 85c9f7df41 Merge remote-tracking branch 'occam/opencl-dev' into concedo_experimental Concedo 2023-05-31 10:20:32 +08:00
  • 4afa38e744 Revert "opencl : no need to allocate cl_mem on heap (#1612)" Concedo 2023-05-31 10:20:23 +08:00
  • 9f2424ac47
    Merge pull request #5 from anon998/stop-stream Randall Fitzgerald 2023-05-30 22:16:32 -04:00
  • 3a079d5cc8 stop generating when the stream is closed anon 2023-05-30 23:12:00 -03:00
  • 7a8104fbd2 add missing quote when printing stopping strings anon 2023-05-30 23:11:32 -03:00
  • b6f536dfb3 Cull to end of generated_text when encountering a stopping string in case it's a partial token. digiwombat 2023-05-30 21:14:24 -04:00
  • 9197674a6b
    Merge pull request #4 from anon998/logging Randall Fitzgerald 2023-05-30 20:58:18 -04:00
  • aa0788b650 add --verbose flag and request logging anon 2023-05-30 21:41:55 -03:00
  • 7a853dc56d prevent the server from swallowing exceptions in debug mode anon 2023-05-30 21:39:30 -03:00
  • e6de69abfb
    Merge pull request #3 from anon998/sse Randall Fitzgerald 2023-05-30 20:36:52 -04:00
  • 2533878b79
    Merge branch 'master' into sse Randall Fitzgerald 2023-05-30 20:34:48 -04:00
  • a25f830fe1 Default streaming to false if it's not set in the request body. digiwombat 2023-05-30 20:17:18 -04:00
  • 38eaf2b7f7 Removed testing fprintf calls. digiwombat 2023-05-30 19:48:43 -04:00
  • 3292f057dc Changed to single API endpoint for streaming and non. digiwombat 2023-05-30 19:44:16 -04:00
  • d6fff56e22 add streaming via server-sent events anon 2023-05-30 19:33:33 -03:00
  • 7f172c1070
    replace auto parameters in lambda function xaedes 2023-05-31 00:25:37 +02:00
  • b2fd06c6aa
    mtl : working mul_mat q4 Georgi Gerganov 2023-05-30 23:06:49 +03:00
  • 03ea8f013a Fix for the regen issue. digiwombat 2023-05-30 15:48:55 -04:00
  • 29bec00ba0
    mtl : another mul_mat Q4 (still does not work) Georgi Gerganov 2023-05-30 22:31:07 +03:00
  • 96d005225f
    mtl : mul_mat fixes (still wrong) Georgi Gerganov 2023-05-30 22:13:43 +03:00
  • 2a24994bad
    mtl : initial mul_mat Q4 kernel (wrong results) Georgi Gerganov 2023-05-30 22:02:54 +03:00
  • ffb06a345e
    OpenLLaMA 3B support (#1588) master-ffb06a3 Henri Vasserman 2023-05-30 21:24:22 +03:00
  • ac6b49ed45 Reduce queueing overhead for contiguous tensors by using single mul kernel call 0cc4m 2023-05-30 18:49:53 +02:00
  • 64afc0b53a
    mtl : add mul kernel + confirm working Georgi Gerganov 2023-05-30 19:15:38 +03:00
  • 72256ebd2b
    mtl : add rms_norm kernel + confirm working Georgi Gerganov 2023-05-30 19:03:04 +03:00
  • 794704e409
    mtl : confirmed get_rows_q4_0 is working correctly Georgi Gerganov 2023-05-30 18:41:21 +03:00
  • 8fd8599f61
    rename baby-llama-text to train-text-from-scratch xaedes 2023-05-30 17:07:03 +02:00
  • 21b11b55d4
    remove python bindings xaedes 2023-05-30 17:03:09 +02:00
  • a5317498c2
    Merge branch 'master' into text-from-scratch xaedes 2023-05-30 16:57:17 +02:00
  • 56456797f4 Merge branch 'master' into concedo_experimental Concedo 2023-05-30 22:15:58 +08:00
  • 1074a81e81
    add train params to specify memory size xaedes 2023-05-30 16:06:20 +02:00
  • ad966da955
    remove unnecessary comments xaedes 2023-05-30 15:58:22 +02:00
  • ec8e262d1d
    add train_params and command line option parser xaedes 2023-05-30 15:53:55 +02:00
  • 2e84ad53ca
    remove convert.py Henri Vasserman 2023-05-30 16:42:11 +03:00
  • fcbc4457d6
    add option to train with flash attention and move options to the top of the main function xaedes 2023-05-30 13:17:58 +02:00
  • 70c08318af
    test flash attention backward pass xaedes 2023-05-29 23:51:40 +02:00
  • 38560b6d51
    bugfixes for backward pass of flash attention xaedes 2023-05-29 23:45:58 +02:00
  • a8fd9dc128
    mtl : initial get_rows_q4_0 kernel Georgi Gerganov 2023-05-29 23:12:19 +03:00
  • 22a7279ffb
    implement backward pass of flash attention xaedes 2023-05-29 22:00:40 +02:00
  • 248a8c3379
    mtl : move MSL code into separate file for easy editing Georgi Gerganov 2023-05-29 22:26:40 +03:00
  • 897d6d8e8f
    mtl : export just a small part of the graph for now to make it easier Georgi Gerganov 2023-05-29 21:40:05 +03:00
  • a792cbd0fc
    mtl : no need for mtl-export tool, add cli arg for main instead Georgi Gerganov 2023-05-29 21:28:59 +03:00
  • b23fe8c9c7
    mtl : adapt the MNIST example as starter Georgi Gerganov 2023-05-29 21:09:47 +03:00
  • 98c267fc77
    ci : disable temporary Georgi Gerganov 2023-05-29 20:57:24 +03:00
  • f85020b19a
    mtl : export the LLaMA computation graph Georgi Gerganov 2023-05-29 20:49:24 +03:00
  • 062dc6c747 Replacing call to convert-pth-to-ggml.py with convert.py Jiri Podivin 2023-05-29 19:02:45 +02:00
  • 7552ac5863
    ggml : sync cgraph import / export API master-7552ac5 Georgi Gerganov 2023-05-29 19:31:44 +03:00
  • 5d1830b99d
    ggml : fix bug in ggml_alibi Georgi Gerganov 2023-05-29 19:30:49 +03:00
  • 7a55593af4 main: add the possibility to open the prompt cache read-only Willy Tarreau 2023-05-29 14:53:00 +02:00
  • ea336bfa33 rwkv eos Concedo 2023-05-29 22:40:27 +08:00
  • 6b3373cb81 revert bad fix Concedo 2023-05-29 22:06:12 +08:00
  • 248367605e
    Work around for recalculating logits in cached prompts (Fixes #1585) (#1609) master-2483676 DannyDaemonic 2023-05-29 05:13:40 -07:00
  • ef16d09a51 fix for older gcc, updated lite Concedo 2023-05-29 18:54:15 +08:00
  • 44c83c6eba Merge remote-tracking branch 'upstream/master' into cached-logits-bandaid Danny Daemonic 2023-05-29 02:57:57 -07:00
  • 3a73ebe8d2 Merge branch 'master' into concedo_experimental Concedo 2023-05-29 16:47:32 +08:00
  • 254a9ff12c Merge commit 'ebc5d0651a' into concedo_experimental Concedo 2023-05-29 16:26:24 +08:00
  • 30ff1133f5 allow users to rename models for use in horde Concedo 2023-05-29 16:01:05 +08:00
  • 97b39f875c fixed fstat64 build error on mac Concedo 2023-05-29 15:50:07 +08:00
  • 0773028d52 1) make gpt_params_parse can jump over some predefined unknown args so we can reuse the gpt_params_parse function 2) fixed the grpc server error Liu Ming 2023-05-29 14:07:13 +08:00
  • 0e730dd23b
    Adding git in container package dependencies (#1621) Jiří Podivín 2023-05-29 06:45:50 +02:00
  • 96165b1201 pick from master changhz 2023-05-28 23:47:42 -04:00
  • 530eb57fe4 fix the error of no ending Liu Ming 2023-05-29 08:37:34 +08:00
  • 56895e28f6
    get vocabulary for exporting training checkpoint to llama compatible model file xaedes 2023-05-29 02:25:18 +02:00
  • 4b81c32d5b
    add export of training checkpoint to llama compatible model file xaedes 2023-05-29 01:27:09 +02:00
  • 2da5c8cf24
    set default model.type for unknown models with few layers xaedes 2023-05-29 01:20:55 +02:00
  • bf4d9b3b81
    add llama_get_vocab to get the vocabulary as output parameters xaedes 2023-05-29 01:20:26 +02:00
  • 42cf4d8433
    Merge branch 'master' into master Henri Vasserman 2023-05-29 01:05:19 +03:00
  • 33b6957177 Fixed failing to return result on stopping token. digiwombat 2023-05-28 16:45:05 -04:00
  • 89475fb320
    slightly improve how cross entropy loss is compute xaedes 2023-05-28 22:40:58 +02:00
  • 5f5aa20078
    remove trailing whitespace xaedes 2023-05-28 22:00:56 +02:00
  • 1fbd19abe1
    use ggml_cross_entropy_loss in text training example xaedes 2023-05-28 22:00:26 +02:00
  • f056a04a80
    add tests for cross_entropy_loss backward pass xaedes 2023-05-28 21:59:17 +02:00
  • 71aaf8dedf
    add ggml_cross_entropy_loss with backward pass for faster training xaedes 2023-05-28 21:57:38 +02:00
  • 3b126f654f
    LLAMA_DEBUG adds debug symbols (#1617) master-3b126f6 Johannes Gäßler 2023-05-28 21:01:02 +02:00
  • 6c58f64a3b --ctx_size flag to --ctx-size to match common.cpp digiwombat 2023-05-28 14:17:36 -04:00
  • b38d41ef52 --memory_f32 flag to --memory-f32 to match common.cpp digiwombat 2023-05-28 13:58:25 -04:00
  • 655899db89 Add ignore_eos option to generation settings. digiwombat 2023-05-28 13:49:45 -04:00
  • 1b78ed2081
    Only show -ngl option when relevant + other doc/arg handling updates (#1625) master-1b78ed2 Kerfuffle 2023-05-28 11:48:57 -06:00
  • 337aea1139
    examples : add --alias option to gpt_params to set use friendly model name (#1614) master-337aea1 Vladimir Zorin 2023-05-28 20:14:24 +03:00
  • bb051d9723
    opencl : no need to allocate cl_mem on heap (#1612) master-bb051d9 Howard Su 2023-05-29 01:13:36 +08:00
  • ca74884f66
    opencl : use strstr to check if fp16 supported (#1611) master-ca74884 Howard Su 2023-05-29 01:09:56 +08:00
  • 2c9ee7a052
    Apply suggestions from code review Randall Fitzgerald 2023-05-28 09:34:11 -07:00
  • 74c6f36bf1
    Editorconfig suggested fixes Henri Vasserman 2023-05-28 19:19:34 +03:00