Commit graph

  • d8b06c2148 CUBLAS_TF32_TENSOR_OP_MATH is not a macro slaren 2023-12-23 17:57:25 +01:00
  • b7da1ba00e fix hip build slaren 2023-12-23 17:08:29 +01:00
  • 110b5055da add cuda_pool_alloc, refactor most pool allocations slaren 2023-12-23 16:38:43 +01:00
  • 708e179e85
    fallback to CPU buffer if host buffer alloc fails (#4610) b1696 slaren 2023-12-23 16:10:51 +01:00
  • 545f23d07b refactor error checking slaren 2023-12-23 15:48:46 +01:00
  • 4c0f300a2c move all caps to g_device_caps slaren 2023-12-23 15:27:49 +01:00
  • 20860daee2 clarify granularity slaren 2023-12-23 13:42:18 +01:00
  • 9452d0d54b fix hip build slaren 2023-12-23 13:40:06 +01:00
  • 1f3ecd8f6c fallback to CPU buffer if host buffer alloc fails slaren 2023-12-23 13:14:35 +01:00
  • 925e5584a0
    ci(docker): fix tags in "Build and push docker image (tagged)" (#4603) b1695 Samuel Maynard 2023-12-23 11:35:55 +02:00
  • 6123979952
    server : allow to specify custom prompt for penalty calculation (#3727) b1694 Alexey Parfenov 2023-12-23 09:31:49 +00:00
  • b9ec82d262
    grammar : check the full vocab only if necessary (opt) (#4306) b1693 kalomaze 2023-12-23 03:27:07 -06:00
  • 88fd22c3fc
    common : fix final newline Georgi Gerganov 2023-12-23 11:25:34 +02:00
  • b4377eedfd
    Merge branch 'master' into conditional-grammar-check Georgi Gerganov 2023-12-23 11:23:41 +02:00
  • 380cd9ae47
    server : allow to specify custom prompt for penalty calculation ZXED 2023-10-21 11:16:05 +03:00
  • e0a4002273
    CUDA: fixed row rounding for 0 tensor splits (#4594) b1692 Johannes Gäßler 2023-12-23 09:16:33 +01:00
  • 71a5afaab5 fixed incorrect localflag Concedo 2023-12-23 11:00:58 +08:00
  • 4a8308b1c8 Merge branch 'master' into concedo_experimental Concedo 2023-12-23 10:40:29 +08:00
  • 8823e8b06d added presence penalty into lite ui Concedo 2023-12-23 10:39:40 +08:00
  • 71f4c96331 added oshmem backend to llama.cpp ct-clmsn 2023-12-22 21:07:26 -05:00
  • e9ad5fe040 update: cicd namtranase 2023-12-23 08:57:39 +07:00
  • 872408cfb7 check for vmm support, disable for hip slaren 2023-12-23 02:29:21 +01:00
  • bd78dc9aee fix cmake build slaren 2023-12-23 02:06:49 +01:00
  • eb223dcddd fix mixtral slaren 2023-12-23 00:34:20 +01:00
  • 0d77fbd774 cuda : improve cuda pool efficiency using virtual memory slaren 2023-12-22 23:46:45 +01:00
  • 52b7385578 Update comment for AdamW implementation reference. Will Findley 2023-12-22 16:14:06 -06:00
  • d0948c9d2f
    ci(docker): fix tags in "Build and push docker image (tagged)" samm81 2023-12-22 16:29:36 -05:00
  • 2187a8debe update: cicd namtranase 2023-12-23 00:35:47 +07:00
  • 7082d24cec
    lookup : add prompt lookup decoding example (#4484) b1691 LeonEricsson 2023-12-22 17:05:56 +01:00
  • 50ea1ef7c8
    lookup : final touches Georgi Gerganov 2023-12-22 18:04:30 +02:00
  • b814bb217d Merge branch 'master' into concedo_experimental Concedo 2023-12-23 00:01:21 +08:00
  • a600c61da2 fix: remove ggml_repeat namtranase 2023-12-22 22:45:35 +07:00
  • 5627e0b1f7
    ggml-alloc : fix ggml_tallocr_is_own Georgi Gerganov 2023-12-22 13:19:55 +02:00
  • f0b2ba2089
    cuda : fix im2col_f32_f16 (ggml/#658) leejet 2023-12-19 00:46:10 +08:00
  • 3bca03d26b Merge branch 'master' into concedo_experimental Concedo 2023-12-22 21:39:23 +08:00
  • 4fe9e71206 Fix potential infinite for-loop Bernhard Gstrein 2023-12-22 14:37:56 +01:00
  • 852ca780c9 cherrypicked the Hipblas fixed from PR #571 Concedo 2023-12-22 21:29:20 +08:00
  • a7a4205340 Fix CudaMemcpy direction Henrik Forstén 2023-12-22 14:44:04 +02:00
  • 0b2f601747
    Allow compiling on Cygwin g++ divinity76 2023-12-22 13:08:20 +01:00
  • a5f91a335b
    missing header for strcasecmp divinity76 2023-12-22 12:56:55 +01:00
  • fdf599d7cf Merge branch 'add_gpt2_support' of https://github.com/manikbhandari/llama.cpp into add_gpt2_support manikbhandari 2023-12-22 06:47:48 -05:00
  • 87f35e59be
    Merge branch 'ggerganov:master' into add_gpt2_support manikbhandari 2023-12-22 11:05:45 -05:00
  • ba66175132
    sync : ggml (fix im2col) (#4591) b1690 Georgi Gerganov 2023-12-22 17:53:43 +02:00
  • a55876955b
    cuda : fix jetson compile error (#4560) b1689 FantasyGmm 2023-12-22 23:11:12 +08:00
  • 6724ef1657
    Fix CudaMemcpy direction (#4599) b1688 Henrik Forstén 2023-12-22 15:34:05 +02:00
  • ab614e5061 add gpt2 vocab file manikbhandari 2023-12-22 06:47:37 -05:00
  • 39515c50cc formatting comments and vocab test for gpt2 manikbhandari 2023-12-22 06:43:17 -05:00
  • 48b7ff193e
    llama : fix platforms without mmap (#4578) b1687 slaren 2023-12-22 12:12:53 +01:00
  • aefae917fd fix win32 error clobber, unnecessary std::string in std::runtime_error slaren 2023-12-22 12:08:12 +01:00
  • 3e58dab79f
    Update README.md Georgi Gerganov 2023-12-22 13:07:32 +02:00
  • 2640a7434a
    Update Makefile FantasyGmm 2023-12-22 18:23:33 +08:00
  • 96f2634cf6
    Update README.md FantasyGmm 2023-12-22 18:19:50 +08:00
  • fb2821c8c7 Fixed issues per code review Alexander Krivutsenko 2023-12-22 10:39:14 +01:00
  • 1b56724ae3 CUDA: fixed row rounding for 0 tensor splits JohannesGaessler 2023-12-22 10:32:39 +01:00
  • e8fae2db7a Merge branch 'master' of https://github.com/ggerganov/llama.cpp into feature/awq_pr Trần Đức Nam 2023-12-22 16:30:10 +07:00
  • 48b24b170e
    ggml : add comment about backward GGML_OP_DIAG_MASK_INF (#4203) b1686 Herman Semenov 2023-12-22 09:26:49 +00:00
  • 6fddb574df ggml: add comment with link to discussion suspicious function param Herman Semenov 2023-12-22 12:16:50 +03:00
  • 9b742c5a52 fix: readme Trần Đức Nam 2023-12-22 15:57:25 +07:00
  • 00f48ade6a fix: common.cpp Trần Đức Nam 2023-12-22 15:55:21 +07:00
  • 66ae732f44 docs: add more details about using oneMKL and oneAPI for intel processors tikikun 2023-12-22 15:54:33 +07:00
  • 4caa9e2cd7 docs: add more details about using oneMKL and oneAPI for intel processors tikikun 2023-12-22 15:54:02 +07:00
  • 08d94752aa docs: add more details about using oneMKL and oneAPI for intel processors tikikun 2023-12-22 15:52:47 +07:00
  • 47f0c7bf85 docs: add more details about using oneMKL and oneAPI for intel processors tikikun 2023-12-22 15:49:33 +07:00
  • 0c35ed7dcf docs: add more details about using oneMKL and oneAPI for intel processors tikikun 2023-12-22 15:48:36 +07:00
  • 440cc2f46f update: change folder architecture Trần Đức Nam 2023-12-22 15:39:24 +07:00
  • b00e2d90f9 fix: readme Trần Đức Nam 2023-12-22 15:18:52 +07:00
  • 6fcdb07773 fix: readme Trần Đức Nam 2023-12-22 15:10:37 +07:00
  • 28cb35a0ec
    make : add LLAMA_HIP_UMA option (#4587) b1685 Michael Kesper 2023-12-22 09:03:25 +01:00
  • 48cd819e64 update: more detail for mpt Trần Đức Nam 2023-12-22 14:56:09 +07:00
  • 621af1b7d2 llama: add avx vnni information display tikikun 2023-12-22 14:52:01 +07:00
  • 32ca5698f1 ggml: add avx vnni based on intel document tikikun 2023-12-22 14:37:18 +07:00
  • 55370263a0 feat: add avx_vnni based on intel documents tikikun 2023-12-22 14:32:25 +07:00
  • 77463e0e9c batch size improvements Concedo 2023-12-22 15:27:40 +08:00
  • e04b8f0e44 fix: remove code Trần Đức Nam 2023-12-22 14:26:00 +07:00
  • c8facb4fb1 Add LLAMA_HIP_UMA option for Makefile Michael Kesper 2023-12-20 13:08:34 +01:00
  • f31b984898
    ci : tag docker image with build number (#4584) b1684 rhuddleston 2023-12-21 23:56:34 -07:00
  • 2bb98279c5
    readme : add zig bindings (#4581) Deins 2023-12-22 08:49:54 +02:00
  • 0137ef88ea
    ggml : extend enum ggml_log_level with GGML_LOG_LEVEL_DEBUG (#4579) b1682 bobqianic 2023-12-22 06:47:01 +00:00
  • 230a638512 Merge branch 'master' into concedo_experimental Concedo 2023-12-22 14:40:13 +08:00
  • ae8a920187 Merge branch 'master' into check-requirements-txt crasm 2023-12-22 01:39:01 -05:00
  • 9986c91684 python: add check-requirements.sh and GitHub workflow crasm 2023-12-22 01:23:46 -05:00
  • c7e9701f86
    llama : add ability to cancel model loading (#4462) b1681 crasm 2023-12-22 01:19:36 -05:00
  • 5f2ee1c938 Redo changes for cancelling model load crasm 2023-12-22 01:00:11 -05:00
  • f607e53252 reset to upstream/master crasm 2023-12-22 00:58:32 -05:00
  • bf4d4b501a tag docker image with build number Ryan Huddleston 2023-12-21 20:42:47 -07:00
  • 525ec7e475 update makefile and cuda,fix some issue FantasyGmm 2023-12-22 11:28:19 +08:00
  • e6d10852d5 update cuda marco define FantasyGmm 2023-12-22 11:24:42 +08:00
  • ec5af9e236
    Update README.md with zig bindings Deins 2023-12-22 05:23:28 +02:00
  • 3194769235 update jetson detect and cuda version detect FantasyGmm 2023-12-22 11:20:43 +08:00
  • 375003b458 always show reported arch Concedo 2023-12-22 11:15:07 +08:00
  • eb0f775950 cleaned up pointer arithmetic ct-clmsn 2023-12-21 21:20:46 -05:00
  • d05fcad5d1 cleaned up pointer arithmetic; rm'd a member variable of the oshmem context struct ct-clmsn 2023-12-21 19:35:33 -05:00
  • c8d67705fe reduced the number of shmem_calloc calls ct-clmsn 2023-12-21 18:58:46 -05:00
  • e4382571ca Fix merge crasm 2023-12-21 18:54:27 -05:00
  • 6bc7411002 Merge remote-tracking branch 'upstream' into cancel-model-load crasm 2023-12-21 18:35:16 -05:00
  • 0c80635bbe
    Update ggml.h bobqianic 2023-12-21 23:29:53 +00:00
  • 3f2769bf26 added correct use of shmem_free ct-clmsn 2023-12-21 18:25:22 -05:00
  • ab42a33018 win32 : limit prefetch size to the file size slaren 2023-12-22 00:09:29 +01:00
  • bffb9a7847 llama : fix platforms without mmap slaren 2023-12-21 23:59:54 +01:00
  • 0f8418079c formatting and missed int in llama_token_to_piece marcus 2023-12-21 13:51:19 -08:00