Commit graph

1391 commits

Author SHA1 Message Date
M. Yusuf Sarıgöz
0409ae00b6 are you happy editorconfig? 2023-10-11 08:21:29 +03:00
M. Yusuf Sarıgöz
ab2158796f Check if apples are compared to apples 2023-10-11 08:15:51 +03:00
M. Yusuf Sarıgöz
f1564bb2eb Merge branch 'master' into llava 2023-10-11 06:59:37 +03:00
M. Yusuf Sarıgöz
587bde8e0c Maybe seed is unlucky? 2023-10-11 06:40:52 +03:00
Galunid
9f6ede19f3
Add MPT model to supported models in README.md (#3574) 2023-10-10 19:02:49 -04:00
goerch
233fc1c69f
Minor improvements in GPT2 tokenizer (#3567)
* Fixing minor bugs in bpe_gpt2_preprocess

* Don't add bos token in test
2023-10-10 18:59:52 +02:00
Xingchen Song(宋星辰)
c5b49360d0
readme : add bloom (#3570) 2023-10-10 19:28:50 +03:00
Xingchen Song(宋星辰)
02d2875def
llm : add bloom models (#3553)
* feat: Support bloom models

* fix(bloom): fix model size

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-10 17:48:21 +03:00
Jhen-Jie Hong
0aa6595ae0
swift : improvements and fixes (#3564)
* swift : use macOS 12 as minimum requirement

* swift : add missing ggml-backend.c source

* swift : add -O3 -DNDEBUG unsafe flags
2023-10-10 14:31:13 +03:00
M. Yusuf Sarıgöz
d640aae755 add support for 13b model variant 2023-10-10 13:02:24 +03:00
Jan Ploski
f5f9121de1
llm : add MPT support (#3417)
* CUDA: added support for ggml_clamp (see also: https://github.com/ggerganov/ggml/issues/545)

* mpt : added an implementation based (mostly) on falcon integration, modified with deltas from ggml/examples/mpt

* mpt : protect against "clip_qkv": null in mpt-7b

* mpt : quick fix to avoid "Strange model" warning when quantizing MPT models

* mpt : addendum to changeset:84e30e8 - leave parameter clamp_kqv out from metadata rather than use 0.0 to indicate "no clamping" (more compliant with the current GGUF spec?)

* mpt : standardized all tensor names to follow GGUF spec

* mpt : addendum to changeset:1be89c40 - use "req" parameter of GGUF_GET_KEY macro instead of duplicate code

* mpt : fixed comment s/gptneox/mpt/

* mpt : remove tabs, trailing whitespace

* mpt : removed ne01 + n_past == ne00 assertion from alibi (cuda/f32) and rope_shift from build_mpt

* mpt : updated convert-mpt-hf-to-gguf.py to reflect changes made to convert-gptneox-hf-to-gguf.py in pr:3252

* comment out n_past instead of marking it unused

* mpt : removed hardcoded +178 from convert script in favor of utilizing hparams["vocab_size"]

* mpt : remove unused tokenizer_json in convert script

* ggml : remove obsolete n_past assert in ggml_alibi

* llama : print clam_kqv and max_alibi_bias hparams

---------

Co-authored-by: Cebtenzzre <cebtenzzre@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-10 10:50:23 +03:00
vvhg1
11ea5c7d96
infill. : fix tokenization (#3508)
* infill tokens correction

* serverinfill tokens correction

* removing any leading whitespace from infill suffix and removing leeading space token from suffix when params.escape

* removing any leading whitespace from infill suffix and removing leeading space token from suffix when params.escape

* only rm when params.escape, rm space if possible which is added back or rm added space token

* only rm when params.escape, rm space if possible which is added back or rm added space token

* Revert "only rm when params.escape, rm space if possible which is added back or rm added space token"

This reverts commit 63ba0b621f.

* fix interactive prompt escaping and fix server infill leading space handling

* rm unnecessary bool check
2023-10-10 10:31:21 +03:00
M. Yusuf Sarıgöz
96171de5ef add llava target to Makefile 2023-10-10 01:50:02 +03:00
M. Yusuf Sarıgöz
5009ae90ef Handle cases where image file does not exist 2023-10-10 01:49:35 +03:00
M. Yusuf Sarıgöz
ae01c859e5 gitignore /llava 2023-10-10 01:13:12 +03:00
M. Yusuf Sarıgöz
d75a0315f0 are you happy editorconfig? 2023-10-09 23:56:07 +03:00
M. Yusuf Sarıgöz
325d240061 introduce pad-to-square mode for non-square images 2023-10-09 23:53:29 +03:00
M. Yusuf Sarıgöz
4759bfd64c fix: rm designated initializers 2023-10-09 15:54:55 +03:00
slaren
95bd60a0a6
ggml-alloc : fix assert in debug builds (#3555) 2023-10-09 15:44:58 +03:00
M. Yusuf Sarıgöz
d78e816365 rm unused import 2023-10-09 14:44:35 +03:00
Georgi Gerganov
fcca0a7004
refact : fix convert script + zero out KV cache to avoid nans (#3523)
* refact : fix convert script + zero out KV cache to avoid nans

* ggml : silu(-inf) should never happen

* metal : assert various kernel requirements
2023-10-09 14:32:17 +03:00
Georgi Gerganov
dcc09d2596
metal : do not use mul_mm kernels when ne00 < 64 (#3542) 2023-10-09 14:28:27 +03:00
M. Yusuf Sarıgöz
8278a7364a rm unused batch image preprocessing 2023-10-09 14:22:18 +03:00
M. Yusuf Sarıgöz
9b0ec4d2cc Are you happy editorconfig? 2023-10-09 13:42:04 +03:00
M. Yusuf Sarıgöz
54495c9474 Some cleanup 2023-10-09 13:38:48 +03:00
M. Yusuf Sarıgöz
8af7e2103c Update readme 2023-10-09 11:10:09 +03:00
M. Yusuf Sarıgöz
444dbce888 Add readme 2023-10-09 09:47:56 +03:00
Georgi Gerganov
db3abcc114
sync : ggml (ggml-backend) (#3548)
* sync : ggml (ggml-backend)

ggml-ci

* zig : add ggml-backend to the build
2023-10-08 20:19:14 +03:00
Matheus C. França
eee42c670e
ci : add Zig CI/CD and fix build (#2996)
* zig CI/CD and fix build

Signed-off-by: Matheus Catarino França <matheus-catarino@hotmail.com>

* fix build_compiler

* ci : remove trailing whitespace

---------

Signed-off-by: Matheus Catarino França <matheus-catarino@hotmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-08 16:59:20 +03:00
M. Yusuf Sarıgöz
2a04d0b5a1 Merge branch 'master' into llava 2023-10-08 15:40:39 +03:00
M. Yusuf Sarıgöz
95da79e740 fix: trailing whitespace 2023-10-08 15:38:47 +03:00
M. Yusuf Sarıgöz
204d08be3d fix: new line at EoF 2023-10-08 15:24:13 +03:00
M. Yusuf Sarıgöz
0c2bd79781 fix: crlf -> lf 2023-10-08 15:20:39 +03:00
M. Yusuf Sarıgöz
94eeac358a Use ggml_allocr + rm unnecessary code 2023-10-08 14:58:47 +03:00
Ryder Wishart
8e6716a102
api_like_OAI.py : compat with Microsoft Guidance (#2746)
Check for None in addition to empty string check in all request params

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-08 13:55:58 +03:00
arcrank
9c38d181d4
api_like_OAI.py : simplify function (#2796)
Simplify function
2023-10-08 13:52:57 +03:00
Johannes Rudolph
a1202a31ed
k-quants : fix comments about block sizing (#3499) 2023-10-08 13:21:19 +03:00
Georgi Gerganov
94e502dfb7
ci : enable on obj-c changes + fix metal build (#3540) 2023-10-08 11:24:50 +03:00
Luo Tian
7d8b24932f
zig : fix build by introducing train.cpp (#3539) 2023-10-08 11:24:01 +03:00
Georgi Gerganov
b0ec5218c3
metal : support MTLGPUFamily < Apple7, formatting, style (#3524)
* metal : improve decoding speed for batches of 2-16

* metal : rename kernels mul_mat_ to mul_mv_

* metal : indentations

* minor

* metal : print more GPU info + disable mul_mm for MTLGPUFamiliy < Apple7
2023-10-08 10:01:53 +03:00
Kerfuffle
63d3b06a43
llama : fix missing break in Persimmon arch case statements (#3535) 2023-10-08 08:22:17 +03:00
M. Yusuf Sarıgöz
8690f425ec LLaVA is working e2e, needs to optimize memory allocation + cleanup 2023-10-08 01:15:13 +03:00
Kerfuffle
a16e89cec8
Fix trying to strip newline from empty prompt and cfg prompt file content (#3534) 2023-10-07 15:31:41 -06:00
M. Yusuf Sarıgöz
4d03833211
gguf.py : fix CI for publishing GGUF package (#3532)
* Fix CI for publishing GGUF package

* Bump version

* fix

* bump version

* bump version

* bump version
2023-10-07 22:14:10 +03:00
Tom C
c47066d833
py : change version of numpy requirement to 1.24.4 (#3515)
Co-authored-by: Lyjia <me@lyjia.us>
2023-10-07 12:56:15 +03:00
cebtenzzre
f1782c68de
quantize : fail fast on write errors (#3521) 2023-10-07 11:41:52 +03:00
Jhen-Jie Hong
c26765a0a1
metal : support default.metallib load & reuse code for swift package (#3522)
* metal : support load default.metallib & reuse code for swift package

* metal : use SWIFT_PACKAGE def instead of define GGML_SWIFT
2023-10-07 11:40:27 +03:00
Phillip Kravtsov
0e797c2fc5
llm : support Adept Persimmon 8B (#3410)
* Produces garbage output

* wip: correct tensors up to RoPE

* correct tensors thru RoPE

* Correct outputs through masked & softmax'd KQ

* fp32 works

* Rename adept->persimmon

* Produces correct outputs

* clean up convert scripts

* remove printing logic from ggml.c

* remove prints from llama.cpp & fix merge

* trivial cleanups

* Add offload funcs

* update conversion script to directly take adept artifacts rather than .saftensors file

* Fix norm eps bug

* Support sqr and concat on metal, persimmon-8b-q4 runs correctly

* Small changes from review

* Formatting changes

* Minor changes to conversion script

* Remove old script

* Fix editorconfig formatting

* Fix build

* add overlooked offload code ggml-ci
2023-10-07 10:12:43 +03:00
goerch
3a716b4dae
Fix for #3454 (#3455)
Fix: `sentencepiece` tokenizers with added tokens failed with an incorrect assertion
2023-10-07 06:57:01 +02:00
BarfingLemurs
1faaae8c2b
readme : update models, cuda + ppl instructions (#3510) 2023-10-06 22:13:36 +03:00