Commit graph

1573 commits

Author SHA1 Message Date
mike dupont
70f836e267 demonstrate crash 2023-11-24 09:16:51 -05:00
mike dupont
ebea708561 adding new header for llama internal 2023-11-24 08:31:12 -05:00
mike dupont
e34fffc77b now has a model 2023-11-23 16:32:46 -05:00
mike dupont
8a8859ced4 update 2023-11-23 15:16:41 -05:00
mike dupont
90d6f11f66 refl now working, not on pointers but on the types 2023-11-23 14:08:17 -05:00
mike dupont
df647db611 bindings 2023-11-23 09:55:19 -05:00
mike dupont
a08640c00d adding binding generator 2023-11-23 09:55:07 -05:00
mike dupont
507c0674c1 adding the print module with type information
the first type names are being printed
2023-11-23 07:09:34 -05:00
mike dupont
c683f2c76a debugging 2023-11-22 17:43:16 -05:00
mike dupont
b598cf84fa compiling and running 2023-11-22 16:46:32 -05:00
mike dupont
09a1f053e7 now working 2023-11-22 12:01:26 -05:00
mike dupont
ef4c0f572b moving to using refl-cpp for llama as well 2023-11-22 11:40:25 -05:00
mike dupont
6fd690fae7 running 2023-11-22 09:04:00 -05:00
mike dupont
6f8adf99d5 simplify 2023-11-21 21:12:56 -05:00
mike dupont
9c33f53e9a working 2023-11-21 20:52:29 -05:00
mike dupont
9b7a296452 update 2023-11-21 20:38:10 -05:00
mike dupont
0b63b52ece update 2023-11-21 20:11:05 -05:00
mike dupont
f3f829a0d7 improvement 2023-11-21 20:03:25 -05:00
mike dupont
f1558ab38f diff 2023-11-21 19:57:47 -05:00
mike dupont
3a916678e3 update 2023-11-21 19:19:49 -05:00
mike dupont
5432cc18da adding debug notes 2023-11-21 15:11:18 -05:00
mike dupont
22359f7afe now it might even run 2023-11-21 14:48:47 -05:00
mike dupont
ee9b0bceeb rebased and trimmed down
now compiling again,
2023-11-21 13:04:31 -05:00
Galunid
8e672efe63
stablelm : simplify + speedup generation (#4153) 2023-11-21 16:22:30 +01:00
Galunid
0b871f1a04
finetune - update readme to mention llama support only (#4148) 2023-11-20 19:30:00 +01:00
Aaryaman Vasishta
dfc7cd48b1
readme : update ROCm Windows instructions (#4122)
* Update README.md

* Update README.md

Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>

---------

Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
2023-11-20 17:02:46 +02:00
Seb C
881800d1f0
main : Add ChatML functionality to main example (#4046)
Co-authored-by: Sebastian Cramond <sebby37@users.noreply.github.com>
2023-11-20 14:56:59 +01:00
Galunid
f23c0359a3
ci : add flake8 to github actions (python linting) (#4129)
Disabled rules:

* E203 Whitespace before ':' - disabled because we often use 'C' Style where values are aligned

* E211 Whitespace before '(' (E211) - disabled because we often use 'C' Style where values are aligned

* E221 Multiple spaces before operator - disabled because we often use 'C' Style where values are aligned

* E225 Missing whitespace around operator - disabled because it's broken so often it seems like a standard

* E231 Missing whitespace after ',', ';', or ':' - disabled because we often use 'C' Style where values are aligned

* E241 Multiple spaces after ',' - disabled because we often use 'C' Style where values are aligned

* E251 Unexpected spaces around keyword / parameter equals - disabled because it's broken so often it seems like a standard

* E261 At least two spaces before inline comment - disabled because it's broken so often it seems like a standard

* E266 Too many leading '#' for block comment - sometimes used as "section" separator

* E501 Line too long - disabled because it's broken so often it seems like a standard

* E701 Multiple statements on one line (colon) - broken only in convert.py when defining abstract methods (we can use# noqa instead)

* E704 Multiple statements on one line - broken only in convert.py when defining abstract methods (we can use# noqa instead)
2023-11-20 11:35:47 +01:00
Branden Butler
40a34fe8d0
speculative : fix prompt tokenization in speculative example (#4025)
* Support special tokens and not adding BOS to prompt in speculative

* Adapt to new should_add_bos function

* Ensure tgt and dft have same add_bos setting
2023-11-20 11:50:04 +02:00
Georgi Gerganov
dae06c06e5
Revert "finetune : add --n-gpu-layers flag info to --help (#4128)"
This reverts commit 05e8301e45.
2023-11-19 19:16:07 +02:00
Clark Saben
05e8301e45
finetune : add --n-gpu-layers flag info to --help (#4128) 2023-11-19 18:56:38 +02:00
SoftwareRenderer
936c79b227
server : relay error messages (#4131) 2023-11-19 18:54:10 +02:00
kchro3
262005ad9d
common : comma should be semicolon (#4137) 2023-11-19 18:52:57 +02:00
Georgi Gerganov
35985acffa
gitignore : tokenize 2023-11-19 18:50:49 +02:00
slaren
e937066420
gguf-py : export chat templates (#4125)
* gguf-py : export chat templates

* llama.cpp : escape new lines in gguf kv info prints

* gguf-py : bump version

* gguf-py : check chat_template type

* gguf-py : initialize chat_template
2023-11-19 11:10:52 +01:00
Kerfuffle
28a2e6e7d4
tokenize example: Respect normal add BOS token behavior (#4126)
Allow building with Makefile
2023-11-18 14:48:17 -07:00
Galunid
0b5c3b0457
scripts : Remove missed baichuan convert script (#4127) 2023-11-18 21:08:33 +01:00
Kerfuffle
2923f17f6f
Clean up ggml-cuda.cu warnings when compiling with clang (for ROCM) (#4124)
* ggml-cuda.cu: Clean up warnings when compiling with clang

* ggml-cuda.cu: Move static items into anonymous namespace

* ggml-cuda.cu: Fix use of namespace start macro

* Revert "ggml-cuda.cu: Fix use of namespace start macro"

This reverts commit 26c1149026.

* Revert "ggml-cuda.cu: Move static items into anonymous namespace"

This reverts commit e29757e0f7.
2023-11-18 08:11:18 -07:00
slaren
bbecf3f415
llama : increase max nodes (#4115) 2023-11-17 21:39:11 +02:00
Roger Meier
8e9361089d
build : support ppc64le build for make and CMake (#3963)
* build: support ppc64le build for make and CMake

* build: keep __POWER9_VECTOR__ ifdef and extend with __powerpc64__

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-17 18:11:23 +02:00
Georgi Gerganov
5ad387e994
tokenize : fix trailing whitespace 2023-11-17 18:01:38 +02:00
zakkor
2fa02b4b3d
examples : add tokenize (#4039) 2023-11-17 17:36:44 +02:00
Don Mahurin
2ab0707acb
convert : use 'model' value if it exists. This allows karpathy/tinyllamas to load (#4089)
Co-authored-by: Don Mahurin <@>
2023-11-17 17:32:34 +02:00
John
11173c92d6
py : Falcon HF compatibility (#4104)
Falcon HF compatibility
2023-11-17 17:24:30 +02:00
Jannis Schönleber
9e87ef60e1
common : improve yaml log escaping (#4080)
* logging: improve escaping in yaml output

* logging: include review feedback
2023-11-17 17:24:07 +02:00
Huawei Lin
c7cce1246e
llava : fix compilation warning that fread return value is not used (#4069) 2023-11-17 17:22:56 +02:00
Jiří Podivín
f7d5e97542
py : remove superfluous import statements (#4076)
Signed-off-by: Jiri Podivin <jpodivin@gmail.com>
Co-authored-by: Jiri Podivin <jpodivin@redhat.com>
2023-11-17 17:20:53 +02:00
Jiří Podivín
ba4cf5c0bf
train : move number of gpu layers argument parsing to common/train.cpp (#4074)
- introduces help entry for the argument
 - cuts '--gpu-layers' form in order to simplify usage and documentation.

Signed-off-by: Jiri Podivin <jpodivin@gmail.com>
Co-authored-by: Jiri Podivin <jpodivin@redhat.com>
2023-11-17 17:19:16 +02:00
slaren
e85bb1a8e7
llama : add functions to get the model's metadata (#4013)
* llama : add functions to get the model's metadata

* format -> std::to_string

* better documentation
2023-11-17 17:17:37 +02:00
gwjr
3e916a07ac
finetune : speed-up ggml_compute_forward_out_prod_f32 via BLAS (#4079)
* Remove logically superfluous assertions and order by dimension

* Use cblas_sgemm() to implement ggml_compute_forward_out_prod()

* Remove ggml_compute_forward_out_prod_use_blas(), fix compiling errors on cmake/zig, remove trailing whitespace

* Add openBLAS support for sgemm() in compute_forward_out_prod()
2023-11-17 16:48:19 +02:00