Commit graph

1597 commits

Author SHA1 Message Date
Yazan Agha-Schrader
9dcb514b1d update start server scripts 2023-11-28 06:57:29 +01:00
Yazan Agha-Schrader
4fa32ad0e3 update 2023-11-27 21:45:12 +01:00
Yazan Agha-Schrader
1b6d4226b8 add start scripts to root path 2023-11-27 21:35:31 +01:00
Yazan Agha-Schrader
ae096d0a92
Merge branch 'ggerganov:master' into master 2023-11-27 20:10:11 +01:00
Kasumi
0dab8cd7cc
readme : add Amica to UI list (#4230) 2023-11-27 19:39:42 +02:00
Yazan Agha-Schrader
6c318b54c8
Update README.md 2023-11-27 18:28:32 +01:00
Yazan Agha-Schrader
ecb39732e6 add min-p image 2023-11-27 18:25:51 +01:00
Yazan Agha-Schrader
082b33550f
Update README.md 2023-11-27 18:19:26 +01:00
Yazan Agha-Schrader
c48f3f2042
Merge pull request #3 from mounta11n/server-ui-improvements
add min-p
2023-11-27 17:58:23 +01:00
Yazan Agha-Schrader
464f073307 add min-p 2023-11-27 17:56:30 +01:00
Yazan Agha-Schrader
d55b482361
Merge pull request #2 from mounta11n/server-ui-improvements
Server UI improvements
2023-11-27 17:26:43 +01:00
Yazan Agha-Schrader
809b2697fe
Merge branch 'ggerganov:master' into master 2023-11-27 17:24:35 +01:00
Yazan Agha-Schrader
c161ad20db add mmproj function 2023-11-27 17:17:38 +01:00
Yazan Agha-Schrader
d5683279b1 fix wrong translation 2023-11-27 16:19:08 +01:00
Bailey Chittle
bb03290c17
examples : iOS example with swift ui (#4159)
* copy to llama.cpp as subdir

* attempt enabling metal, fails

* ggml metal compiles!

* Update README.md

* initial conversion to new format, utf8 errors?

* bug fixes, but now has an invalid memory access :(

* added O3, now has insufficient memory access

* begin sync with master

* update to match latest code, new errors

* fixed it!

* fix for loop conditionals, increase result size

* fix current workflow errors

* attempt a llama.swiftui workflow

* Update .github/workflows/build.yml

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-27 16:56:52 +02:00
Yazan Agha-Schrader
09e3b50f62 fix wrong formattings 2023-11-27 15:54:21 +01:00
Yazan Agha-Schrader
cf8cb0d303 fix multi-modal-selection 2023-11-27 15:05:23 +01:00
Yazan Agha-Schrader
49d7c07210
Update README.md
add description
2023-11-27 14:23:51 +01:00
Yazan Agha-Schrader
1bb2df7367
Update README.md
add pictures of the ui
2023-11-27 14:22:31 +01:00
Yazan Agha-Schrader
25ed0c4f6b add ui and tui pics 2023-11-27 14:18:58 +01:00
Yazan Agha-Schrader
1bc9ca6a9c add ui and tui pics 2023-11-27 14:17:04 +01:00
Yazan Agha-Schrader
a28935febe
Update README.md 2023-11-27 14:14:46 +01:00
Yazan Agha-Schrader
ca22eb6cc7
Merge pull request #1 from mounta11n/server-ui-improvements
Server UI improvements
2023-11-27 14:11:48 +01:00
Yazan Agha-Schrader
e7cfe1f5d9 add favicon 2023-11-27 13:58:54 +01:00
Yazan Agha-Schrader
9abb31011b
Update index.html
add atlas
2023-11-27 13:47:08 +01:00
Yazan Agha-Schrader
4d15130fda add start script 2023-11-27 13:06:27 +01:00
Yazan Agha-Schrader
2566e53945 ic 2023-11-27 11:33:06 +01:00
Jared Van Bortel
f3b269813f
ggml : fix -Warray-bounds warning with gcc (#4231) 2023-11-26 22:58:43 -05:00
Georgi Gerganov
3e73d31d9c
lookahead : support -n -1 infinite generation 2023-11-26 21:52:23 +02:00
Georgi Gerganov
9656026b53
readme : update hot topics 2023-11-26 20:42:51 +02:00
Georgi Gerganov
922754a8d6
lookahead : add example for lookahead decoding (#4207)
* lookahead : init

* lookahead : generate and store n-grams

* lookahead : use loop instead recursion to generate n-grams

* lookahead : initial working implementation

* lookahead : filter repeating n-grams

* lookahead : use deterministic init

* lookahead : add to Makefile

* lookahead : fix a bug in the seq_id of the lookahead tokens

* lookahead : add comments

---------

Co-authored-by: slaren <slarengh@gmail.com>
2023-11-26 20:33:07 +02:00
Xiao-Yong Jin
22da05536f
metal : fix yarn (#4220)
get the correct n_orig_ctx in metal
2023-11-26 10:30:02 +02:00
Galunid
1ddb52ec38
scripts : Use mmap in torch load (#4202)
* Use mmap in torch load, prefer .bin files when loading

* Revert .bin > .safetensors preference
2023-11-25 22:45:02 +01:00
Marcus Dunn
f837c3a992
llama : grammar reserve space in decode_utf8 (#4210)
* reserve space for codepoints

* improvement for the appended 0
2023-11-25 18:58:23 +02:00
crasm
3014b5415d
Update docs for yarn_ext_factor <0.0 as unspecified instead of NaN (#4189) 2023-11-25 10:47:07 -05:00
Georgi Gerganov
04814e718e
readme : update hot topics 2023-11-25 12:02:13 +02:00
Georgi Gerganov
af19d35734
server : OAI API compatibility (#4198)
* Add openai-compatible POST /v1/chat/completions API endpoint to server example

* fix code style

* Update server README.md

* Improve server README.md

* Fix server.cpp code style according to review

* server : some style changes

* server : indentation

* server : enable special tokens during tokenization by default

* server : minor code style

* server : change random string generator

* straightforward /v1/models endpoint

---------

Co-authored-by: kir-gadjello <111190790+kir-gadjello@users.noreply.github.com>
Co-authored-by: Tobi Lütke <tobi@Tobis-MacBook-Pro.local>
2023-11-25 11:29:06 +02:00
slaren
e9c13ff781
llama : set metal log callback correctly (#4204) 2023-11-24 18:10:01 +01:00
slaren
8a052c131e
ggml-cuda : support stablelm rope (#4156)
* ggml-cuda : support stablelm rope

* remove unused freq_base kernel parameter

* add n_dims parameter to llm_build_k_shift, default to n_rot via overload

* llama : fix llm_build_k_shift args

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-24 18:04:31 +01:00
Galunid
189d68446e
convert : fix tensors using grad in some models (#4173) 2023-11-24 15:02:49 +01:00
eastriver
2568a4bf54
main.swift : fix eos checking (#4197)
llama_token_eos(const struct llama_model *) is currently getting struct llama_context type variable context as a parameter.
2023-11-24 11:25:10 +02:00
Aaryaman Vasishta
b35f3d0def
readme : use PATH for Windows ROCm (#4195)
* Update README.md to use PATH for Windows ROCm

* Update README.md

* Update README.md
2023-11-24 09:52:39 +02:00
Haohui Mai
55978ce09b
Fix incorrect format strings and uninitialized variables. (#4133)
* Fix incorrect format strings and uninitialized variables.

* Address comments

* Add the missing include statement
2023-11-23 22:56:53 +01:00
Georgi Gerganov
6b0a7420d0
llama : KV cache view API + better KV cache management (#4170)
* llama : keep track of used KV cells + better KV cache management

* llama : zero KV cache used upon clear

ggml-ci

* llama : allow exporting a view of the KV cache (#4180)

* Allow exporting a view of the KV cache

* Allow dumping the sequences per cell in common

* Track max contiguous cells value and position as well

* Fix max contiguous empty cells index calculation

Make dump functions deal with lengths or sequences counts > 10 better

* Fix off by one error in dump_kv_cache_view

* Add doc comments for KV cache view functions

Eliminate cell sequence struct; use llama_seq_id directly

Minor cleanups

* common : add -dkvc arg for enabling kv cache dumps

---------

Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com>
2023-11-23 19:07:56 +02:00
Georgi Gerganov
d103d935c0
readme : update hot topics 2023-11-23 13:51:22 +02:00
Daniel Bevenius
9d5949f04b
examples : fix typo in parallel example doc comment (#4181)
Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2023-11-23 13:34:20 +02:00
Georgi Gerganov
ff8238f71d
docs : add llama-star arch idea 2023-11-23 11:35:04 +02:00
Galunid
8e672efe63
stablelm : simplify + speedup generation (#4153) 2023-11-21 16:22:30 +01:00
Galunid
0b871f1a04
finetune - update readme to mention llama support only (#4148) 2023-11-20 19:30:00 +01:00
Aaryaman Vasishta
dfc7cd48b1
readme : update ROCm Windows instructions (#4122)
* Update README.md

* Update README.md

Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>

---------

Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
2023-11-20 17:02:46 +02:00