Concedo
e4c9aea840
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# README.md
2023-06-26 10:35:47 +08:00
Georgi Gerganov
447ccbe8c3
readme : add new roadmap + manifesto
2023-06-25 16:08:12 +03:00
Georgi Gerganov
bd34cdde38
ggml : sync latest ggml (custom operators)
2023-06-25 14:25:08 +03:00
Concedo
d2034ced7b
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# README.md
# build.zig
# flake.nix
# tests/test-grad0.c
# tests/test-sampling.cpp
# tests/test-tokenizer-0.cpp
2023-06-25 17:01:15 +08:00
anon998
c2a08f87b8
fix server sampling: top k sampler first ( #1977 )
...
Co-authored-by: anon <anon@example.org>
2023-06-25 10:48:36 +02:00
Georgi Gerganov
66a2555ba6
readme : add Azure CI discussion link
2023-06-25 09:07:03 +03:00
sjinzh
e65ca7e14a
zig : upgrade build system support ( #1981 )
...
* upgrade zig build system support
* zig : add new line at the end of the file
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-06-25 08:45:44 +03:00
Robyn
5ec8dd5a3c
#1869 Fix null reference errors when training from scratch with CUDA ( #1907 )
...
* #1869 Fix null reference errors when training from scratch with CUDA build
Calling ggml_compute_forward when node->src0 was null was causing train-text-from-scratch.exe to terminate unexpectedly.
* ggml : do not dereference src0 if NULL
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-06-24 20:10:29 +02:00
Georgi Gerganov
65bdd52a86
tests : sync test-grad0 from ggml
2023-06-24 19:40:18 +03:00
Rowan Hart
fdd1860911
flake : fix ggml-metal.metal path and run nixfmt ( #1974 )
2023-06-24 14:07:08 +03:00
AN Long
c943d823c1
convert : fix invalid params in write_vocab_only ( #1975 )
2023-06-24 14:02:06 +03:00
slaren
f2c754e1c3
ggml : improve ggml_graph_dump_dot, add ggml_format_name ( #1978 )
...
* Improve ggml_graph_dump_dot, add ggml_format_name
* add more automatic names to view ops
* fix name of copies
2023-06-24 13:57:18 +03:00
Georgi Gerganov
11da1a85cd
readme : fix whitespaces
2023-06-24 13:38:18 +03:00
Alberto
235b610d65
readme : fixed termux instructions ( #1973 )
2023-06-24 13:32:13 +03:00
Alex Renda
b061ba9e2a
llama : fix top-p sampling to match the canonical definition ( #1953 )
...
* Fix top-p sampling to match the standard definition (smallest set that has probability mass at least p, not largest set with probability mass less than p)
* top-p: correct gt to gte
* add test for correct top-p behavior
2023-06-24 13:15:01 +03:00
Didzis Gosko
527b6fba1d
llama : make model stateless and context stateful (llama_state) ( #1797 )
...
* llama : make model stateless and context stateful
* llama : minor cleanup
* llama : update internal API declaration
* Apply suggestions from code review
fix style
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Missing model memory release
* Fix style
* Add deprecated warning for public API function llama_init_from_file
* Update public API use cases: move away from deprecated llama_init_from_file
* Deprecate public API function llama_apply_lora_from_file
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-06-24 11:47:58 +03:00
Concedo
8342fe81b1
revert the wstring tokenization. coherency was affected
2023-06-24 12:58:49 +08:00
Concedo
6da38b0d40
up ver
2023-06-24 12:30:38 +08:00
Concedo
0485fa65a2
wstring convert for mpt
2023-06-24 11:43:42 +08:00
Concedo
6d718525c4
Merge branch 'optimize_quants_upstream' into concedo_experimental
2023-06-23 23:56:31 +08:00
Concedo
f7b096374d
fixed string too long CI issue
2023-06-23 23:56:22 +08:00
Concedo
490cf395f8
better alloc error
2023-06-23 22:51:51 +08:00
Concedo
ece453ed09
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# CMakeLists.txt
# README.md
2023-06-23 22:46:54 +08:00
Concedo
f39a746089
bug fixes for openblas
2023-06-23 22:45:22 +08:00
Concedo
43c2891afa
option to not use scratch
2023-06-23 19:01:36 +08:00
Concedo
d5e4cf7ffe
handle ctx manip
2023-06-23 19:01:15 +08:00
Concedo
df9135e3a9
fixing memory bugs
2023-06-23 18:41:23 +08:00
eiery
d7b7484f74
Add OpenLLaMA instructions to the README ( #1954 )
...
* add openllama to readme
2023-06-23 10:38:01 +02:00
Erik Scholz
7487137227
rework convert.py to read hyper-parameters from config.json ( #1958 )
...
* Read hyper-parameters from HuggingFace-transformer config.json, if they exist, and fall back to guessing, like before otherwise.
This allows converting open_llama 3B and other non-standard model designs.
2023-06-22 14:20:47 +02:00
Concedo
0eedccaf06
Merge branch 'master' into optimize_quants_upstream
2023-06-22 17:59:58 +08:00
Concedo
e6ddb15c3a
cleanup
2023-06-22 10:38:27 +08:00
Johannes Gäßler
bbca06e269
cmake: revert CUDA arch default to 52, 61 if f16 ( #1959 )
2023-06-21 23:49:25 +02:00
Rahul Vivek Nair
fb98254f99
Fix typo in README.md ( #1961 )
2023-06-21 23:48:43 +02:00
Concedo
1b71752a9f
Implemented basic GPU offloading for MPT, GPT-2, GPT-J and GPT-NeoX
2023-06-22 00:43:25 +08:00
Ycros
b1f00fa9cc
Fix hordeconfig max context setting, and add Makefile flags for cuda F16/KQuants per iter. ( #252 )
...
* Fix hordeconfig maxcontext setting.
* cuda: Bring DMMV_F16 and KQUANTS_ITER Makefile flags over from llama.
2023-06-21 23:01:46 +08:00
Concedo
dfdd20240c
gpt j use scratch buffers
2023-06-21 16:10:31 +08:00
Georgi Gerganov
049aa16b8c
readme : add link to p1
2023-06-20 19:05:54 +03:00
Concedo
266d47a4b9
Merge branch 'optimize_quants_upstream' into concedo_experimental
2023-06-20 22:46:35 +08:00
Concedo
da668e685f
fixing address spaces
2023-06-20 22:46:11 +08:00
Concedo
cce6e67f44
fixing address spaces
2023-06-20 22:45:16 +08:00
Concedo
1f1735f5ad
Merge branch 'optimize_quants_upstream' into concedo_experimental
2023-06-20 21:39:35 +08:00
Concedo
6b75fc48b9
fixed global const struct types
2023-06-20 21:38:48 +08:00
Xiake Sun
2322ec223a
Fix typo ( #1949 )
2023-06-20 15:42:40 +03:00
Concedo
537ff22ec9
fixed a bug with token timings, updated lite
2023-06-20 20:41:42 +08:00
Concedo
c5ae3f50a7
Merge branch 'optimize_quants_upstream' into concedo_experimental
2023-06-20 18:41:13 +08:00
Concedo
a6e8b0216d
remove old dot kernels and template
2023-06-20 18:37:48 +08:00
Concedo
93247a11cd
ported q2k and q5k speedups
2023-06-20 18:37:41 +08:00
Concedo
029bed6446
ported q3k speedup successfully
2023-06-20 18:37:26 +08:00
Concedo
d754915269
Merge branch 'optimize_quants_upstream' into concedo_experimental
2023-06-20 17:26:39 +08:00
Concedo
b4c532e862
Merge branch 'master' into concedo_experimental
2023-06-20 17:26:27 +08:00