Commit graph

516 commits

Author SHA1 Message Date
Howard Su
c5d70f5c9e
ggml : optimize rope function to avoid call powf in the tight loop (#807) 2023-04-14 09:24:52 +03:00
Gary Linscott
be87b6ed20
perplexity : add support for batch size to --perplexity (#407)
* Add support to batch size for perplexity

* Revert "Fix memory allocation issues and seg faults"

This reverts commit 4870e455b3.

* update from merge

* Remove perplexity from main

* updates

* Update batch size for efficiency
2023-04-14 00:50:42 +03:00
CRD716
0e07e6a839
common : remove unnecessary includes (#947) 2023-04-13 18:39:25 +03:00
Georgi Gerganov
a3a2a0eda8
ggml : add GGML_DEFAULT_N_THREADS 2023-04-13 18:36:48 +03:00
Georgi Gerganov
d990e3fffc
ggml : speed-up ggml_vec_dot_q4_1() ARM_NEON + 32-bit ARM support (#900)
* ggml : speed-up q4_1 ARM_NEON by ~5%

* ggml : implement vaddvq when missing

* ggml : implement vminvq and vmaxvq when missing

* ggml : implement vzip when missing

* ggml : fix comment

* ggml : try to use correct ifdef
2023-04-13 18:32:36 +03:00
Georgi Gerganov
9190e8eac8
llama : merge llama_internal.h into llama.h
Hide it behind an #ifdef
2023-04-13 18:04:45 +03:00
Georgi Gerganov
c85980acd0
gitignore : benchmark 2023-04-13 18:01:33 +03:00
Stephan Walter
6232f2d7fd
ggml : optimize non-SIMD Q4_0 vector dot product (#703) 2023-04-13 17:59:50 +03:00
Pavol Rusnak
6c248707f5
ggml : introduce GGML_ALIGNED_MALLOC/GGML_ALIGNED_FREE macros (#884)
which allows us to use aligned_alloc or _aligned_malloc functions
2023-04-13 17:08:32 +03:00
CRD716
8cda5c981d
fix whitespace (#944) 2023-04-13 16:03:57 +02:00
CRD716
ec29272175
readme : remove python 3.10 warning (#929) 2023-04-13 16:59:53 +03:00
Genkagaku.GPT
7e941b95eb
readme : llama node binding (#911)
* chore: add nodejs binding

* chore: add nodejs binding
2023-04-13 16:54:27 +03:00
Pavol Rusnak
c729ff730a
flake.nix: add all binaries from bin (#848) 2023-04-13 15:49:05 +02:00
Judd
4579af95e8
zig : update build.zig (#872)
* update

* update readme

* minimize the changes.

---------

Co-authored-by: zjli2019 <zhengji.li@ingchips.com>
2023-04-13 16:43:22 +03:00
Vladimir
8c3ffc2f04
ggml : update cblas_sgemm columns var to be more reasonable (#838) 2023-04-13 16:24:30 +03:00
niansa/tuxifan
107980d970
examples : add -n to alpaca and gpt4all scripts (#706) 2023-04-13 16:03:39 +03:00
anzz1
585d91a156
cmake : add explicit F16C option (x86) (#576)
Fixes building for x86 processors missing F16C featureset
MSVC not included, as in MSVC F16C is implied with AVX2/AVX512
2023-04-13 15:48:21 +03:00
SebastianApel
95ea26f6e9
benchmark : add tool for timing q4_0 matrix multiplication (#653)
* Initial version of q4_0 matrix multiplication benchmark

* Bugfix: Added dependency to ggml.o to benchmark

* Reviewer requests: added parameter for threads, switched to ggml_time_us()

* Reviewer input: removed rtsc, use epsilon for check

* Review comment: Removed set_locale

* Feature: Param for numer of iterations, Bugfix for use of parameter threads

* Reviewer suggestion: Moved to examples

* Reviewer feedback: Updated clean: and benchmark: sections

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-04-13 15:46:23 +03:00
Pavol Rusnak
82d146df9b
do not force the prompt file to end with a new line (#908) 2023-04-13 11:33:16 +02:00
Concedo
ca297c190f up version 2023-04-13 14:38:38 +08:00
Concedo
c1b75f38d0 try to fix noavx2 for really old devices by 2023-04-13 14:36:00 +08:00
Concedo
2ff91b5570 Merge remote-tracking branch 'occam/clblast-1' into concedo 2023-04-13 11:39:35 +08:00
Concedo
5c22f7e4c4 reduce batch sizes and skip all intrinsic flags except AVX when building in compatibility mode. 2023-04-13 11:32:05 +08:00
0cc4m
67d220210f Revert buffer changes, no improvements in benchmarks 2023-04-12 23:10:35 +02:00
0cc4m
c7e5c4f7b2 Improve ClBlast implementation, avoid recreating buffers, remove redundant transfers 2023-04-12 23:10:33 +02:00
Concedo
f4257a8eef Merge branch 'master' into concedo 2023-04-12 23:25:45 +08:00
Concedo
1bd5992da4 clean and refactor handling of flags 2023-04-12 23:25:31 +08:00
Stephan Walter
e7f6997f89
Don't crash on ftype (formerly f16) == 4 (#917) 2023-04-12 15:06:16 +00:00
Concedo
636f8e5a8e updated the quantize files and makefile 2023-04-12 21:40:25 +08:00
Georgi Gerganov
f76cb3a34d
readme : change "GPU support" link to discussion 2023-04-12 14:48:57 +03:00
Georgi Gerganov
782438070f
readme : update hot topics with link to "GPU support" issue 2023-04-12 14:31:12 +03:00
Concedo
4faae0afa9 Merged upstream, fixed OSX compile errors, integrated noavx2 build into main 2023-04-12 18:08:55 +08:00
rabidcopy
2444a99db5
Fix make compile error in expose.cpp(?) (#44)
* fix compile error?

* Update expose.cpp
2023-04-12 16:19:38 +08:00
Nicolai Weitkemper
4dbbd40750
readme: link to sha256sums file (#902)
This is to emphasize that these do not need to be obtained from elsewhere.
2023-04-12 08:46:20 +02:00
Pavol Rusnak
8b679987cd
Fix whitespace, add .editorconfig, add GitHub workflow (#883) 2023-04-11 19:45:44 +00:00
Concedo
ca69e05d1f update readme and fixed typos 2023-04-11 23:53:21 +08:00
Concedo
9245c7d7d0 Merge branch 'master' into concedo 2023-04-11 23:38:15 +08:00
Concedo
23c675b2e6 integrated optional (experimentl) CLBlast support 2023-04-11 23:33:44 +08:00
Stephan Walter
3e6e70d8e8
Add enum llama_ftype, sync ggml_type to model files (#709) 2023-04-11 15:03:51 +00:00
comex
2663d2c678
Windows fixes (#890)
Mostly for msys2 and mingw64 builds, which are different from each other
and different from standard Visual Studio builds.  Isn't Windows fun?

- Define _GNU_SOURCE in more files (it's already used in ggml.c for
  Linux's sake).

- Don't use PrefetchVirtualMemory if not building for Windows 8 or later
  (mingw64 doesn't by default).  But warn the user about this situation
  since it's probably not intended.

- Check for NOMINMAX already being defined, which it is on mingw64.

- Actually use the `increment` variable (bug in my `pizza` PR).

- Suppress unused variable warnings in the fake pthread_create and
  pthread_join implementations for Windows.

- (not Windows-related) Remove mention of `asprintf` from comment;
  `asprintf` is no longer used.

Fixes #871.
2023-04-11 15:19:54 +02:00
Concedo
c9f18082fd Merge remote-tracking branch 'occam/clblast' into concedo 2023-04-11 17:01:31 +08:00
Concedo
1f6aa47b6e Merge branch 'master' into concedo
# Conflicts:
#	README.md
2023-04-11 16:53:41 +08:00
qouoq
a0caa34b16
Add BAIR's Koala to supported models (#877) 2023-04-10 22:41:53 +02:00
Georgi Gerganov
461ba9e66e
ggml : fix WASM build 2023-04-10 23:20:01 +03:00
Georgi Gerganov
c3ac702e5e
ggml : add ggml_cont() + optimize ggml_cpy() for contiguous dst 2023-04-10 22:42:28 +03:00
Georgi Gerganov
9d634ef452
ggml : remove trailing whitespaces 2023-04-10 22:42:28 +03:00
Marco Matthies
d9a239c410
Simplify to include lower-case windows.h always, fix compile on mingw32 (#747) 2023-04-10 19:57:59 +02:00
Georgi Gerganov
684da25926
ggml : fix quantize_row_q4_1() ARM_NEON (close #876) 2023-04-10 19:29:48 +03:00
0cc4m
c3db99ea32 Allow use of OpenCL GPU-based BLAS using ClBlast instead of OpenBLAS for context processing 2023-04-10 18:20:40 +02:00
Concedo
69b85f5b61 fixed a few OOM errors with larger contexts - I cannot figure out why they happen, so I am forced to increase the buffer size. 2023-04-11 00:14:57 +08:00