Georgi Gerganov
0f07cacb05
ggml : fix q4_1 dot product types
2023-04-14 09:45:42 +03:00
Howard Su
c5d70f5c9e
ggml : optimize rope function to avoid call powf in the tight loop ( #807 )
2023-04-14 09:24:52 +03:00
Gary Linscott
be87b6ed20
perplexity : add support for batch size to --perplexity
( #407 )
...
* Add support to batch size for perplexity
* Revert "Fix memory allocation issues and seg faults"
This reverts commit 4870e455b3
.
* update from merge
* Remove perplexity from main
* updates
* Update batch size for efficiency
2023-04-14 00:50:42 +03:00
CRD716
0e07e6a839
common : remove unnecessary includes ( #947 )
2023-04-13 18:39:25 +03:00
Georgi Gerganov
a3a2a0eda8
ggml : add GGML_DEFAULT_N_THREADS
2023-04-13 18:36:48 +03:00
Georgi Gerganov
d990e3fffc
ggml : speed-up ggml_vec_dot_q4_1() ARM_NEON + 32-bit ARM support ( #900 )
...
* ggml : speed-up q4_1 ARM_NEON by ~5%
* ggml : implement vaddvq when missing
* ggml : implement vminvq and vmaxvq when missing
* ggml : implement vzip when missing
* ggml : fix comment
* ggml : try to use correct ifdef
2023-04-13 18:32:36 +03:00
Georgi Gerganov
9190e8eac8
llama : merge llama_internal.h into llama.h
...
Hide it behind an #ifdef
2023-04-13 18:04:45 +03:00
Georgi Gerganov
c85980acd0
gitignore : benchmark
2023-04-13 18:01:33 +03:00
Stephan Walter
6232f2d7fd
ggml : optimize non-SIMD Q4_0 vector dot product ( #703 )
2023-04-13 17:59:50 +03:00
Pavol Rusnak
6c248707f5
ggml : introduce GGML_ALIGNED_MALLOC/GGML_ALIGNED_FREE macros ( #884 )
...
which allows us to use aligned_alloc or _aligned_malloc functions
2023-04-13 17:08:32 +03:00
CRD716
8cda5c981d
fix whitespace ( #944 )
2023-04-13 16:03:57 +02:00
CRD716
ec29272175
readme : remove python 3.10 warning ( #929 )
2023-04-13 16:59:53 +03:00
Genkagaku.GPT
7e941b95eb
readme : llama node binding ( #911 )
...
* chore: add nodejs binding
* chore: add nodejs binding
2023-04-13 16:54:27 +03:00
Pavol Rusnak
c729ff730a
flake.nix: add all binaries from bin ( #848 )
2023-04-13 15:49:05 +02:00
Judd
4579af95e8
zig : update build.zig ( #872 )
...
* update
* update readme
* minimize the changes.
---------
Co-authored-by: zjli2019 <zhengji.li@ingchips.com>
2023-04-13 16:43:22 +03:00
Vladimir
8c3ffc2f04
ggml : update cblas_sgemm columns var to be more reasonable ( #838 )
2023-04-13 16:24:30 +03:00
niansa/tuxifan
107980d970
examples : add -n to alpaca and gpt4all scripts ( #706 )
2023-04-13 16:03:39 +03:00
anzz1
585d91a156
cmake : add explicit F16C option (x86) ( #576 )
...
Fixes building for x86 processors missing F16C featureset
MSVC not included, as in MSVC F16C is implied with AVX2/AVX512
2023-04-13 15:48:21 +03:00
SebastianApel
95ea26f6e9
benchmark : add tool for timing q4_0 matrix multiplication ( #653 )
...
* Initial version of q4_0 matrix multiplication benchmark
* Bugfix: Added dependency to ggml.o to benchmark
* Reviewer requests: added parameter for threads, switched to ggml_time_us()
* Reviewer input: removed rtsc, use epsilon for check
* Review comment: Removed set_locale
* Feature: Param for numer of iterations, Bugfix for use of parameter threads
* Reviewer suggestion: Moved to examples
* Reviewer feedback: Updated clean: and benchmark: sections
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-04-13 15:46:23 +03:00
Pavol Rusnak
82d146df9b
do not force the prompt file to end with a new line ( #908 )
2023-04-13 11:33:16 +02:00
Concedo
ca297c190f
up version
2023-04-13 14:38:38 +08:00
Concedo
c1b75f38d0
try to fix noavx2 for really old devices by
2023-04-13 14:36:00 +08:00
Concedo
2ff91b5570
Merge remote-tracking branch 'occam/clblast-1' into concedo
2023-04-13 11:39:35 +08:00
Concedo
5c22f7e4c4
reduce batch sizes and skip all intrinsic flags except AVX when building in compatibility mode.
2023-04-13 11:32:05 +08:00
0cc4m
67d220210f
Revert buffer changes, no improvements in benchmarks
2023-04-12 23:10:35 +02:00
0cc4m
c7e5c4f7b2
Improve ClBlast implementation, avoid recreating buffers, remove redundant transfers
2023-04-12 23:10:33 +02:00
Concedo
f4257a8eef
Merge branch 'master' into concedo
2023-04-12 23:25:45 +08:00
Concedo
1bd5992da4
clean and refactor handling of flags
2023-04-12 23:25:31 +08:00
Stephan Walter
e7f6997f89
Don't crash on ftype (formerly f16) == 4 ( #917 )
2023-04-12 15:06:16 +00:00
Concedo
636f8e5a8e
updated the quantize files and makefile
2023-04-12 21:40:25 +08:00
Georgi Gerganov
f76cb3a34d
readme : change "GPU support" link to discussion
2023-04-12 14:48:57 +03:00
Georgi Gerganov
782438070f
readme : update hot topics with link to "GPU support" issue
2023-04-12 14:31:12 +03:00
Concedo
4faae0afa9
Merged upstream, fixed OSX compile errors, integrated noavx2 build into main
2023-04-12 18:08:55 +08:00
rabidcopy
2444a99db5
Fix make compile error in expose.cpp(?) ( #44 )
...
* fix compile error?
* Update expose.cpp
2023-04-12 16:19:38 +08:00
Nicolai Weitkemper
4dbbd40750
readme: link to sha256sums file ( #902 )
...
This is to emphasize that these do not need to be obtained from elsewhere.
2023-04-12 08:46:20 +02:00
Pavol Rusnak
8b679987cd
Fix whitespace, add .editorconfig, add GitHub workflow ( #883 )
2023-04-11 19:45:44 +00:00
Concedo
ca69e05d1f
update readme and fixed typos
2023-04-11 23:53:21 +08:00
Concedo
9245c7d7d0
Merge branch 'master' into concedo
2023-04-11 23:38:15 +08:00
Concedo
23c675b2e6
integrated optional (experimentl) CLBlast support
2023-04-11 23:33:44 +08:00
Stephan Walter
3e6e70d8e8
Add enum llama_ftype, sync ggml_type to model files ( #709 )
2023-04-11 15:03:51 +00:00
comex
2663d2c678
Windows fixes ( #890 )
...
Mostly for msys2 and mingw64 builds, which are different from each other
and different from standard Visual Studio builds. Isn't Windows fun?
- Define _GNU_SOURCE in more files (it's already used in ggml.c for
Linux's sake).
- Don't use PrefetchVirtualMemory if not building for Windows 8 or later
(mingw64 doesn't by default). But warn the user about this situation
since it's probably not intended.
- Check for NOMINMAX already being defined, which it is on mingw64.
- Actually use the `increment` variable (bug in my `pizza` PR).
- Suppress unused variable warnings in the fake pthread_create and
pthread_join implementations for Windows.
- (not Windows-related) Remove mention of `asprintf` from comment;
`asprintf` is no longer used.
Fixes #871 .
2023-04-11 15:19:54 +02:00
Concedo
c9f18082fd
Merge remote-tracking branch 'occam/clblast' into concedo
2023-04-11 17:01:31 +08:00
Concedo
1f6aa47b6e
Merge branch 'master' into concedo
...
# Conflicts:
# README.md
2023-04-11 16:53:41 +08:00
qouoq
a0caa34b16
Add BAIR's Koala to supported models ( #877 )
2023-04-10 22:41:53 +02:00
Georgi Gerganov
461ba9e66e
ggml : fix WASM build
2023-04-10 23:20:01 +03:00
Georgi Gerganov
c3ac702e5e
ggml : add ggml_cont() + optimize ggml_cpy() for contiguous dst
2023-04-10 22:42:28 +03:00
Georgi Gerganov
9d634ef452
ggml : remove trailing whitespaces
2023-04-10 22:42:28 +03:00
Marco Matthies
d9a239c410
Simplify to include lower-case windows.h always, fix compile on mingw32 ( #747 )
2023-04-10 19:57:59 +02:00
Georgi Gerganov
684da25926
ggml : fix quantize_row_q4_1() ARM_NEON ( close #876 )
2023-04-10 19:29:48 +03:00
0cc4m
c3db99ea32
Allow use of OpenCL GPU-based BLAS using ClBlast instead of OpenBLAS for context processing
2023-04-10 18:20:40 +02:00