Concedo
346cd68903
make linux and OSX build process equal to windows. Now it will build all applicable libraries, for a full build do make LLAMA_OPENBLAS=1 LLAMA_CLBLAST=1
2023-04-20 15:53:55 +08:00
Concedo
93761e7baf
slightly clarified the library replacement steps - replacing the dll is necessary in addition to replacing the library imports
2023-04-20 12:23:54 +08:00
Gustavo Rocha Dias
5ca2d774cc
doc - explanation of how to use a custom version of the windows libraries at the lib folder. ( #92 )
...
the dynamic libraries also need to be updated if you replace the import libraries
2023-04-20 12:20:11 +08:00
Concedo
be1222c36e
Merged the upstream cublas feature,
2023-04-19 20:45:37 +08:00
Concedo
cc407f283a
messing around with memory allocation to bandaid the random ooms with various gpt2 and gptj models
2023-04-19 20:18:55 +08:00
slaren
8944a13296
Add NVIDIA cuBLAS support ( #1044 )
2023-04-19 11:22:45 +02:00
Concedo
f662a9a230
Merge branch 'master' into concedo
...
# Conflicts:
# .github/workflows/build.yml
# .github/workflows/docker.yml
# CMakeLists.txt
# Makefile
# README.md
2023-04-19 16:34:51 +08:00
Concedo
65bfcdb1cc
Merge branch 'concedo_experimental' into concedo
2023-04-19 15:35:48 +08:00
Concedo
45ec09d31b
fast forwarding for rwkv for unmodified contexts
2023-04-19 15:09:35 +08:00
AlpinDale
116488af66
Create make_pyinstaller.sh ( #89 )
2023-04-19 10:57:07 +08:00
slaren
6667401238
Multi-threaded ggml_cpy ( #1035 )
...
* Multi-threaded ggml_cpy
* Update ggml.c
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Also fix wdata offset in ggml_compute_forward_add_q_f32
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-04-19 00:53:24 +02:00
Georgi Gerganov
77a73403ca
ggml : add new Q4_2 quantization (ARM only) ( #1046 )
...
* ggml : Q4_2 ARM
* ggml : add ggml_is_quantized()
* llama : update llama_type_name() with Q4_2 entry
* ggml : speed-up q4_2
- 4 threads: ~100ms -> ~90ms
- 8 threads: ~55ms -> ~50ms
* ggml : optimize q4_2 using vmlaq_n_f32 + vmulq_n_f32
2023-04-18 23:54:57 +03:00
Georgi Gerganov
50a8a2af97
ggml : scratch that - vmlaq_n_f32 is always better
...
Had a background process that was messing with the timings
2023-04-18 23:11:23 +03:00
Georgi Gerganov
4caebf6d40
gitignore : vdot
2023-04-18 23:00:08 +03:00
Georgi Gerganov
dcdd65e296
ggml : optimize ggml_vec_dot_q4_0_q8_0() using vectorized accumulators
2023-04-18 22:59:17 +03:00
Kawrakow
5ecff35151
Adding a simple program to measure speed of dot products ( #1041 )
...
On my Mac, the direct Q4_1 product is marginally slower
(~69 vs ~55 us for Q4_0). The SIMD-ified ggml version
is now almost 2X slower (~121 us).
On a Ryzen 7950X CPU, the direct product for Q4_1 quantization
is faster than the AVX2 implementation (~60 vs ~62 us).
---------
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
2023-04-18 19:00:14 +00:00
Georgi Gerganov
7faa7460f0
readme : update hot topics about new LoRA functionality
2023-04-18 20:10:26 +03:00
Georgi Gerganov
5af8e32238
ci : do not run on drafts
2023-04-18 19:57:06 +03:00
Concedo
f39def81d4
Update readme with more info
2023-04-18 21:44:26 +08:00
Concedo
3614956bc7
update readme
2023-04-18 21:39:05 +08:00
Concedo
ea01771dd5
rwkv is done
2023-04-18 20:55:01 +08:00
Concedo
a76b15b581
Merge branch 'concedo' into concedo_experimental
...
# Conflicts:
# make_pyinstaller.bat
2023-04-18 17:42:43 +08:00
Gustavo Rocha Dias
ed5b5c45a9
doc - enhanced readme explaing how to compile at Windows. ( #80 )
2023-04-18 17:40:04 +08:00
Gustavo Rocha Dias
a9253cdfba
fix - at some OSs the PyInstaller command is case sensitive, at lowercase it doen't work. ( #81 )
2023-04-18 17:39:06 +08:00
Concedo
ac61e34d5f
Merge branch 'master' into concedo_experimental
...
# Conflicts:
# CMakeLists.txt
# README.md
2023-04-18 17:38:10 +08:00
Concedo
c200b674f4
updated kobold lite, work on rwkv, added exe path to model load params, added launch parameter
2023-04-18 17:36:44 +08:00
Ivan Komarov
42747220b4
Do not close file after mmap (Windows version) ( #1034 )
2023-04-18 03:15:50 +02:00
Atsushi Tatsuma
e9298af389
readme : add Ruby bindings ( #1029 )
2023-04-17 22:34:35 +03:00
Cameron
4ad73137a1
add 4_0 to default outfile namestr dict ( #1031 )
...
this came up when trying to convert the gpt4all-lora-unfiltered-quantized.bin file
2023-04-17 20:26:23 +02:00
slaren
315a95a4d3
Add LoRA support ( #820 )
2023-04-17 17:28:55 +02:00
Arik Poznanski
efd05648c8
llama : well-defined static initialization of complex objects ( #927 )
...
* Replaced static initialization of complex objects with a initialization on first use. This prevents an undefined behavior on program run, for example, crash in Release build, works in Debug build
* replaced use of auto with exact type to avoid using -std=c++14
* Made the assessors functions for static maps be static const
2023-04-17 17:41:53 +03:00
Georgi Gerganov
eb17a026fd
quantize-stats : fix bug in --type argument
2023-04-17 17:31:06 +03:00
Concedo
8e923dc6e9
updated kobold lite
2023-04-17 21:33:57 +08:00
Georgi Gerganov
69b740289f
ggml : avoid using ggml_fp16_to_fp32() and ggml_fp32_to_fp16() in ggml.c
2023-04-17 16:16:23 +03:00
Ivan Komarov
f266259ad9
Speedup the AVX-512 implementation of ggml_vec_dot_q4_0() ( #933 )
2023-04-17 15:10:57 +02:00
Concedo
1f4a69c051
version number api
2023-04-17 19:31:15 +08:00
Concedo
364e2736c9
Merge branch 'master' into concedo
2023-04-17 17:34:50 +08:00
Concedo
763ad172c0
arranged files, updated kobold lite, modified makefile for extra link args on linux, started RWKV implementation
2023-04-17 17:31:45 +08:00
slaren
47f61aaa5f
Fix: do not close file on mmap ( #1017 )
2023-04-16 21:27:38 +02:00
Concedo
9581171a9f
updated embedded lite again
2023-04-16 22:42:51 +08:00
Concedo
bee6a401fd
slight clarity fix
2023-04-16 22:04:19 +08:00
Concedo
96fb12cfa2
Merge branch 'master' into concedo
2023-04-16 21:59:05 +08:00
Concedo
c757fbee1d
fixes to stopper tokens, fixed BLAS mode for GPT2 and GPTJ, updated kobold lite
2023-04-16 21:54:18 +08:00
Concedo
6548d3b3fb
Added prints for stopping sequences, made makefile 1% friendlier to arch linux users
2023-04-16 20:43:17 +08:00
Georgi Gerganov
3173a62eb9
stdout : vertical align outputs for better readibility
2023-04-16 13:59:27 +03:00
Concedo
525184930d
added a kobold API compatible implementation of stopping sequences
2023-04-16 18:37:49 +08:00
Pavol Rusnak
489537e6cf
examples: add missing <ctime> include for time() ( #1011 )
2023-04-16 10:13:00 +00:00
nanahi
2d3481c721
Fix msys2 build error and warnings ( #1009 )
2023-04-16 11:13:42 +02:00
Concedo
8bf2e50a11
converted the cl file to be a string literal instead
2023-04-16 15:57:30 +08:00
Concedo
5a4d1b5d15
Merge branch 'master' into concedo
...
# Conflicts:
# CMakeLists.txt
# Makefile
2023-04-16 14:08:23 +08:00