Commit graph

1386 commits

Author SHA1 Message Date
pudepiedj
6189a9ef3a One more trailing ws 2023-10-09 16:55:31 +01:00
pudepiedj
9abc92545c Remove trailing ws 2023-10-09 16:53:27 +01:00
pudepiedj
7636c34891 Merge branch 'context-sensitive-help' of https://github.com/pudepiedj/llama.cpp into context-sensitive-help 2023-10-09 16:10:15 +01:00
pudepiedj
094d6d6e09 Add help list 2023-10-09 16:10:10 +01:00
pudepiedj
49244be7e9
Merge branch 'ggerganov:master' into context-sensitive-help 2023-10-09 13:51:08 +01:00
pudepiedj
3f07ed90a4 Added prompt-file to hep 2023-10-09 13:45:12 +01:00
slaren
95bd60a0a6
ggml-alloc : fix assert in debug builds (#3555) 2023-10-09 15:44:58 +03:00
pudepiedj
51446bf921 Naming convention 2023-10-09 13:35:52 +01:00
Georgi Gerganov
fcca0a7004
refact : fix convert script + zero out KV cache to avoid nans (#3523)
* refact : fix convert script + zero out KV cache to avoid nans

* ggml : silu(-inf) should never happen

* metal : assert various kernel requirements
2023-10-09 14:32:17 +03:00
Georgi Gerganov
dcc09d2596
metal : do not use mul_mm kernels when ne00 < 64 (#3542) 2023-10-09 14:28:27 +03:00
pudepiedj
f6e92a84a2 Merge branch 'master' into context-sensitive-help 2023-10-09 12:21:31 +01:00
pudepiedj
990e8cb329 New comment 2023-10-09 10:36:11 +01:00
pudepiedj
3e4de67fdd Update find_implemented_args.py 2023-10-09 10:01:21 +01:00
pudepiedj
2e17fcfdba Comment in common.cpp 2023-10-09 09:46:26 +01:00
pudepiedj
32bdf0ee4b Final reconciliation 2023-10-09 09:10:07 +01:00
pudepiedj
982c908984 Update contextual help 2023-10-08 22:26:13 +01:00
Georgi Gerganov
db3abcc114
sync : ggml (ggml-backend) (#3548)
* sync : ggml (ggml-backend)

ggml-ci

* zig : add ggml-backend to the build
2023-10-08 20:19:14 +03:00
Matheus C. França
eee42c670e
ci : add Zig CI/CD and fix build (#2996)
* zig CI/CD and fix build

Signed-off-by: Matheus Catarino França <matheus-catarino@hotmail.com>

* fix build_compiler

* ci : remove trailing whitespace

---------

Signed-off-by: Matheus Catarino França <matheus-catarino@hotmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-08 16:59:20 +03:00
Ryder Wishart
8e6716a102
api_like_OAI.py : compat with Microsoft Guidance (#2746)
Check for None in addition to empty string check in all request params

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-08 13:55:58 +03:00
arcrank
9c38d181d4
api_like_OAI.py : simplify function (#2796)
Simplify function
2023-10-08 13:52:57 +03:00
Johannes Rudolph
a1202a31ed
k-quants : fix comments about block sizing (#3499) 2023-10-08 13:21:19 +03:00
Georgi Gerganov
94e502dfb7
ci : enable on obj-c changes + fix metal build (#3540) 2023-10-08 11:24:50 +03:00
Luo Tian
7d8b24932f
zig : fix build by introducing train.cpp (#3539) 2023-10-08 11:24:01 +03:00
Georgi Gerganov
b0ec5218c3
metal : support MTLGPUFamily < Apple7, formatting, style (#3524)
* metal : improve decoding speed for batches of 2-16

* metal : rename kernels mul_mat_ to mul_mv_

* metal : indentations

* minor

* metal : print more GPU info + disable mul_mm for MTLGPUFamiliy < Apple7
2023-10-08 10:01:53 +03:00
Kerfuffle
63d3b06a43
llama : fix missing break in Persimmon arch case statements (#3535) 2023-10-08 08:22:17 +03:00
Kerfuffle
a16e89cec8
Fix trying to strip newline from empty prompt and cfg prompt file content (#3534) 2023-10-07 15:31:41 -06:00
pudepiedj
9c5d6f0ef6 Update helper dev 2023-10-07 21:40:45 +01:00
M. Yusuf Sarıgöz
4d03833211
gguf.py : fix CI for publishing GGUF package (#3532)
* Fix CI for publishing GGUF package

* Bump version

* fix

* bump version

* bump version

* bump version
2023-10-07 22:14:10 +03:00
Tom C
c47066d833
py : change version of numpy requirement to 1.24.4 (#3515)
Co-authored-by: Lyjia <me@lyjia.us>
2023-10-07 12:56:15 +03:00
cebtenzzre
f1782c68de
quantize : fail fast on write errors (#3521) 2023-10-07 11:41:52 +03:00
Jhen-Jie Hong
c26765a0a1
metal : support default.metallib load & reuse code for swift package (#3522)
* metal : support load default.metallib & reuse code for swift package

* metal : use SWIFT_PACKAGE def instead of define GGML_SWIFT
2023-10-07 11:40:27 +03:00
Phillip Kravtsov
0e797c2fc5
llm : support Adept Persimmon 8B (#3410)
* Produces garbage output

* wip: correct tensors up to RoPE

* correct tensors thru RoPE

* Correct outputs through masked & softmax'd KQ

* fp32 works

* Rename adept->persimmon

* Produces correct outputs

* clean up convert scripts

* remove printing logic from ggml.c

* remove prints from llama.cpp & fix merge

* trivial cleanups

* Add offload funcs

* update conversion script to directly take adept artifacts rather than .saftensors file

* Fix norm eps bug

* Support sqr and concat on metal, persimmon-8b-q4 runs correctly

* Small changes from review

* Formatting changes

* Minor changes to conversion script

* Remove old script

* Fix editorconfig formatting

* Fix build

* add overlooked offload code ggml-ci
2023-10-07 10:12:43 +03:00
goerch
3a716b4dae
Fix for #3454 (#3455)
Fix: `sentencepiece` tokenizers with added tokens failed with an incorrect assertion
2023-10-07 06:57:01 +02:00
pudepiedj
0d70518220 Update contextual help 2023-10-06 22:19:29 +01:00
BarfingLemurs
1faaae8c2b
readme : update models, cuda + ppl instructions (#3510) 2023-10-06 22:13:36 +03:00
Mihai
cb13d73a72
server : docs fix default values and add n_probs (#3506) 2023-10-06 21:39:33 +03:00
Kerfuffle
9ca79d5cbb
kv cache slot search improvements (#3493)
* kv cache slot search improvements

* Use n_ctx in kv find slot for consistency

* Ensure kv cache head points to a valid slot in llama_decode internal

* Add some comments to prevent dumb people (like me) from getting confused.
2023-10-06 10:10:13 -06:00
pudepiedj
7a4dcff667 Update contextual help dev 2023-10-06 14:50:17 +01:00
Georgi Gerganov
0c731ca403
prompts : fix editorconfig checks after #3416 2023-10-06 16:36:32 +03:00
pudepiedj
a8777ad84e
parallel : add option to load external prompt file (#3416)
* Enable external file and add datestamp

* Add name of external file at end

* Upload ToK2024

* Delete ToK2024.txt

* Experiments with jeopardy

* Move ParallelQuestions to /proimpts and rename

* Interim commit

* Interim commit

* Final revision

* Remove trailing whitespace

* remove cmake_all.sh

* Remove cmake_all.sh

* Changed .gitignore

* Improved reporting and new question files.

* Corrected typo

* More LLM questions

* Update LLM-questions.txt

* Yet more LLM-questions

* Remove jeopardy results file

* Reinstate original jeopardy.sh

* Update examples/parallel/parallel.cpp

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-10-06 16:16:38 +03:00
Jhen-Jie Hong
97af49fa39
server : reuse llama_sample_token common util (#3494)
* server : reuse llama_sample_token common function

* common : use n_probs for temperature sampling
2023-10-06 15:44:24 +03:00
l3utterfly
16820a5a0d
llama : correct hparams comparison (#3446)
* fixed floating point comparison issues

* updated implementation for hparam comparison to handle inf and NaN

* fixed code review comments

* minor simplification

* rename is_float_eq -> is_float_close

---------

Co-authored-by: Cebtenzzre <cebtenzzre@gmail.com>
2023-10-06 13:47:59 +03:00
Jhen-Jie Hong
04b2f4386e
ci : fix xcodebuild destinations (#3491)
* ci : fix xcodebuild destinations

* ci : add .swift to paths
2023-10-06 13:36:43 +03:00
pudepiedj
739d6d3022 Automatic helper dev 2023-10-06 09:52:33 +01:00
cebtenzzre
48edda30ee
convert : update Falcon script for new HF config (#3448)
Also adds Falcon-180B support.
Closes #3049

Co-authored-by: jb <jonathan.t.barnard@gmail.com>
2023-10-05 15:00:34 -04:00
Kenvix ⭐
45eba9369f
build : use std::make_tuple() for compatibility with older GCC versions (#3488) 2023-10-05 20:16:39 +03:00
pudepiedj
297b7b6301 Automation 2023-10-05 17:30:48 +01:00
staviq
acec9eaaa9
common : process escape sequences in reverse prompts (#3461) 2023-10-05 19:17:29 +03:00
pudepiedj
275d56e99e Update cmap-example 2023-10-05 15:38:21 +01:00
shibe2
e2583cbc29 CLBlast: Fix handling of on-device tensor data
Fix uploading tensor data to device, including 3D, 4D, and non-contiguous tensors.
Use correct offsets into data that is already in VRAM.
Correct handling of OpenCL events when multiple commands are queued.
2023-10-05 18:25:23 +04:00