HanishKVC
452813f235
SimpleChat:UI:Settings make boolean button text show meaning
2024-06-01 18:18:14 +05:30
HanishKVC
0dae12ba6b
SimpleChat:UI:Add settings button and bring in settings ui
2024-06-01 18:18:14 +05:30
HanishKVC
e17f5e0204
SimpleChat:UI: Add Div wrapped label+element helpers
...
Move settings related elements to use the new div wrapped ones.
2024-06-01 18:18:14 +05:30
HanishKVC
94bc0b08d8
SimpleChat:UI:Select: dict-name-value, value wrt default, change
...
Take a dict/object of name-value pairs instead of just names.
Inturn specify the actual value wrt default, rather than the
string representing that value.
Trap the needed change event rather than click wrt select.
2024-06-01 18:18:14 +05:30
HanishKVC
1e47a48b30
SimpleChat:UI: Add Select helper and use it wrt ChatHistoryInCtxt
2024-06-01 18:18:14 +05:30
HanishKVC
e42249d82d
SimpleChat:UI: Helper to create bool button and use it wrt settings
2024-06-01 18:18:14 +05:30
HanishKVC
ae7e66d27a
SimpleChat:UI: Add and use a para-create-append helper
...
Also update the config params dump to indicate that now one needs
to use document to get hold of gMe global object, this is bcas of
moving to module type js.
Also add ui.mjs to importmap
2024-06-01 18:18:14 +05:30
HanishKVC
ed345abac8
SimpleChat:DU:Avoid setting frequence/Presence penalty
...
Some models like llama3 found to try to be over intelligent by
repeating garbage still, but by tweaking the garbage a bit so that
it is not exactly same. So avoid setting these penalties and let
the model's default behaviour work out, as is.
Also the simple minded histogram based garbage trimming from end,
works to an extent, when the garbage is more predictable and
repeatative.
2024-06-01 18:18:14 +05:30
HanishKVC
a41f701159
SimpleChat:UI: Move html ui base helpers into its own module
2024-06-01 18:18:14 +05:30
HanishKVC
15152af94f
SimpleChat:DU: Cleanup debug log messages
2024-06-01 18:18:14 +05:30
HanishKVC
ae9f610663
SimpleChat:DU: Bring in maxType to the mix along with maxUniq
...
Allow for more uniq chars, but then ensure that a given type of
char ie numerals or alphabets or other types dont cross the
specified maxType limit. This allows intermixed text garbage
to be identified and trimmed.
2024-06-01 18:18:14 +05:30
HanishKVC
d1e73d8777
SimpleChat:DU: Switch trim garbage hist based to maxUniq simple
...
Instead of blindly building histogram for specified substring
length, and then checking if any new char within specified min
garbage length limit, NOW exit learn state when specified maxUniq
chars are found. Inturn there should be no new chars with in
the specified min garbage length required limit.
TODO: Need to track char classes like alphabets, numerals and
special/other chars.
2024-06-01 18:18:14 +05:30
HanishKVC
f33aa28149
SimpleChat:DU: Try trim using histogram based info
...
TODO: May have to add max number of uniq chars in histogram at
end of learning phase.
2024-06-01 18:18:14 +05:30
HanishKVC
6390f3489a
SimpleChat:DU:TrimGarbage if unable try skip char and retry
2024-06-01 18:18:13 +05:30
HanishKVC
54802dc184
SimpleChat:DU: Add trim garbage at end in loop helper
2024-06-01 18:18:13 +05:30
HanishKVC
c83c19ad4c
SimpleChat:DU:BringIn local helper js modules using importmap
...
Use it to bring in a simple trim garbage at end logic, which is
used to trim received response.
Also given that importmap assumes esm / standard js modules, so
also global variables arent implicitly available outside the
modules. So add it has a member of document for now
2024-06-01 18:18:13 +05:30
Johannes Gäßler
9b596417af
CUDA: quantized KV support for FA vec ( #7527 )
...
* CUDA: quantized KV support for FA vec
* try CI fix
* fix commented-out kernel variants
* add q8_0 q4_0 tests
* fix nwarps > batch size
* split fattn compile via extern templates
* fix flake8
* fix metal tests
* fix cmake
* make generate_cu_files.py executable
* add autogenerated .cu files
* fix AMD
* error if type_v != FP16 and not flash_attn
* remove obsolete code
2024-06-01 08:44:14 +02:00
Georgi Gerganov
a323ec60af
server : update js ( #7670 )
2024-05-31 22:23:04 +03:00
Galunid
0515ad93f4
convert-hf : Handle NotImplementedError in convert-hf-to-gguf ( #7660 )
2024-05-31 17:42:33 +02:00
Johannes Gäßler
c8047d538f
scripts: update compare_llama_bench.py [no ci] ( #7673 )
2024-05-31 16:26:21 +02:00
Daniele
30e238b246
Improve HIP compatibility ( #7672 )
2024-05-31 16:00:29 +02:00
Georgi Gerganov
16926dff92
readme : link homebrew discussion
2024-05-31 15:04:58 +03:00
Georgi Gerganov
0c27e6f62e
ggml : fix loongson compile warnings ( #7537 )
...
* ggml : fix loongson compile warnings
ggml-ci
* Fix loongarch quantize test fail.
Fix unexpected error introduced during rebase code.
* tests : disable json test due to lack of python on the CI node
ggml-ci
---------
Co-authored-by: junchao-loongson <zhaojunchao@loongson.cn>
2024-05-31 14:17:10 +03:00
Galunid
2e32f874e6
Somehow '**' got lost ( #7663 )
2024-05-31 18:24:41 +10:00
Galunid
1af511fc22
Add convert.py removal to hot topics ( #7662 )
2024-05-31 10:09:20 +02:00
Sertaç Özercan
0541f06296
[no ci] docs: add aikit to readme ( #7650 )
...
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
2024-05-31 09:57:16 +10:00
JohnnyB
9022c33646
Fixed painfully slow single process builds. ( #7326 )
...
* Fixed painfully slow single process builds.
* Added nproc for systems that don't default to nproc
2024-05-30 22:32:38 +02:00
Georgi Gerganov
5921b8f089
llama : cache llama_token_to_piece ( #7587 )
...
* llama : cache llama_token_to_piece
ggml-ci
* llama : use vectors and avoid has_cache
ggml-ci
* llama : throw on unknown tokenizer types
ggml-ci
* llama : print a log of the total cache size
2024-05-31 02:01:41 +10:00
Martin Delille
5dcdf94676
Fix conan badge display [no ci] ( #7645 )
2024-05-31 01:07:39 +10:00
Manuel
2e2340de17
Add brew installation instruction to README [no ci] ( #7616 )
2024-05-31 00:58:15 +10:00
Martin Delille
7846540bd2
readme : add Conan badge ( #7638 )
2024-05-30 15:52:50 +03:00
Brian
e6157f94c8
github: add contact links to issues and convert question into research [no ci] ( #7612 )
2024-05-30 21:55:36 +10:00
Galunid
9c4c9cc83f
Move convert.py to examples/convert-legacy-llama.py ( #7430 )
...
* Move convert.py to examples/convert-no-torch.py
* Fix CI, scripts, readme files
* convert-no-torch -> convert-legacy-llama
* Move vocab thing to vocab.py
* Fix convert-no-torch -> convert-legacy-llama
* Fix lost convert.py in ci/run.sh
* Fix imports
* Fix gguf not imported correctly
* Fix flake8 complaints
* Fix check-requirements.sh
* Get rid of ADDED_TOKENS_FILE, FAST_TOKENIZER_FILE
* Review fixes
2024-05-30 21:40:00 +10:00
Chris Elrod
59b0d07766
faster avx512 exp implementation ( #7551 )
...
* faster avx512 exp implementation
* x->r
* improve accuracy, handle special cases
* remove `e`
2024-05-30 21:32:55 +10:00
junchao-loongson
d5c05821f3
ggml : fix loongarch build (O2 issue) ( #7636 )
2024-05-30 12:30:10 +03:00
Johannes Gäßler
972b555ab9
README: explain parallel build [no ci] ( #7618 )
2024-05-30 09:52:39 +02:00
Meng, Hengyu
3854c9d07f
[SYCL] fix intel docker ( #7630 )
...
* Update main-intel.Dockerfile
* workaround for https://github.com/intel/oneapi-containers/issues/70
* reset intel docker in CI
* add missed in server
2024-05-30 16:19:08 +10:00
Galunid
eb57fee51f
gguf-py : Add tokenizer.ggml.pre to gguf-new-metadata.py ( #7627 )
2024-05-30 02:10:40 +02:00
Georgi Gerganov
55d62262a9
metal : remove invalid asserts ( #7617 )
2024-05-29 22:21:20 +03:00
Georgi Gerganov
975ec63ff2
metal : add missing asserts ( #7617 )
2024-05-29 20:45:25 +03:00
Georgi Gerganov
fb76ec31a9
ggml : fix YARN + add tests + add asserts ( #7617 )
...
* tests : add rope tests
ggml-ci
* ggml : fixes (hopefully)
ggml-ci
* tests : add non-cont tests
ggml-ci
* cuda : add asserts for rope/norm + fix DS2
ggml-ci
* ggml : assert contiguousness
* tests : reduce RoPE tests
ggml-ci
2024-05-29 20:17:31 +03:00
Georgi Gerganov
cce3dcffc5
cuda : non-cont concat support ( #7610 )
...
* tests : add non-cont concat tests
* cuda : non-cont concat support
ggml-ci
2024-05-29 15:38:26 +03:00
Radoslav Gerganov
210d99173d
llama-bench : add support for the RPC backend ( #7435 )
2024-05-29 14:45:44 +03:00
slaren
87bdf2a199
ggml : use atomic_flag for critical section ( #7598 )
...
* ggml : use atomic_flag for critical section
* add windows shims
2024-05-29 13:36:39 +02:00
Georgi Gerganov
00281b7be3
scripts : remove mpi remnants
2024-05-29 14:31:18 +03:00
Georgi Gerganov
2ab977282b
sync : ggml
2024-05-29 14:29:52 +03:00
Georgi Gerganov
72de268bec
ggml : restore ggml_rope_xpos_inplace (ggml/0)
...
ggml-ci
2024-05-29 14:29:33 +03:00
Akarshan Biswas
0e8d8bfd6c
Add Arc A750 and Arch linux to readme-sycl.md as verified GPU model and Linux distro ( #7605 )
2024-05-29 16:53:47 +10:00
zhouwg
504f0c340f
ggml : fix typo in ggml.c ( #7603 )
2024-05-29 04:09:31 +02:00
Meng, Hengyu
b864b50ce5
[SYCL] Align GEMM dispatch ( #7566 )
...
* align GEMM dispatch
2024-05-29 07:00:24 +08:00