llama : add AWQ for llama, llama2, mpt, and mistral models (#4593)

* update: awq support llama-7b model

* update: change order

* update: benchmark results for llama2-7b

* update: mistral 7b v1 benchmark

* update: support 4 models

* fix: Readme

* update: ready for PR

* update: readme

* fix: readme

* update: change order import

* black

* format code

* update: work for bot mpt and awqmpt

* update: readme

* Rename to llm_build_ffn_mpt_awq

* Formatted other files

* Fixed params count

* fix: remove code

* update: more detail for mpt

* fix: readme

* fix: readme

* update: change folder architecture

* fix: common.cpp

* fix: readme

* fix: remove ggml_repeat

* update: cicd

* update: cicd

* uppdate: remove use_awq arg

* update: readme

* llama : adapt plamo to new ffn

ggml-ci

---------

Co-authored-by: Trần Đức Nam <v.namtd12@vinai.io>
Co-authored-by: Le Hoang Anh <v.anhlh33@vinai.io>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

This commit is contained in:

Nam D. Tran

2023-12-27 22:39:45 +07:00

• committed by

GitHub

parent 879b690a9e

commit f6793491b5

No known key found for this signature in database

GPG key ID: 4AEE18F83AFDEB23

8 changed files with 443 additions and 5 deletions

2

awq-py/requirements.txt Normal file

View file

 @ -0,0 +1,2 @@
 torch>=2.0.0
 transformers>=4.32.0

Rows
Columns

llama : add AWQ for llama, llama2, mpt, and mistral models (#4593)

2 awq-py/requirements.txt Normal file Unescape Escape View file

2

awq-py/requirements.txt Normal file

View file