diff --git a/examples/llava/README.md b/examples/llava/README.md index fbb30b860..a2b296331 100644 --- a/examples/llava/README.md +++ b/examples/llava/README.md @@ -14,12 +14,41 @@ After building, run: `./bin/llava` to see the usage. For example: ./bin/llava path/to/llava-v1.5-7b/ggml-model-q5_k.gguf path/to/llava-v1.5-7b/mmproj-model-f16.gguf path/to/an/image.jpg ``` +## Model conversion + +- Clone `llava-v15-7b`` and `clip-vit-large-patch14-336`` locally: + +```sh +git clone https://huggingface.co/liuhaotian/llava-v1.5-7b + +git clone https://huggingface.co/openai/clip-vit-large-patch14-336 +``` + +2. Use `llava_surgery.py` to split the LLaVA model to LLaMA and multimodel projector constituents: + +```sh +python ./examples/llava/llava_surgery.py -m ../llava-v1.5-7b +``` + +3. Use `convert_image_encoder_to_gguf.py` to convert the LLaVA image encoder to GGUF: + +```sh +python ./examples/llava/convert_image_encoder_to_gguf -m ../clip-vit-large-patch14-336 --llava-projector ../llava-v1.5-7b/llava.projector --output-dir ../llava-v1.5-7b +``` + +4. Use `convert.py` to convert the LLaMA part of LLaVA to GGUF: + +```sh +python ./convert.py ../llava-v1.5-7b +``` + +Now both the LLaMA part and the image encoder is in the `llava-v1.5-7b` directory. + ## TODO These will be include in this pr: - [ ] Better command line interface. -- [ ] Document model conversion. These will be another PR: diff --git a/examples/llava/convert_hf_to_gguf.py b/examples/llava/convert_image_encoder_to_gguf.py similarity index 100% rename from examples/llava/convert_hf_to_gguf.py rename to examples/llava/convert_image_encoder_to_gguf.py