From 0c74b82f2c5188aaa39c744b61972dfc61fa43d3 Mon Sep 17 00:00:00 2001
From: niansa/tuxifan <tuxifan@posteo.de>
Date: Fri, 28 Jul 2023 01:59:55 +0200
Subject: [PATCH] Added LLaMA 2 usage instructions

---
 README.md | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 4538d9123..6c49526a3 100644
--- a/README.md
+++ b/README.md
@@ -634,7 +634,7 @@ python3 convert.py pygmalion-7b/ --outtype q4_1
 - The LLaMA models are officially distributed by Facebook and will **never** be provided through this repository.
 - Refer to [Facebook's LLaMA repository](https://github.com/facebookresearch/llama/pull/73/files) if you need to request access to the model data.
 
-### Obtaining the Facebook LLaMA 2 model
+### Obtaining and using the Facebook LLaMA 2 model
 
 - Refer to [Facebook's LLaMA download page](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) if you want to access the model data.
 - Alternatively, if you want to save time and space, you can download already converted and quantized models from [TheBloke](https://huggingface.co/TheBloke), including:
@@ -642,6 +642,8 @@ python3 convert.py pygmalion-7b/ --outtype q4_1
   - [LLaMA 2 13B base](https://huggingface.co/TheBloke/Llama-2-13B-GGML)
   - [LLaMA 2 7B chat](https://huggingface.co/TheBloke/Llama-2-7B-chat-GGML)
   - [LLaMA 2 13B chat](https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML)
+- Specify `-eps 1e-5` for best generation quality
+- Specify `-gqa 8` for 70B models to work
 
 ### Verifying the model files