From d4b9423560c508d7583f1442843a7434ce57ba43 Mon Sep 17 00:00:00 2001
From: John <78893154+cmp-nct@users.noreply.github.com>
Date: Sat, 17 Jun 2023 16:23:01 +0200
Subject: [PATCH] Update README.md

---
 README.md | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index edbad27ea..b1aa85f74 100644
--- a/README.md
+++ b/README.md
@@ -1,10 +1,12 @@
 llama.cpp modification to run Falcon (work in progress)
 
 Status:
-Quantization works except for Q_K_ types
-CUDA not yet functional
+* Quantization works except for Q_K_ types
+* CUDA not yet functional
+* 
 
 It appears the Q5 Falcon 40B inference time on CPU is as fast as the A100 fp16 inference time at 2 tk/second
+CPU inference examples:
 ```
  Q:\ggllm.cpp> .\build\bin\Release\falcon_main.exe -t 31 -m Q:\models\falcon-40b\q5_1 -p "Love relates to hate like" -n 50 -ngl 0
 main: build = 677 (dd3d346)