From 2797754843481e6772bc4afe9fc2af83dd3ff457 Mon Sep 17 00:00:00 2001
From: John <78893154+cmp-nct@users.noreply.github.com>
Date: Sat, 17 Jun 2023 16:51:34 +0200
Subject: [PATCH] Update README.md

---
 README.md | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 52b7ed300..be2eefeba 100644
--- a/README.md
+++ b/README.md
@@ -3,8 +3,7 @@ llama.cpp modification to run Falcon (work in progress)
 Status:  
 * Quantization works except for Q_K_ types  
 * CUDA not yet functional
-* context size calculation not proper (cuda as well as cpu)  
-
+* python conversion script is very basic (produces ggml v0)
 
 It appears the Q5 Falcon 40B inference time on CPU is as fast as the A100 fp16 inference time at 2 tk/second  
 CPU inference examples: