SimpleChat: Try make user experience better, if possible

Reduce chat history context sent to the server/ai-model to be just the system-prompt, prev-user-request-and-ai-response and cur-user-request, instead of the previous full chat history. This way if there is any response with garbage/repeatation, it doesnt mess with things beyond the next question, in some ways. Increase max_tokens to 1024, so that a relatively large previous reponse doesnt eat up the space available wrt next query-response. However dont forget that the server when started should also be started with a model context size of 1k or more, to be on safe side. Add frequency and presence penalty fields set to 1.2 to the set of fields sent to server along with the user query. So that the model is partly set to try avoid repeating text in its response.
2024-05-24 22:53:43 +05:30 · 2024-05-24 22:53:43 +05:30 · 8f172b9070
commit 8f172b9070
parent 11d2d31622
2 changed files with 29 additions and 6 deletions
--- a/examples/server/public_simplechat/readme.md
+++ b/examples/server/public_simplechat/readme.md
@ -97,6 +97,8 @@ Once inside

 ## Devel note

+### General
+
 Me/gMe consolidates the settings which control the behaviour into one object.
 One can see the current settings, as well as change/update them using browsers devel-tool/console.

@ -158,6 +160,27 @@ at the end when full data is available.
  for the overall data to be available.


+### Default setup
+
+By default things are setup to try and make the user experience a bit better, if possible.
+However a developer when testing the server of ai-model may want to change these value.
+
+Using iRecentUserMsgCnt reduce chat history context sent to the server/ai-model to be
+just the system-prompt, prev-user-request-and-ai-response and cur-user-request, instead of
+full chat history. This way if there is any response with garbage/repeatation, it doesnt
+mess with things beyond the next question/request/query, in some ways.
+
+Set max_tokens to 1024, so that a relatively large previous reponse doesnt eat up the space
+available wrt next query-response. However dont forget that the server when started should
+also be started with a model context size of 1k or more, to be on safe side.
+
+Frequency and presence penalty fields are set to 1.2 in the set of fields sent to server
+along with the user query. So that the model is partly set to try avoid repeating text in
+its response.
+
+A end-user can change these behaviour by editing gMe from browser's devel-tool/console.
+
+
 ## At the end

 Also a thank you to all open source and open model developers, who strive for the common good.
--- a/examples/server/public_simplechat/simplechat.js
+++ b/examples/server/public_simplechat/simplechat.js
@ -25,11 +25,9 @@ let gUsageMsg = `
        <li> Completion mode doesnt insert user/role: prefix implicitly.</li>
        <li> Use shift+enter for inserting enter/newline.</li>
        </ul>
-    <li> If strange responses, Refresh page to start over fresh.</li>
+    <li> Default ContextWindow = [System, Last Query+Resp, Cur Query].</li>
        <ul class="ul2">
-        <li> [default] old msgs from chat not culled, when sending to server.</li>
-        <li> either use New CHAT, or refresh if chat getting long, or</li>
-        <li> experiment iRecentUserMsgCnt, max_tokens, model ctxt window.</li>
+        <li> experiment iRecentUserMsgCnt, max_tokens, model ctxt window to expand</li>
        </ul>
    </ul>
 `;
@ -573,11 +571,13 @@ class Me {
        this.multiChat = new MultiChatUI();
        this.bCompletionFreshChatAlways = true;
        this.bCompletionInsertStandardRolePrefix = false;
-        this.iRecentUserMsgCnt = -1;
+        this.iRecentUserMsgCnt = 2;
        // Add needed fields wrt json object to be sent wrt LLM web services completions endpoint.
        this.chatRequestOptions = {
            "temperature": 0.7,
-            "max_tokens": 512
+            "max_tokens": 1024,
+            "frequency_penalty": 1.2,
+            "presence_penalty": 1.2,
        };
    }