SimpleChat Completion Mode flexibility and cleanup, Settings gMe, Optional sliding window (#7480)
* SimpleChat: A placeholder system prompt, Use usage msg in code Just have a alert msg wrt needing javascript enabled in html. And have usage message from js file. Update the usage message a bit. So also enable switch session wrt setup_ui call. Add a possible system prompt as a placeholder for the system-input. * SimpleChat:CompletionMode: Allow control of Role: prefix * SimpleChat:Completion: Avoid Role: prefix; Newline only in between In completion mode * avoid inserting Role: prefix before each role's message * avoid inserting newline at the begin and end of the prompt message. However if there are multiple role messages, then insert newline when going from one role's message to the next role's message. * SimpleChat:CompletionMode: Update readme/usage, trim textarea newline Readme update wrt completion mode behavior. Usage help updated wrt completion mode behavior. When changing from input to textarea elment wrt user input, the last newline at the end of the user input wrt textarea, was forgotten to be filtered, this is fixed now. However if user wants to have a explicit newline they can using shift+enter to insert a newline, that wont be removed. The extra newline removal logic uses substring and keyup to keep things simple and avoid some previously noted bugs wrt other events in the key path as well as IME composition etal. * SimpleChat:SC: Ensure proper clearing/reseting previous logic would have cleared/reset the xchat, without doing the same wrt iLastSys, thus leading to it pointing to a now non existent role-content entry. So if a user set a system prompt and used completion mode, it would have done the half stupid clear, after the model response was got. Inturn when user tries to send a new completion query, it would inturn lead to handle_user_submit trying to add/update system prompt if any, which will fail, bcas iLastSys will be still pointing to a non existant entry. This is fixed now, by having a proper clear helper wrt SC class. * SimpleChat: Update usage note and readme a bit * SimpleChat:Completion: clear any prev chat history at begining Previously any chat history including model response to a completion query would have got cleared, after showing the same to the user, at the end of handle_user_submit, rather than at the begining. This gave the flexibility that user could switch from chat mode to completion mode and have the chat history till then sent to the ai model, as part of the completion query. However this flow also had the issue that, if user switches between different chat sessions, after getting a completion response, they can no longer see the completion query and its response that they had just got. The new flow changes the clearing of chat history wrt completion mode to the begining of handle_user_submit, so that user doesnt lose the last completion mode query and response, till a new completion mode query is sent to the model, even if they were to switch between the chat sessions. At the same time the loss of flexibility wrt converting previous chat history into being part of the completion query implicitly doesnt matter, because now the end user can enter multiline queries. * SimpleChat:Try read json early, if available For later the server flow doesnt seem to be sending back data early, atleast for the request (inc options) that is currently sent. if able to read json data early on in future, as and when ai model is generating data, then this helper needs to indirectly update the chat div with the recieved data, without waiting for the overall data to be available. * SimpleChat: Rename the half asleep mis-spelled global var * SimpleChat: Common chat request options from a global object * SimpleChat: Update title, usage and readme a bit Keep the title simple so that print file name doesnt have chars that need to be removed. Update readme wrt some of the new helpers and options. Change Usage list to a list of lists, add few items and style it to reduce the margin wrt lists. * SimpleChat:ChatRequestOptions: max_tokens As some times based on the query from the user, the ai model may get into a run away kind of generation with repeatations etal, so adding max_tokens to try and limit this run away behaviour, if possible. * SimpleChat: Reduce max_tokens to be small but still sufficient * SimpleChat: Consolidate global vars into gMe, Display to user This allows the end user to see the settings used by the logic, as well as allows users to change/update the settings if they want to by using devel-tools/console * SimpleChat:SlidingWindow: iRecentUserMsgCnt to limit context load This is disabled by default. However if enabled, then in addition to latest system message, only the last N user messages, after the latest system message and its reponses from the ai model will be sent to the ai-model, when querying for a new response. This specified N also includes the latest user query. * SimpleChat: placeholder based usage hint for user-in textarea * SimpleChat: Try make user experience better, if possible Reduce chat history context sent to the server/ai-model to be just the system-prompt, prev-user-request-and-ai-response and cur-user-request, instead of the previous full chat history. This way if there is any response with garbage/repeatation, it doesnt mess with things beyond the next question, in some ways. Increase max_tokens to 1024, so that a relatively large previous reponse doesnt eat up the space available wrt next query-response. However dont forget that the server when started should also be started with a model context size of 1k or more, to be on safe side. Add frequency and presence penalty fields set to 1.2 to the set of fields sent to server along with the user query. So that the model is partly set to try avoid repeating text in its response. * SimpleChat:Add n_predict (equiv max_tokens) for llamacpp server The /completions endpoint of examples/server doesnt take max_tokens, instead it takes the internal n_predict, for now add the same on the client side, maybe later add max_tokens to /completions endpoint handling. * SimpleChat: Note about trying to keep things simple yet flexible
This commit is contained in:
		
							parent
							
								
									9588f196b1
								
							
						
					
					
						commit
						b9adcbbf92
					
				
					 4 changed files with 314 additions and 39 deletions
				
			
		|  | @ -14,23 +14,86 @@ class ApiEP { | |||
| } | ||||
| 
 | ||||
| let gUsageMsg = ` | ||||
|     <p> Enter the system prompt above, before entering/submitting any user query.</p> | ||||
|     <p> Enter your text to the ai assistant below.</p> | ||||
|     <p> Use shift+enter for inserting enter.</p> | ||||
|     <p> Refresh the page to start over fresh.</p> | ||||
|     <p class="role-system">Usage</p> | ||||
|     <ul class="ul1"> | ||||
|     <li> Set system prompt above, to try control ai response charactersitic, if model supports same.</li> | ||||
|         <ul class="ul2"> | ||||
|         <li> Completion mode normally wont have a system prompt.</li> | ||||
|         </ul> | ||||
|     <li> Enter your query to ai assistant below.</li> | ||||
|         <ul class="ul2"> | ||||
|         <li> Completion mode doesnt insert user/role: prefix implicitly.</li> | ||||
|         <li> Use shift+enter for inserting enter/newline.</li> | ||||
|         </ul> | ||||
|     <li> Default ContextWindow = [System, Last Query+Resp, Cur Query].</li> | ||||
|         <ul class="ul2"> | ||||
|         <li> experiment iRecentUserMsgCnt, max_tokens, model ctxt window to expand</li> | ||||
|         </ul> | ||||
|     </ul> | ||||
| `;
 | ||||
| 
 | ||||
| /** @typedef {{role: string, content: string}[]} ChatMessages */ | ||||
| 
 | ||||
| class SimpleChat { | ||||
| 
 | ||||
|     constructor() { | ||||
|         /** | ||||
|          * Maintain in a form suitable for common LLM web service chat/completions' messages entry | ||||
|          * @type {{role: string, content: string}[]} | ||||
|          * @type {ChatMessages} | ||||
|          */ | ||||
|         this.xchat = []; | ||||
|         this.iLastSys = -1; | ||||
|     } | ||||
| 
 | ||||
|     clear() { | ||||
|         this.xchat = []; | ||||
|         this.iLastSys = -1; | ||||
|     } | ||||
| 
 | ||||
|     /** | ||||
|      * Recent chat messages. | ||||
|      * If iRecentUserMsgCnt < 0 | ||||
|      *   Then return the full chat history | ||||
|      * Else | ||||
|      *   Return chat messages from latest going back till the last/latest system prompt. | ||||
|      *   While keeping track that the number of user queries/messages doesnt exceed iRecentUserMsgCnt. | ||||
|      * @param {number} iRecentUserMsgCnt | ||||
|      */ | ||||
|     recent_chat(iRecentUserMsgCnt) { | ||||
|         if (iRecentUserMsgCnt < 0) { | ||||
|             return this.xchat; | ||||
|         } | ||||
|         if (iRecentUserMsgCnt == 0) { | ||||
|             console.warn("WARN:SimpleChat:SC:RecentChat:iRecentUsermsgCnt of 0 means no user message/query sent"); | ||||
|         } | ||||
|         /** @type{ChatMessages} */ | ||||
|         let rchat = []; | ||||
|         let sysMsg = this.get_system_latest(); | ||||
|         if (sysMsg.length != 0) { | ||||
|             rchat.push({role: Roles.System, content: sysMsg}); | ||||
|         } | ||||
|         let iUserCnt = 0; | ||||
|         let iStart = this.xchat.length; | ||||
|         for(let i=this.xchat.length-1; i > this.iLastSys; i--) { | ||||
|             if (iUserCnt >= iRecentUserMsgCnt) { | ||||
|                 break; | ||||
|             } | ||||
|             let msg = this.xchat[i]; | ||||
|             if (msg.role == Roles.User) { | ||||
|                 iStart = i; | ||||
|                 iUserCnt += 1; | ||||
|             } | ||||
|         } | ||||
|         for(let i = iStart; i < this.xchat.length; i++) { | ||||
|             let msg = this.xchat[i]; | ||||
|             if (msg.role == Roles.System) { | ||||
|                 continue; | ||||
|             } | ||||
|             rchat.push({role: msg.role, content: msg.content}); | ||||
|         } | ||||
|         return rchat; | ||||
|     } | ||||
| 
 | ||||
|     /** | ||||
|      * Add an entry into xchat | ||||
|      * @param {string} role | ||||
|  | @ -57,7 +120,7 @@ class SimpleChat { | |||
|             div.replaceChildren(); | ||||
|         } | ||||
|         let last = undefined; | ||||
|         for(const x of this.xchat) { | ||||
|         for(const x of this.recent_chat(gMe.iRecentUserMsgCnt)) { | ||||
|             let entry = document.createElement("p"); | ||||
|             entry.className = `role-${x.role}`; | ||||
|             entry.innerText = `${x.role}: ${x.content}`; | ||||
|  | @ -69,17 +132,21 @@ class SimpleChat { | |||
|         } else { | ||||
|             if (bClear) { | ||||
|                 div.innerHTML = gUsageMsg; | ||||
|                 gMe.show_info(div); | ||||
|             } | ||||
|         } | ||||
|     } | ||||
| 
 | ||||
|     /** | ||||
|      * Add needed fields wrt json object to be sent wrt LLM web services completions endpoint | ||||
|      * Add needed fields wrt json object to be sent wrt LLM web services completions endpoint. | ||||
|      * The needed fields/options are picked from a global object. | ||||
|      * Convert the json into string. | ||||
|      * @param {Object} obj | ||||
|      */ | ||||
|     request_jsonstr(obj) { | ||||
|         obj["temperature"] = 0.7; | ||||
|         for(let k in gMe.chatRequestOptions) { | ||||
|             obj[k] = gMe.chatRequestOptions[k]; | ||||
|         } | ||||
|         return JSON.stringify(obj); | ||||
|     } | ||||
| 
 | ||||
|  | @ -88,18 +155,27 @@ class SimpleChat { | |||
|      */ | ||||
|     request_messages_jsonstr() { | ||||
|         let req = { | ||||
|             messages: this.xchat, | ||||
|             messages: this.recent_chat(gMe.iRecentUserMsgCnt), | ||||
|         } | ||||
|         return this.request_jsonstr(req); | ||||
|     } | ||||
| 
 | ||||
|     /** | ||||
|      * Return a string form of json object suitable for /completions | ||||
|      * @param {boolean} bInsertStandardRolePrefix Insert "<THE_ROLE>: " as prefix wrt each role's message | ||||
|      */ | ||||
|     request_prompt_jsonstr() { | ||||
|     request_prompt_jsonstr(bInsertStandardRolePrefix) { | ||||
|         let prompt = ""; | ||||
|         for(const chat of this.xchat) { | ||||
|             prompt += `${chat.role}: ${chat.content}\n`; | ||||
|         let iCnt = 0; | ||||
|         for(const chat of this.recent_chat(gMe.iRecentUserMsgCnt)) { | ||||
|             iCnt += 1; | ||||
|             if (iCnt > 1) { | ||||
|                 prompt += "\n"; | ||||
|             } | ||||
|             if (bInsertStandardRolePrefix) { | ||||
|                 prompt += `${chat.role}: `; | ||||
|             } | ||||
|             prompt += `${chat.content}`; | ||||
|         } | ||||
|         let req = { | ||||
|             prompt: prompt, | ||||
|  | @ -171,7 +247,6 @@ let gChatURL = { | |||
|     'chat': `${gBaseURL}/chat/completions`, | ||||
|     'completion': `${gBaseURL}/completions`, | ||||
| } | ||||
| const gbCompletionFreshChatAlways = true; | ||||
| 
 | ||||
| 
 | ||||
| /** | ||||
|  | @ -291,6 +366,8 @@ class MultiChatUI { | |||
|             // allow user to insert enter into their message using shift+enter.
 | ||||
|             // while just pressing enter key will lead to submitting.
 | ||||
|             if ((ev.key === "Enter") && (!ev.shiftKey)) { | ||||
|                 let value = this.elInUser.value; | ||||
|                 this.elInUser.value = value.substring(0,value.length-1); | ||||
|                 this.elBtnUser.click(); | ||||
|                 ev.preventDefault(); | ||||
|             } | ||||
|  | @ -321,6 +398,29 @@ class MultiChatUI { | |||
|         } | ||||
|     } | ||||
| 
 | ||||
|     /** | ||||
|      * Try read json response early, if available. | ||||
|      * @param {Response} resp | ||||
|      */ | ||||
|     async read_json_early(resp) { | ||||
|         if (!resp.body) { | ||||
|             throw Error("ERRR:SimpleChat:MCUI:ReadJsonEarly:No body..."); | ||||
|         } | ||||
|         let tdUtf8 = new TextDecoder("utf-8"); | ||||
|         let rr = resp.body.getReader(); | ||||
|         let gotBody = ""; | ||||
|         while(true) { | ||||
|             let { value: cur,  done: done} = await rr.read(); | ||||
|             let curBody = tdUtf8.decode(cur); | ||||
|             console.debug("DBUG:SC:PART:", curBody); | ||||
|             gotBody += curBody; | ||||
|             if (done) { | ||||
|                 break; | ||||
|             } | ||||
|         } | ||||
|         return JSON.parse(gotBody); | ||||
|     } | ||||
| 
 | ||||
|     /** | ||||
|      * Handle user query submit request, wrt specified chat session. | ||||
|      * @param {string} chatId | ||||
|  | @ -330,6 +430,14 @@ class MultiChatUI { | |||
| 
 | ||||
|         let chat = this.simpleChats[chatId]; | ||||
| 
 | ||||
|         // In completion mode, if configured, clear any previous chat history.
 | ||||
|         // So if user wants to simulate a multi-chat based completion query,
 | ||||
|         // they will have to enter the full thing, as a suitable multiline
 | ||||
|         // user input/query.
 | ||||
|         if ((apiEP == ApiEP.Completion) && (gMe.bCompletionFreshChatAlways)) { | ||||
|             chat.clear(); | ||||
|         } | ||||
| 
 | ||||
|         chat.add_system_anytime(this.elInSystem.value, chatId); | ||||
| 
 | ||||
|         let content = this.elInUser.value; | ||||
|  | @ -344,7 +452,7 @@ class MultiChatUI { | |||
|         if (apiEP == ApiEP.Chat) { | ||||
|             theBody = chat.request_messages_jsonstr(); | ||||
|         } else { | ||||
|             theBody = chat.request_prompt_jsonstr(); | ||||
|             theBody = chat.request_prompt_jsonstr(gMe.bCompletionInsertStandardRolePrefix); | ||||
|         } | ||||
| 
 | ||||
|         this.elInUser.value = "working..."; | ||||
|  | @ -359,6 +467,7 @@ class MultiChatUI { | |||
|         }); | ||||
| 
 | ||||
|         let respBody = await resp.json(); | ||||
|         //let respBody = await this.read_json_early(resp);
 | ||||
|         console.debug(`DBUG:SimpleChat:MCUI:${chatId}:HandleUserSubmit:RespBody:${JSON.stringify(respBody)}`); | ||||
|         let assistantMsg; | ||||
|         if (apiEP == ApiEP.Chat) { | ||||
|  | @ -376,13 +485,6 @@ class MultiChatUI { | |||
|         } else { | ||||
|             console.debug(`DBUG:SimpleChat:MCUI:HandleUserSubmit:ChatId has changed:[${chatId}] [${this.curChatId}]`); | ||||
|         } | ||||
|         // Purposefully clear at end rather than begin of this function
 | ||||
|         // so that one can switch from chat to completion mode and sequece
 | ||||
|         // in a completion mode with multiple user-assistant chat data
 | ||||
|         // from before to be sent/occur once.
 | ||||
|         if ((apiEP == ApiEP.Completion) && (gbCompletionFreshChatAlways)) { | ||||
|             chat.xchat.length = 0; | ||||
|         } | ||||
|         this.ui_reset_userinput(); | ||||
|     } | ||||
| 
 | ||||
|  | @ -462,17 +564,66 @@ class MultiChatUI { | |||
| } | ||||
| 
 | ||||
| 
 | ||||
| let gMuitChat; | ||||
| const gChatIds = [ "Default", "Other" ]; | ||||
| class Me { | ||||
| 
 | ||||
|     constructor() { | ||||
|         this.defaultChatIds = [ "Default", "Other" ]; | ||||
|         this.multiChat = new MultiChatUI(); | ||||
|         this.bCompletionFreshChatAlways = true; | ||||
|         this.bCompletionInsertStandardRolePrefix = false; | ||||
|         this.iRecentUserMsgCnt = 2; | ||||
|         // Add needed fields wrt json object to be sent wrt LLM web services completions endpoint.
 | ||||
|         this.chatRequestOptions = { | ||||
|             "temperature": 0.7, | ||||
|             "max_tokens": 1024, | ||||
|             "frequency_penalty": 1.2, | ||||
|             "presence_penalty": 1.2, | ||||
|             "n_predict": 1024 | ||||
|         }; | ||||
|     } | ||||
| 
 | ||||
|     /** | ||||
|      * @param {HTMLDivElement} elDiv | ||||
|      */ | ||||
|     show_info(elDiv) { | ||||
| 
 | ||||
|         var p = document.createElement("p"); | ||||
|         p.innerText = "Settings (devel-tools-console gMe)"; | ||||
|         p.className = "role-system"; | ||||
|         elDiv.appendChild(p); | ||||
| 
 | ||||
|         var p = document.createElement("p"); | ||||
|         p.innerText = `bCompletionFreshChatAlways:${this.bCompletionFreshChatAlways}`; | ||||
|         elDiv.appendChild(p); | ||||
| 
 | ||||
|         p = document.createElement("p"); | ||||
|         p.innerText = `bCompletionInsertStandardRolePrefix:${this.bCompletionInsertStandardRolePrefix}`; | ||||
|         elDiv.appendChild(p); | ||||
| 
 | ||||
|         p = document.createElement("p"); | ||||
|         p.innerText = `iRecentUserMsgCnt:${this.iRecentUserMsgCnt}`; | ||||
|         elDiv.appendChild(p); | ||||
| 
 | ||||
|         p = document.createElement("p"); | ||||
|         p.innerText = `chatRequestOptions:${JSON.stringify(this.chatRequestOptions)}`; | ||||
|         elDiv.appendChild(p); | ||||
| 
 | ||||
|     } | ||||
| 
 | ||||
| } | ||||
| 
 | ||||
| 
 | ||||
| /** @type {Me} */ | ||||
| let gMe; | ||||
| 
 | ||||
| function startme() { | ||||
|     console.log("INFO:SimpleChat:StartMe:Starting..."); | ||||
|     gMuitChat = new MultiChatUI(); | ||||
|     for (let cid of gChatIds) { | ||||
|         gMuitChat.new_chat_session(cid); | ||||
|     gMe = new Me(); | ||||
|     for (let cid of gMe.defaultChatIds) { | ||||
|         gMe.multiChat.new_chat_session(cid); | ||||
|     } | ||||
|     gMuitChat.setup_ui(gChatIds[0]); | ||||
|     gMuitChat.show_sessions(); | ||||
|     gMe.multiChat.setup_ui(gMe.defaultChatIds[0], true); | ||||
|     gMe.multiChat.show_sessions(); | ||||
| } | ||||
| 
 | ||||
| document.addEventListener("DOMContentLoaded", startme); | ||||
|  |  | |||
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue