SimpleChat:SlidingWindow: iRecentUserMsgCnt to limit context load

This is disabled by default. However if enabled, then in addition
to latest system message, only the last N user messages, after the
latest system message and its reponses from the ai model will be sent
to the ai-model, when querying for a new response.

This specified N also includes the latest user query.
This commit is contained in:
HanishKVC 2024-05-24 01:05:05 +05:30
parent f0dd91d550
commit b57aad79a8
2 changed files with 96 additions and 16 deletions

View file

@ -14,11 +14,15 @@ own system prompts.
The UI follows a responsive web design so that the layout can adapt to available display space in a usable The UI follows a responsive web design so that the layout can adapt to available display space in a usable
enough manner, in general. enough manner, in general.
NOTE: Given that the idea is for basic minimal testing, it doesnt bother with any model context length and Allows developer/end-user to control some of the behaviour by updating gMe members from browser's devel-tool
culling of old messages from the chat. console.
NOTE: It doesnt set any parameters other than temperature for now. However if someone wants they can update NOTE: Given that the idea is for basic minimal testing, it doesnt bother with any model context length and
the js file as needed. culling of old messages from the chat by default. However by enabling the sliding window chat logic, a crude
form of old messages culling can be achieved.
NOTE: It doesnt set any parameters other than temperature and max_tokens for now. However if someone wants
they can update the js file or equivalent member in gMe as needed.
## usage ## usage
@ -96,8 +100,8 @@ Once inside
Me/gMe consolidates the settings which control the behaviour into one object. Me/gMe consolidates the settings which control the behaviour into one object.
One can see the current settings, as well as change/update them using browsers devel-tool/console. One can see the current settings, as well as change/update them using browsers devel-tool/console.
bCompletionFreshChatAlways - whether Completion mode collates completion history when communicating bCompletionFreshChatAlways - whether Completion mode collates complete/sliding-window history when
with the server. communicating with the server or only sends the latest user query/message.
bCompletionInsertStandardRolePrefix - whether Completion mode inserts role related prefix wrt the bCompletionInsertStandardRolePrefix - whether Completion mode inserts role related prefix wrt the
messages that get inserted into prompt field wrt /Completion endpoint. messages that get inserted into prompt field wrt /Completion endpoint.
@ -106,22 +110,42 @@ One can see the current settings, as well as change/update them using browsers d
irrespective of whether /chat/completions or /completions endpoint. irrespective of whether /chat/completions or /completions endpoint.
If you want to add additional options/fields to send to the server/ai-model, and or If you want to add additional options/fields to send to the server/ai-model, and or
modify the existing options value, for now you can update this global var using modify the existing options value or remove them, for now you can update this global var
browser's development-tools/console. using browser's development-tools/console.
iRecentUserMsgCnt - a simple minded SlidingWindow to limit context window load at Ai Model end.
This is disabled by default. However if enabled, then in addition to latest system message, only
the last/latest iRecentUserMsgCnt user messages after the latest system prompt and its responses
from the ai model will be sent to the ai-model, when querying for a new response. IE if enabled,
only user messages after the latest system message/prompt will be considered.
This specified sliding window user message count also includes the latest user query.
<0 : Send entire chat history to server
0 : Send only the system message if any to the server
>0 : Send the latest chat history from the latest system prompt, limited to specified cnt.
By using gMe's iRecentUserMsgCnt and chatRequestOptions.max_tokens one can try to control the
implications of loading of the ai-model's context window by chat history, wrt chat response to
some extent in a simple crude way.
Sometimes the browser may be stuborn with caching of the file, so your updates to html/css/js Sometimes the browser may be stuborn with caching of the file, so your updates to html/css/js
may not be visible. Also remember that just refreshing/reloading page in browser or for that may not be visible. Also remember that just refreshing/reloading page in browser or for that
matter clearing site data, dont directly override site caching in all cases. Worst case you may matter clearing site data, dont directly override site caching in all cases. Worst case you may
have to change port. Or in dev tools of browser, you may be able to disable caching fully. have to change port. Or in dev tools of browser, you may be able to disable caching fully.
Concept of multiple chat sessions with different servers, as well as saving and restoring of Concept of multiple chat sessions with different servers, as well as saving and restoring of
those across browser usage sessions, can be woven around the SimpleChat/MultiChatUI class and those across browser usage sessions, can be woven around the SimpleChat/MultiChatUI class and
its instances relatively easily, however given the current goal of keeping this simple, it has its instances relatively easily, however given the current goal of keeping this simple, it has
not been added, for now. not been added, for now.
By switching between chat.add_system_begin/anytime, one can control whether one can change By switching between chat.add_system_begin/anytime, one can control whether one can change
the system prompt, anytime during the conversation or only at the beginning. the system prompt, anytime during the conversation or only at the beginning.
read_json_early, is to experiment with reading json response data early on, if available, read_json_early, is to experiment with reading json response data early on, if available,
so that user can be shown generated data, as and when it is being generated, rather than so that user can be shown generated data, as and when it is being generated, rather than
at the end when full data is available. at the end when full data is available.
@ -132,3 +156,8 @@ at the end when full data is available.
if able to read json data early on in future, as and when ai model is generating data, then if able to read json data early on in future, as and when ai model is generating data, then
this helper needs to indirectly update the chat div with the recieved data, without waiting this helper needs to indirectly update the chat div with the recieved data, without waiting
for the overall data to be available. for the overall data to be available.
## At the end
Also a thank you to all open source and open model developers, who strive for the common good.

View file

@ -25,21 +25,23 @@ let gUsageMsg = `
<li> Completion mode doesnt insert user/role: prefix implicitly.</li> <li> Completion mode doesnt insert user/role: prefix implicitly.</li>
<li> Use shift+enter for inserting enter/newline.</li> <li> Use shift+enter for inserting enter/newline.</li>
</ul> </ul>
<li> Refresh the page to start over fresh.</li> <li> If strange responses, Refresh page to start over fresh.</li>
<ul class="ul2"> <ul class="ul2">
<li> old chat is not culled when sending to server/ai-model.</li> <li> [default] old msgs from chat not culled, when sending to server.</li>
<li> either use New CHAT, or refresh if chat getting long.</li> <li> either use New CHAT, or refresh if chat getting long, or</li>
<li> experiment iRecentUserMsgCnt, max_tokens, model ctxt window.</li>
</ul> </ul>
</ul> </ul>
`; `;
/** @typedef {{role: string, content: string}[]} ChatMessages */
class SimpleChat { class SimpleChat {
constructor() { constructor() {
/** /**
* Maintain in a form suitable for common LLM web service chat/completions' messages entry * Maintain in a form suitable for common LLM web service chat/completions' messages entry
* @type {{role: string, content: string}[]} * @type {ChatMessages}
*/ */
this.xchat = []; this.xchat = [];
this.iLastSys = -1; this.iLastSys = -1;
@ -50,6 +52,50 @@ class SimpleChat {
this.iLastSys = -1; this.iLastSys = -1;
} }
/**
* Recent chat messages.
* If iRecentUserMsgCnt < 0
* Then return the full chat history
* Else
* Return chat messages from latest going back till the last/latest system prompt.
* While keeping track that the number of user queries/messages doesnt exceed iRecentUserMsgCnt.
* @param {number} iRecentUserMsgCnt
*/
recent_chat(iRecentUserMsgCnt) {
if (iRecentUserMsgCnt < 0) {
return this.xchat;
}
if (iRecentUserMsgCnt == 0) {
console.warn("WARN:SimpleChat:SC:RecentChat:iRecentUsermsgCnt of 0 means no user message/query sent");
}
/** @type{ChatMessages} */
let rchat = [];
let sysMsg = this.get_system_latest();
if (sysMsg.length != 0) {
rchat.push({role: Roles.System, content: sysMsg});
}
let iUserCnt = 0;
let iStart = this.xchat.length;
for(let i=this.xchat.length-1; i > this.iLastSys; i--) {
if (iUserCnt >= iRecentUserMsgCnt) {
break;
}
let msg = this.xchat[i];
if (msg.role == Roles.User) {
iStart = i;
iUserCnt += 1;
}
}
for(let i = iStart; i < this.xchat.length; i++) {
let msg = this.xchat[i];
if (msg.role == Roles.System) {
continue;
}
rchat.push({role: msg.role, content: msg.content});
}
return rchat;
}
/** /**
* Add an entry into xchat * Add an entry into xchat
* @param {string} role * @param {string} role
@ -76,7 +122,7 @@ class SimpleChat {
div.replaceChildren(); div.replaceChildren();
} }
let last = undefined; let last = undefined;
for(const x of this.xchat) { for(const x of this.recent_chat(gMe.iRecentUserMsgCnt)) {
let entry = document.createElement("p"); let entry = document.createElement("p");
entry.className = `role-${x.role}`; entry.className = `role-${x.role}`;
entry.innerText = `${x.role}: ${x.content}`; entry.innerText = `${x.role}: ${x.content}`;
@ -111,7 +157,7 @@ class SimpleChat {
*/ */
request_messages_jsonstr() { request_messages_jsonstr() {
let req = { let req = {
messages: this.xchat, messages: this.recent_chat(gMe.iRecentUserMsgCnt),
} }
return this.request_jsonstr(req); return this.request_jsonstr(req);
} }
@ -123,7 +169,7 @@ class SimpleChat {
request_prompt_jsonstr(bInsertStandardRolePrefix) { request_prompt_jsonstr(bInsertStandardRolePrefix) {
let prompt = ""; let prompt = "";
let iCnt = 0; let iCnt = 0;
for(const chat of this.xchat) { for(const chat of this.recent_chat(gMe.iRecentUserMsgCnt)) {
iCnt += 1; iCnt += 1;
if (iCnt > 1) { if (iCnt > 1) {
prompt += "\n"; prompt += "\n";
@ -527,6 +573,7 @@ class Me {
this.multiChat = new MultiChatUI(); this.multiChat = new MultiChatUI();
this.bCompletionFreshChatAlways = true; this.bCompletionFreshChatAlways = true;
this.bCompletionInsertStandardRolePrefix = false; this.bCompletionInsertStandardRolePrefix = false;
this.iRecentUserMsgCnt = -1;
// Add needed fields wrt json object to be sent wrt LLM web services completions endpoint. // Add needed fields wrt json object to be sent wrt LLM web services completions endpoint.
this.chatRequestOptions = { this.chatRequestOptions = {
"temperature": 0.7, "temperature": 0.7,
@ -540,7 +587,7 @@ class Me {
show_info(elDiv) { show_info(elDiv) {
var p = document.createElement("p"); var p = document.createElement("p");
p.innerText = "Settings (gMe)"; p.innerText = "Settings (devel-tools-console gMe)";
p.className = "role-system"; p.className = "role-system";
elDiv.appendChild(p); elDiv.appendChild(p);
@ -552,6 +599,10 @@ class Me {
p.innerText = `bCompletionInsertStandardRolePrefix:${this.bCompletionInsertStandardRolePrefix}`; p.innerText = `bCompletionInsertStandardRolePrefix:${this.bCompletionInsertStandardRolePrefix}`;
elDiv.appendChild(p); elDiv.appendChild(p);
p = document.createElement("p");
p.innerText = `iRecentUserMsgCnt:${this.iRecentUserMsgCnt}`;
elDiv.appendChild(p);
p = document.createElement("p"); p = document.createElement("p");
p.innerText = `chatRequestOptions:${JSON.stringify(this.chatRequestOptions)}`; p.innerText = `chatRequestOptions:${JSON.stringify(this.chatRequestOptions)}`;
elDiv.appendChild(p); elDiv.appendChild(p);