SimpleChat:SlidingWindow: iRecentUserMsgCnt to limit context load
This is disabled by default. However if enabled, then in addition to latest system message, only the last N user messages, after the latest system message and its reponses from the ai model will be sent to the ai-model, when querying for a new response. This specified N also includes the latest user query.
This commit is contained in:
parent
f0dd91d550
commit
b57aad79a8
2 changed files with 96 additions and 16 deletions
|
@ -14,11 +14,15 @@ own system prompts.
|
||||||
The UI follows a responsive web design so that the layout can adapt to available display space in a usable
|
The UI follows a responsive web design so that the layout can adapt to available display space in a usable
|
||||||
enough manner, in general.
|
enough manner, in general.
|
||||||
|
|
||||||
NOTE: Given that the idea is for basic minimal testing, it doesnt bother with any model context length and
|
Allows developer/end-user to control some of the behaviour by updating gMe members from browser's devel-tool
|
||||||
culling of old messages from the chat.
|
console.
|
||||||
|
|
||||||
NOTE: It doesnt set any parameters other than temperature for now. However if someone wants they can update
|
NOTE: Given that the idea is for basic minimal testing, it doesnt bother with any model context length and
|
||||||
the js file as needed.
|
culling of old messages from the chat by default. However by enabling the sliding window chat logic, a crude
|
||||||
|
form of old messages culling can be achieved.
|
||||||
|
|
||||||
|
NOTE: It doesnt set any parameters other than temperature and max_tokens for now. However if someone wants
|
||||||
|
they can update the js file or equivalent member in gMe as needed.
|
||||||
|
|
||||||
|
|
||||||
## usage
|
## usage
|
||||||
|
@ -96,8 +100,8 @@ Once inside
|
||||||
Me/gMe consolidates the settings which control the behaviour into one object.
|
Me/gMe consolidates the settings which control the behaviour into one object.
|
||||||
One can see the current settings, as well as change/update them using browsers devel-tool/console.
|
One can see the current settings, as well as change/update them using browsers devel-tool/console.
|
||||||
|
|
||||||
bCompletionFreshChatAlways - whether Completion mode collates completion history when communicating
|
bCompletionFreshChatAlways - whether Completion mode collates complete/sliding-window history when
|
||||||
with the server.
|
communicating with the server or only sends the latest user query/message.
|
||||||
|
|
||||||
bCompletionInsertStandardRolePrefix - whether Completion mode inserts role related prefix wrt the
|
bCompletionInsertStandardRolePrefix - whether Completion mode inserts role related prefix wrt the
|
||||||
messages that get inserted into prompt field wrt /Completion endpoint.
|
messages that get inserted into prompt field wrt /Completion endpoint.
|
||||||
|
@ -106,22 +110,42 @@ One can see the current settings, as well as change/update them using browsers d
|
||||||
irrespective of whether /chat/completions or /completions endpoint.
|
irrespective of whether /chat/completions or /completions endpoint.
|
||||||
|
|
||||||
If you want to add additional options/fields to send to the server/ai-model, and or
|
If you want to add additional options/fields to send to the server/ai-model, and or
|
||||||
modify the existing options value, for now you can update this global var using
|
modify the existing options value or remove them, for now you can update this global var
|
||||||
browser's development-tools/console.
|
using browser's development-tools/console.
|
||||||
|
|
||||||
|
iRecentUserMsgCnt - a simple minded SlidingWindow to limit context window load at Ai Model end.
|
||||||
|
This is disabled by default. However if enabled, then in addition to latest system message, only
|
||||||
|
the last/latest iRecentUserMsgCnt user messages after the latest system prompt and its responses
|
||||||
|
from the ai model will be sent to the ai-model, when querying for a new response. IE if enabled,
|
||||||
|
only user messages after the latest system message/prompt will be considered.
|
||||||
|
|
||||||
|
This specified sliding window user message count also includes the latest user query.
|
||||||
|
<0 : Send entire chat history to server
|
||||||
|
0 : Send only the system message if any to the server
|
||||||
|
>0 : Send the latest chat history from the latest system prompt, limited to specified cnt.
|
||||||
|
|
||||||
|
|
||||||
|
By using gMe's iRecentUserMsgCnt and chatRequestOptions.max_tokens one can try to control the
|
||||||
|
implications of loading of the ai-model's context window by chat history, wrt chat response to
|
||||||
|
some extent in a simple crude way.
|
||||||
|
|
||||||
|
|
||||||
Sometimes the browser may be stuborn with caching of the file, so your updates to html/css/js
|
Sometimes the browser may be stuborn with caching of the file, so your updates to html/css/js
|
||||||
may not be visible. Also remember that just refreshing/reloading page in browser or for that
|
may not be visible. Also remember that just refreshing/reloading page in browser or for that
|
||||||
matter clearing site data, dont directly override site caching in all cases. Worst case you may
|
matter clearing site data, dont directly override site caching in all cases. Worst case you may
|
||||||
have to change port. Or in dev tools of browser, you may be able to disable caching fully.
|
have to change port. Or in dev tools of browser, you may be able to disable caching fully.
|
||||||
|
|
||||||
|
|
||||||
Concept of multiple chat sessions with different servers, as well as saving and restoring of
|
Concept of multiple chat sessions with different servers, as well as saving and restoring of
|
||||||
those across browser usage sessions, can be woven around the SimpleChat/MultiChatUI class and
|
those across browser usage sessions, can be woven around the SimpleChat/MultiChatUI class and
|
||||||
its instances relatively easily, however given the current goal of keeping this simple, it has
|
its instances relatively easily, however given the current goal of keeping this simple, it has
|
||||||
not been added, for now.
|
not been added, for now.
|
||||||
|
|
||||||
|
|
||||||
By switching between chat.add_system_begin/anytime, one can control whether one can change
|
By switching between chat.add_system_begin/anytime, one can control whether one can change
|
||||||
the system prompt, anytime during the conversation or only at the beginning.
|
the system prompt, anytime during the conversation or only at the beginning.
|
||||||
|
|
||||||
|
|
||||||
read_json_early, is to experiment with reading json response data early on, if available,
|
read_json_early, is to experiment with reading json response data early on, if available,
|
||||||
so that user can be shown generated data, as and when it is being generated, rather than
|
so that user can be shown generated data, as and when it is being generated, rather than
|
||||||
at the end when full data is available.
|
at the end when full data is available.
|
||||||
|
@ -132,3 +156,8 @@ at the end when full data is available.
|
||||||
if able to read json data early on in future, as and when ai model is generating data, then
|
if able to read json data early on in future, as and when ai model is generating data, then
|
||||||
this helper needs to indirectly update the chat div with the recieved data, without waiting
|
this helper needs to indirectly update the chat div with the recieved data, without waiting
|
||||||
for the overall data to be available.
|
for the overall data to be available.
|
||||||
|
|
||||||
|
|
||||||
|
## At the end
|
||||||
|
|
||||||
|
Also a thank you to all open source and open model developers, who strive for the common good.
|
||||||
|
|
|
@ -25,21 +25,23 @@ let gUsageMsg = `
|
||||||
<li> Completion mode doesnt insert user/role: prefix implicitly.</li>
|
<li> Completion mode doesnt insert user/role: prefix implicitly.</li>
|
||||||
<li> Use shift+enter for inserting enter/newline.</li>
|
<li> Use shift+enter for inserting enter/newline.</li>
|
||||||
</ul>
|
</ul>
|
||||||
<li> Refresh the page to start over fresh.</li>
|
<li> If strange responses, Refresh page to start over fresh.</li>
|
||||||
<ul class="ul2">
|
<ul class="ul2">
|
||||||
<li> old chat is not culled when sending to server/ai-model.</li>
|
<li> [default] old msgs from chat not culled, when sending to server.</li>
|
||||||
<li> either use New CHAT, or refresh if chat getting long.</li>
|
<li> either use New CHAT, or refresh if chat getting long, or</li>
|
||||||
|
<li> experiment iRecentUserMsgCnt, max_tokens, model ctxt window.</li>
|
||||||
</ul>
|
</ul>
|
||||||
</ul>
|
</ul>
|
||||||
`;
|
`;
|
||||||
|
|
||||||
|
/** @typedef {{role: string, content: string}[]} ChatMessages */
|
||||||
|
|
||||||
class SimpleChat {
|
class SimpleChat {
|
||||||
|
|
||||||
constructor() {
|
constructor() {
|
||||||
/**
|
/**
|
||||||
* Maintain in a form suitable for common LLM web service chat/completions' messages entry
|
* Maintain in a form suitable for common LLM web service chat/completions' messages entry
|
||||||
* @type {{role: string, content: string}[]}
|
* @type {ChatMessages}
|
||||||
*/
|
*/
|
||||||
this.xchat = [];
|
this.xchat = [];
|
||||||
this.iLastSys = -1;
|
this.iLastSys = -1;
|
||||||
|
@ -50,6 +52,50 @@ class SimpleChat {
|
||||||
this.iLastSys = -1;
|
this.iLastSys = -1;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Recent chat messages.
|
||||||
|
* If iRecentUserMsgCnt < 0
|
||||||
|
* Then return the full chat history
|
||||||
|
* Else
|
||||||
|
* Return chat messages from latest going back till the last/latest system prompt.
|
||||||
|
* While keeping track that the number of user queries/messages doesnt exceed iRecentUserMsgCnt.
|
||||||
|
* @param {number} iRecentUserMsgCnt
|
||||||
|
*/
|
||||||
|
recent_chat(iRecentUserMsgCnt) {
|
||||||
|
if (iRecentUserMsgCnt < 0) {
|
||||||
|
return this.xchat;
|
||||||
|
}
|
||||||
|
if (iRecentUserMsgCnt == 0) {
|
||||||
|
console.warn("WARN:SimpleChat:SC:RecentChat:iRecentUsermsgCnt of 0 means no user message/query sent");
|
||||||
|
}
|
||||||
|
/** @type{ChatMessages} */
|
||||||
|
let rchat = [];
|
||||||
|
let sysMsg = this.get_system_latest();
|
||||||
|
if (sysMsg.length != 0) {
|
||||||
|
rchat.push({role: Roles.System, content: sysMsg});
|
||||||
|
}
|
||||||
|
let iUserCnt = 0;
|
||||||
|
let iStart = this.xchat.length;
|
||||||
|
for(let i=this.xchat.length-1; i > this.iLastSys; i--) {
|
||||||
|
if (iUserCnt >= iRecentUserMsgCnt) {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
let msg = this.xchat[i];
|
||||||
|
if (msg.role == Roles.User) {
|
||||||
|
iStart = i;
|
||||||
|
iUserCnt += 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
for(let i = iStart; i < this.xchat.length; i++) {
|
||||||
|
let msg = this.xchat[i];
|
||||||
|
if (msg.role == Roles.System) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
rchat.push({role: msg.role, content: msg.content});
|
||||||
|
}
|
||||||
|
return rchat;
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Add an entry into xchat
|
* Add an entry into xchat
|
||||||
* @param {string} role
|
* @param {string} role
|
||||||
|
@ -76,7 +122,7 @@ class SimpleChat {
|
||||||
div.replaceChildren();
|
div.replaceChildren();
|
||||||
}
|
}
|
||||||
let last = undefined;
|
let last = undefined;
|
||||||
for(const x of this.xchat) {
|
for(const x of this.recent_chat(gMe.iRecentUserMsgCnt)) {
|
||||||
let entry = document.createElement("p");
|
let entry = document.createElement("p");
|
||||||
entry.className = `role-${x.role}`;
|
entry.className = `role-${x.role}`;
|
||||||
entry.innerText = `${x.role}: ${x.content}`;
|
entry.innerText = `${x.role}: ${x.content}`;
|
||||||
|
@ -111,7 +157,7 @@ class SimpleChat {
|
||||||
*/
|
*/
|
||||||
request_messages_jsonstr() {
|
request_messages_jsonstr() {
|
||||||
let req = {
|
let req = {
|
||||||
messages: this.xchat,
|
messages: this.recent_chat(gMe.iRecentUserMsgCnt),
|
||||||
}
|
}
|
||||||
return this.request_jsonstr(req);
|
return this.request_jsonstr(req);
|
||||||
}
|
}
|
||||||
|
@ -123,7 +169,7 @@ class SimpleChat {
|
||||||
request_prompt_jsonstr(bInsertStandardRolePrefix) {
|
request_prompt_jsonstr(bInsertStandardRolePrefix) {
|
||||||
let prompt = "";
|
let prompt = "";
|
||||||
let iCnt = 0;
|
let iCnt = 0;
|
||||||
for(const chat of this.xchat) {
|
for(const chat of this.recent_chat(gMe.iRecentUserMsgCnt)) {
|
||||||
iCnt += 1;
|
iCnt += 1;
|
||||||
if (iCnt > 1) {
|
if (iCnt > 1) {
|
||||||
prompt += "\n";
|
prompt += "\n";
|
||||||
|
@ -527,6 +573,7 @@ class Me {
|
||||||
this.multiChat = new MultiChatUI();
|
this.multiChat = new MultiChatUI();
|
||||||
this.bCompletionFreshChatAlways = true;
|
this.bCompletionFreshChatAlways = true;
|
||||||
this.bCompletionInsertStandardRolePrefix = false;
|
this.bCompletionInsertStandardRolePrefix = false;
|
||||||
|
this.iRecentUserMsgCnt = -1;
|
||||||
// Add needed fields wrt json object to be sent wrt LLM web services completions endpoint.
|
// Add needed fields wrt json object to be sent wrt LLM web services completions endpoint.
|
||||||
this.chatRequestOptions = {
|
this.chatRequestOptions = {
|
||||||
"temperature": 0.7,
|
"temperature": 0.7,
|
||||||
|
@ -540,7 +587,7 @@ class Me {
|
||||||
show_info(elDiv) {
|
show_info(elDiv) {
|
||||||
|
|
||||||
var p = document.createElement("p");
|
var p = document.createElement("p");
|
||||||
p.innerText = "Settings (gMe)";
|
p.innerText = "Settings (devel-tools-console gMe)";
|
||||||
p.className = "role-system";
|
p.className = "role-system";
|
||||||
elDiv.appendChild(p);
|
elDiv.appendChild(p);
|
||||||
|
|
||||||
|
@ -552,6 +599,10 @@ class Me {
|
||||||
p.innerText = `bCompletionInsertStandardRolePrefix:${this.bCompletionInsertStandardRolePrefix}`;
|
p.innerText = `bCompletionInsertStandardRolePrefix:${this.bCompletionInsertStandardRolePrefix}`;
|
||||||
elDiv.appendChild(p);
|
elDiv.appendChild(p);
|
||||||
|
|
||||||
|
p = document.createElement("p");
|
||||||
|
p.innerText = `iRecentUserMsgCnt:${this.iRecentUserMsgCnt}`;
|
||||||
|
elDiv.appendChild(p);
|
||||||
|
|
||||||
p = document.createElement("p");
|
p = document.createElement("p");
|
||||||
p.innerText = `chatRequestOptions:${JSON.stringify(this.chatRequestOptions)}`;
|
p.innerText = `chatRequestOptions:${JSON.stringify(this.chatRequestOptions)}`;
|
||||||
elDiv.appendChild(p);
|
elDiv.appendChild(p);
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue