SimpleChat: Update a bit wrt readme and notes in du

This commit is contained in:
HanishKVC 2024-05-27 03:00:39 +05:30
parent 452813f235
commit 1db965d00d
2 changed files with 50 additions and 11 deletions

View file

@ -3,9 +3,30 @@
// by Humans for All // by Humans for All
// //
/**
* Given the limited context size of local LLMs and , many a times when context gets filled
* between the prompt and the response, it can lead to repeating text garbage generation.
* And many a times setting penalty wrt repeatation leads to over-intelligent garbage
* repeatation with slight variations. These garbage inturn can lead to overloading of the
* available model context, leading to less valuable response for subsequent prompts/queries,
* if chat history is sent to ai model.
*
* So two simple minded garbage trimming logics are experimented below.
* * one based on progressively-larger-substring-based-repeat-matching-with-partial-skip and
* * another based on char-histogram-driven garbage trimming.
* * in future characteristic of histogram over varying lengths could be used to allow for
* a more aggressive and adaptive trimming logic.
*/
/** /**
* Simple minded logic to help remove repeating garbage at end of the string. * Simple minded logic to help remove repeating garbage at end of the string.
* The repeatation needs to be perfectly matching.
*
* The logic progressively goes on probing for longer and longer substring based
* repeatation, till there is no longer repeatation. Inturn picks the one with
* the longest chain.
*
* @param {string} sIn * @param {string} sIn
* @param {number} maxSubL * @param {number} maxSubL
* @param {number} maxMatchLenThreshold * @param {number} maxMatchLenThreshold
@ -44,6 +65,9 @@ export function trim_repeat_garbage_at_end(sIn, maxSubL=10, maxMatchLenThreshold
/** /**
* Simple minded logic to help remove repeating garbage at end of the string, till it cant. * Simple minded logic to help remove repeating garbage at end of the string, till it cant.
* If its not able to trim, then it will try to skip a char at end and then trim, a few times. * If its not able to trim, then it will try to skip a char at end and then trim, a few times.
* This ensures that even if there are multiple runs of garbage with different patterns, the
* logic still tries to munch through them.
*
* @param {string} sIn * @param {string} sIn
* @param {number} maxSubL * @param {number} maxSubL
* @param {number | undefined} [maxMatchLenThreshold] * @param {number | undefined} [maxMatchLenThreshold]
@ -72,7 +96,14 @@ export function trim_repeat_garbage_at_end_loop(sIn, maxSubL, maxMatchLenThresho
/** /**
* A simple minded try trim garbage at end using histogram characteristics * A simple minded try trim garbage at end using histogram driven characteristics.
* There can be variation in the repeatations, as long as no new char props up.
*
* This tracks the chars and their frequency in a specified length of substring at the end
* and inturn checks if moving further into the generated text from the end remains within
* the same char subset or goes beyond it and based on that either trims the string at the
* end or not. This allows to filter garbage at the end, including even if there are certain
* kind of small variations in the repeated text wrt position of seen chars.
* *
* Allow the garbage to contain upto maxUniq chars, but at the same time ensure that * Allow the garbage to contain upto maxUniq chars, but at the same time ensure that
* a given type of char ie numerals or alphabets or other types dont cross the specified * a given type of char ie numerals or alphabets or other types dont cross the specified
@ -135,7 +166,10 @@ export function trim_hist_garbage_at_end(sIn, maxType, maxUniq, maxMatchLenThres
} }
/** /**
* Keep trimming repeatedly using hist_garbage logic, till you no longer can * Keep trimming repeatedly using hist_garbage logic, till you no longer can.
* This ensures that even if there are multiple runs of garbage with different patterns,
* the logic still tries to munch through them.
*
* @param {any} sIn * @param {any} sIn
* @param {number} maxType * @param {number} maxType
* @param {number} maxUniq * @param {number} maxUniq

View file

@ -15,11 +15,13 @@ The UI follows a responsive web design so that the layout can adapt to available
enough manner, in general. enough manner, in general.
Allows developer/end-user to control some of the behaviour by updating gMe members from browser's devel-tool Allows developer/end-user to control some of the behaviour by updating gMe members from browser's devel-tool
console. console. Parallely some of the directly useful to end-user settings can also be changed using the provided
settings ui.
NOTE: Given that the idea is for basic minimal testing, it doesnt bother with any model context length and NOTE: Current web service api doesnt expose the model context length directly, so client logic doesnt provide
culling of old messages from the chat by default. However by enabling the sliding window chat logic, a crude any adaptive culling of old messages nor of replacing them with summary of their content etal. However there
form of old messages culling can be achieved. is a optional sliding window based chat logic, which provides a simple minded culling of old messages from
the chat history before sending to the ai model.
NOTE: It doesnt set any parameters other than temperature and max_tokens for now. However if someone wants NOTE: It doesnt set any parameters other than temperature and max_tokens for now. However if someone wants
they can update the js file or equivalent member in gMe as needed. they can update the js file or equivalent member in gMe as needed.
@ -54,6 +56,8 @@ Once inside
* Select between chat and completion mode. By default it is set to chat mode. * Select between chat and completion mode. By default it is set to chat mode.
* Change the default global settings, if one wants to.
* In completion mode * In completion mode
* logic by default doesnt insert any role specific "ROLE: " prefix wrt each role's message. * logic by default doesnt insert any role specific "ROLE: " prefix wrt each role's message.
If the model requires any prefix wrt user role messages, then the end user has to If the model requires any prefix wrt user role messages, then the end user has to
@ -104,14 +108,15 @@ by developers who may not be from web frontend background (so inturn may not be
end-use-specific-language-extensions driven flows) so that they can use it to explore/experiment things. end-use-specific-language-extensions driven flows) so that they can use it to explore/experiment things.
And given that the idea is also to help explore/experiment for developers, some flexibility is provided And given that the idea is also to help explore/experiment for developers, some flexibility is provided
to change behaviour easily using the devel-tools/console, for now. And skeletal logic has been implemented to change behaviour easily using the devel-tools/console or provided minimal settings ui (wrt few aspects).
to explore some of the end points and ideas/implications around them. Skeletal logic has been implemented to explore some of the end points and ideas/implications around them.
### General ### General
Me/gMe consolidates the settings which control the behaviour into one object. Me/gMe consolidates the settings which control the behaviour into one object.
One can see the current settings, as well as change/update them using browsers devel-tool/console. One can see the current settings, as well as change/update them using browsers devel-tool/console.
It is attached to the document object.
bCompletionFreshChatAlways - whether Completion mode collates complete/sliding-window history when bCompletionFreshChatAlways - whether Completion mode collates complete/sliding-window history when
communicating with the server or only sends the latest user query/message. communicating with the server or only sends the latest user query/message.
@ -189,9 +194,9 @@ also be started with a model context size of 1k or more, to be on safe side.
internal n_predict, for now add the same here on the client side, maybe later add max_tokens internal n_predict, for now add the same here on the client side, maybe later add max_tokens
to /completions endpoint handling code on server side. to /completions endpoint handling code on server side.
Frequency and presence penalty fields are set to 1.2 in the set of fields sent to server NOTE: One may want to experiment with frequency/presence penalty fields in chatRequestOptions
along with the user query. So that the model is partly set to try avoid repeating text in wrt the set of fields sent to server along with the user query. To check how the model behaves
its response. wrt repeatations in general in the generated text response.
A end-user can change these behaviour by editing gMe from browser's devel-tool/console. A end-user can change these behaviour by editing gMe from browser's devel-tool/console.