common: llama_load_model_from_url split support (#6192)

* llama: llama_split_prefix fix strncpy does not include string termination
common: llama_load_model_from_url:
 - fix header name case sensitive
 - support downloading additional split in parallel
 - hide password in url

* common: EOL EOF

* common: remove redundant LLAMA_CURL_MAX_PATH_LENGTH definition

* common: change max url max length

* common: minor comment

* server: support HF URL options

* llama: llama_model_loader fix log

* common: use a constant for max url length

* common: clean up curl if file cannot be loaded in gguf

* server: tests: add split tests, and HF options params

* common: move llama_download_hide_password_in_url inside llama_download_file as a lambda

* server: tests: enable back Release test on PR

* spacing

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* spacing

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* spacing

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
Pierrick Hymbert 2024-03-23 18:07:00 +01:00 committed by GitHub
parent 1997577d5e
commit f482bb2e49
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 200 additions and 63 deletions

View file

@ -2959,7 +2959,7 @@ struct llama_model_loader {
}
}
LLAMA_LOG_INFO("%s: additional %d GGUFs metadata loaded.\n", __func__, n_split);
LLAMA_LOG_INFO("%s: additional %d GGUFs metadata loaded.\n", __func__, n_split - 1);
}
n_kv = gguf_get_n_kv(meta);
@ -15140,7 +15140,7 @@ int llama_split_prefix(char * dest, size_t maxlen, const char * split_path, int
// check if dest ends with postfix
int size_prefix = str_split_path.size() - str_postfix.size();
if (size_prefix > 0 && str_split_path.find(str_postfix, size_prefix) != std::string::npos) {
snprintf(dest, std::min((size_t) size_prefix, maxlen), "%s", split_path);
snprintf(dest, std::min((size_t) size_prefix + 1, maxlen), "%s", split_path);
return size_prefix;
}