when dealing with loads of URLs for PDFs, but the file names are in headers and possibly colliding, how to preserve the metadata and download them all
- Go 100%
Prompt claude.ai using 3.7 Sonnet: ``` one additional issue that the metadata json file is also named as the hash of the PDF, but if there are two different URLs that download a matching PDF, then while the hash will be the same the metadata is actually different for the two, so it will collide there. Lets also get a sha1 hash of the URL string itself, and the metadata json file is named with hash of the PDF, then a hyphen, then hash of the URL. ``` Signed-off-by: Vincent Batts <vbatts@hashbangbash.com> |
||
|---|---|---|
| .gitignore | ||
| go.mod | ||
| LICENSE | ||
| main.go | ||
| README.md | ||
fetch-content
when dealing with loads of URLs for PDFs, but the file names are in headers and possibly colliding, how to preserve the metadata and download them all