vbatts/fetch-content

Fork 0

when dealing with loads of URLs for PDFs, but the file names are in headers and possibly colliding, how to preserve the metadata and download them all

Find a file

Vincent Batts 67e835eea9 main.go: ruling out a colliding metadata for single PDF Prompt claude.ai using 3.7 Sonnet: ``` one additional issue that the metadata json file is also named as the hash of the PDF, but if there are two different URLs that download a matching PDF, then while the hash will be the same the metadata is actually different for the two, so it will collide there. Lets also get a sha1 hash of the URL string itself, and the metadata json file is named with hash of the PDF, then a hyphen, then hash of the URL. ``` Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>		2025-04-03 22:26:40 -04:00
.gitignore	Initial commit	2025-04-03 15:19:48 -04:00
go.mod	Initial commit	2025-04-03 15:19:48 -04:00
LICENSE	Initial commit	2025-04-03 15:19:48 -04:00
main.go	main.go: ruling out a colliding metadata for single PDF	2025-04-03 22:26:40 -04:00
README.md	Initial commit	2025-04-03 15:19:48 -04:00

README.md

fetch-content

when dealing with loads of URLs for PDFs, but the file names are in headers and possibly colliding, how to preserve the metadata and download them all