A Dwarf has created a tool that should (haven't tested myself yet) capture an entire thread in one go.
You take file called url_list.txt
You put each thread you want to download in the file and save it then run the script. It will backup everything there. URL conversions is a list to convert the web urls to local file locations (I didn't implement changing them on the fly, we can do that after everything is backed up)
If anyone wants to give this a try on the larger threads, let us know; https://www.dropbox.com/s/belwmr34crezlru/topic.zip?dl=0
Hey everyone, I'm the guy that programmed this. It looks like his dropbox didn't include the actual script so here is the link to the github repository
https://github.com/Celebrinborn/Bay12_Imgur_ArchiveIf you run into any bugs please let me know. I haven't gotten the dockerfile working yet (so no container swarms sadly) but the code itself works just fine.
It only works on bay12forums itself, its custom made for the forum so it can't scrap other websites. It takes a thread, copies every post, then copies every image on the page, then moves to the next page in the thread until its done.
Each image is stored in the data folder under data/topic/topicid/post#
a dump of the text is stored at data/topic/topicid/thread.html
The URL's in the url_list.txt MUST be bay12forums pages that look like
http://www.bay12forums.com/smf/index.php?topic=168375If its a different URL it won't work. I also haven't tested running it on applications that have parameters other then topic (I haven't been on this forum sense 2012ish and the guy on Reddit who asked for this didn't give me any example links that included other parameters so I didn't plan for it)