Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1] 2

Author Topic: Bay 12 Thread Downloader and Filter: New version: 1/4/2014  (Read 5883 times)

Parisbre56

  • Bay Watcher
  • I can haz skullz?
    • View Profile
    • parisbre56 Discord
Bay 12 Thread Downloader and Filter: New version: 1/4/2014
« on: October 10, 2013, 11:45:43 am »

Here's a Java program that transcribes any thread of this forum into a single file for offline reading! It processes about 260 pages a minute. Time might vary depending on connection speed and site traffic, as well as your preferences (check the menu).

The program can also be ordered to only keep the posts of certain users, great for reading only the GM's posts or finding the Toad's posts in the Future of the Fortress thread.

The program can combine multiple threads into one, arranging the posts chronologically.

If the output file ends in .html, the program will split the output into multiple pages based on user preferences (check the menu).
If the output file ends in .txt, the program will keep the output as a single plain text file, great for reading in an e-book or for processing with regular expressions or programs like grep.

Here's the program: B12_PostProcessor.jar (10 MB)
Should work on most common OSs (Windows, Linux, Mac) and architectures (x86, x86_64) with an up to date Java VM installed.

Spoiler: To do list (click to show/hide)

Here's the source code if anyone wants to mess with it. (30 MB, Eclipse project)

Changelog:
1/4/2014:New version can combine multiple files into one based on time. Unfortunately, it can only process times in the %a %d-%m-%Y, %H:%M:%S format. So make sure to log in and change your date format to that if you want to use it until I get around and fix it.
28/3/2014: Ability to split output to multiple files. New options menu. Various bugfixes.
2/11/2013: New version can save output as a "lightweight" .txt file.
1/11/2013: Fixed an out of memory error that occurred in 32-bit windows JVMs when processing more than 1069 pages
23/10/2013: New version can login and can download images and forum theme images. It also utilizes multiple downloader threads to reduce download time. Finally, it has a better GUI.
11/10/2013: New version should work on most common OSs (Windows, Linux, Mac) and architectures (x86, x86_64) with an up to data Java VM installed.
10/10/2013: Made the program create a window that acts as a terminal. This means that you can now just double click the file instead of having to launch it from the terminal. Bad thing is, it only works on 64-bit Linux now. I'll fix that tomorrow. Should be an easy fix (famous last words).
« Last Edit: June 03, 2014, 08:15:11 pm by Parisbre56 »
Logged

Xantalos

  • Bay Watcher
  • Your Friendly Salvation
    • View Profile
Re: Bay 12 Thread Filter Thingy
« Reply #1 on: October 10, 2013, 12:17:10 pm »

Hells yeahs PTW.
Logged
Sig! Onol
Quote from: BFEL
XANTALOS, THE KARATEBOMINATION
Quote from: Toaster
((The Xantalos Die: [1, 1, 1, 6, 6, 6]))

miauw62

  • Bay Watcher
  • Every time you get ahead / it's just another hit
    • View Profile
Re: Bay 12 Thread Filter Thingy
« Reply #2 on: October 10, 2013, 12:46:04 pm »

Does this work for all Simple-Machine powered forums? (or could it be editted to do so?)

Either way, this is fucking awesome.
Logged

Quote from: NW_Kohaku
they wouldn't be able to tell the difference between the raving confessions of a mass murdering cannibal from a recipe to bake a pie.
Knowing Belgium, everyone will vote for themselves out of mistrust for anyone else, and some kind of weird direct democracy coalition will need to be formed from 11 million or so individuals.

Lectorog

  • Bay Watcher
    • View Profile
Re: Bay 12 Thread Filter Thingy
« Reply #3 on: October 10, 2013, 12:51:26 pm »

Man oh man. This has so much impractical use. PTW. I'll be actually using it if optional theme formatting and image downloading is implemented.
Logged

Parisbre56

  • Bay Watcher
  • I can haz skullz?
    • View Profile
    • parisbre56 Discord
Re: Bay 12 Thread Filter Thingy
« Reply #4 on: October 10, 2013, 04:06:53 pm »

Does this work for all Simple-Machine powered forums? (or could it be edited to do so?)
Theoretically, with a bit of modifying, yeah, but I haven't tried it yet.

Changed the first post.

EDIT: A new version is up!

EDIT2: Updated with some speed ups and bug fixing.
« Last Edit: October 10, 2013, 06:37:07 pm by Parisbre56 »
Logged

kisame12794

  • Bay Watcher
  • !!Arc Welder!!
    • View Profile
Re: Bay 12 Thread Downloader and Filter
« Reply #5 on: October 10, 2013, 07:55:06 pm »

You glorious bastard. PTW.
Logged
The non-assholes vastly outnumber the assholes but the assholes can fart with greater volume.
((You're an arm and a torso in low orbit. This was the best possible resolution of things.))

Parisbre56

  • Bay Watcher
  • I can haz skullz?
    • View Profile
    • parisbre56 Discord
Re: Bay 12 Thread Downloader and Filter
« Reply #6 on: October 11, 2013, 07:19:39 am »

New version is up. It should now work on most common OSs (Windows, Linux, Mac) and architectures (x86, x86_64) with an up to data Java VM installed.
Tested it on 64-bit Windows and Linux. Tell me if it doesn't work on your system and I'll see what I can do.

It took some messing around with ANT and the help of those two pages:
http://mchr3k.github.io/swtjar/
http://timeme.eclipselabs.org.codespot.com/git.wiki/SWT.wiki
Check them out if you're planning on creating a cross platform Java app that uses SWT to generate its interface. Or just copy my source code and rewrite the Main class to suit your needs.

Now to start rewriting the program to make it "smarter".

Person

  • Bay Watcher
    • View Profile
Re: Bay 12 Thread Downloader and Filter
« Reply #7 on: October 11, 2013, 04:37:18 pm »

I think I'll be having some forum games to read through once this thing does images.
Logged
Please don't let textbooks invade Bay12.
The Conquistadors only have the faintest idea of what the modern world is like when they are greeted by two hostile WWI Veterans riding on a giant potato; Welcome to 2016.

Elephant Parade

  • Bay Watcher
    • View Profile
Re: Bay 12 Thread Downloader and Filter
« Reply #8 on: October 13, 2013, 07:17:38 pm »

This does look interesting.

Edit: Now to use this to read the Homestuck thread. Except not, because that would take far too long. I like how it preserves the formatting.
« Last Edit: October 13, 2013, 07:21:29 pm by Elephant Parade »
Logged

Parisbre56

  • Bay Watcher
  • I can haz skullz?
    • View Profile
    • parisbre56 Discord
Re: Bay 12 Thread Downloader and Filter
« Reply #9 on: October 14, 2013, 12:16:39 pm »

Spoiler (click to show/hide)
Version 2 is coming along nicely. At this rate, I should be ready for a release near the end of the week or sometime in the next week. Most of my work so far has gone to designing a simple GUI and a login system. Next up is upgrading the downloader itself to download images, downloading through multiple connections to increase download speed and making it smarter so that it can filter messages according to their content.

Speaking of filters, as of right now, I'm working on two filters. One that removes out of character content by removing any text enclosed in (( )) and another that checks for types of text like Italics, Bold, Underlined, Coloured, etc. Any ideas for other filters you would like to have?
« Last Edit: October 14, 2013, 12:38:44 pm by Parisbre56 »
Logged

sjm9876

  • Bay Watcher
  • Did not so much Fall as Saunter Vaguely Downwards
    • View Profile
Re: Bay 12 Thread Downloader and Filter
« Reply #10 on: October 14, 2013, 12:19:44 pm »

This looks very interesting. PTW
Logged
My dreams are not unlike yours - they long for the safety, and break like a glass chandelier.
But there's laughter and oh there is love, just past the edge of our fears.
And there's chaos when push comes to shove, but it's music to my ears.

Sigtext

miauw62

  • Bay Watcher
  • Every time you get ahead / it's just another hit
    • View Profile
Re: Bay 12 Thread Downloader and Filter
« Reply #11 on: October 14, 2013, 12:20:32 pm »

Removal of all the shit inbetween posts. Just a name and a link to the profile, maybe an avatar instead of a name, an image, contact details, the topic reply name, the date of the reply, the signature, the little report to moderator and IP logged thingie in the bottom left etc etc.
Logged

Quote from: NW_Kohaku
they wouldn't be able to tell the difference between the raving confessions of a mass murdering cannibal from a recipe to bake a pie.
Knowing Belgium, everyone will vote for themselves out of mistrust for anyone else, and some kind of weird direct democracy coalition will need to be formed from 11 million or so individuals.

Armok

  • Bay Watcher
  • God of Blood
    • View Profile
Re: Bay 12 Thread Downloader and Filter
« Reply #12 on: October 15, 2013, 01:04:29 am »

This is great

If we're making wishlists, what I could really use is if it striped away all the formatting and signatures and junk and outputted a plain textfile viewable on kindle. Just name, post date, main text of post, repeat.
Logged
So says Armok, God of blood.
Sszsszssoo...
Sszsszssaaayysss...
III...

LordSlowpoke

  • Bay Watcher
    • View Profile
Re: Bay 12 Thread Downloader and Filter
« Reply #13 on: October 15, 2013, 05:39:30 am »

PTW
Logged

Parisbre56

  • Bay Watcher
  • I can haz skullz?
    • View Profile
    • parisbre56 Discord
Re: Bay 12 Thread Downloader and Filter
« Reply #14 on: October 21, 2013, 04:36:01 pm »

Bug squashing took more time than expected (mostly due to personal issues and my lack of experience with working with SWT), but lo and behold:
Spoiler (click to show/hide)
A speed of about 260 pages per minutes (which translates to 3800 posts per minute with 15 posts per page, speed should increase with higher settings) with 6 threads downloading and processing data (I'll probably add an option to increase the number of threads).
Now all that's left is the ability to download images and it'll be ready for the next release.
Pages: [1] 2