Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  
Pages: 1 ... 4 5 [6] 7 8 ... 11

Author Topic: Bay12_SS: The Shitpost Simulator.  (Read 21300 times)

misko27

  • Bay Watcher
  • Lawful Neutral; Prophet of Pestilence
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #75 on: January 30, 2016, 09:33:40 pm »

I've got 9600, can I get one? I may not be too prolific nowadays, but I was on FG&RP enough to vastly inflate my numbers.
Logged
The Age of Man is over. It is the Fire's turn now

My Name is Immaterial

  • Bay Watcher
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #76 on: January 30, 2016, 09:37:58 pm »

Is it really? Alright, so that textfile if being called "The One Percent", and the rest of the forum is going in a file called "The Other 99 Percent." And, of course, there's still going to be one that's the entirity of the forum.
Yep. There are 454 people with over 3k posts, out of 46315 total members, so it's actually .98024%, to be precise.
If you raise that minimum to 4k, only 345 meet that requirement, or about three-fourths of a percent.

Hmm. Maybe after midterms, graphing individual post counts will be my next data project.

Detros

  • Bay Watcher
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #77 on: January 30, 2016, 09:38:52 pm »

I put DF Wikipedia page into that online version (http://www.yisongyue.com/shaney/) and got things like (in italics):

Adams has two favourite bugs: The other involves a dwarven executioner, with broken arms unable to use up his $15,000 savings.
Everything comes from Armok: Rivers are created by tracing their paths from the original Armok to Dwarf Fortress
New feature - smeltable traps: Players can use traps and engineering in addition to training an army.Traps can be smelted to produce their corresponding metal bars.
We need to get a job to help Toady with work on Armok: Adams said that the text-based graphics forces players to get a job in order to finish his previous work, Armok.
Slaves to Armok: God of Blood Chapter II: Dwarf Fortress: It was named after a long gap.
How to make steel: For steel production, flux stones are used to make mushroom wine.
On the game's community, Tarn Adams said: I'm lucky to be very powerful.
« Last Edit: January 30, 2016, 09:42:44 pm by Detros »
Logged
Beside other things, bay12forums is also the leader website in calculations of saguaro wood density.
(noted by jwoodward48df)

Aklyon

  • Bay Watcher
  • Fate~
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #78 on: January 30, 2016, 09:44:51 pm »

That could be an interesting project, Immaterial.
Logged
Crystalline (SG)
Sigtext
Quote from: RedKing
It's known as the Oppai-Kaiju effect. The islands of Japan generate a sort anti-gravity field, which allows breasts to behave as if in microgravity. It's also what allows Godzilla and friends to become 50 stories tall, and lets ninjas run up the side of a skyscraper.

chaotic skies

  • Bay Watcher
  • Vibing in anti-space
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #79 on: January 30, 2016, 09:55:10 pm »

Quick question: Where do I put the py file?

EDIT: I don't need hard drive space. I'll by an external 1TB hard drive for every file if I have to.
« Last Edit: January 30, 2016, 09:57:01 pm by chaotic skies »
Logged
Don't let me start a forum game, smack me with a paper towel roll if needed

Professional Thread Necromancer

Sensei

  • Bay Watcher
  • Haven't tried coffee crisps.
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #80 on: January 30, 2016, 11:41:14 pm »

Interesting, this. I've got over 9000 posts, so I think I qualify.

I'd be curious to see one of just popular forum game OPs. We could run a bunch of them through and then play the first one that makes enough sense to be possible.
Logged
Let's Play: Automation! Bay 12 Motor Company Buy the 1950 Urist Wagon for just $4500! Safety features optional.
The Bay 12 & Mates Discord Join now! Voice/text chat and play games with other Bay12'ers!
Add me on Steam: [DFC] Sensei

Amperzand

  • Bay Watcher
  • Knight of Cerebus
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #81 on: January 30, 2016, 11:43:29 pm »

I approve of this.
Logged
Muh FG--OOC Thread
Quote from: smirk
Quote from: Shadowlord
Is there a word that combines comedy with tragedy and farce?
Heiterverzweiflung. Not a legit German word so much as something a friend and I made up in German class once. "Carefree despair". When life is so fucked that you can't stop laughing.
http://www.collinsdictionary.com

O.Wilde

  • Bay Watcher
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #82 on: January 30, 2016, 11:49:41 pm »

So for all of you wonderful people out there wanting a Loud Whisper Official Posts Authentically Datamined Textfile of Words, here ya go. It took friggin forever even with the scraper (It's 15 fucking megabytes holy shit), but here's 32,000 posts worth of distilled LW. 200 proof.

The weirdest textfile you ever did see. (It's on tinyupload cause it's too big for pastebin)

Side Note: I am actually ridiculously proud of the scraper, having never programmed a working thing in my life beyond a simple calculator. It's a work of art. Even if it is likely horrible optimization wise.

If you want to use the scraper, I'll post the code below. It requires the following:
requests (Can be downloaded through pip)
BeautifulSoup 4 (Also pip)
lxml (Ditto)

I take no responcibility for anything this does to your computer, blah blah blah, use at your own risk, try to not crash any websites. It's commented, but only as far as I understand it. It's likely that some comments are entirely wrong and show a fundamental misunderstanding of everything.

Code: [Select]
############ O.Wilde's Bay 12 Scraper ############
################## V.1 1/30/16 ##################

import requests
import bs4
import lxml
import re
from html.parser import HTMLParser

###########################################################################

def findprofile(profilepage):
    print ('Finding pages to scrape...')
    user = re.sub('http://www.bay12forums.com/smf/index.php?action=profile;u=', '', profilepage) #Removes everything but the user ID number from the  profile link
    url = 'http://www.bay12forums.com/smf/index.php?action=profile;area=showposts;sa=messages;u=' + user #Adds the User ID number obtained in the last step to navigate to their messages page.
    return (url)

def scrapeposts(url):
    page = requests.get(url)
    soup = bs4.BeautifulSoup(page.text, "lxml") #Takes the text of the HTML contained on the URL messages page and makes is usable for our purposes
    data = [a.attrs.get('href') for a in soup.select('div.pagesection a.navPages')] #Selects the data contained in our HTML [a class="navpages"...../div] tags, which is the entirety of the posts listed.
    ppg = re.sub('http(.+?)start=', '', data[0]) #Finds the number of posts per page
    tp = re.sub('http(.+?)start=', '', data[len(data)-1]) #Finds the postnumber that the final page of posts starts on
    pagenumber = (int(tp) / int(ppg)) + 1 #Finds the total number of pages of posts
    counter = 0
    page = 0
    scrapedata = ''
    while counter < pagenumber: #While the page we are working on is less than the total number of pages
        counter = counter + 1 #Add 1 to the counter
        print ('Now scraping page ' + str(counter) + ' out of ' + str(pagenumber) + '!')
        scrapeurl = url + 'start=' + str(((counter - 1) * 15)) #Goes to the url of the page of posts
        scrapepage = requests.get(scrapeurl)
        scrapesoup = bs4.BeautifulSoup(scrapepage.text, "lxml") #Takes the text of the HTML contained on the URL messages page and makes is usable for our purposes
        scrapetext = soup.select('div.list_posts') #Selects the data contained in our HTML that we want to mine. Specifically, the posts on the page
        scrapedata = scrapedata + ' ' + str(scrapetext) #Adds our freshly scraped data to the string of scraped data mined so far.
    print ('Done!')
    return(scrapedata)

def cleardata(data):
    print('Removing HTML...')
    cleareddata = re.sub('<[^<]+?>', ' ', str(data)) #Removes all strings contained within <...>, this is to remove HTML tags. Replaces with a space.
    print('Removing Quote Tags...')
    cleareddata1 = re.sub("Quote(.+)pm", ' ', cleareddata) #Removes quote tags ending in PM and replaces with a space
    cleareddata2 = re.sub("Quote(.+)am", ' ', cleareddata1) #Same as above, but for am. (I could do this in one step, but I don't know how.)
    print('Removing Tabs...')
    cleareddata3 = re.sub('[\s+]', ' ', cleareddata2) #Removes any extra tabs, replaces with spaces
    print('Removing Non-ASCII Data...')
    cleareddata4 = re.sub(r'[^\x00-\x7F]',' ', cleareddata3) #Removes any non-ascii characters so textfile can be created.
    print('Removing Spaces...')
    cleareddata5 = re.sub(r'\s+', ' ', cleareddata4) #Removes the clutter of spaces created from the previous few steps, replacing them with a single space each.
    return (cleareddata5)

def writefile(data, name):
    print('Writing File...')
    file = open(name + '.txt', 'w') #Creates a new file in which to save our data
    file.write(data) #Writes our data to the file
    file.close #Closes the file
    input('Posts have been scraped, and file created. Thank you for using the Bay 12 Scraper by O.Wilde!')

###########################################################################
   
url = findprofile(input("Please input the profile of the member who's posts you would like to scrape: ")) #Asks for a profile link to scrape, and calls findprofile using that link. Sets url equal to the returned value
messydata = scrapeposts(url)
cleandata = cleardata(messydata)
writefile(cleandata, input('Please input the name of the text file you want to be generated. WARNING: Any file with the same name will be overwritten!!!: '))
Logged
What could pre-industrial societies do, run a bunch of cattle off a cliff? Boo fucking hoo I'll be crying for them while I just dump these litres of acidic chemicals into this river. Scrubs.

chaotic skies

  • Bay Watcher
  • Vibing in anti-space
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #83 on: January 31, 2016, 12:00:46 am »

AND SO THE APOCALYPSE WAS BROUGHT ABOUT BY A BUNCH OF BORED BAY 12 PROGRAMMERS.

This is going to end badly, once I figure out where to put the py file C:
Logged
Don't let me start a forum game, smack me with a paper towel roll if needed

Professional Thread Necromancer

mainiac

  • Bay Watcher
  • Na vazeal kwah-kai
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #84 on: January 31, 2016, 12:02:27 am »

So can you make the bots talk to each other?
Logged
Ancient Babylonian god of RAEG
--------------
[CAN_INTERNET]
[PREFSTRING:google]
"Don't tell me what you value. Show me your budget and I will tell you what you value"
« Last Edit: February 10, 1988, 03:27:23 pm by UR MOM »
mainiac is always a little sarcastic, at least.

O.Wilde

  • Bay Watcher
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #85 on: January 31, 2016, 12:03:57 am »

Oh, sorry! You put the markov file in the Lib folder contained in the python installation folder.

For me, it's <USER>\AppData\Programs\Python\Python35-32\Lib\shitpost.py
Logged
What could pre-industrial societies do, run a bunch of cattle off a cliff? Boo fucking hoo I'll be crying for them while I just dump these litres of acidic chemicals into this river. Scrubs.

chaotic skies

  • Bay Watcher
  • Vibing in anti-space
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #86 on: January 31, 2016, 12:10:50 am »

Thanks for the info! I probably should have figured that one out...

So can you make the bots talk to each other?

With some work, you could probably make the Scraper open the SS after scraping the posts and feed it the file, but that removes the fun of having several thousand posts from hundreds of people mixed.
Logged
Don't let me start a forum game, smack me with a paper towel roll if needed

Professional Thread Necromancer

O.Wilde

  • Bay Watcher
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #87 on: January 31, 2016, 12:16:50 am »

Just a note: If you're using python 3.5, you may need to edit the Markov file so that 'xrange' is replaced by 'range'.
Logged
What could pre-industrial societies do, run a bunch of cattle off a cliff? Boo fucking hoo I'll be crying for them while I just dump these litres of acidic chemicals into this river. Scrubs.

chaotic skies

  • Bay Watcher
  • Vibing in anti-space
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #88 on: January 31, 2016, 12:18:07 am »

This is why I have notepad++. "Find xrange, replace with range." So useful for small updates like this :P
Logged
Don't let me start a forum game, smack me with a paper towel roll if needed

Professional Thread Necromancer

Aklyon

  • Bay Watcher
  • Fate~
    • View Profile
Re: Bay12_SS: The Shitpost Simulator.
« Reply #89 on: January 31, 2016, 12:19:42 am »

Anything over about 3 or 4 thousand. If there are other people below that, which I find particularly funny, I'll add those too. Basically, yes. You qualify.
I'd mostly been asking since I expected I'd be fairly below the post counts of the rps. But apparently not, I'm actually still on the first page apparently.
Logged
Crystalline (SG)
Sigtext
Quote from: RedKing
It's known as the Oppai-Kaiju effect. The islands of Japan generate a sort anti-gravity field, which allows breasts to behave as if in microgravity. It's also what allows Godzilla and friends to become 50 stories tall, and lets ninjas run up the side of a skyscraper.
Pages: 1 ... 4 5 [6] 7 8 ... 11