Why the hell don't we already have a dedicated thread for this?
Someone needs to code Bay12 Simulator. Or at least tell me how to do it.
Markov chainsssss.
I actually have one of those somewhere. It's loaded with LW's posts right now in response to someone saying we needed an LW post generator after a particularly spectacular shirpost.
Jeepers, okay. I was gonna work out some way of bundling a thread-sucker with it so you could gen posts based on one particular thread, but I can just post the script here too.
Note: this is not my code. It is in fact sourced from this. I wrote some code surrounding it, but that's kinda buggy and not really important.
import random
class Markov(object):
def __init__(self, open_file, chain_size=3):
self.chain_size = chain_size
self.cache = {}
self.open_file = open_file
self.words = self.file_to_words()
self.word_size = len(self.words)
self.database()
def file_to_words(self):
self.open_file.seek(0)
data = self.open_file.read()
words = data.split()
return words
def words_at_position(self, i):
"""Uses the chain size to find a list of the words at an index."""
chain = []
for chain_index in range(0, self.chain_size):
chain.append(self.words[i + chain_index])
return chain
def chains(self):
"""Generates chains from the given data string based on passed chain size.
So if our string were:
"What a lovely day"
With a chain size of 3, we'd generate:
(What, a, lovely)
and
(a, lovely, day)
"""
if len(self.words) < self.chain_size:
return
for i in range(len(self.words) - self.chain_size - 1):
yield tuple(self.words_at_position(i))
def database(self):
for chain_set in self.chains():
key = chain_set[:self.chain_size - 1]
next_word = chain_set[-1]
if key in self.cache:
self.cache[key].append(next_word)
else:
self.cache[key] = [next_word]
def generate_markov_text(self, size=25):
seed = random.randint(0, self.word_size - 3)
gen_words = []
seed_words = self.words_at_position(seed)[:-1]
gen_words.extend(seed_words)
for i in xrange(size):
last_word_len = self.chain_size - 1
last_words = gen_words[-1 * last_word_len:]
next_word = random.choice(self.cache[tuple(last_words)])
gen_words.append(next_word)
return ' '.join(gen_words)
Is this an acceptable tribute?
It is.
Here's how to use the script if you're unfamiliar with Python:
(1) Install
Python. The Windows version usually comes bundled with
IDLE, which gives you easy access to the Python shell.
(2) Save the code posted by Arx and name the file "markovgen.py" or whatever you'd like. Put the file somewhere where Python can find it.
(3) Copy a buttload of posts by your favourite B12 shitposter into a "textfile.txt" or whatever. Put the textfile in the same folder as the .py file.
(4) Type the following lines one at a time into the Python shell:
>>> file_ = open('/your_folder_of_choice/textfile.txt')
>>> import markovgen
>>> markov = markovgen.Markov(file_)
>>> markov.generate_markov_text()
The argument for the "generate_markov_text" function determines the length of the generated output. Thus, when I type
>>> markov.generate_markov_text(100)
...the glorious result is:
done their best to do that in 2 years. Our society is degenerating, whilst the leftists with far less support retain disproportionately vast amounts of power, or more strikingly where Front Nationale won 1/3rd of the nightclub by the end of that term, you can see things where UKIP become the 3rd largest party yet win only 1 seat whilst the continental partners are decomposing. If we pretended all of the modern geopolitics and identity politics of the poisoned lamb - he dies almost immediately. Muhammed ate less of them top canuck REMOVE PICKELHAUBE remove pickelhaube you are not an seeker.
EDIT: I'll add instructions for using Dissociated press because it's slightly easier and the results are actually just as good, if not better:
(1) Install
Emacs.(2) Open your shitpostfile.txt in Emacs.
(3) To make a default letter-by-letter dissociation, press 'Alt' and 'x' at the same time, then type 'dissociated-press' into the prompt.
(4) To make a word-by-word dissociation, press 'Alt', '-', and some number (eg. '1') at the same time, then do as in step 3.
(5) Copy the text to clipboard by pressing 'Alt' and 'w'.