Links for further documentation:
Data Sheet and code referenceControl LogicProgram ListingsUsage advice for the savefileMapSavefile (0.34.11)Excessively large write-up on much of the theory and praxis behind the thingI'd messed around with minecart logic for quite a while and when i worked out ways to make an adressable memory cell and mass data storage with them, i decided to try building a computer. In the early conceptual stages i decided i was not going to untangle program and data coming from a shared bus and completely separated program and data storage. Since my mass storage concept looked like a good model for expanded program storage, the instruction length worked off the three bits encoded per cart in the mass storage. Thus, the computer uses six-bit instructions, a three-bit operator and a three-bit operand. It operates on ten-bit data: since program and data storage are separate, there's no need whatsoever to use the same or even related register sizes.
In spite of the tiny instruction set, the computer can access output (the line above is the result of a "print to screen" program) and do basic mathematical operations, albeit impressively slowly: addition is trivial, since it's one of the actual instructions, but the computer can also subtract (by constructing the binary complement first), multiply, divide, calculate a Fibonacci sequence (limited by the ten bits, of course) or...
calculate square roots:
The bottom line contains the actual number, the top line the square root. (sqrt of 840 is 28).
I'm more than a little proud of this one. The program grabs a number from a "tape" loaded with an arbitrary sequence of ten-bit numbers (specifically intended for such "pick any number" programs, not requiring direct user input) and the only assumption it takes is that the number it's working with is different from 1. After i the "operate" switch was pulled, all operation took place automatically, without any intervention by me. The square root is calculated in what i guess is the usual way: by dividing the "to-root" number by the current guess and deriving an approximation from the result.
code:
#Registers A-C are main data registers, D is the first counter; Bf is buffer, used for data transfer between registers, from/to output, from the data tape
Module 5 (first of square root program)
111 010 read datum from tape to Bf
001 011 write Bf to register C
111 111 toggle I/O status bit (normally "keyboard/display", toggles to "machinery/latch array")
111 001 write Bf to output (latch array controlling linked machines)
111 000 read input to Bf (latch array); this returns data shifted one position to the right: this pair of commands performs "shift Bf right", while the MSB is _not_ used as "sign" but always set to zero.
001 001 write Bf to register A
000 000 read zero to Bf #the operands 000 and 111, when used with the "read" operation, either clear or set all bits in the buffer. Makes some operations a lot easier, since you don't have to dedicate one of your precious registers just to store a zero.
110 101 load module 6
Module 6
001 010 write Bf to Reg. B
001 100 write Bf to Reg. D #"primes" the machine. B stores the running sum during division, D stores the Quotient. Both must be re-set to zero after every pass. Zeroing both before running the first division means there's no need to prepare the registers before starting the program.
#Beginning of division loop
010 000 add A+B -> write result to B
100 000 test if B>C; if no, skip next instruction
110 110 load Module 7
011 100 increment reg.D (test for zero)
101 010 jump to pos. 3 (if not zero)
111 110 halt #safety catch, for zero, one, and for excessively large numbers: above 681 as original number, dividing by 1/2 itself will wrap around past 1024 on the first try; past 1005 or so, no running sum larger than the original number will occur before 64 iterations. In the case of zero and one, the first guess for the divisor is zero, and no matter how many zeroes you take, they won't ever be absolutely bigger than any number. If the counter wraps around, the program will simply stop there, displaying the original guess as "result", which is only correct in the case of zero.
Module 7
#we reach this module after we've divided the source number by our current guess. Now we need to check if our result is sufficient, approximate if not
000 100 - read D to Bf #fetch the division result
001 010 - write Bf to B
100 011 - test if A>B, skip next if not # since we start out with a guess assuredly bigger than the result if source >4, we have the correct guess if A is _not_ bigger than B (unless i'm overlooking some cases).
101 101 - jump to position 6 #not the result yet, we need to refine and retry
111 110 - Halt # reached if A>B fails. Square root stands in register A. I could have made it cleaner by also copying A to B to remove the unsightly division result from display.
010 000 - add A+B, result in reg. B
000 010 - read reg. B to Bf
110 111 - load Module 8
Module 8
# we have the sum of our guess and division result of source/guess in the buffer. Now we need to divide it by two. Good thing i wired the "control" output latches to work as right-shift for Bf!
111 001 - write Bf to output
111 000 - read input to Bf #Bf's contents got divided by two. Easy!
001 001 - write Bf to reg. A #that's our new guess
000 000 - clear all bits in Bf #load a fresh zero so we can reset registers B and D.
110 101 - load Module 6 #next round of division
unused
unused
unused
I made notes of the program's progress, writing down dates when a division and approximation was finished. I had deliberately put a pretty large number at the tape position for this run, to check if the initial overflow was handled properly and led to the correct sequence once this initial trouble was over. It worked splendidly:
first division: 840:420 = 6 (overflow, stopped after the third wraparound)
first approximation: 420+6=426, /2 = 213
all further divisions correct (giving the nearest lower integer), approximations, as predicted:
108
57
35
29
28
840/29=28, thus A>B; 840/28=30, A!>B, program finished, result is 28.
The program was started on 24 Moonstone 268 and finished 12 Malachite 269 - six months and 17 days, 185 days total. Barring accounting errors, those were 567 program steps, at a rate just over three operations per day, 392 steps per operation, 35,47 microurist clock rate by dwarven time, ~ 50 millihertz realtime.
A map has been uploaded:
http://mkv25.net/dfma/map-12378-faintedpostsBits and piecesUser interfaceThe place to interact with the machine. From here, you can enter program code, set the data registers, set the status bits, connect/cut main power, reset the computer, directly load the first program module from the ROM into active program storage, keep track of the main registers' contents and of the four status bits, shut down the main registers...
and of course start the computer.
It's riddled with more or less helpful notes; i wouldn't find my own way around it otherwise.
Memory cellsI use three different types of read-/writeable memory cells. I could have done with only two types, but kept the powered-cell register in the end, as an hommage to the pioneers of minecart computing. While it takes a lot of mechanisms and is power-dependent, it's fast, reliable and probably quite intuitive for people more into pure mechanical logic.
1. Linear read/write latch
Data register A, containing 0000 011 100 (28). These are ten memory cells, built horizontally, with "double-action" read, write-adressable.
Diagram:
#══▼▼═▼▼═▼▼═▼▼══#
z+0, paths
#▲^¢▼^▼¢D¢▼B▼¢^▲#
1r d ces h r0
z+0, constructions
#══#══#══#══#
z-1, paths
#▲▲#▲▲#▲▲#▲▲#
z-1, ramps :P
legend:
1 - pressure plate active when cell is "on" and is currently read
0 - "off" pressure plate active when cell is read
/* both "on" and "off" signals are sent, which make data exchange between registers easier - we can extract a direct "clear" signal from a cell in zero state instead of trying to draw conclusions from absence of an "on" */
r - hatch covers that open when a read of the cell is requested
d - pressure plate sending a signal to the UI display when the cell is on /* we want the status of registers constantly displayed */
c - write hatch, clears the cell if enabled
s - write hatch, sets cell if enabled
e - door, enables writes, makes the cell adressable
h - raising bridge, lowered. This is a special contraption, combined with a (currently and normally) retracted bridge over the adjacent pit to the left, allowing to "h"alt the whole register. It can stop all minecarts in the register, to allow refurbishing, exchanging carts, demolishing the register or the like. Bridges are latched into "neutral" or "shutdown" position by a single lever in the UI area (has to go through a logic circuit since we need one bridge in "active" and one in "inactive" state).
Not-so-obvious issue: carts passing over a hatch cover don't come from track and will not automatically enter downward ramps they encounter. They will instead jump, staying on the upper z-level until they accumulated enough "downward distance". Ramp-propelled carts in this circuit are slow enough that they'll only jump three tiles, i.e. they fall into the appropriate pit just by themselves. I carefully tested the functionality of the design before installing the full 88 of them.
Most memory cells i built here used this design.
2. Settable flipflop with optional carry generation
Three cells of the first counting register (register D).
Interlinks between individual cells allow the register to count up or down and cells can easily be written to. Reading its state is more complicated, it has to be done through a separate evaluation circuit.
I'm afraid the construction is a bit too complicated to convey it here. The core is a three-ramp pit with a single hatch cover over the "corner" tile:
Trying to explain that core mechanism:
.# #
▲ ║
#▲▲# #╣═#
# #
z-1
║
▼
═¢▼═
║
z+0
When the cart moves E->W or N->S, it passes through the pit in a straight line, emerges to the west/south and continues on its path there. When the hatch cover is shut, the cart bumps into it, accelerates to the north and emerges to the north. This holds true both for carts coming from the north and from the east. (Carts coming from the west or south will jump off the hatch and misfunction.)
The important point is that by putting such a pit into a circuit which "re-cycles" straight E->W passages (and N->N returns) and lets N->S passes leave, it's possible to build an edge detector using a single signal-operated building: a cart enters the circuit on the E->W line, keeps circulating through, always passing E->W until the hatch cover closes, keeps bouncing off the hatch (and returning) while it's shut, then passes through to the south once the hatch opens again.
The practical application here consists of leashing two of these "edge detectors" together, making a very simple toggle cell. Further regulation of the switchover paths between the halves of the cell allows extracting "carry" signals that can be passed on to the next cell. In addition, if only one of the main hatches in a cell is opened, the cell gets set - the cart can only switch position if it's currently in the cell half that had its hatch opened.
3. Powered latch
Six bits of memory using Tinypirate's variant of the powered minecart memory cell (first published in Bloodbeard's minecart computation thread).
The function is explained on the wiki. I keep getting confused linking up mechanisms, so didn't intend to use this model extensively. I had first speculated about including some extra functionality, that's why each cell has six tiles of floor instead of the entirely sufficient four. My idea was making all cells toggleable, but i couldn't see a good use for just inverting the value so left it out in the end. Input is sent by _disabling_ the gear assemblies of undesired states and connecting main power to the whole contraption. Since the detailed input arrives only after main power turns on, carts bounce around in the circuit quite a bit on every write, but reliably settle into the correct state. Output must be generated through an external circuit.
The ten-bit register draws 265 power before input has established, not at all trivial when the entire computer draws 1380 in standby and rarely over 1650 in operation.
Signal bundlingEvery main register bank has ten enabling doors and twenty read hatches. There are several operations which write to or read from a register. To avoid having to link all those buildings up to each function, i collected them to a bank of "control" circuits. This way, "write to register A" can be done by establishing a single link to that control circuit. As a simple case of expedience, i also collected the "write to program memory" commands like this: while i could have gotten away with a single "clear all program memory" function, that would have meant linking 48 doors and hatch covers to a single plate. That could easily have taken over half a year to do. In the interest of my own sanity, i collected the functions through eight "clear" and eight "write" controllers.
The obvious downside to such indirect operation is that it introduces reaction delay: a signal sent directly to the targetted building switches the building instantly, a signal sent via control circuit only arrives once the control circuit has responded. It will also only turn off after the control circuit's plate has recovered.
The simplest "signal relayer" based on powerless minecarts:
#▲▲# #══#
z-1 track
##¢^▲# ##▼══#
z+0 track
has two problems:
- reaction time is irregular, because in the "off" state (hatch closed), the cart can end up either sitting on top of the hatch or captured in the pit under the hatch. Falling off the hatch, bouncing into the backward wall and climbing out forward gives a delay of 26 steps until the plate gets activated, if the cart was in the pit, the delay can be any number of steps from 2-19 (equal chances of each).
- the output signal is lengthened: the cart will bounce into and out of the pit, passing over the plate regularly, as long as the hatch is open. That means the plate's active state keeps getting refreshed for ~100 steps, and will only turn off around about 100 steps after the hatch closes - the exact number is naturally variable, ranging from 85 to 115 steps. An ordinary on/off pressure plate cycle, which has a spacing of 101-105 steps, will thus result in a relayed "on" that lasts roughly from (2-25) to (190-220), much longer and less distinct. Not a good premise when looking for fast operation.
The remote circuit i settled for introduces a delay, but that delay is absolutely constant and signal length does not expand.
Write control.
All writing to active memory, clearing the buffer or program memory and reading from the buffer and registers A-C is done through such indirect devices.
Program instructions have their "read" commands triggered directly: each only gets activated in one case (when the instruction is needed for execution), affording no benefit from collected operation, _and_ since each possible operation has to pass this step, unnecessary delays here hurt the overall performance of the computer.
## ##
#▲ #║ ## ##
#▲ #║ ╔¢# ╔▼
## ## ║^ ║║
▲# ║# ▼╗ ▼╗
▲# ╚# ¢▲# ▼║#
z-1 track # #
z+0 track
The hatch cover to the north is linked to the input, both hatch covers and the output are linked to the pressure plate in the circuit. When the hatch opens, the cart falls into the pit, climbs out of it after 25 steps (activates the plate after 26) and heads south. It bumps into the impulse ramp to the south and gets sent into the southern pit by the corner. Since the hatch is opened by the passage over the pressure plate, the cart now cycles through the pit/impulse ramp/corner circle to the south, until the hatch cover closes again. At this point, the cart will emerge straight to the north, travels north along the track and settles on the northern hatch; since it's been cycled by the internal plate, it's guaranteed to be reset and closed again. Since the cart only passes over the plate once, the signal sent consists of a single "on" pulse after 26 steps, followed by an "off" after 126.
NB: for operation of bridges, floodgates and grates, the circuit must be one tile longer and the pressure plate put another tile away from the ramp. The circuit shown here will toggle such buildings, because the "off" signal comes before the building has reacted and is ready to take new input.
Caveat: the circuit is not very tolerant for architecture variants - if the incoming cart has the wrong momentum, the stable cycle in the southern end cannot establish itself.
Mass StorageBy far most of the program code in this computer is stored in a type of ROM: 128 words of code in the bank, while only eight at a time are actually loaded. New code can be loaded into active memory (overwriting the previous contents) by the "load Module" instruction, and an additional status bit can switch between two banks of eight modules each. The actual program store is spacious but fairly unassuming (spoilered for size):
It's still fairly compact considering that it can store 768 bits on 75x36x2 tiles. just over 7 tiles per bit. Even more impressively, it does this with less than 500 mechanisms (including the entire adressing and output management). The secret is that information in the mass storage is encoded in minecarts of specified weights that are simply held in readiness in small ramped pits until their data is requested. Then they get released by opening an "output" bridge, weighed by four pressure plates which directly send set signals to the pre-cleared program latches and cycled around and back into their pit through an "input" bridge that's held open until all carts are safely returned; the output bridge closes again ~ 100 steps after it opened, quite a long time before carts have finished their round.
A pair of holding pits looks like this:
##### ##### .B.B.
#═╚═# #▲▲▲# ═B.B═
#═╔═# #▲▲▲# ═B.B═
##### ##### .I.O.
z-1, track z-1 z+0
B - bridge
I - incoming
O - outgoing
enough to hold two minecarts, worth six bits or one program word. The impulse ramps in the pits linked to the walls prevent carts from bouncing back out of the pit after entering (from the west) - they can only leave the pit when the eastern "output" bridge opens.
Minecarts of seven weights (and no minecart as "zero" datum) are used to encode three bits of information per cart; sixteen carts (potentially fewer depending on presence of "000" code segments or unused words) make up a module of eight words, sixteen modules can be stored.
The principle of this mass storage is remarkably simple. It allowed me to build a computer with an extremely small active program memory without crippling computational capacity: longer programs are simply spread over multiple modules and loaded in succession.
Higher information density per cart is possible, and a denser packing of the module, too, but either would increase complication:
four bits per cart already require using loaded carts to represent some of the states, and it gets exponentially worse when trying to encode more than that. Six bits per cart are theoretically possible, but just accurately reading those values would be an insane effort.
denser packing could be done by positioning the stored carts on top of rollers instead of in ramp-pits: it'd require a lot of power and mechanisms and would be prone to misfunctions when carts jostle each other. Alternatively, carts could be "stacked" and pulled out sequentially, e.g. by a roller. Adressing individual data chunks could only reasonably be done when storing only one segment per stack, sacrificing much of the compactness, and cycling read data back into the stack could be slowed down tremendously (or completely broken) by "stack cluttering".
For added convenience, the module loaded in the first position can be transferred to the program memory by operating a dedicated lever in the control centre. The two-wide bridges over the return tracks can be raised to stop currently circling minecarts - this way, a program module can be unloaded.
"Programming" the ROM is a pretty slow and inelegant process: a cart must be deposited on a pressure plate operating the input bridges of the desired module and the correct-material minecarts assigned to sixteen routes (consisting of a single stop on the western end of the long straight path, with a "push east immediately" order). Entering one module of program code in this way takes two to three weeks.
Mechanical logic devices1. Adder (not the animal)
The main tool for calculations, this machine adds two ten-bit numbers together. The correct sum rollers are instantly connected to power the moment the inputs are fully established. Reading the result and transferring it to the target register is done by roller-directed minecarts; since i insisted on the compactest possible layout for the adder itself (two lines per bit), logistics are a bit tricky.
The read-out procedure must be started with a notable delay, because inputs come from linear latches and have a "startup" latency of up to 35 steps.
I cannot take credit for the design, it was developed by Jong, apparently inspired by the "Pre-Triggered Mechanical Logic" discussion of yore and shown in
this post. Unfortunately, imageshack since decided the design was too awesome for posterity and swallowed the pictures. The optimised circuit can still be found on the very last page of the
design manuscript of Jong's dwarven computer.
I think i started with the not-fully optimised two-stage adder and the concept of the "inverse carry" calculation chain. Putting it together and optimising from there got the exact same final circuit:
PABoQSqobaP
X X
P = power supply
a, A = gear linked to engage when input A1 is false (a) or true (A)
b, B = gear linked to engage when input B1 is false (b) or true (B)
X = gears linked to both inputs A1 and B1, pre-toggled; engaged when both inputs are different, disengaged when they're equal
o = unlinked gears or rollers. They receive and pass power when their inputs are switched correctly.
Q, q = gears linked to both inputs A2 and B2, calculating the sum based on whether or not the last bit (bit 1) produced a carry. Q is engaged when A2 and B2 are equal (carry-in active and inputs equal => sum is true), q is engaged when they're inequal (carry-in off and inputs different => sum is true).
What it does is instantly calculate all carries on the left side of the machinery
and all "non-carries" on the right side. Each bit either has a carry or not, so exactly one of each summation (q) gears is offered power on every bit. Since these gears are also in opposite configuration, no erroneous power connections can be established if everything's wired correctly. Over 100 correct additions during the square root calculation provided the practical proof of the adder.
Handing the instantly-accurate addition result over to the target register is a bit trickier: with the register design in use, the fastest read process requires not only "set" signals for every bit that has a sum, but also "clear" signals for every bit that doesn't. Achieving that with minecarts means an already-moving cart must be re-pathed by a roller activated when sum is on or stays on the given course when sum is off and the roller unpowered. It must then pass over one of two pressure plates that mustn't be touched during the process of setting the cart in motion towards the sum roller. I insisted on building the adder as compact as possible - two lines per bit - thus pathing of the carts had to change z-level so it could be dropped back onto the startup roller. The impulse ramp used to provide speed for getting the carts to the upper level proved problematic: the further acceleration could sometimes increase cart speed into derail territory. I eventually replaced the initial standard (highest-speed) result rollers with high-speed ones.
Another issue was correct timing of the sum-read process: input signals are sent from linear minecart latches, which have a read-signal latency of 2-32 steps; in addition, they are not read directly but rather through the remote triggers above, adding another 26 steps delay, for a total of 28-58. Instantly sending the reading carts would usually give zero as result. Since the inputs last for a fairly long time (about 200 steps), i could simply implement a 100-step delay before reading the result, which also allowed choosing between the two target registers by a simple bridge switch: the lowest-bit cart goes over such a fork on its read cycle.
2. The comparator
Compares two inputs, passes power (switching the path of a "measuring" minecart on the NE loop) when the tested-for condition is true.
The inputs compared are ten bits wide. The logic is quite simple:
Testing for A>B
2QAb║ 1
3QAb║ 2
4QAb║ 3
.PAb║ 4
A = gear engaged when input A is true at the bit number to the right
b = gear sengaged when input B is false at the bit number to the right
Q = gear engaged when A=B at the bit number to the left
║ = NS roller (accepts power from the left/right)
The circuit keeps checking if A=B; as soon as that condition fails, it checks whether A>B at that position. Only if this is the case, power is passed to the output. Lower bits are not checked, power progression invariably breaks at the first inequality. Putting the gears of successive bits in direct contact with each other doesn't hurt the functionality.
Reading the result is done with a hundred-step delay again (inputs from indirectly read linear latches, thus wait time until correct value of up to 58 steps). A variety of comparisons can be made, and many require inverting all gears in the assembly (in the schematic above, B>A can be tested if all A- and b-gears are inverted); consequently, a single "simple relayer" was built - it inverts the necessary gears and holds them inverted for about 200 steps, enough for the evaluation to be always correct.