Topic: Hope is on the horizon - one day FPS drops might be a thing of the past! (Read 8272 times)

jecowa · « **Reply #60 on:** September 22, 2016, 05:05:01 pm »

Quote from: Evaris on September 22, 2016, 12:48:54 pm

So for a particular note; AMD's upcoming Raven Ridge has apparently been confirmed to have SKUs with 16GB of HBM 2.0 on-interposer with the APU. Albeit at half-data transfer rates (128GB/s vs 256GB/s, still 5x faster than DDR4, but.. yeah)

So, may be investing in a low power PC of some sort purely for DF in the next year, lol.

I wonder if the HBM (high-bandwidth memory) is available for use for non-graphics purposes. And I wonder what the latency is on it.

Goatmaan · « **Reply #61 on:** October 17, 2016, 02:36:02 pm »

To a semigeek this has been an interesting discussion on the future of hardware. But all of you forgot to take one element into account in regards to DF, and that's the human brain.
Even with unknown future DF additions, rewrites, tweaks, and using tools like Dwarf Therapist, there will still be a fps/dwarf ratio.
For a 100 fps fort, I predict the max dwarf count to be somewhere between 400 to 600 dwarfs.
At 100 fps, attempting to go beyond this will crash your brain.

I would love to try Anvillocked at 20 fps, maybe 30.
But 800+ adults at 100 fps? No thank you, I'll pass. It wouldn't be fun, FUN or !!FUN!!, it'd be WORK. And *HARD WORK* at that.

Goatmaan

P.s. My first "IBM clone" was a Vendex 888 turbo. Years later while putting it on a Salvation Army truck, I (re)noticed. "Blazingly fast 3.3 Mhz"

SpoCk0nd0pe · « **Reply #62 on:** October 17, 2016, 03:12:57 pm »

As far as I know, GPUs are massive parallel processors that can essentially do the same operation many times at the same time (AMD has fewer limitations here because of better arcitecture, but is slower at rasterizing in return). That doesn't sound like the workload of DF.

A friend of mine told me that HBM massively increses bandwidth but it's current implementation does not reduce latency (which seems to be the main problem). I never read about that to verify it for myself though.

Quote from: Miuramir on September 21, 2016, 01:21:07 pm

Quote from: SpoCk0nd0pe on September 19, 2016, 07:28:36 pm
The idea in my head looks like this:

With one thread, you do x+y, then you decide you need to do +z. z is too big for the cache, so the cpu has to wait for the ram.

With 100 threads (hypothetical number), there is already another thread where z is already fetched from ram, so the cpu isn't idle. It can do something else. Even all 4 CPUs (or how many you have) can do somthing else.

But fetched from RAM to *where*? The whole point is that all of the on-die cache put together is smaller than some of the individual data structures DF may need to iterate over. So it has to read a piece from system RAM, work on it, then send it back and get the next piece.

Well, I was thinking L3 cache.
Are you talking about L1 and L2? Wouldn't multi threading still help because all 4 cpus could theoretically use their L1 and L2? If L3 is too small, maybe the enthusiast intel chips should perform much better since they have much more L3 cache.

Is there any performance data with the eDRAM broadwell?

Kumquat · « **Reply #63 on:** October 19, 2016, 10:52:21 am »

I suspect that DF also suffers from a side effect that comes with Von Neumann architecture computers (which is practically all general computers since 70s): code is data. Generally only L0 cache is split into dedicated code and data parts, everything from then on is shared. And being shared it is also contested.

I assume that DF is programmed in a traditional style, with large functions working on large data structures; updating a dwarf splits into updating their brain and their kidneys and their liver and their temperature and their soul and so on. Also each of these work on a small part of a large "dwarf" data structure. This is generally called the "array of structures" approach. This leads into the cache either throwing out bits of code or bits of data to bring in new stuff.

In the opposite approach, "structure of arrays", you have arrays that contain a part of a dwarf (or any other creature), which are then updated separately in one fell swoop each. Since the CPU is only running a short piece of code over large amount of data, the code will remain in cache, and there data can be processed in a streaming fashion rather than being cherrypicked with random accesses.

In a concrete example, the first approach would be a bottling machine that fills bottles, caps them, glues on labels and prints last-use-dates; the latter approach would be a production line where first machine bottles, second machine puts on caps, third machine glues on labels, etc. And you won't see the first approach in any larger scale production facility.

Evans · « **Reply #64 on:** November 03, 2016, 09:21:42 am »

Well, things might be moving forward.

http://futurism.com/computer-memory-could-speed-up-by-1000-times-thanks-to-new-computing-breakthrough/

Putnam · « **Reply #65 on:** November 03, 2016, 08:16:40 pm »

actual source, since that "futurism" article was written third hand

Evans · « **Reply #66 on:** November 05, 2016, 02:35:33 pm »

Thank you.

News:

Author Topic: Hope is on the horizon - one day FPS drops might be a thing of the past! (Read 8272 times)

jecowa

Re: Hope is on the horizon - one day FPS drops might be a thing of the past!

Goatmaan

Re: Hope is on the horizon - one day FPS drops might be a thing of the past!

SpoCk0nd0pe

Re: Hope is on the horizon - one day FPS drops might be a thing of the past!

Kumquat

Re: Hope is on the horizon - one day FPS drops might be a thing of the past!

Evans

Re: Hope is on the horizon - one day FPS drops might be a thing of the past!

Putnam

Re: Hope is on the horizon - one day FPS drops might be a thing of the past!

Evans

Re: Hope is on the horizon - one day FPS drops might be a thing of the past!