Topic: Things that made you go "WTF?" today o_O (Read 14965103 times)

MrRoboto75 · « **Reply #154365 on:** November 15, 2020, 03:30:45 pm »

Quote from: Naturegirl1999 on November 15, 2020, 03:17:23 pm

Quote from: MrRoboto75 on November 15, 2020, 03:06:53 pm
Its because the limitations of hardware at the time required optimization and innovation.

Give people no limits and creativity diminishes.
Why would no limits hinder creativity?

Necessity is the mother of invention. Limitations and walls in design space might require clever workarounds to overcome to deliver something interesting.

In other words, if you want someone to think outside the box, there needs to be a box.

Lord Shonus · « **Reply #154366 on:** November 15, 2020, 03:40:44 pm »

If you have heavy restrictions on what you physically can do, you have to come up with clever tricks to actually execute it. Or else you have to come up with a different model that actually fits inside your workable space. Without those limits, you cal just slop down whatever and make it work (or not) as you go. This, for example, is one of the reasons fanfic and web serials tend to have so much meandering crap in them. So there is a good argument that simple tasks on modern computers have bloated requirements because the programmers weren't forced to be efficient.

The other side of the coin is that a lot of people are underestimating what their software is doing, or are remembering early programs with a heavy set of rose-colored glasses. Zoom, for example, isn't just a chat program - it is encoding a video stream while decoding several incoming ones in real time in acceptable resolution. That's a genuinely involved processing task. Older programs that did the same thing with less resources existed, but if you go back to them you'll find them to be extraordinarily shitty.

scriver · « **Reply #154367 on:** November 15, 2020, 03:41:53 pm »

Quote from: scriver on November 15, 2020, 03:25:17 pm

My tv remote seem to have broke.

It's shit that you can't do any interfacing/usage from the actual contraptions these days.

...and it was just that the batteries I had put away as unused wasn't.

Facepalm.

bloop_bleep · « **Reply #154368 on:** November 15, 2020, 04:04:18 pm »

Quote from: MrRoboto75 on November 15, 2020, 03:06:53 pm

Its because the limitations of hardware at the time required optimization and innovation.

It's not like these times the need for optimization has disappeared. Perhaps many if not most processors are in embedded applications, which can be very severely resource-restricted. Even in higher-end CPUs faster operation directly correlates with better experience and more features you can fit in in many applications. Take gaming. Lots of operations have to be done in the 33 ms or so time period you get for a single frame. That's still not a long time. There's increasing numbers of automated trading systems which have to be very high performance to beat the competition. (This guy at a talk who works on this said he uses an 18-core Intel Xeon CPU at his work with oodles of cache, then shuts off all but one core to minimize disruption.) Optimization may have gotten harder recently due to the increasing complexity of CPUs.

Naturegirl1999 · « **Reply #154369 on:** November 15, 2020, 04:37:07 pm »

I can’t count the times I remember this thread returning to computers and I learn something new every time it happens

MrRoboto75 · « **Reply #154370 on:** November 15, 2020, 04:54:29 pm »

Quote from: bloop_bleep on November 15, 2020, 04:04:18 pm

Optimization may have gotten harder recently due to the increasing complexity of CPUs.

Probably true. I have run into a lot of old games that run like garbage today because multicore simply didn't exist yet and without making an entirely new program I can't use my comp's modern muscle to run them.

McTraveller · « **Reply #154371 on:** November 15, 2020, 05:13:25 pm »

Quote from: MrRoboto75 on November 15, 2020, 04:54:29 pm

Quote from: bloop_bleep on November 15, 2020, 04:04:18 pm
Optimization may have gotten harder recently due to the increasing complexity of CPUs.

Probably true. I have run into a lot of old games that run like garbage today because multicore simply didn't exist yet and without making an entirely new program I can't use my comp's modern muscle to run them.

What does this mean? Unless the old game had a loop locked to a clock or something, how can it run worse on modern hardware than it did on old hardware?

NOTE: even modern games are rarely multi-threaded, because thread syncrhonization is Difficult^TM.

Lord Shonus · « **Reply #154372 on:** November 15, 2020, 05:16:56 pm »

The only way that statement would make sense is if you're running something that strains the single-core performance of your modern CPU. Which is entirely possible since most software nowadays does have significant multithreading in arewas where synchronization is less important, and we've hit a practical limit on how good single core performance can be pushed economically.

McTraveller · « **Reply #154373 on:** November 15, 2020, 05:51:39 pm »

Quote from: Lord Shonus on November 15, 2020, 05:16:56 pm

The only way that statement would make sense is if you're running something that strains the single-core performance of your modern CPU. Which is entirely possible since most software nowadays does have significant multithreading in arewas where synchronization is less important, and we've hit a practical limit on how good single core performance can be pushed economically.

But most modern CPUs have better single-core performance than older ones, so my question still stands: why would an "older" game, made for an older CPU, have worse performance on a modern CPU?

Is there an instance of a new chip that has worse single-core performance (for any workload) than an earlier-generation CPU (things like Spectre or whatever notwithstanding)?

Lord Shonus · « **Reply #154374 on:** November 15, 2020, 06:10:30 pm »

Any game that can't use multiple cores also can't say "run me on the least congested core", and most OS's have trouble autoshuffling stuff from that era.

methylatedspirit · « **Reply #154375 on:** November 15, 2020, 06:16:17 pm »

Quote from: McTraveller on November 15, 2020, 05:51:39 pm

Is there an instance of a new chip that has worse single-core performance (for any workload) than an earlier-generation CPU (things like Spectre or whatever notwithstanding)?

Skylake (6th gen) chips vs Broadwell (5th gen) in DF worldgen. I even have the charts (straight from Anandtech) to prove it:

(source)

Look at that. The i7-5775C (the old chip) is substantially faster than the i7-6700K (the new chip). I think this is a rare exception to the rule; DF enjoys having loads of cache, so performance improved drastically on the Broadwell parts because those had 128MB of eDRAM acting as L4 cache. Skylake, on the other hand, didn't have L4. DF is almost entirely single-threaded, so this fits, I think.

Reelya · « **Reply #154376 on:** November 15, 2020, 06:20:09 pm »

Quote from: bloop_bleep on November 15, 2020, 12:51:44 pm

Yeah, you guys are saying what I was thinking about. IIRC loop unrolling helps because it decreases the number of branches dealt with, but loops are generally easy branches to predict so branch prediction nowadays removes most of the benefit of unrolling.

The history is mixed up here. They didn't even have predictive CPUs back when loop unrolling was the main thing you did. So it's basically completely wrong to say the main rationale behind loop unrolling was about decreasing branches. There were no branches.

What loop unrolling is for is avoiding the instruction overhead per cycle. By unrolling a loop, you get rid of the counter variable and the jump instructions. You will save the one hit on the branch prediction, but only on the last iteration, and by the time that was important, well instruction cache size was becoming important, too, so loop unrolling was already becoming counter-productive.

The branch penalty thing is a thing where if it makes sense to unroll the loop for other reasons, or all else being equal then it might tip it over the edge to being worthwhile, rather than being the main show. For example, say you're looping something 1024 times and you decide to use a little loop-unrolling. You might unroll on 8s, so you do 8, increment the index by 8, then do another 8 and so on. You'll get the benefit of unrolling, but you can fine-tune this so that it doesn't overload the instruction cache, and find an optimal value for how many you step through per loop. However, the branch predictor will only fail on the last iteration, so once per 1024 elements, no matter if you unrolled this loop or not. So, given even the most rudimentary branch prediction on the CPU then unrolling in this case wasn't about that. Partially unrolling loops is far more common than completely removing them, which is what you'd need to do to get any benefit of avoiding branch misses.

McTraveller · « **Reply #154377 on:** November 15, 2020, 06:39:49 pm »

If the loop counter and jump is a meaningful percentage of your loop time... I don't even know what to say about that!

Interesting data about the cache and other differences in those CPU generations by the way. It's definitely not an obvious result, but I concede the point.

Aside: I've only ever personally unrolled loops for some vector math, duplicating a line of code 3 times. I can't imagine loop unrolling of this type is really responsible for the type of code bloat we were initially discussing...

bloop_bleep · « **Reply #154378 on:** November 15, 2020, 07:05:23 pm »

Quote from: Reelya on November 15, 2020, 06:20:09 pm

Quote from: bloop_bleep on November 15, 2020, 12:51:44 pm
Yeah, you guys are saying what I was thinking about. IIRC loop unrolling helps because it decreases the number of branches dealt with, but loops are generally easy branches to predict so branch prediction nowadays removes most of the benefit of unrolling.

The history is mixed up here. They didn't even have predictive CPUs back when loop unrolling was the main thing you did. The other stuff with the loop increment and compare is peanuts compared to an unpredicted branch. So it's basically completely wrong to say the main rationale behind loop unrolling was about decreasing branches. There were no branches.

No, they didn't have predictive CPUs. So conditional branches stalled the pipeline as they were being handled, so you'd want to decrease their number. Predictive CPUs don't benefit as much from reducing branches in loops. That's what I was saying.

Quote from: McTraveller on November 15, 2020, 06:39:49 pm

If the loop counter and jump is a meaningful percentage of your loop time... I don't even know what to say about that!

It can be, in very tight code. As I said, conditional branches can stall the pipeline if they are not predicted or mispredicted.

Reelya · « **Reply #154379 on:** November 15, 2020, 09:35:59 pm »

Quote from: McTraveller on November 15, 2020, 06:39:49 pm

If the loop counter and jump is a meaningful percentage of your loop time... I don't even know what to say about that!

Interesting data about the cache and other differences in those CPU generations by the way. It's definitely not an obvious result, but I concede the point.

Aside: I've only ever personally unrolled loops for some vector math, duplicating a line of code 3 times. I can't imagine loop unrolling of this type is really responsible for the type of code bloat we were initially discussing...

It is, if for example you are doing some instruction per-pixel at around 1024/768, then that's about a million additional computations that you're doing per frame. having an additional million things you can do per frame is worthwhile.

So, don't think in terms of 'percentage' but realize that in many situations you have a hard cap on how long something is allowed to take, for example, old hardware or consoles where there's a set refresh rate and if you miss that, it halves your frame rate. So in other words, a bit of loop unrolling frees up some time before the next refresh giving you extra computations to work with or avoiding stuttering. So, if you're aiming for 60 FPS, you need to fit whatever you're doing into 16 milliseconds, no matter what, or it bumps to the next refresh, which gives an effective FPS of 30 instead.

Bay 12 Games Forum

News:

Author Topic: Things that made you go "WTF?" today o_O (Read 14965103 times)

MrRoboto75

Re: Things that made you go "WTF?" today o_O

Lord Shonus

Re: Things that made you go "WTF?" today o_O

scriver

Re: Things that made you go "WTF?" today o_O

bloop_bleep

Re: Things that made you go "WTF?" today o_O

Naturegirl1999

Re: Things that made you go "WTF?" today o_O

MrRoboto75

Re: Things that made you go "WTF?" today o_O

McTraveller

Re: Things that made you go "WTF?" today o_O

Lord Shonus

Re: Things that made you go "WTF?" today o_O

McTraveller

Re: Things that made you go "WTF?" today o_O

Lord Shonus

Re: Things that made you go "WTF?" today o_O

methylatedspirit

Re: Things that made you go "WTF?" today o_O

Reelya

Re: Things that made you go "WTF?" today o_O

McTraveller

Re: Things that made you go "WTF?" today o_O

bloop_bleep

Re: Things that made you go "WTF?" today o_O

Reelya

Re: Things that made you go "WTF?" today o_O