Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  

Author Topic: Use SSE  (Read 1348 times)

copx

  • Bay Watcher
    • View Profile
Use SSE
« on: June 25, 2008, 05:54:54 pm »

There might be an easy way to improve frame rate which does not require any code changes: make your compiler emit SSE instructions!

It seems that the current binaries don't use SSE - you don't mention any CPU requirements.

If your code uses floating point calculations in speed critical parts using SSE could result in greatly increased performance.

The only downside is that binaries which contain SSE instructions won't run on CPUs without SSE support. But DF is such a CPU hog that it's not really playable on older CPUs anyway.

SSE1 only requires a Pentium III / Athlon XP or better. 
However, if you use double precision reals (type "double" in C/C++) you want SSE2. That needs a Pentium IV or Athlon 64/Opteron or better.

Seriously, try it (should only require setting the appropriate compiler flag and rebuilding). You could keep support for older CPUs with an alternative binary if you want.
Logged

Mikademus

  • Bay Watcher
  • Pirate ninja dwarves for great justice
    • View Profile
Re: Use SSE
« Reply #1 on: June 25, 2008, 06:30:27 pm »

What you say about speed increases from using Intel's SSE/SSE2 (Streaming SIMD Extensions) or AMD's 3DNow! were very true for Pentium platforms about 10 years ago. Today, all reasonably modern CPUs have these, have 1 cycle float operations and all compilers automatically optimise for it.
« Last Edit: June 25, 2008, 06:32:08 pm by Mikademus »
Logged
You are a pirate!

Quote from: Silverionmox
Quote from: bjlong
If I wanted to recreate the world of one of my favorite stories, I should be able to specify that there is a civilization called Groan, ruled by Earls from a castle called Gormanghast.
You won't have trouble supplying the Countess with cats, or producing the annual idols to be offerred to the castle. Every fortress is a pale reflection of Ghormenghast..

copx

  • Bay Watcher
    • View Profile
Re: Use SSE
« Reply #2 on: June 25, 2008, 08:18:27 pm »

What you say about speed increases from using Intel's SSE/SSE2 (Streaming SIMD Extensions) or AMD's 3DNow! were very true for Pentium platforms about 10 years ago. Today, all reasonably modern CPUs have these, have 1 cycle float operations and all compilers automatically optimise for it.

You are simply wrong. SSE 2 did not even exist 10 years ago for starters.
I don't know which compiler Toady uses but GCC (and most other x86 compilers) emit only instructions which can be executed on a 386 by default, because that's the only way to ensure that the resulting code will actually run on all CPUs. If you don't specify SSE use or do it implicitly by targeting a CPU which supports SSE the optimizer won't use any of the advanced floating point instructions (and the related registers) available on modern CPUs.

All current, high end professional apps which are floating point heavy use SSE and Intel certainly didn't create those extensions for no reason.

I repeat my suggest to Toady to try it (if DF actually uses floating point math).




 



Logged

bartavelle

  • Bay Watcher
  • Coin coin!
    • View Profile
Re: Use SSE
« Reply #3 on: June 26, 2008, 04:39:38 am »

SSE is good for scalar operations, not floating point operations. The problem is that compilers aren't most of the time smart enough to decide if code could be converted efficiently, so it has to be done by hand.

Moreover, most "professionnal applications" using these instructions are trivial to vectorize (image and sound processing, for example), whereas the biggest time sinks in DF are likely to get a lot more benefit from an algorithmic optimization than from a local optimization.
Logged

Zruty

  • Bay Watcher
    • View Profile
Re: Use SSE
« Reply #4 on: June 26, 2008, 11:01:49 am »

BTW, at some moment I was thinking about the pathfinding optimization.

I assumed that pathfinding and fluid dynamics consume the most of the processing power and tried to think of a better PF algorithm than a brute-force 'wave' Dijkstra.

I don't know whether Toady have already done something about it though, so my thoughts about the current implementation were mere guesses.

Suppose we somehow set a number of 'Flags' (remember Settlers II?) at different popular locations, then pre-compute the paths between them. So any desired path from A to B can be split to
A -> nearest flag F1
B -> nearest flag F2
precomputed F1 -> F2.

Well, then setting the flags themselves could be done manually or by some cunning algorithm...
Logged

winner

  • Bay Watcher
    • View Profile
Re: Use SSE
« Reply #5 on: June 26, 2008, 12:08:59 pm »

does anyone know what method the "desktop tower defense" guy uses?
Logged
The great game of Warlocks!

Dame de la Licorne

  • Bay Watcher
  • Cats? Check. FPS? Uh-oh...
    • View Profile
Re: Use SSE
« Reply #6 on: June 26, 2008, 12:13:27 pm »

Suppose we somehow set a number of 'Flags' (remember Settlers II?) at different popular locations, then pre-compute the paths between them. So any desired path from A to B can be split to
A -> nearest flag F1
B -> nearest flag F2
precomputed F1 -> F2.

Well, then setting the flags themselves could be done manually or by some cunning algorithm...

I think this solution would be the best for the pathfinding issues, as long as the player could set the flags.  Each of our fortresses are laid out differently (or at least, no two of mine are the same), so we should be able to set the flags for the "most frequently traveled' routes.  It should cut down on the pathfinding FPS lags, at the very least.
Logged
If software was real world, then it'd be something equivalent of hitting a nail with a hammer and having a building collapse on the other side of town.

Don't worry people, sometimes -moments occur

irmo

  • Bay Watcher
    • View Profile
Re: Use SSE
« Reply #7 on: June 27, 2008, 03:44:27 am »

I think this solution would be the best for the pathfinding issues, as long as the player could set the flags.  Each of our fortresses are laid out differently (or at least, no two of mine are the same), so we should be able to set the flags for the "most frequently traveled' routes.  It should cut down on the pathfinding FPS lags, at the very least.

On the other hand, it (1) increases congestion (flag-to-flag routes have to carry all of the traffic) and (2) increases travel distance (since, by necessity, nobody is taking the shortest route to anywhere).

On the other other hand, if the player has to set these flags manually, those of us who don't want them can just not set any flags. So it's not a bad idea, just less than optimal in some situations.
Logged

Zruty

  • Bay Watcher
    • View Profile
Re: Use SSE
« Reply #8 on: June 27, 2008, 07:01:20 am »

1) I think that at the moment dwarves have some kind of micro-PF algorithm so that they do avoid each other when travelling through, say, a 3-tile wide tunnel in opposite direction. This will remain so and decrease the number of collisions.

Also the system may track the 'load' on each flagged route (i.e. by the nuber of dwarf collisions per step) and try not to overload the routes.

2a) if we are searching for a flag F from point A and suddenly find the point B (this can happen via Dijkstra), we can abort the search and take a direct doute.

2b) I was thinking about large number of flags that are set (semi) automatically - not several flags set manually.
Logged