Hey, I have a little bit of experience with using OpenGL/SDL for 2D graphics. I wrote a screensaver in SDL, but couldn't get it to work as a screensaver using SDL, so I ported it to use OpenGL. Anyway, I was kind of interested to see what the graphics code looks like for DF, so I visited the github.com link. I couldn't find any of the SDL code in there, so I guess that must be basically the initial upload.
Anyway, I just thought I'd share what I learned from optimizing my screensaver, in hopes that maybe I can help a little. Or maybe not. A situation where it's redrawing each pixel every single refresh probably has different characteristics than one where it's only redrawing sprites when they change.
I figured the compiler would optimize away some of this stuff, but even with gcc -O3, I got many times speedup doing a few simple things:
1. Don't mix floating point and integer math. OpenGL screen coordinates are floating point, but SDL's are integers. I thought if I stayed in integer calculations as long as possible, and then do a cast to float would be more efficient, but it turned out that even a single cast from int to float was a real performance killer.
2. Calculating loop bounds early. If you say something like "for (i = 0; i < Min(x, width); i++)" it has to calculate "Min(x, width)" every single loop, even if it's not changing.
3. Similarly, one of my biggest speedups was simply not recalculating values that have already been calculated. I noticed that graphics.cpp had lots of code that looks like this:
if(screen[x*dimy*4 + y*4 + 1]==screen[x*dimy*4 + y*4 + 2])screen[x*dimy*4 + y*4 + 1]=0;
screen[x*dimy*4 + y*4 + 3]=0;
if(screen[x*dimy*4 + y*4 + 1]==0&&screen[x*dimy*4 + y*4 + 2]==0&&screen[x*dimy*4 + y*4 + 3]==0)screen[x*dimy*4 + y*4 + 3]=1;
Notice how "x * dimy * 4 + y * 4" is used over and over. My initial code was full of stuff like that as well. But probably half my optimization gains came from simply calculating values like that into a local variable ahead of time:
int screenindex = x*dimy*4 + y*4 + 1;
if(screen[screenindex]==screen[screenindex + 1])screen[screenindex]=0;
screen[screenindex + 2]=0;
if(screen[screenindex]==0 && screen[screenindex + 1]==0 && screen[screenindex + 2]==0) screen[screenindex + 2]=1;
And it has the side benefit of making the code a bit easier to read.
Obviously, these kinds of optimizations are only really important for stuff that happens in loops on every refresh.