A straight line along an embark tile boundary is a sign of either an error in the script or odd data in the world. The world generation apparently "bleed" biomes to generate undulating "blob" boundaries. At higher elevations it seems the "sky" biome (whatever that is) takes over except for an area around where the ground pokes up into it. Why the ground biome isn't used all the way up to the top of the sky is an unanswered question, though. It might have to do with cloud and wind behavior.
Neither rivers nor pools are actually handled by ragundo's code, because these features do not appear on the pre embark level, so I've hacked those by looking for appropriate tile types (murky pool, riverX feature, etc) and have then invoked ragundo's logic for lake tropicality determination for those tiles. It is very possible this doesn't hold up to closer examination, so it's a point that should be investigated. Similarly, ragundo's code for lake detection doesn't work on the fortress level, because it checks an "is lake" flag on the corresponding embark tile, which isn't what we want (every fortress tile within that embark tile would be marked as lake). It might be that the logic used to determine tropicality "home" tile could be used here as well, though, but for the time being any water that's neither ocean, pool or river/brook/stream is labeled as lake, including any artificial pools on the surface.
Anyway, my test embarks identify biomes correctly on the surface, and I've looked at a couple of others as well without detecting any issues.
I've started looking into shears, in particular the ocean ones, and those I've looked at have a region id that points to an ocean region, as well as having an elevation below 100 (I've added two new views, one for the elevation and one for the broad biome (which labels each region as either Desert, Glacier, Hills [whatever that is], Jungle, Mountain, Ocean, Swamp, Steppe, and Tundra [I've shifted the order to the alphabetic designation order I use in my view]). Taiga would thus fall under Jungle (there is another enumeration with the same elements in the same order but with slightly different names: Jungle matches the more logical Forest).
Edit:
Pushed out version 1.7. Two new views and updated help text, but no bugs fixed.
Edit 2:
- Lake detection was bugged, and does now seem to be fixed. Also switched the characters around so the "smaller" one is temperate ('<'). This should also affect brooks and pools.
- Changed all polarities to use a definitive tropicality test vs 200 rather than odd code that doesn't seem to work.
- Looked at shears, and the biomes they claim to be match their parameters along all dimensions I've looked at, but what you can grow on them does not, but it does not necessarily match with what's on the ground below either (although in the cases I'm looking at it does match one of the biomes at the ground). I looked at vegetation as well, but it matched rainfall as usual.
@ragundo: The scripts relies heavily on the Probe code, which I've used as the basis for the logic. However, Probe doesn't actually determine the biome type, it just prints the lower level delta value. Unless some new major thing crops up it looks like the current script logic is working and what remains would be bugs and boundary cases. However, the reality has a tendency to strike you from behind...
Someone might want to update Probe to provide more info (such as the biome) at some time, of course.
Edit 3: Interesting find. I've dug through the C code to figure out why the provided operation fails to get data while Probe does find it, and the answer is actually that Probe just doesn't perform the same check as the other path does, so it happily uses a reference that's way out of bounds (legal range is 0 - 8, while values can be things like 100). Thus, it may provide data for something, but not the tile probed...
Edit 4: Using what looks like the original operation's C code implementation's recovery mechanism, to return the embark tile inserted on offset overflow, I end up with with biomes that look like the ones present at the surface level, but on closer examination they sometimes actually display some parameters that differ, Thus, it may be a case of shearing in from a neighboring tile, but not with the silly data previously displayed.
Edit 5: Uploaded 1.8. Further testing on an embark with a silly checkerboard pattern above the embarks showed a match; tropical swamp above the embark (with none at it) showed papyrus, temperate grasslands above showed artichokes (this biome existed on ground level as well) and an extra dry broadleaf showed buckwheat.