I had a fort where I made a covered road from the map edge to my depot by way of raised bridges on the side and constructed floors above it at the edge before it went underground. After some forgotten beasts made a mess of things I reclaimed and that section of covered edge was no longer a valid spawning point for caravans (and probably others) even after deconstructing the floor above. If I knew were to look I could try to see if there is data value involved. I suppose another reclaim after uncovering those edge tiles might fix it too, but the other reclaim bugs I had made that fort rather disappointing.
That would be consistent with it being re-calculated on each embark, then stored with the site data for the duration of that embark.
Oh hey! I found a likely data sequence in world.sav that matches this theory!
Bastardizing the bt hex edit format from the world parsing thread, something like this, where they're all 16 bit integers.
short count_x;
short x_coords[count_x];
short count_y;
short y_coords[count_y];
short count_z;
short z_coords[count_z];
Where all the count_.'s are the same, and the resulting coordinates appear to match the edge tiles. I can see gaps in the sequence where trees are growing on the map edge. I'll try editing this section and see if my caravan moves around.
Um, this might not relate to dfhack anymore, except to maybe help identify how the migrant/caravan/diplomat-eligible tiles might appear in memory. Please tell me if it should move to another thread.
Edit: Success!
The visitor-eligible tiles are definitely determined at embark, and stored in a static set of coordinates that can be edited to change where they may appear. That doesn't completely rule out other sets of coordinates for other civilizations.