First off I would like to say that this is an awesome idea. And second for all folks who think this is going to be cpu consuming, I would like to suggest that it wont, at least if done correctly.
All fluids as I have understood it constantly have to check around for squares to fill, potentially over large distances, and it is large masses of water and magma that cause noticable cpu consumption due to having to check so many squares.
Sand wont flow around usually, at least if there are no weather effects like wind implemented. It could be implemented however reasonably as well without much cpu consumption, similar to rain, except that every sand tile it hits sand is displaced 1/7 in the wind's direction, with an element of randomness, and also the likeliness of displacement depending on the slope.
My suggestion of the algorithm would work as follows:
When any tile x is mined, sand is collected from, or is dispaced by cave in do PULL(x):
If a sand tile x is walked on, if a random number is <p, then pick a random adjacent tile y, displace 1/7 sand on it and do PUSH(y), and PULL(x)
PULL(x): if adjacent tile y, has sand(y) > 2+sand(x) on it set the coordinate x of the dug out tile on an event vector
PUSH(x): if an adjacent tile y has sand(y)+2 < sand(x), add coordiante y to the event vector
Per frame do:
For each coordinate x in the queue,
for each adjacent tile y
if sand(y) > sand(x)+2, displace sand(y)-(sand(x)+2) onto x from y and do PULL(y)
if any sand has been displaced do PUSH(x), otherwise remove the event (move the last coordinate into the gap)
the vector could be reallocated 1000 coordinates at a time to speed things up
While no sand is moving it should not consume almost any cpu cycles, and when sand moving is needed, it will be very localized (unless you are digging out a huge inverted pyramid) and thus should consume a minimal amount of cpu cycles.