Just keep a running tally of quotes- have a counter that increments on every {quote} and decrements on every {/quote} (except square brackets, obviously). If the counter isn't 0, ignore the red text.
It doesn't work like that: the script reads the html, not the bbccode, which means it's not nicely ordered {quote}{/quote} pairs, but deeply nested <divs> and <spans> with
endless crap in between, including different "bbc_quote" styles for nested quotes (they have a different background colour, which is reflected in the html), and for things like quotes with links to profiles, links to posts, or no links, and so on, all in a single line (the html for the contents of a post is, internally, stored in a single, very long line).
Also, I don't walk the text,
perl walks the text; for me to keep a counter, I'd have to walk the text myself, which defeats the purpose of using perl for it... perl is really good and smart about reading text the same way you and I do: top-down and letft-right. Matching quote pairs requires it to read
inside-out instead, for which I think the answer is recursion -- I have a couple of ideas to try that, but won't have time to tinker until the weekend or thereabouts; we'll see how it pans out.
I do know that my current approach for the redtext is not the right one, just because of how convoluted it is. For comparison, it takes
one line of code to find the post author, process replacements, and store it; it takes
one line to find the reply number and date, and one more line for the url, but finding the redtext takes five lines at the moment, and it doesn't quite work. No, the answer is simpler, and more elegant, I just haven't found it yet. Or maybe I have, just need time to give the latest hare-brained ideas a whirl.
On top of that, there are also other inexplicable bits in how the forum creates the html. For example, in Salad,
this post was missed by the finder; I noticed it and put it in manually, but the reason it missed it is because the colour is
not "red", as all other red text is, but is instead "#ff0000"; I don't know why that is, but maybe the edit had something to do with it. I'm OK with missing edited posts (they are illegal, after all), or with putting an exception for "#ff0000", but it's just an example of how nutty the unexpected bits of the problem sometimes are.
Oh well, it's not like it's the launch codes for the nuclear warheads or anything; if it works, it works. I'll have something new, and hopefully better, some time next week. But thank you for the discussion, it helps.