With compiler optimization I usually just assume that it's going to simplify or reorder operations, mostly to hide latency in the CPU's pipeline. Thinking on it though, that probably pretty rarely affects memory access patterns. It could if some part of an arithmetic operation were simplified, perhaps, but the overall access pattern is probably going to be more dominated by the layout and loops in the program.
I probably just need to watch the videos since I'm having a hard time imagining ways that you can really improve that for most programs, at least not without completely restructuring them. Kind of like the getting rid of objects idea thing you mentioned. Most of the time you'd have an array of complex objects that you iterate and access member variables from, but presumably, depending on what your code did, you could instead just maintain arrays of each member variable and iterate those instead. That could be a lot faster but would probably be a much bigger headache to manage...
I guess that's kind of the eternal tradeoff with programming languages. Do you want easy to use, or fast? Rarely do you have both.
A compiler such MSVC can't actually optimize version 1 into version 2. It can't do the optimization that you can see immediately of moving the bool check outside the loop. That's because you can reason non-locally that the member bool isn't changing as the loop progresses. But what if did change inside "do_the_thing"? Is the compiler meant to read your entire program and reason about such things, and make sure there aren't any concurrent processes that might have changed the m_bool value during the for loop? This would be non-trivial AI level stuff that doesn't actually exist.
This is an important statement and something to think about, but it's worth noting that depending on how your code is structured it might indeed be able to do that optimization. C and C++ compilers are usually pretty lazy about it or have a separate option for "global program optimization", but if it does have that option and it's turned on, they could possibly inline the do_the_thing function and perform code analysis to determine if the if check can be hoisted out of the loop. I wouldn't rely on it though.
This is related to something I found surprising at first, which is that Fortran programs are generally considered faster than C programs, and that's because the compiler is usually able to optimize them better due to the language having stricter rules on memory usage. In particular, pointers in C and C++ usually mean that the compiler has to do away with some optimizations because it can never know for sure if a variable is being aliased by several pointers that might change its value in unexpected ways. Using the "restrict" keyword can be a useful hand optimization in such cases since it's effectively promising to the compiler that you're not doing that.
Now I'm curious if compilers even think about the possibility of nonlocal thread access to variables when performing optimizations. I'm guessing not, unless you reference a variable through a pointer somewhere.