With multithreading you are effectively saying "see this function? run it separately." The OS takes it, throws it in with the other things requiring run time (like background processes and such), and it is then managed automatically and sent to wherever is available. And in fact, your single-core program will have this happen to it too if the OS decides to move it to a different CPU core. Watching Ubuntu's task-manager-equivalent program shows my code swapping to a different core every 5 seconds or so; though this is OS dependent behavior and is pretty much entirely managed by the OS (though you can change how high your code's priority level to get run time is within your code in some cases).
The trick with multithreading is to ensure things aren't being accessed by multiple things at the same time; and so there are special atomic commands to help prevent it. Say for example you have a variable 'int x=0;' and this variable is accessed by both thread A and thread B. Thread A needs to run 'x = x + 2;' while Thread B needs to run 'x = x - 1;' Even if the order in which they are run doesn't matter, there comes an important problem which can result in funny things. If run in normal sequential order, x should be equal to 1 at the end. If run with multithreading, you can't know what it will be without proper locks.
'x = x + 2' would compile down to assembly saying something like:
'tmp1 = loadWord(xAddress); tmp1 = tmp1 + 2; storeWord(xAddress, tmp1)'
'x = x - 1' would similarly compile down to something like:
'tmp2 = loadWord(xAddress); tmp2 = tmp2 - 1; storeWord(xAddress, tmp2)'
Now, this means this single line of code needs 4 instructions to run. So, if Thread A and Thread B both run, you may have something like this:
'x = 0;' <--- value set previously in RAM/cache or where ever it's stored
'tmp1 = loadWord(xAddress);' <--- value loaded into register for Thread A
'tmp2 = loadWord(xAddress);' <--- value loaded into register for Thread B
'tmp1 = tmp1 + 2;' <--- add 2 to tmp1 for Thread A; tmp1 now equals 2
'storeWord(xAddress, tmp1);' <--- tmp1 stored back at RAM/cache/wherever location by Thread A; Thread A finished with its code now
'tmp2 = tmp2 - 1;' <--- subtract 1 from tmp2 for Thread B; tmp2 now equals -1
'storeWord(xAddress, tmp2);' <--- tmp2 stored back at RAM/cache/wherever location by Thread B; Thread B finished with its code now
So, if we were to check the value stored in 'x' at this point, it would be -1. Similarly, if Thread A took longer to complete than Thread B, it would be 2. Most of the times code executes on Thread A and Thread B, one is run after the other, as 3 instructions is a tiny amount of time. However, if you were to run them both 100 times, you would quite likely not get 100 as the value for x. These sorts of things are called
race conditions.
So, we need a way of telling Thread A and Thread B "Don't execute
this specific part of your code if the other thread is executing
that specific part of it's code." So, you could try a boolean flag shared between the two.
if(flag)
wait
else
{
flag = true
execute code
flag = false
}
However, this runs into the same problem as above; there is a chance one could see the flag is false, branch into the else statement, and before it has a chance to change the flag, the other thread enters its else statement. The flag is then set to true, but both are already inside the else statement.
The special multithreading things are atomic (that is, indivisible, single instruction units) operations. They combine the 'if' functionality with 'set' functionality; 'check and set' operations. These go by the names of
semaphores and
mutexes, among others.
While these deal with the fundamental issues of multithreading, the tricky part is using them. It requires careful consideration of interdependencies between code run as separate threads to prevent race conditions which will generally slowly corrupt the values of any variables involved.