The problem is hard to solve, requires effort plus intent towards that point to solve, and won't be solved spontaneously. But once we think that maybe we're nearly there, ask whether we should... We might well have unknowingly have tipped over into doing so, irreversibly so, because no one person's 68 billion neurons has yet entered the state that properly recognises that the somewhat comparably complex artificial brain of perhaps 68 billion primary transisors[1] (but not necessarily that many, nor necessarily so limited) has entered a state of its own consciousness. It may just look like it is being a better 'trained-monkey'[2] than the last trained-monkey.
(And I say "primary transistors" because clusters of aggregating transistors would need to 'feed' the output 'primary' with the balanced assessment of all inputs, to directly emulate the way the neuron is stimulated, and then relaying 'wires' to place the output into the input bin of other 'primaries'. Right now, if this approach is taken then this is done in software (virtual neural circuits, emulated by data reusing the same hardware (time-sliced) with just generic all-purpose chip logic), though there's Programmable Gate Array tech which might be closer to the (currently) fictional "biogel" or "positronic" versions of amorphous brain-stuff that the likes of Trek or Asimov use to explain how (seemingly) intelligent their computers and robots can be, i.e. how to pack the computing power needed into some handy circuitry-like system.)
We can perhaps interrogate the state, a lot easier than we can interrogate the state of a brain[3][4]. By piggy-backing on the data-flow is no good unless we can interpret the reason for the data. Having established that a trivially understandable AI probably is insufficiently intelligent, and anything on the order of a human mind (or even a monkey; perhaps somewhat far further down the list of life, too) is darn tootin' gonna be ineffible at anything other than the broad level.
If anyone is still with me, the whole
'evolved circuitry' concept (might be better reports of that, than the link provided, but the first link I checked that dealt with the points directly) reflects the "we know it works, but not really how" idea, already, without more than 100 "primary transistors" (less than half that). And if 'disconnected circuits' and subtle sub-logic effects play a vital part in a simple tone-discriminator, can we even hope to tap our budding-AI's circuits to monitor its innermost thoughts, assuming we can resolve them? Could that not intrinsically change the efficacy or direction of the 'spark' of consciousness in the ultimate "heisenbug"? And actually somehow imposing directives 'deep' into such layers of nuance[6] when we don't even know how those depths work (or will work, once we let itself develop the free will to do the bits that we
don't know how to pre-engineer into the system) seems likely to be hit and miss.
Just like the incontrovertable Asimovian laws
seemed to be subverted in
'Reason' (though, there, ultimately benignly so/not really, depending upon your interpretation), and definitely had ...from the human viewpoint alone... been broken by the time Solaria becomes maximally roboticised and only an ex-Solarian even has a chance.
Even the often (deviously/obtusely) strict Asimov stretched the laws
beyond actual breaking point, occasionally, also. And this is where the in-universe reality of such laws is pretty much inviolable, by the time you understand the how and the why of if it, and even well-meaning violation means death (alas poor R. Giskard; and Herbie from 'Liar!', long before that...). The 'Evitable Conflict' precursor to the Zeroeth Law bucks that trend slightly, however.
[1] Or equivalent.
[2] Not to disparage the marvellous brains that monkeys have! Let alone apes, lest we go down Planet Of The Apes levels of "whoops, we created our successor".
[3] We can slice a brain up post-mortem, and establish as many linkages as have not distorted due to death (or slicing!), but can we identify the remnants of activity that traversed the physical structure and shows us what the last images that the eyes saw, sounds that the ears heard, etc? Certainly, even if that was possible, we can't then establish much of what was thought... If we additionally sift for adrenaline/etc, or signs of raised activity in certain structures, perhaps we can see what emotional state that brain was in or
possibly that it recognised something. But was that because it was a family member with them at/during their death, or that they died in a place they knew well and had many associated memories of (including the presence family members)? Already that's beyond what we can do now, but if/when it becomes viable to start to forensically establish these things, the overlap of related states is going to be a further matter of untangling the possibly nigh-unentanglable.
[4] Live, realtime fMRI scans, etc, can 'eavesdrop' upon the working mind, getting (perhaps, once we know enough) the "is thinking about family" signal, but not yet really that specific. A wish to move a limb is one thing studied (showing that a randomly moved hand will be moved, in advance of even the subject consciously realising that they have made such a decision, but only in gross terms. A sudden distaste for being stuck in the fMRI could look like an instantaneous distaste for the latest in the long line of images you're forcing the subject to look at as part of a different cognitive experiment.
[6] As opposed to "we rate non-lethal solutions a gazillion times higher than lethal ones" as an input, being unaware that internally it finds a loophole whereby degrees of lethality don't even apply to anything it doesn't consider equal to itself, somehow, so it'll play 'nice' only so long as it does not feel threatened.