Why do people keep acting like I'm in some way attacking the efficacy or validity of these things? They work great. I'm just skeptical whether they're an approach that'll lead to general-intelligence AI.
Well, you'll have to define what "general-intelligence AI" means, first, before we can say that.
But no, whenever I talk apparently everyone has to go into condescending "maybe if we explain it slowly he'll understand" mode as if I don't understand. I've seen nothing in the arguments made here that lead me to believe that deep learning/artificial neuron networks will actually lead to general-intelligence AI. I'm not saying they don't work. I'm saying that they have an upper bound on their efficacy, and it's lower than what people would like.
K'.
Then prove it. These are mathematical concepts, after all, so it's entirely within the scope of the discussion to demand a proof. Prove to me that methods other than stochastic gradient descent lead to convergence to the solution.
Random permutation of weights + batch gradient descent will eventually reach the best possible solution for a given network, after you'll explore every local maximum/minimum (depending on where you go along the gradient, don't remember exactly), and since neural networks are a universal approximator, it could be made as close to reality as possible. Given enough data, of course.
Or rather, prove my thing that I edited in to the remark above- prove to me that artificial neural networks and deep learning exhibit mathematical chaos.
Different starting weights can, via the magic of gradient descent, result in reaching completely different solutions to the problem. Though, that's probably not what you want.
Decision trees could actually be more close to what you want, because they exhibit a great degree of instability. I forgot where I've seen it, but there was an example where you've had a decision tree built on a dataset, compared to a decision tree built on a dataset with one datapoint removed, and they were almost completely different. Though, in context that was perceived as a
bad thing, since it meant that you couldn't rely on decision trees to give you an interpretation of the underlying phenomena that wouldn't completely change with every new observation.