Introduction
There is this idea amongst many people who work in the Computer/Data Science adjacent technology space (and even some within it) that hold this idea machine “learning” systems, neural networks in particular will continue to become more and more advanced over time, with greater and greater access to better and better hardware and thus will inevitably get to the point where they can explain and emulate any facet of the world, including consciousness.
Such notions however, show a fundamentally great deal of ignorance on the part of those that hold them, they have forgotten exactly what it is that neural networks are.
What is a(n Artificial) Neural Network Anyway?
Fundamentally an artificial neural network is the combination of one or more nodes that take some input, modify them with internal, learned parameters called weights and sum them (i.e. a form of linear combination) with scale or amplitude of the resulting output being controlled with an activation function, most commonly the REctified Linear Unit or the Sigmoid Function.
While an obvious property of this networking of linear combinations is that it can perfectly copy any linear function, what is not obvious, but is very useful is that this networking, when combined with the idea of being having no cycles (being acyclic or feedforward) and the optimisation method of backpropagation allows for virtually all types of functions to be approximated.
This is to say that artificial neural networks are universal function approximators. Notice, however the caveat of “virtually”.
Fortunately, there is available a much more precise and formal statement of this power of approximation available to us in the Universal Approximation Theorem1
A feedforward network with a linear output layer and at least one hidden layer with any βsquashingβ activation function (such as the sigmoid activation function) can approximate any Borel measurable function from one finite-dimensional space to another with any desired nonzero amount of error, provided that the network is given enough hidden units. The derivatives of the feedforward network can also approximate the derivatives of the function arbitrarily well
-
Kurt Hornik, Maxwell Stinchcombe, Halbert White, Multilayer feedforward networks are universal approximators, Neural Networks, Volume 2, Issue 5, 1989, Pages 359-366, ISSN 0893-6080, https://doi.org/10.1016/0893-6080(89)90020-8. ↩︎