Lucy Chen
Artificial intelligence has existed since the 1950s, and underpins all technology in modern society. From monitoring Amazon transactions to self-driving cars, almost every product used today runs on intelligent machines, learning AI is almost mandatory these days.
One of the most recognisable systems of artificial intelligence is the artificial neural network (ANN), which is loosely modelled after the animal neocortex- a structure responsible for higher-level processes such as decision-making, episodic memory, and consciousness.[1] The neocortex is the outermost layer of the brain, and has a thickness of approximately 2.5mm (from Figure 1). Due to its highly folded geometry, the surface area of the neocortex is maximised; thus, it accounts for 80% of the total weight of a human brain.[2]
Figure 1: Anatomical representation of the neocortex and other structures related to memory and perception.[3]
This outermost layer is further divided into a series of six horizontal sublayers, numbered I to XI. These sublayers are characterised by neuron morphology (mainly pyramidal, fusiform, and granular) and neuronal connections; for example, layers V and VI project to the thalamus, brainstem, and spinal cord, whereas layer IV receives the majority of synaptic connections from the thalamus and project short-range axons to other cortical layers (Figure 2). Thus, layers V and VI can be considered as responsible for processing output, whereas layer IV receives input.[4,5] The neocortex has generally uniform structure between regions of the brain, with exceptions being associated with the magnitude of input information being received from those respective regions.[4] Thus the architecture of the neocortex reflects its ability to recognise and distribute incoming sensory information, and subsequently modulate responses to a high degree of sensitivity.
Figure 2: Graphical representation of subcortical layers with inputs and outputs.
It is remarkable that a structure with such a complex neurological function has conserved uniformity in its fundamental organisation; it is arranged in vertical columns that have a diameter of roughly 0.5mm and span all six sublayers.
This was first observed in 1957 by Vernon Mountcastle, a neuropsychologist at Johns Hopkins University, who hypothesised that the neocortex was composed of a basic repeating unit, operating through a common functional principle.[6] This has since been heavily debated, with a lack of consensus on anatomy and function.[7] However, for the purposes of discussing the links between the biological neocortex and the computational neural network, it is useful to use the cortical column theory in order to visualise the similarities between them. Mountcastle later elaborated on this idea by suggesting the existence of sub-columns within columns, which initially received backlash due to the lack of visible partitioning as evidence, but was later proven to be correct.[8]
Ergo the neocortex contains repeating columns, which further consist of another, smaller basic unit. This idea would later become the central focus of futurist Ray Kurzweil’s postulation on the brain as a computer. Kurzweil contended that these sub-columns were pattern recognisers, laying the foundations for language and memory as sequences of patterns forming a specific hierarchy that would be used to perform basic processes, such as daily routines or writing letters. This would become known as the “pattern recognition theory of mind”.[8]
What is pattern recognition, and how is it performed in the neocortex?
The term “pattern recognition” is most commonly associated with statistical processing and machine learning- a subset of artificial intelligence that builds mathematical models based on training data in order to make predictions about test data without being specifically programmed for that prediction.[8] Pattern recognition is the automatic identification of commonalities in data through the use of computer algorithms; these common features are then used to categorise the data and determine outputs. This is primarily done by locating “most likely” matches or non-exact similarities in the data, whilst also accounting for statistical variation.
But how does this concept relate to the human brain? In his 2012 non-fiction book How to Create a Mind, Kurzweil discusses this connection by modelling the hundreds of neurons that make up a neocortical sub-column as a single module of inputs, weights, and outputs.[8] To illustrate how pattern recognisers may work in the neocortex, an example will be used where the brain recognises the letter “A” in a series of different words.
Each pattern is composed of three parts. The first is the input, which is where the main pattern is divided into a group of lower-level patterns. These are represented by signals received from dendrites; as can be seen from Figure 3, signals sent into the module indicate the confirmed presence of the lower-level pattern, and signals sent out of the module via the dendrites indicate the expected presence of the lower-level pattern.
The second part is the “name” of the pattern; the ability for one’s brain to conclude; “I just saw the letter A.” Biologically, this name is determined by the firing of the axon that emerges out of the recognition module. As action potentials are an all-or-nothing process occurring with constant amplitude and above a specific threshold, it is therefore reasonable to compare the firing of an axon to a switch, or a Boolean variable with the values “True” or “False”.
Figure 3: Representation of a pattern recognition module, the sub-columns contained within cortical columns of the neocortex.
The final part of the pattern is the set of higher-level patterns it can feed into. For example, the set of higher-level patterns for the letter “A” consist of all words containing the letter “A” at least once. This can be taken further; all words are themselves patterns, which can then be fed into higher-level patterns of sentences, and paragraphs, and so on. These levels of patterns are linked by dendrites. The organisation of the neurons that make up each module can be considered comparable to a function with many-to-one mapping- one neuron receives many dendrites, but outputs through only one axon, which can then transmit to multiple dendrites towards higher levels as shown in Figure 4.
This hierarchy is not established by physically stacking neurons on top of each other; instead, the concepts defined in the previous paragraph are biologically represented by the connections between modules – the synaptic interactions between dendrites and axons. There are several distinctive properties of these modules that affect how information is processed between them. They are redundant, which ensures that there is more than one pattern recogniser available to process the letter “A”, and that any of these multiple modules can fire the same output signal from their axons to higher levels.
This system not only works for processing language, but also images and sound. Pattern recognisers are differentiated by the type of pattern they recognise, and their location in visual and auditory cortices of the brain respectively. The brain can therefore use this redundancy to its advantage by increasing the likelihood of correct recognition and enabling identification of its variations; reading the letter “A”, saying it aloud, or hearing it spoken from someone else all constitutes part of the total information on the letter, and are all encoded by a series of different patterns.[8]
Figure 4: Example of three redundant patterns for the letter “A”. Symbols in the first layer of rectangles represent the lower-level patterns that are combined to form the letter. The second layer of rectangles are higher-level patterns that contain the letter “A” (words).
Once the lower-level pattern inputs have been received, how does the module recognise them? For example, the letters “R” and “P” share some lower-level patterns (Figure 5). How does the brain differentiate between them?
This is done by weighting each individual input; the higher the weight, the more important that pattern is to the recognition of the overall pattern. As mentioned earlier, the output axon has a firing threshold, to which the weights of all the inputs contribute. The module may still end up firing even if one of the inputs is missing – for example, if only half a letter is seen, some of the corresponding pattern may be absent but the overall letter may still be correctly recognised; a process known as auto-association.[7] However, if the missing input has a higher weight, the axon is less likely to fire.
Figure 5: Depiction of lower-level patterns in the letters “R” (left hand side) and “P” (right hand side) with each lower-level pattern highlighted in red. These patterns are only examples.
In addition, the size of each input matters. A good example of this would be the differences between sounds of consonants when spoken aloud; the “e” sound is longer in “meet” than in “met”, and their vowel sounds are due to different resonant frequencies. Therefore, although the letter “e” is still the same, the two words are recognised by the duration of the “e” sound; it is shorter in “met” than in “meet”. This is defined as the size of the input, although the nature of the size parameter is dependent on the type of pattern recognised. For example, with sound, there is a time dimension, but with reading printed letters, there is a spatial dimension. Thus, each input encodes the pattern and its observed size, which is then compared to the size parameters in the module.
The function of the pattern recognition module, therefore, is to calculate the probability that the total pattern is present, based on differentially-weighted input patterns and size parameters determined by prior experience. This biological process can be modelled as Hidden Markov Models (HMMs), where patterns are recognised in a hierarchical manner. HMMs are now applied to various fields, such as reinforcement learning and bioinformatics, and are part of the speech recognition software for Siri.[8]
Single-layer perceptrons and mimicking the neocortex
In 1958, Frank Rosenblatt created the Mark 1 Perceptron- a machine modelled after the electronic properties of neurons, and the first major predecessor to the ANNs that are prevalently used today.
The inputs, similar to as described in the pattern recognition module models, were two-dimensional values. For example, for images, the inputs would be each pixel, defined by a set of coordinates. These inputs would then enter the first layer, known as the input layer, which was made up of a series of nodes (represented in Figure 6 by the coloured circles) where each node represents a single neuron. A node receives all inputs via weighted connections represented by the arrows in Figure 6. To calculate the output of the node, the weighted sum of all the inputs is calculated, and a further bias term is added to account for a possible mean shift. The node encapsulates an activation function, which is necessary to convert the weighted sum to the output.
Figure 6: Layers of single-layer perceptron shown in different colours. The orange arrows represent one possible pathway through the feed-forward network.
The outputs from each node of the input layer are then passed onto the next layer, where they act as the inputs to the next set of nodes, computed by the propagation function, which essentially calculates another weighted sum. In the Mark 1 Perceptron, there was only one hidden layer; further improvements of this system increased the number of hidden layers to better represent variations in the input data and thus generate better predictions.[10]
The final layer is the output layer. This overall architecture is known as a feedforward network, as the nodes and connections form a directed non-cyclic graph. As can be seen, the similarities between neurons and ANNs are abundant.
Developing further: can computers simulate brains?
Mathematician John von Neumann described the brain as performing analog mechanisms with low precision, which can be simulated through digital computation. There is also plasticity in the brain, where changes at both the cellular and cortical levels correspond to specific experiences, such as learning. This plasticity is seen to an even greater extent in computers, by reprogramming software. Thus, theoretically, a computer may be able to successfully emulate certain aspects of neocortical function.[7]
Brain-machine interfaces and new engineering innovations in healthcare further emphasise how closely linked the understanding of the brain and programmed software are. Scientists are now able to transfer signals from the central nervous system to prosthetic limbs, and Elon Musk’s NeuraLink is in the process of developing a chip that will help treat brain injury and depression via electrode threads from the skull to specific brain regions, as well as allowing patients to directly stream music to the brain.[11,12]
What is consciousness?
Can machines ever be conscious? Over the course of the last few decades, many scientists have attempted to better understand the biological basis of human consciousness. However, while it is a concept which has been the focus of philosophical discussion for centuries, and has not been rigidly defined due to the intrinsic subjectiveness of the topic. How can an scientific representation of the brain be applied to the elusive internal experience of being oneself?
One notable example is Max Tegmark’s theory, wherein consciousness results from a system that stores and retrieves information, and involves underlying quantum mechanics. Tegmark hypothesises that consciousness can be conceptually defined as a separate state of matter from solids, liquids, or gases; a state that he refers to as “perceptronium”, an indivisible continuous substance that, in his words, feels “subjectively self aware”.[13]
However, Tegmark fails to define what may cause consciousness. A general, but rather unsatisfying answer could be that outputs from different regions of the brain are integrated into a single experience. A study by researchers at George Washington University discovered that the clastrum, a thin structure underneath the neocortex, may be responsible. Upon stimulation of this region with electrodes, consciousness was removed; the patient stopped whatever they were doing, instead staring blankly into space whilst still remaining fully awake. When stimulation ended, the patient regained activity, but could not recall the duration of time spent under stimulation.[14] However, there were several limitations to this study; the sample size was restricted to one patient, who had a preexisting epleptic condition, thus providing a poor representation of healthy brain function.
However, when discussing the implications of artificial intelligence gaining consciousness, considering the similarities in the biological and digital computation of decisions, the parameters surrounding consciousness become even more evasive. Artificial consciousness has been notoriously difficult to define, with some scientists linking it to self-awareness, and others postulating a function of system-level adaptation.[15, 16]
Scientists have come up with different criteria to define intelligence, the most famous of which is the Turing test, created by mathematician Alan Turing in 1950. The aim of the test is to see whether the exhibited hypothetical intelligence of a machine can be distinguished from that of a human. In it, a human has a conversation via text with a machine designed to respond in a human-like way. This conversation is monitored by a human evaluator, who sees the typed conversation but does not know which is the machine and which is the human. If the evaluator cannot tell the difference, the machine passes the test, showing that it has human-level intelligence.[7]
The basis of this consciousness discussion subsequently starts with the American philosopher John Searle, who developed upon the principles of the Turing test to generate a thought experiment known as the “Chinese room” (Figure 7).
Figure 7: Visual representation of the “Chinese Room” thought experiment.[20]
Suppose a man is sitting in a room. He is given questions in Chinese, and uses a rulebook to answer them. One could say that the man does not have true understanding of the language because he does not really understand the questions or the answers as he is passively following the guidelines of the rulebook, even though he’s able to answer them correctly.
Imagine this again, but with a computer: If the machine answers the questions accurately enough to pass the Turing test with a Chinese speaker, does it actually understand Chinese, or is it just simulating the ability to understand Chinese? A machine that satisfies the first criterion is called “strong AI”, and the second is “weak AI”.[17] Therefore, even though the Turing test is a benchmark for machine intelligence, it is not proof that such a machine has consciousness.
So which definition of consciousness is most applicable to the hypothesised artificial consciousness? Philosopher and cognitive scientist Ned Block provided a distinction between phenomenal and access consciousness, where phenomenal consciousness relates to the experience of being in a conscious mental state, whereas access consciousness refers to the availability of a mental state to be used by an organism to perform tasks that involve reasoning.[18] Therefore, access consciousness appears to be the best conceptual representative of artificial consciousness.
Technology is not currently advanced enough to build the strong AI that Searle proposed. However, as biology has informed AI in the past, a better neurological understanding of human consciousness may potentially pave the way towards creating intelligent systems that surpass the biological limitations of the brain. Ray Kurzweil says it best: “Our technology, our machines, are part of our humanity. We created them to extend ourselves and that is what is unique about human beings.”[19]
Bibliography
[1] Strange, B. A., Menno P. Witter, Edward S. Lien, and Edvard I. Moser. “Functional organisation of the hippocampal longitudinal axis.” Nature Reviews Neuroscience 15 (2014): 655-669. doi:10.1038/nrn3785
[2] Mack, S. Eric R. Kandel, Thomas M. Jessell. Principles of Neural Science. 5. New York: McGraw-Hill Professional, 2013. Isbn:9780071390118
[3] Efe, L. “Where are memories stored in the brain?” Accessed July 20, 2020. https://qbi.uq.edu.au/brain-basics/memory/where-are-memories-stored
[4] Noback, C. R., David A. Ruggiero, Robert J. Demarest. The Human Nervous System: Structure and Function. 6. New Jersey: Humana Press, 2007. Isbn:1592597300
[5] Moerel, M., Federico De Martino, Elia Formisano. “An anatomical and functional topography of human auditory cortical areas.” Frontiers in Neuroscience 8, 225. (2014) doi:10.3389/fnins.2014.00225
[6] Mountcastle, V. B. “Modality and topographic properties of single neurons of cat’s somatic sensory cortex.” Journal of Neurophysiology 20, 4. (1956): 408-434. doi:10.1152/jn.1957.20.4.408
[7] Horton, J. C. and Daniel L. Adams. “The cortical column: a structure without a function.” Philosophical Transactions of the Royal Society B: Biological Sciences 360, 1456. (2005): 837-862. doi:10.1098/rstb.2005.1623
[8] Kurzweil, R. How to Create a Mind: The Secret of Human Thought Revealed. New York: Viking Penguin. Isbn:978-0670025299
[9] Koza, J. R., Forrest H. Bennett III, David Andre, Martin A. Keane. Automated Design of both the Topology and Sizing of Analog Electrical Circuits using Genetic Programming. Berlin: Springer. https://link.springer.com/chapter/10.1007%2F978-94-009-0279-4_9
[10] Schmidhuber, J. “Deep learning in neural networks: an overview.” Neural Networks 61. (2015): 85-117. doi: 10.1016/j.neunet.2014.09.003
[11] Ramirez, V. B. “What would it mean for AI to become conscious?” March 26, 2019. https://singularityhub.com/2019/03/26/what-would-it-mean-for-ai-to-become-conscious/
[12] Eadicicco, L. “Elon Musk says there’s a chance his AI-brain-chip company will be putting implants in humans within a year.” May 7, 2020. https://www.businessinsider.com/elon-musk-neuralink-brain-chip-put-in-human-within-year-2020-5?r=US&IR=T
[13] Tegmark, M. “Consciousness as a state of matter.” Chaos, Solitons & Fractals 76. (2015): 238-270. doi:10.1016/j.chaos.2015.03.014
[14] Koubeissi, M. Z., Fabrice Bartolomei, Abdelrahman Beltagy, Fabienne Picard. “Electrical stimulation of a small brain area reversibly disrupts consciousness.” Epilepsy & Behaviour 37. (2014): 32-35. doi:10.1016/j.yebeh.2014.05.027
[15] Chatila, R., Erwan Renaudo, Mihai Andries, Ricardo-Omar Chavez-Garcia, Pierre Luce-Vayrac, Raphael Gottstein, Rachid Alami, Aurelie Clodic, Sandra Devin, Benoit Girard, Mehdi Khamassi. “Toward Self-Aware Robots.” Frontiers in Robotics and AI 1. (2018). doi:10.3389/frobt.2018.00088
[16] Kinouchi, Y. and Kenneth James Mackin. “A Basic Architecture of an Autonomous Adaptive System with Conscious-Like Function for a Humanoid Robot.” Frontiers in Robotics and AI 1. (2018). doi:10.3389/frobt.2018.00030
[17] Hildt, E. “Artificial Intelligence: Does Consciousness Matter?” Frontiers in Psychology 1. (2019). doi:0.3389/fpsyg.2019.01535
[18] Block, N. “On a confusion about a function of consciousness.” Behavioural and Brain Sciences 18, 2. (1995): 227-247. doi:10.1017/S0140525X00038188
[19] Sahota, N. “Human 2.0 is coming faster than you think. Will you evolve with the times?” October 1, 2018. https://www.forbes.com/sites/cognitiveworld/2018/10/01/human-2-0-is-coming-faster-than-you-think-will-you-evolve-with-the-times/#5cadb27a4284
[20] Novella, S. “AI and the Chinese Room Argument.” October 23, 2015. https://theness.com/neurologicablog/index.php/ai-and-the-chinese-room-argument/