So, Linear Regression was pretty much it, as far as learning from data was concerned. Until the 1940s, a researcher, Frank Rosenblatt, comes up with a perceptron as a computational model of a neuron in the human brain and shows how it can learn simple functions. It is what we would call today, a Binary Linear Classifier where we are trying to find a single line that splits the data into two classes. A single layer of perceptrons would be the simplest possible feet forward neural network. Inputs which feed into a single layer perceptrons and a weighted sum will be performed. This sum would then pass through what we call today an activation function, which is just a mathematical function you apply to each element that is now residing within that neuron. Remember though, at this point, this is still just a linear classifier. So the activation function which is linear in this case, just returns its inputs. Comparing the output of this to a threshold, would then determine which class each point belongs to. The errors would be aggregated and used to change the weights used in the sum, and the process would happen again and again until convergence. If you are trying to come up with a simple model of something that learns a desired output from a given input distribution, then you needn't look far since our brains do this all day long making sense out of the world around us and all the signals that our bodies receive. One of the fundamental units of the brain is the neuron. Neural networks are just groups of neurons connected together in different patterns or architectures. A biological neuron has several components specialized in passing along electrical signal which allows you and I to have thoughts, perform actions, and study the fascinating world of machine learning. Electrical signals from other neurons such as, sensory neurons in the retina of your eye, are propagated from neuron to neuron. The input signal is received at one end of the neuron, which is made up of dendrites. These dendrites might not just collect electrical signal from just one other neuron, but possibly from several, which all get summed together over windows in time that changes the electrical potential of the cell. A typical neuron has a resting electric potential of about negative 70 millivolts. As the input stimuli received at the dendrites increases, eventually it reaches a threshold around a negative 55 millivolts. In which case, a rapid depolarization of the axon occurs, with a bunch of voltage gates opening and allowing a sudden flow of ions. This causes the neuron to fire an action potential of electric current along the axon aided by the myelin sheath for better transmission to the axon terminals. Here, neurotransmitters are released at synapses that then travel across the synaptic cleft, to usually the dendrites of other neurons. Some of the neurotransmitters are excitatory, where they raise the potential of the next cell, while some are inhibitory and lower the potential. The neuron repolarizes to an even lower potential than resting for a refractory period. And then the process continues in the next neuron, until maybe, it eventually reaches a motor neuron and moves your hand to shield the sun out of your eyes. So, what does all this biology and neuroscience have to do with machine learning? Look familiar? This is a single layer perceptron. It too just like the neuron, has inputs which it then multiplies by weights and sums all together. The value here is now compared with the threshold and then transformed by an activation function. For instance, if the sum is greater than or equal to zero, then activate or press a value of one, otherwise, don't activate or press a value of zero. The inputs and weights act like the neurotransmitters in a neuron, where some can be positive and add to the sum, and some can be negative and subtract from the sum. The unit's step function acts as an all or none threshold. If the threshold is met, then pass the signal, otherwise, don't pass anything. Finally, there is an output and like biological neurons, this can actually pass as input to other neurons in a multi-layered perceptron, which we'll talk about next. This is all very cool however, it turns out that there are very simple functions that it can't learn. For example, the XOR function. Marvin Minsky, a famous computer scientist at MIT, pointed this out and then no one wanted to fund AI for about 15 years. This was not the first time neural networks hit a brick wall and were essentially forgotten for awhile. What component of a biological neuron is analogous to the input portion of a perceptron? The correct answer is the dendrites. They receive stimulus from other neurons, just like an artificial neural network does. The axon is incorrect since that is more analogous to the output of a perceptron. The nucleus is incorrect since that is where the cells' genetic material is stored, and it controls the cells' activities. The myelin sheath is incorrect since that helps transmission of the axon, which is once again, on the output portion of the perceptron.