Artificial intelligence for breast cancer detection. History of artificial intelligence. Here are the topics we're going to cover in this lesson. We will introduce the definitions of artificial intelligence in their context of this course, such as neural networks, perceptron, deep neural network. We will discuss the applications of the artificial Intelligence and its products available commercially. The deep neural network is getting lots of attention in recent years due to its successful applications. We will also discuss its shortcomings and the proposed solutions. Let's define the terms of artificial intelligence, machine learning, and the deep learning. The artificial intelligence refers to the machines that mimic human cognitive functions, such as learning and the problem solving. The Machine Learning is the study of the computer algorithms with a specific processing paradigm. The deep learning involves a deep neural network and is a specific approach of the machine learning. The machine learning is a specific approach of the AI. However, in this course, machine learning and the AI are used interchangeably. The field of artificial intelligence was established more than 60 years ago. It includes researchers from various disciplines, such as biology, math, psychology, economics, political science, etc. Of course, the Computer Science and the hardware engineering are the core of the AI technology. Humans are very good at tasks such as decision-making, image exploitation, playing games etc. The challenge or the question is, can we model the human brain or human intelligence? The brain is a mass of interconnected neurons. Many neurons connected into each neuron, and each neuron connects to many neurons. The processing capability of the brain is a function of these connections. A connectionist model emulates the brain structure. The artificial neural networks are the most commonly used connectionist model. This slide compares the difference between the neural networks and the Von Neumann machine. A Von Neumann machine is composed of the two elements, the CPU and the memory. The processing is performed in a CPU, the data and the programs reside in the memory, the neural networks have a completely different architecture. The program is the connection between the neural units. The memory is created by modifying the strength of the connections between these units. Let's review the unit of the brain neurons as shown in the figure on the right. Upon the reception on the chemicals at the dendrites, the electrical signals are generated. The signals travel across the neuron and they reach the axon terminals. If the strength of the signal is larger than the threshold, the neuron is fired. Then, the chemical would be released out via the axon to other neurons. Note that there is only one axon per neuron. In 1958, Frank Rosenblatt built a perceptron to emulate the neuron. The representation is shown as the diagram on the upper right-hand side. A number of inputs combined linearly to represent the strength of the neural syndrome. If the combined input exceeds a threshold, the perceptron will fire, like a neuron in the brain. It can also be represented in a function as shown in the lower right-hand side of this slide. As you may have noticed, that the equation is a representation of a line in the feature space. In this equation, x 's are the features and w's are the weights of the connection. The perceptron can be used for the supervised learning of the binary classifiers. The predictions are based on combining a set of weights with a feature vector. The figure on the left shows the feature distribution of two output types class 1 and the class 2. Class 1 is showing the red dots and the class 2 is shown in the blue dots. The green line here is the decision boundary between these two output types. However, as more training samples are introduced, the decision boundary will be updated. That means the weight will be updated to achieve the optimal classification results as shown as the black line. When the feature distribution gets more complicated, a single linear boundary is not enough to delineate the distribution of the features. In this case, we need to employ multilayer perceptron. This is analogous to the human brain, which is composed of many neurons in order to meet the tasking requirements. In case of a multilayer perceptron, more nodes mean more decision boundaries. In theory, while additional layer of perceptron is sufficient to be a universal approximator to model any function at an arbitrary precision. Let's define what the depth of the network is. It is the length of the longest path from a source node to a sink node, as shown in the figure on the right. For a feed-forward neural network, which means the processing is from the input layer to the output layer without any backward looping, the depth of the network is the number of hidden layers plus one. The deep neural network is referred to the network whose depth is greater than 2. Of course, the deeper the data because the extra layers help the network learning. On the other hand, the deeper the network means more weights need to be determined, the more training data and training time would be needed. Recent advancement to overcome these challenges is due to the following three facts: (1) more data are available, which was made possible by the Internet technology, the big data storage and the retrieval; (2) various deep neural network architectures have been developed, such as convolutional neural network, long and short-term memory etc., and (3) the computer hardware such as GPUs, make a large-scale deep neural network training affordable both in terms of the time and the cost. Here is the list of successful applications of a deep neural network, such as health care, speech recognition, natural language processing, image recognition, video object tracking, and even financial services. We now use the deep neural network commercial products listed here in our daily lives. Even with these successful applications mentioned in the previous slide, the deep neural network needs to be improved in several areas. As we mentioned before, the deep neural network requires lots of labeled data for training, which is challenging for many applications. Because of its complexity, it is often viewed as a black box to the users that deep neural network can produce confidently correct classification, and it may generate unexplainable misclassification. Then the undesirable attributes that deep neural network is vulnerable to the adversarial attack and the data poisoning. It is not robust against the input data with minor random noise adding to the data To alleviate the problem mentioned before, it is proposed to take a hybrid approach to the object classification: combining various deep neural network architectures with multiple modeling and the Bayesian inferencing techniques to provide the explainable AI; establishing the robust data modeling to avoid problematic behaviors, hacks, deceptions, etc. Instead of providing a simple binary decision, a classifier generate output with confidence scores. To sum up, we introduced the artificial intelligence and clarified definitions for artificial intelligence, machine learning, and a deep neural network. The concept and the architecture of the perceptron lead to today's deep neural network. Furthermore, we have discussed the challenges and potential solutions to the deep neural network. The references relevant to the content of this lesson are listed here. This concludes the lessons of history of artificial intelligence.