5.5 Deep Learning
TensorFlow 2 (Keras):
- Deep Learning with Python by François Chollet, Manning Publishing, 2nd Edition, 2022 Code for Chollet’s book can be found at: https://github.com/fchollet/deep-learning-with-python-notebooks
- Keras site: https://keras.io/guides/ has up-to-date guides which cover the latest version (3) of Keras.
PyTorch:
- Machine Learning with Pytorch and Scikit-Learn by Sebastian Raschka, Yuxi (Hayden) Liu and Vahid Mirjalili, Packt Publishing Ltd, 2022.
- PyTorch version of the book General References for this course are will appear soon.
- Deep Learning by Ian Goodfellow, Yoshua Bengio & Aaron Couville, MIT Press, 2017, available online and on the course site
- Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow, 3nd Edition, by Aurelie Geron, O’Relly 2022, one of the best books on Machine Learning
- Hands-on Machine Learning with Scikit-Learn and PyTorch, by Aurelie Geron, O’Relly 2025, is expected by the end of this calendar year.
- Machine Learning by Andrew Ng of Stanford University and Coursera.
5.5.1 9/5 Introduction to Neural Networks and Deep Learning,
We briefly review several key examples from physiology and neurology motivating basic concepts and patterns in Neural Networks (NNs) and introduce a basic model of the neural network loss function optimization
5.5.2 9/12 Gradient Descent and Back Propagation
We introduce the gradient descent, auto-differentiation, and backpropagation the algorithms which make large scale NNs feasible.
Derivatives
## expression(2 * w)
5.5.3 9/19 Keras
We introduce Keras, the development framework for Deep Learning Networks with its key elements: Loss Functions, Regularization, Sequential and Functional Models and others. We present detailed architecture and patterns for building models in Keras. Initially we discuss fully connected or “dense” networks.
5.5.4 9/26 Convolutional Neural Networks (CNNs)
Starting from physiological models and an analysis of the computational efficiency of fully connected networks we introduce CNNs. We also learn how to determine the number of trainable parameters of various layers in CNN Models.
5.5.5 10/3 Visualizing Feature Maps of CNN Layers, Locating Objects in Images
We learn how to monitor the evolution of layer’s (network’s) parameters as NN evolves through the optimization and observe the distributions of values of those parameters in optimized NNs. Those observations provide a deep insight in the nature of NN data processing.
5.5.6 10/10: Transfer Learning, Fine Tuning, Augmentation
Transfer Learning, Fine Tuning, and Data Augmentation are three most frequently used techniques for training networks on restricted data sets and the increasing the precision of Deep Learning analysis.
5.5.7 10/17 Autoencoders, Variational Autoencoders and Manifold Hypothesis
We learn that NNs behave as if they are searching for a minimal representation of any object. We “discover” embedded vector representations of words in texts. We also understand VAEs, an extension of the autoencoders which allows generation of “higher quality” images and other objects.
5.5.8 10/24 Natural Language Processing (NLP), Doc2Vec like API-s and
Large Language Models (LLMs). Deep Learning (DL) and LLMs provide sophisticated tools for analysis and representation of text. We introduce and demonstrate the most important applications of DL techniques in NLP .
5.5.9 10/31 Analysis and Transcription of Speech
We learn how to encode human speech and present it as a time series of codes. Subsequently, we learn how to use such representations to train DL networks to transcribe speech into text.
5.5.10 11/7 Sequence Analysis, Seq2Seq Models and Machine Translation
Using LSMTs, we demonstrate an ability to predict next value in a sequence, ability to perform machine (automated) translations of texts from one to another language and other sequence related tasks.
5.5.11 11/14 Transformers
Transformer technology is at the core of LLMs. We will demonstrate the capacity of transformer-based systems (LLMs) to generate sensible text and perform many other linguistic and graphical tasks.
5.5.12 11/21: Class 12: Large Language Models (LLMs)
Large Language Models (LLMs) are redefining artificial intelligence. We will review the basic architecture and capabilities of several flavors of LLMs.
5.5.13 12/5: Class 13: Generative Adversarial Networks (GANs)
GANs are special networks that act as generators of objects such as speech, text, and images.