This course is a hands-on introduction to modern neural network ("deep learning") tools and methods. The course will cover the fundamentals of neural networks, and introduce standard and new architectures: from simple feed forward networks to recurrent neural networks. We will cover stochastic gradient descent and backpropagation, along with related fitting techniques.
The course will have a particular emphasis on using these technologies in practice, via modern toolkits. We will specifically be introducing (1) Keras (together with TensorFlow) and (2) PyTorch, which are illustrative of static and dynamic network implementations, respectively. Applications of these models to various types of data will be reviewed, including images and text. This iteration will have a bit of a bias toward the latter, reflecting instructor biases.
We will use a collection of (mostly online) resources for references and readings. Many of the course readings are from the very nice (and quite recent) Dive into Deep Learning" online book. A note here is that this book presents code in the mxnet framework, which we will not `officially' be covering. However, a key objective in this class is to provide sufficient familiarity with the methods and programming paradigms such that switching to new frameworks is no great obstacle. This is particularly important given the rapid pace of development in the deep learning toolkit space.
|5%||In class exercises|
Prior exposure to machine learning is recommended, and enforced through a co-req with ML. Working knowledge of Python required (or you must be willing to pick up rapidly as we go). Familiarity with linear algebra, (basic) calculus and probability will be assumed throughout.
Homeworks will consist of both written and programming components. The latter will be completed in Python, using the Keras/TensorFlow and PyTorch frameworks.
Late Policy. Homeworks that are one day late will be subject to a 20% penalty; two days incurs 50%. Homeworks more than two days late will not be accepted.
The mid-term will be given in class, and will be testing for understanding of the core material presented in the course regarding the fundamentals of neural networks, including backpropagation, as well as differences between and applicability of the architectures introduced.
A big component of this course will be your project, which will involve picking a particular dataset on which to implement, train and evaluate neural models. Collaboration is allowed (team sizes should be <= 3, however). This project will be broken down into several graded deliverables, and culminate in a report and final presentation in class to your peers.
Here is an outline of the project expectations, etc.
A commitment to the principles of academic integrity is essential to the mission of Northeastern University. The promotion of independent and original scholarship ensures that students derive the most from their educational experience and their pursuit of knowledge. Academic dishonesty violates the most fundamental values of an intellectual community and undermines the achievements of the entire University. For more information, please refer to the Academic Integrity Web page.
More specific to this class: It is fine to consult online resources for programming assignments (of course), but lifting a solution/implementation in its entirety is completely inappropriate. Moreover, you must list all sources (websites/URLs) consulted for every homework; failing to do so will constitute a violation of academic integrity.
|Meeting||Topic(s)||readings||things due||lecture notes|
|1/7||Course aims, expectations, logistics; Review of supervised learning / Perceptron / intro to colab||d2l: Introduction / Tensor arithmetic basics||join the Piazza site!||Intro slide deck; Notes; Notebook|
|1/10||Linear Regression and Estimation via SGD||d2l: Linear regression||Notes; Notebook|
|1/14||Beyond Linear Models: The Multi-Layer Perceptron||d2l: MLPs||Notes; An aside on metrics; Notebook: metrics and evaluation; Notebook: MLPs/non-linear models|
|1/17||Abstractions: Layers and Computation Graphs||d2l: Layers and blocks||HW 1 Due!||Notes; Notebook: Computation graphs|
|1/21||No class (MLK day)|
|1/24||Backpropagation I||d2l: Backprop; and a take from Colah's blog||Notes; Notebook|
|1/28||Backpropagation II||Notes; Notebook|
|1/31||Optimization wrap-up / Embeddings||d2l: Word embeddings<||Notes; Notebook on optimizers; Notebook on regularizers (from TF docs)|
|2/4||Embeddings (wrap-up) / Convolutional Neural Networks (CNNs) I||d2l: CNNs||Notes on embeddings; Intro notes on ConvNets; Notebook: embeddings|
|2/7||Convolutional Neural Networks (CNNs) II||d2l: CNNs||HW 2 Due!||Notes; Notebook|
|2/11||Recurrent Neural Networks (RNNs) I||RNNs /The Unreasonable Effectiveness of RNNs||Notes; Notebook (modified from TF official docs)|
|2/14||Recurrent Neural Networks (RNNs) II||RNNs /Colah: Understanding LSTMs||Notes (from Jay!)|
|2/18||No class (President's day)||HW 3 Due! Now by midnight, 2/26!!!|
|2/21||Neural Sequence Tagging|
|3/11||Sequence-to-Sequence Models I||d2l: Encoder-Decoder (seq2seq)|
|3/14||Sequence-to-Sequence Models II / Attention||d2l: Attention|
|3/18||Summarization Models||HW 4 Due!|
|3/21||Auto-Encoders/Variational Auto-Encoders (VAEs)||Auto-encoders; Intuitive VAEs|
|3/25||Generative Adversarial Networks (GANs)||Original GAN paper (Goodfellow et al.)|
|3/28||Deep Reinforcement Learning I||Intro to deep RL||HW 5 Due!|
|4/1||Deep Reinforcement Learning II||Intro to deep RL|
|4/4||Advanced Topics (TBD)/built-in slack|
|4/8||Final project presentations/discussion I|
|4/11||Final project presentations/discussion II|