1.[PAPER: EVOLVED POLICY GRADIENTS]03:57
2.Evolution Strategies +Stochastic Gradient Descent04:31
3.Generalization Beyond Training Task Distribution07:05
4.More Information08:41
5.[PAPER: ADAPTED DEEP EMBEDDINGS]09:01
6.Inductive Transfer Learning09:11
7.MNIST11:13
8.Conclusion12:25
9.[PAPER: BAYESIAN MODEL-AGNOSTIC META LEARNING]13:17
10.Model-Agnostic Meta-learning (MAML)13:41
11.Gradient-based meta-learning +"Bayesian"14:15
12.Lightweight laplace Approximation for Meta-Adaptation (LLAMA)14:47
13.Bayesian Model-Agnostic Meta-Learning (BMAML)15:34
14.Stein Variational Gradient Descent (SVGD)16:15
15.Bayesian Fast Adaptation (BFA)16:37
16.Bayesian Meta-Update with Chaser Loss17:03
17.Experiments17:55
18.[PAPER: PROBABILISTIC NEURAL PROGRAMMED FOR SCENE GENERATION]18:29
19.Scene generation problem18:53
20.Our Proposed Model20:34
21.Reusable neural operators21:02
22.Apply modules into generation process22:28
23.Learning process22:42
24.Experiments: Unseen Object-Attribute Combinations22:59
25.Our poster23:17
26.[PAPER: NEURAL ORDINARY DIFFERENTIAL EQUATIONS]23:37
27.Background: ODE Solvers24:36
28.How to train an ODE net?27:54
29.Continuous-time Backpropagation28:23
30.Drop-in replacement for Resets29:28
31.How deep are ODE-nets?30:05
32.Explicit Error Control31:24
33.Poisson Process Likelihoods: 33:01
34.Change of Variables33:27
35.Continuous Normalizing Flows34:24
36.A new generative density model35:02
37.Recap35:56
38.[PAPER: BIAS AND GENERALIZATION IN DEEP GENERATIVE MODELS]39:08
39.Success in Generative Modeling of Images39:24
40.Goal: Understanding Generalization39:47
41.Empirical Study of Generalization: Method40:28
42.Multiple Features42:24
43.Memorization vs. Generalization42:57
44.Different Setups, Similar Results43:24
45.Conclusion43:45
46.[PAPER: ROBUSTNESS OF CONDITIONAL GANS TO NOISY LABELS]44:06
47.Conditional GAN is sensitive to noise in labels44:56
48.Robust Conditional GAN (RCGAN) Architecture46:12
49.Minimising the noisy divergence minimizes true divergence46:44
50.RCGAN generates correct class (MNIST)47:53
51.RCGAN improves quality of samples (CIFAR-10)48:32
52.[PAPER: GENERATIVE NETWORKS WITH METRIC EMBEDDINGS]49:00
53.Mode Collapse in GAN49:15
54.Why would mode collapse happen?49:46
55.Our Approach50:51
56.How to construct the Gaussian mixture?51:02
57.Mode: A Geometric Structure51:07
58.Missing Modes51:58
59.Our Solution: Encourage Distance Preservation52:09
60.Theoretical Results52:32
61.Experimental Results52:56
62.[PAPER: LOSS SURFACES, MODE CONNECTIVITY, AND FAST ENSEMBLING OF DNNS]53:34
63.Loss Surfaces55:56
64.Fast Geometric Ensembles (FGE)56:26
65.Summary57:19
66.VIVADO
pent- NETFLIX57:42
67.[PAPER: HOW DOES BATCH NORMALIZATION HELP OPTIMIZATION?57:53
68.How does Batchnorm help59:56
69.A Closer Look at Internal Covariate Shift1:01:07
70.Playing with Internal Covariate Shift1:01:53
71.Different View of Internal Covariate Shift1:03:17
72.A Landscape View1:04:40
73.A Simple Experiment1:05:26
74.Delving Deeper1:06:35
75.Conclusions1:07:55
76.Q/A1:09:43
77.Neural Information Processing Systems Thanks All Our Sponsors1:09:49
78.[PAPER: TRAINING NEURAL NETWORKS USING FEATURES REPLAY]1:13:32
79.Motivation1:14:02
80.Problem Reformulation1:15:21
81.Features Repaly1:16:25
82.Convergence Guarantee1:17:04
83.Experimental Results1:17:33
84.Thanks !1:18:18
85.[PAPER: STEP SIZE MATTERS IN DEEP LEARNING]1:18:30
86.Gradient Descent: Effect of Step Size Example1:18:55
87.Deep Linear Networks1:19:50
88.Nonlinear Networks1:22:59
89.[PAPER: NEURAL TANGENT KERNEL]1:23:38
90.What happens during training?1:24:35
91.In the Infinite width limit:1:25:26
92.Neural Networks - Kernel Methods1:26:30
93.What happens inside a very wide network? 1:27:50
94.Summary1:28:20
95.[PAPER: HIERARCHICAL GRAPH REPRESENTATION LEARNING VIA DIFFERENTIABLE POOLING]1:28:57
96.Motivation : ML for Graphs1:29:20
97.Graph Pooling1:29:52
98.Pooling for GNNs1:31:21
99.DIFFPool Architecture1:31:58
100.Experimental results1:33:26
101.Thank you!1:33:52