1.[PAPER: ROBUST SUBSPACE ESTIMATION IN A STREAM]09:30
2.Problem Statement10:06
3.Previous Work11:43
4.Our Results12:33
5.High Level Approach13:02
6.Experiments: Synthetic Data14:30
7.Thanks!14:50
8.[PAPER: EFFICIENT NONMYOPIC BATCH ACTIVE SEARCH]15:23
9.What is active search15:56
10.The most straight forward policy: greedy16:50
11.How can we do better17:09
12.Sequential simulation18:00
13.Empirical results18:46
14.Batching is more efficient, but
what are we compromising?18:59
15.Adaptivity gap19:27
16.Thanks for your attention!19:46
17.[PAPER: INETREACTIVE STRUCTURE LEARNING WITH STRUCTURAL QUERY-BY-COMMITTEE]20:02
18.The problem with traditional learning frameworks20:18
19.Learning with interaction21:02
20.Interactive structure learning21:54
21.Other results24:24
22.[PAPER: CONTOUR LOCATION VIA ENTROPY REDUCTION LEVERAGING MULTIPLE INFORMATION S...24:56
23.Contributions28:58
24.Poster AB#9930:00
25.[PAPER: SAMPLE -EFFICIENT REINFORCEMENT LEARNING WITH STOCHASTIC ENSEMBLE VALUE ...30:23
26.Key Points30:54
27.Preliminaries: DQN & DDPG31:24
28.Preliminaries: Model Value Expansion32:40
29.Preliminaries: Model-Based RL33:32
30.Preliminaries: Takeaways34:25
31.STEVE: Motivation35:29
32.STEVE: Candidate Targets36:08
33.STEVE: Estimating Uncertainty36:31
34.STEVE: Computing Target37:50
35.Results39:44
36.Conclusion41:38
37.Q/A42:03
38.PAPER: POLICY-CONDITIONED UNCERTAINTY SETS FOR ROBUST MARKOV DECISION PROCESSES]45:09
39.Why Robust MDPs?45:31
40.Problem46:15
41.Non-Rectangular Uncertainty Sets via Marginal Features47:21
42.Marginally-Constrained Robust MDPs48:08
43.Please visit49:16
44.PAPER: LEARNING CONVEX BOUNDS FOR LINEAR QUADRATIC CONTROL POLICY SYNTHESIS]49:42
45.Learning to control49:59
46.Problem set-up50:18
47.Learning from data50:46
48.Convex upper bounds51:13
49.Convexification51:38
50.Performance52:25
51.Poster presentation53:07
52.[PAPER: MULTIPLE-STEP GREEDY POLICIES IN ONLINE AND APPROXIMATE REINFORCEMENT LE...53:27
53.Motivation: Impressive Empirical Success53:47
54.Motivation: Although the Impressive Empirical Success...54:06
55.Multiple-Step Greedy Policies: 54:39
56.1-Step Greedy Policies and Soft Updates55:50
57.Negative Result on Multiple-Step Greedy Policies56:25
58.How to Circumvent the Problem? 56:55
59.Take Home Messages57:34
60.[PAPER: DEEP INFORCEMENT LEARNING IN A HANDFUL OF TRIALS USING PROBABILISTIC DYN...58:10
61.How Long Does Learning Take?58:32
62.Can we speed this up?58:54
63.Model-Based Reinforcement Learning58:57
64.Deterministic Neural Nets as Models59:56
65.Probabilistic Neural Nets as Models1:00:47
66.Probabilistic Ensembles as Models1:00:58
67.Trajectory Sampling for State Propagation1:01:24
68.Experimental Results1:02:18
69.[PAPER: NON-DELUSIONAL Q-LEARNING AND VALUE ITERATION]1:03:26
70.Q-learning: The Promise and the Peril1:04:17
71.Delusional Bias1:04:48
72.What is Delusional Bias?1:05:55
73.Greedy Policy Limitations1:06:03
74.A Simple MDP1:06:17
75.Optimal Greedy Policy with Approximator1:07:09
76.The Problem: Delusion1:07:28
77.What Happened?1:08:23
78.Resolving Delusion1:08:52
79.Policy Commitments1:08:55
80.Algorithms (see paper for details)1:12:23
81.Results (Theoretical & Empirical)1:12:35
82.Theoretical Guarantees1:12:44
83.Experiments with PCVI & PCQL1:13:36
84.Summary1:14:33
85.Next Steps1:15:06
86.Questions?1:15:53
87.[PAPER: BILEVEL LEARNING OF THE GROUP LASSO STRUCTURE]1:21:21
88.Linear Regression and Group Sparsity1:21:39
89.Group Lasso1:22:05
90.Setting1:22:38
91.A Bilevel programming Approach1:23:18
92.Approximate Bilevel Problem1:23:48
93.Contributions1:24:35
94.Numerical Experiment1:25:12
95.[PAPER: BINARY CLASSIFICATION FROM POSITIVE-CONFIDENCE DATA]1:26:04
96.Introduction1:26:22
97.Related Works1:27:03
98.Main Idea1:27:51
99.Summary of the Paper1:30:38
100.[PAPER: FULLY UNDERSTANDING THE HASHING TRICK]1:31:10
101.Recommendation and Classification1:31:31
102.Dimensionality Reduction1:32:13
103.The Hashing Trick - With High Prob.1:33:53
104.Tight Bounds - Formal Problem1:34:43
105.Tight Bounds - Our Result1:35:15
106.Empirical Analysis1:35:46
107.Questions1:36:06
108.[PAPER: SUPPORT RECOVERY FOR ORTHOGONAL MATCHING PURSUIT]1:36:19
109.Sparse Linear Regression (SLR)1:36:55
110.Setup and Goals1:37:50
111.Orthogonal Matching Pursuit1:38:39
112.Known results and our contribution1:38:51
113.A key idea1:39:50
114.Thank You!1:40:20
115.Thank you for watching our live stream of NeurIPS 20181:40:30