ICLR 2020 Submissions

No.	Title
1	Empirical Bayes Transductive Meta-Learning with Synthetic Gradients
2	Contextualized Sparse Representation with Rectified N-Gram Attention for Open-Domain Question Answering
3	Generalized Domain Adaptation with Covariate and Label Shift CO-ALignment
4	Quaternion Equivariant Capsule Networks for 3D Point Clouds
5	Pay Attention to Features, Transfer Learn faster CNNs
6	Differentiable Hebbian Consolidation for Continual Learning
7	Generative Hierarchical Models for Parts, Objects, and Scenes
8	Mixture Distributions for Scalable Bayesian Inference
9	Best feature performance in codeswitched hate speech texts
10	Geom-GCN: Geometric Graph Convolutional Networks
11	Smart Ternary Quantization
12	HIPPOCAMPAL NEURONAL REPRESENTATIONS IN CONTINUAL LEARNING
13	A GOODNESS OF FIT MEASURE FOR GENERATIVE NETWORKS
14	Gradients as Features for Deep Representation Learning
15	Deceptive Opponent Modeling with Proactive Network Interdiction for Stochastic Goal Recognition Control
16	Monotonic Multihead Attention
17	Massively Multilingual Sparse Word Representations
18	Attention over Phrases
19	Query-efficient Meta Attack to Deep Neural Networks
20	BREAKING CERTIFIED DEFENSES: SEMANTIC ADVERSARIAL EXAMPLES WITH SPOOFED ROBUSTNESS CERTIFICATES
21	Meta-Learning Initializations for Image Segmentation
22	Privacy-preserving Representation Learning by Disentanglement
23	Building Hierarchical Interpretations in Natural Language via Feature Interaction Detection
24	AN EXPONENTIAL LEARNING RATE SCHEDULE FOR BATCH NORMALIZED NETWORKS
25	End-to-end learning of energy-based representations for irregularly-sampled signals and images
26	Enabling Deep Spiking Neural Networks with Hybrid Conversion and Spike Timing Dependent Backpropagation
27	How to 0wn the NAS in Your Spare Time
28	Generalized Zero-shot ICD Coding
29	EXACT ANALYSIS OF CURVATURE CORRECTED LEARNING DYNAMICS IN DEEP LINEAR NETWORKS
30	WEEGNET: an wavelet based Convnet for Brain-computer interfaces
31	Meta Label Correction for Learning with Weak Supervision
32	Toward Controllable Text Content Manipulation
33	NAMSG: An Efficient Method for Training Neural Networks
34	Learning to Reason: Distilling Hierarchy via Self-Supervision and Reinforcement Learning
35	The Shape of Data: Intrinsic Distance for Data Distributions
36	Measuring Numerical Common Sense: Is A Word Embedding Approach Effective?
37	Learning DNA folding patterns with Recurrent Neural Networks
38	Generative Adversarial Nets for Multiple Text Corpora
39	Understanding Generalization in Recurrent Neural Networks
40	Measure by Measure: Automatic Music Composition with Traditional Western Music Notation
41	Weakly-Supervised Trajectory Segmentation for Learning Reusable Skills
42	Learn Interpretable Word Embeddings Efficiently with von Mises-Fisher Distribution
43	Goten: GPU-Outsourcing Trusted Execution of Neural Network Training and Prediction
44	Limitations for Learning from Point Clouds
45	DOUBLE-HARD DEBIASING: TAILORING WORD EMBEDDINGS FOR GENDER BIAS MITIGATION
46	Conservative Uncertainty Estimation By Fitting Prior Networks
47	Re-Examining Linear Embeddings for High-dimensional Bayesian Optimization
48	ASYNCHRONOUS MULTI-AGENT GENERATIVE ADVERSARIAL IMITATION LEARNING
49	Predictive Coding for Boosting Deep Reinforcement Learning with Sparse Rewards
50	NORML: Nodal Optimization for Recurrent Meta-Learning
51	Keyword Spotter Model for Crop Pest and Disease Monitoring from Community Radio Data
52	NAS-BENCH-1SHOT1: BENCHMARKING AND DISSECTING ONE-SHOT NEURAL ARCHITECTURE SEARCH
53	Defense against Adversarial Examples by Encoder-Assisted Search in the Latent Coding Space
54	Fuzzing-Based Hard-Label Black-Box Attacks Against Machine Learning Models
55	Conditional generation of molecules from disentangled representations
56	Dataset Distillation
57	Learning RNNs with Commutative State Transitions
58	XD: Cross-lingual Knowledge Distillation for Polyglot Sentence Embeddings
59	LAVAE: Disentangling Location and Appearance
60	Sparse Skill Coding: Learning Behavioral Hierarchies with Sparse Codes
61	REFINING MONTE CARLO TREE SEARCH AGENTS BY MONTE CARLO TREE SEARCH
62	WHAT DATA IS USEFUL FOR MY DATA: TRANSFER LEARNING WITH A MIXTURE OF SELF-SUPERVISED EXPERTS
63	A Bilingual Generative Transformer for Semantic Sentence Embedding
64	Learning to Coordinate Manipulation Skills via Skill Behavior Diversification
65	DeepPCM: Predicting Protein-Ligand Binding using Unsupervised Learned Representations
66	Ternary MobileNets via Per-Layer Hybrid Filter Banks
67	Constant Curvature Graph Convolutional Networks
68	Variational Information Bottleneck for Unsupervised Clustering: Deep Gaussian Mixture Embedding
69	Combining graph and sequence information to learn protein representations
70	FINBERT: FINANCIAL SENTIMENT ANALYSIS WITH PRE-TRAINED LANGUAGE MODELS
71	Cancer homogeneity in single cell revealed by Bi-state model and Binary matrix factorization
72	Robust Subspace Recovery Layer for Unsupervised Anomaly Detection
73	Learning Nearly Decomposable Value Functions Via Communication Minimization
74	Batch Normalization is a Cause of Adversarial Vulnerability
75	Undersensitivity in Neural Reading Comprehension
76	Extreme Classification via Adversarial Softmax Approximation
77	IS THE LABEL TRUSTFUL: TRAINING BETTER DEEP LEARNING MODEL VIA UNCERTAINTY MINING NET
78	Information Geometry of Orthogonal Initializations and Training
79	Multi-Step Decentralized Domain Adaptation
80	Mixed Precision DNNs: All you need is a good parametrization
81	PROGRESSIVE LEARNING AND DISENTANGLEMENT OF HIERARCHICAL REPRESENTATIONS
82	Co-Attentive Equivariant Neural Networks: Focusing Equivariance On Transformations Co-Ocurring in Data
83	Improving the Gating Mechanism of Recurrent Neural Networks
84	Learning to Transfer via Modelling Multi-level Task Dependency
85	Latent Variables on Spheres for Sampling and Inference
86	Deep Orientation Uncertainty Learning based on a Bingham Loss
87	Analyzing Privacy Loss in Updates of Natural Language Models
88	Learning from Positive and Unlabeled Data with Adversarial Training
89	Deep exploration by novelty-pursuit with maximum state entropy
90	Reconstructing continuous distributions of 3D protein structure from cryo-EM images
91	Deep Evidential Uncertainty
92	Tree-structured Attention Module for Image Classification
93	Generalization of Two-layer Neural Networks: An Asymptotic Viewpoint
94	Better Knowledge Retention through Metric Learning
95	Winning the Lottery with Continuous Sparsification
96	Critical initialisation in continuous approximations of binary neural networks
97	Learning to Learn via Gradient Component Corrections
98	LEARNING DIFFICULT PERCEPTUAL TASKS WITH HODGKIN-HUXLEY NETWORKS
99	Filter redistribution templates for iteration-lessconvolutional model reduction
100	Universal Safeguarded Learned Convex Optimization with Guaranteed Convergence
101	A Gradient-Based Approach to Neural Networks Structure Learning
102	Sub-policy Adaptation for Hierarchical Reinforcement Learning
103	AdvCodec: Towards A Unified Framework for Adversarial Text Generation
104	PROVABLY BENEFITS OF DEEP HIERARCHICAL RL
105	Learning Latent State Spaces for Planning through Reward Prediction
106	Variational lower bounds on mutual information based on nonextensive statistical mechanics
107	Hope For The Best But Prepare For The Worst: Cautious Adaptation In RL Agents
108	Semi-Supervised Boosting via Self Labelling
109	Fractional Graph Convolutional Networks (FGCN) for Semi-Supervised Learning
110	Antifragile and Robust Heteroscedastic Bayesian Optimisation
111	Generalizing Reinforcement Learning to Unseen Actions
112	Provable Representation Learning for Imitation Learning via Bi-level Optimization
113	Episodic Reinforcement Learning with Associative Memory
114	Flexible and Efficient Long-Range Planning Through Curious Exploration
115	Learning to Prove Theorems by Learning to Generate Theorems
116	Depth-Width Trade-offs for ReLU Networks via Sharkovsky's Theorem
117	Common sense and Semantic-Guided Navigation via Language in Embodied Environments
118	Gradient-based training of Gaussian Mixture Models in High-Dimensional Spaces
119	Neural Phrase-to-Phrase Machine Translation
120	At Your Fingertips: Automatic Piano Fingering Detection
121	Energy-based models for atomic-resolution protein conformations
122	Federated Learning with Matched Averaging
123	Clustered Reinforcement Learning
124	Understanding the (Un)interpretability of Natural Image Distributions Using Generative Models
125	Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning
126	Efficient and Robust Asynchronous Federated Learning with Stragglers
127	Handwritten Amharic Character Recognition System Using Convolutional Neural Networks
128	Effects of Linguistic Labels on Learned Visual Representations in Convolutional Neural Networks: Labels matter!
129	Differentiable Programming for Physical Simulation
130	Fooling Pre-trained Language Models: An Evolutionary Approach to Generate Wrong Sentences with High Acceptability Score
131	Implicit Rugosity Regularization via Data Augmentation
132	A Mutual Information Maximization Perspective of Language Representation Learning
133	Goal-Conditioned Video Prediction
134	Accelerate DNN Inference By Inter-Operator Parallelization
135	Compression without Quantization
136	Geometry-Aware Visual Predictive Models of Intuitive Physics
137	Growing Up Together: Structured Exploration for Large Action Spaces
138	Adversarial Training with Voronoi Constraints
139	A Non-asymptotic comparison of SVRG and SGD: tradeoffs between compute and speed
140	RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers
141	Towards Understanding the Spectral Bias of Deep Learning
142	Domain Adaptive Multiflow Networks
143	Unbiased Contrastive Divergence Algorithm for Training Energy-Based Latent Variable Models
144	Unsupervised Distillation of Syntactic Information from Contextualized Word Representations
145	Optimal Unsupervised Domain Translation
146	Multi-task Network Embedding with Adaptive Loss Weighting
147	Biologically Plausible Neural Networks via Evolutionary Dynamics and Dopaminergic Plasticity
148	ON SOLVING COOPERATIVE DECENTRALIZED MARL PROBLEMS WITH SPARSE REINFORCEMENTS
149	Continual Learning using the SHDL Framework with Skewed Replay Distributions
150	Semi-supervised Autoencoding Projective Dependency Parsing
151	Differentiable Reasoning over a Virtual Knowledge Base
152	Making Sense of Reinforcement Learning and Probabilistic Inference
153	Negative Sampling in Variational Autoencoders
154	Improved Training of Certifiably Robust Models
155	Unsupervised Generative 3D Shape Learning from Natural Images
156	Diagnosing the Environment Bias in Vision-and-Language Navigation
157	Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation
158	Learning Mahalanobis Metric Spaces via Geometric Approximation Algorithms
159	Laconic Image Classification: Human vs. Machine Performance
160	Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks
161	Reinforcement Learning with Structured Hierarchical Grammar Representations of Actions
162	The Usual Suspects? Reassessing Blame for VAE Posterior Collapse
163	Dynamical System Embedding for Efficient Intrinsically Motivated Artificial Agents
164	BERT for Sequence-to-Sequence Milti-Label Text Classification
165	SCALABLE OBJECT-ORIENTED SEQUENTIAL GENERATIVE MODELS
166	Evaluations and Methods for Explanation through Robustness Analysis
167	Attributed Graph Learning with 2-D Graph Convolution
168	Stochastic Neural Physics Predictor
169	Neural tangent kernels, transportation mappings, and universal approximation
170	Pragmatic Evaluation of Adversarial Examples in Natural Language
171	Learning to Move with Affordance Maps
172	Towards Interpreting Deep Neural Networks via Understanding Layer Behaviors
173	Deep Learning For Symbolic Mathematics
174	Deep Interaction Processes for Time-Evolving Graphs
175	Differentiable learning of numerical rules in knowledge graphs
176	Consistency Regularization for Generative Adversarial Networks
177	On Generalization Error Bounds of Noisy Gradient Methods for Non-Convex Learning
178	Lyceum: An efficient and scalable ecosystem for robot learning
179	SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models
180	In-training Matrix Factorization for Parameter-frugal Neural Machine Translation
181	Benefits of Overparameterization in Single-Layer Latent Variable Generative Models
182	Implicit competitive regularization in GANs
183	Scale-Equivariant Steerable Networks
184	Extreme Language Model Compression with Optimal Subwords and Shared Projections
185	DeepSphere: a graph-based spherical CNN
186	Improved Training Techniques for Online Neural Machine Translation
187	GRASPEL: GRAPH SPECTRAL LEARNING AT SCALE
188	Overcoming Catastrophic Forgetting via Hessian-free Curvature Estimates
189	Score and Lyrics-Free Singing Voice Generation
190	Neural Video Encoding
191	Interactive Classification by Asking Informative Questions
192	Classification-Based Anomaly Detection for General Data
193	Mixture Density Networks Find Viewpoint the Dominant Factor for Accurate Spatial Offset Regression
194	Distributed Training Across the World
195	Unrestricted Adversarial Examples via Semantic Manipulation
196	Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model
197	Closed loop deep Bayesian inversion: Uncertainty driven acquisition for fast MRI
198	OBJECT-ORIENTED REPRESENTATION OF 3D SCENES
199	Discriminative Particle Filter Reinforcement Learning for Complex Partial observations
200	Learning to Group: A Bottom-Up Framework for 3D Part Discovery in Unseen Categories
201	State Alignment-based Imitation Learning
202	Reweighted Proximal Pruning for Large-Scale Language Representation
203	Neural Arithmetic Units
204	Lipschitz constant estimation for Neural Networks via sparse polynomial optimization
205	Random Bias Initialization Improving Binary Neural Network Training
206	Meta-RCNN: Meta Learning for Few-Shot Object Detection
207	Adversarially learned anomaly detection for time series data
208	HOW THE CHOICE OF ACTIVATION AFFECTS TRAINING OF OVERPARAMETRIZED NEURAL NETS
209	Multi-Precision Policy Enforced Training (MuPPET) : A precision-switching strategy for quantised fixed-point training of CNNs
210	Deep Spike Decoder (DSD)
211	Isolating Latent Structure with Cross-population Variational Autoencoders
212	Learning Compact Embedding Layers via Differentiable Product Quantization
213	Accelerating First-Order Optimization Algorithms
214	Physics-Aware Flow Data Completion Using Neural Inpainting
215	Imagine That! Leveraging Emergent Affordances for Tool Synthesis in Reaching Tasks
216	Provable Filter Pruning for Efficient Neural Networks
217	ADAPTIVE GENERATION OF PROGRAMMING PUZZLES
218	Learning transitional skills with intrinsic motivation
219	Quantifying uncertainty with GAN-based priors
220	End to End Trainable Active Contours via Differentiable Rendering
221	Plan2Vec: Unsupervised Representation Learning by Latent Plans
222	Uncertainty-aware Variational-Recurrent Imputation Network for Clinical Time Series
223	Compositional Continual Language Learning
224	Out-of-Distribution Image Detection Using the Normalized Compression Distance
225	Discriminative Variational Autoencoder for Continual Learning with Generative Replay
226	Connectivity-constrained interactive annotations for panoptic segmentation
227	On learning visual odometry errors
228	Regularization Matters in Policy Optimization
229	Adaptive Online Planning for Continual Lifelong Learning
230	Measuring causal influence with back-to-back regression: the linear case
231	Regularizing Predictions via Class-wise Self-knowledge Distillation
232	Multi-source Multi-view Transfer Learning in Neural Topic Modeling with Pretrained Topic and Word Embeddings
233	Adversarial Lipschitz Regularization
234	Reasoning-Aware Graph Convolutional Network for Visual Question Answering
235	SGD Learns One-Layer Networks in WGANs
236	Localized Meta-Learning: A PAC-Bayes Analysis for Meta-Leanring Beyond Global Prior
237	FNNP: Fast Neural Network Pruning Using Adaptive Batch Normalization
238	Adversarial Training and Provable Defenses: Bridging the Gap
239	Finding Deep Local Optima Using Network Pruning
240	Adversarial Training Generalizes Data-dependent Spectral Norm Regularization
241	Knowledge Transfer via Student-Teacher Collaboration
242	A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case
243	Weight-space symmetry in neural network loss landscapes revisited
244	Differentiable Bayesian Neural Network Inference for Data Streams
245	Efficient Transformer for Mobile Applications
246	Learning by shaking: Computing policy gradients by physical forward-propagation
247	Occlusion resistant learning of intuitive physics from videos
248	Quantum Graph Neural Networks
249	Statistical Verification of General Perturbations by Gaussian Smoothing
250	Localised Generative Flows
251	TransINT: Embedding Implication Rules in Knowledge Graphs with Isomorphic Intersections of Linear Subspaces
252	Robust Few-Shot Learning with Adversarially Queried Meta-Learners
253	Certifying Neural Network Audio Classifiers
254	Collaborative Training of Balanced Random Forests for Open Set Domain Adaptation
255	PAC-Bayesian Neural Network Bounds
256	Semi-Implicit Back Propagation
257	Mutual Information Gradient Estimation for Representation Learning
258	Qgraph-bounded Q-learning: Stabilizing Model-Free Off-Policy Deep Reinforcement Learning
259	Iterative Deep Graph Learning for Graph Neural Networks
260	Mint: Matrix-Interleaving for Multi-Task Learning
261	Learning Cluster Structured Sparsity by Reweighting
262	Selfish Emergent Communication
263	Decoupling Adaptation from Modeling with Meta-Optimizers for Meta Learning
264	Imitation Learning of Robot Policies using Language, Vision and Motion
265	Improving Visual Relation Detection using Depth Maps
266	Semi-supervised Pose Estimation with Geometric Latent Representations
267	Identifying Weights and Architectures of Unknown ReLU Networks
268	Unsupervised Domain Adaptation through Self-Supervision
269	Improving Gradient Estimation in Evolutionary Strategies With Past Descent Directions
270	$\alpha^{\alpha}$-Rank: Scalable Multi-agent Evaluation through Evolution
271	Variable Complexity in the Univariate and Multivariate Structural Causal Model
272	Regularizing activations in neural networks via distribution matching with the Wassertein metric
273	RefNet: Automatic Essay Scoring by Pairwise Comparison
274	Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
275	Mixed Precision Training With 8-bit Floating Point
276	An Empirical and Comparative Analysis of Data Valuation with Scalable Algorithms
277	Consistent Meta-Reinforcement Learning via Model Identification and Experience Relabeling
278	Transferring Optimality Across Data Distributions via Homotopy Methods
279	Latent Normalizing Flows for Many-to-Many Cross Domain Mappings
280	Learning Multi-Agent Communication Through Structured Attentive Reasoning
281	Dynamic Model Pruning with Feedback
282	$\ell_1$ Adversarial Robustness Certificates: a Randomized Smoothing Approach
283	On the interaction between supervision and self-play in emergent communication
284	CNAS: Channel-Level Neural Architecture Search
285	FLAT MANIFOLD VAES
286	Slow Thinking Enables Task-Uncertain Lifelong and Sequential Few-Shot Learning
287	A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms
288	Expected Information Maximization: Using the I-Projection for Mixture Density Estimation
289	Through the Lens of Neural Network: Analyzing Neural QA Models via Quantized Latent Representation
290	All Simulations Are Not Equal: Simulation Reweighing for Imperfect Information Games
291	Truth or backpropaganda? An empirical investigation of deep learning theory
292	Learning to Rank Learning Curves
293	Set Functions for Time Series
294	I love your chain mail! Making knights smile in a fantasy game world
295	Masked Translation Model
296	MissDeepCausal: causal inference from incomplete data using deep latent variable models
297	Variational Constrained Reinforcement Learning with Application to Planning at Roundabout
298	Efficient Deep Representation Learning by Adaptive Latent Space Sampling
299	Learning Functionally Decomposed Hierarchies for Continuous Navigation Tasks
300	Deep Audio Priors Emerge From Harmonic Convolutional Networks
301	Drawing Early-Bird Tickets: Toward More Efficient Training of Deep Networks
302	On Understanding Knowledge Graph Representation
303	Encoding Musical Style with Transformer Autoencoders
304	Collaborative Inter-agent Knowledge Distillation for Reinforcement Learning
305	Gauge Equivariant Spherical CNNs
306	INTERPRETING CNN PREDICTION THROUGH LAYER - WISE SELECTED DISCERNIBLE NEURONS
307	Preventing Imitation Learning with Adversarial Policy Ensembles
308	On the Anomalous Generalization of GANs
309	Improving Generalization in Meta Reinforcement Learning using Neural Objectives
310	A closer look at the approximation capabilities of neural networks
311	VIMPNN: A physics informed neural network for estimating potential energies of out-of-equilibrium systems
312	SLM Lab: A Comprehensive Benchmark and Modular Software Framework for Reproducible Deep Reinforcement Learning
313	Resolving Lexical Ambiguity in English–Japanese Neural Machine Translation
314	Data-Efficient Image Recognition with Contrastive Predictive Coding
315	Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps
316	wMAN: WEAKLY-SUPERVISED MOMENT ALIGNMENT NETWORK FOR TEXT-BASED VIDEO SEGMENT RETRIEVAL
317	Residual Energy-Based Models for Text Generation
318	AtomNAS: Fine-Grained End-to-End Neural Architecture Search
319	The Power of Semantic Similarity based Soft-Labeling for Generalized Zero-Shot Learning
320	AugMix: A Simple Method to Improve Robustness and Uncertainty under Data Shift
321	Learning Latent Dynamics for Partially-Observed Chaotic Systems
322	Exploration via Flow-Based Intrinsic Rewards
323	Learning Underlying Physical Properties From Observations For Trajectory Prediction
324	SPREAD DIVERGENCE
325	GraphQA: Protein Model Quality Assessment using Graph Convolutional Network
326	Disentanglement through Nonlinear ICA with General Incompressible-flow Networks (GIN)
327	DEEP GRAPH SPECTRAL EVOLUTION NETWORKS FOR GRAPH TOPOLOGICAL TRANSFORMATION
328	Angular Visual Hardness
329	Deep Relational Factorization Machines
330	Towards Scalable Imitation Learning for Multi-Agent Systems with Graph Neural Networks
331	On the Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks
332	MEMORY-BASED GRAPH NETWORKS
333	Mem2Mem: Learning to Summarize Long Texts with Memory-to-Memory Transfer
334	GQ-Net: Training Quantization-Friendly Deep Networks
335	An Empirical Study of Encoders and Decoders in Graph-Based Dependency Parsing
336	ExpandNets: Linear Over-parameterization to Train Compact Convolutional Networks
337	Variational Template Machine for Data-to-Text Generation
338	Phase Transitions for the Information Bottleneck in Representation Learning
339	PopSGD: Decentralized Stochastic Gradient Descent in the Population Model
340	Symmetric-APL Activations: Training Insights and Robustness to Adversarial Attacks
341	Faster and Just As Accurate: A Simple Decomposition for Transformer Models
342	Hidden incentives for self-induced distributional shift
343	The divergences minimized by non-saturating GAN training
344	The Differentiable Cross-Entropy Method
345	Atomic Compression Networks
346	Continual learning with hypernetworks
347	Few-Shot Regression via Learning Sparsifying Basis Functions
348	Understanding and Training Deep Diagonal Circulant Neural Networks
349	Removing input features via a generative model to explain their attributions to classifier's decisions
350	Top-down training for neural networks
351	Demystifying Graph Neural Network Via Graph Filter Assessment
352	Towards Certified Defense for Unrestricted Adversarial Attacks
353	Permutation Equivariant Models for Compositional Generalization in Language
354	Training binary neural networks with real-to-binary convolutions
355	DO-AutoEncoder: Learning and Intervening Bivariate Causal Mechanisms in Images
356	StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
357	Multichannel Generative Language Models
358	Smooth markets: A basic mechanism for organizing gradient-based learners
359	Enhancing the Transformer with explicit relational encoding for math problem solving
360	Ergodic Inference: Accelerate Convergence by Optimisation
361	SemanticAdv: Generating Adversarial Examples via Attribute-Conditional Image Editing
362	Uncertainty - sensitive learning and planning with ensembles
363	Fair Resource Allocation in Federated Learning
364	Continual Learning via Principal Components Projection
365	Task-Mediated Representation Learning
366	Convolutional Conditional Neural Processes
367	Self-Induced Curriculum Learning in Neural Machine Translation
368	CWAE-IRL: Formulating a supervised approach to Inverse Reinforcement Learning problem
369	A Quality-Diversity Controllable GAN for Text Generation
370	Newton Residual Learning
371	Hydra: Preserving Ensemble Diversity for Model Distillation
372	Few-Shot Few-Shot Learning and the role of Spatial Attention
373	BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning
374	Lossless Data Compression with Transformer
375	Meta-Learning with Warped Gradient Descent
376	Never Give Up: Learning Directed Exploration Strategies
377	AdvectiveNet: An Eulerian-Lagrangian Fluidic Reservoir for Point Cloud Processing
378	Unsupervised Spatiotemporal Data Inpainting
379	Transferable Recognition-Aware Image Processing
380	GMM-UNIT: Unsupervised Multi-Domain and Multi-Modal Image-to-Image Translation via Attribute Gaussian Mixture Modelling
381	Transfer Active Learning For Graph Neural Networks
382	Trajectory growth through random deep ReLU networks
383	Frequency Pooling: Shift-Equivalent and Anti-Aliasing Down Sampling
384	Improving Sequential Latent Variable Models with Autoregressive Flows
385	SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference
386	Sparse Transformer: Concentrated Attention Through Explicit Selection
387	Minimizing Change in Classifier Likelihood to Mitigate Catastrophic Forgetting
388	Scheduled Intrinsic Drive: A Hierarchical Take on Intrinsically Motivated Exploration
389	You CAN Teach an Old Dog New Tricks! On Training Knowledge Graph Embeddings
390	Unsupervised Learning of Graph Hierarchical Abstractions with Differentiable Coarsening and Optimal Transport
391	Defensive Tensorization: Randomized Tensor Parametrization for Robust Neural Networks
392	Question Generation from Paragraphs: A Tale of Two Hierarchical Models
393	Robust Reinforcement Learning via Adversarial Training with Langevin Dynamics
394	Embodied Multimodal Multitask Learning
395	High Fidelity Speech Synthesis with Adversarial Networks
396	Autoencoder-based Initialization for Recurrent Neural Networks with a Linear Memory
397	Test-Time Training for Out-of-Distribution Generalization
398	Distance-based Composable Representations with Neural Networks
399	At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?
400	GPU Memory Management for Deep Neural Networks Using Deep Q-Network
401	FRICATIVE PHONEME DETECTION WITH ZERO DELAY
402	Walking on the Edge: Fast, Low-Distortion Adversarial Examples
403	Disentangling Trainability and Generalization in Deep Learning
404	Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization
405	Functional Regularisation for Continual Learning with Gaussian Processes
406	Verification of Generative-Model-Based Visual Transformations
407	A Graph Neural Network Assisted Monte Carlo Tree Search Approach to Traveling Salesman Problem
408	Residual EBMs: Does Real vs. Fake Text Discrimination Generalize?
409	Learning Likelihoods with Conditional Normalizing Flows
410	Informed Temporal Modeling via Logical Specification of Factorial LSTMs
411	Auto Network Compression with Cross-Validation Gradient
412	Regularly varying representation for sentence embedding
413	A Simple and Scalable Shape Representation for 3D Reconstruction
414	Learning Through Limited Self-Supervision: Improving Time-Series Classification Without Additional Data via Auxiliary Tasks
415	EvoNet: A Neural Network for Predicting the Evolution of Dynamic Graphs
416	Few-Shot One-Class Classification via Meta-Learning
417	Training a Constrained Natural Media Painting Agent using Reinforcement Learning
418	Fix-Net: pure fixed-point representation of deep neural networks
419	Learning Semantic Correspondences from Noisy Data-text Pairs by Local-to-Global Alignments
420	The Role of Embedding Complexity in Domain-invariant Representations
421	Learning Curves for Deep Neural Networks: A field theory perspective
422	Zero-Shot Policy Transfer with Disentangled Attention
423	Disentangled Cumulants Help Successor Representations Transfer to New Tasks
424	Learning vector representation of local content and matrix representation of local motion, with implications for V1
425	Online Learned Continual Compression with Stacked Quantization Modules
426	Gumbel-Matrix Routing for Flexible Multi-task Learning
427	The Frechet Distance of training and test distribution predicts the generalization gap
428	Mixed Setting Training Methods for Incremental Slot-Filling Tasks
429	Selective sampling for accelerating training of deep neural networks
430	Representing Unordered Data Using Multiset Automata and Complex Numbers
431	Robust Natural Language Representation Learning for Natural Language Inference by Projecting Superficial Words out
432	Deep Nonlinear Stochastic Optimal Control for Systems with Multiplicative Uncertainties
433	Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network
434	Sentence embedding with contrastive multi-views learning
435	Dynamics-Aware Embeddings
436	Learning Multi-facet Embeddings of Phrases and Sentences using Sparse Coding for Unsupervised Semantic Applications
437	AN ATTENTION-BASED DEEP NET FOR LEARNING TO RANK
438	RaPP: Novelty Detection with Reconstruction along Projection Pathway
439	SAFE-DNN: A Deep Neural Network with Spike Assisted Feature Extraction for Noise Robust Inference
440	Putting Machine Translation in Context with the Noisy Channel Model
441	Deep geometric matrix completion: Are we doing it right?
442	Progressive Compressed Records: Taking a Byte Out of Deep Learning Data
443	Robustness and/or Redundancy Emerge in Overparametrized Deep Neural Networks
444	The Intriguing Effects of Focal Loss on the Calibration of Deep Neural Networks
445	Hypermodels for Exploration
446	Denoising Improves Latent Space Geometry in Text Autoencoders
447	Provable Convergence and Global Optimality of Generative Adversarial Network
448	On Symmetry and Initialization for Neural Networks
449	Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies
450	Policy path programming
451	Meta-Learning with Network Pruning for Overfitting Reduction
452	Kernel and Rich Regimes in Overparametrized Models
453	A Boolean Task Algebra for Reinforcement Learning
454	Explanation by Progressive Exaggeration
455	Quantum Optical Experiments Modeled by Long Short-Term Memory
456	Why do These Match? Explaining the Behavior of Image Similarity Models
457	Mode Connectivity and Sparse Neural Networks
458	Monte Carlo Deep Neural Network Arithmetic
459	Shape Features Improve General Model Robustness
460	Random Partition Relaxation for Training Binary and Ternary Weight Neural Network
461	How can we generalise learning distributed representations of graphs?
462	Relation-based Generalized Zero-shot Classification with the Domain Discriminator on the shared representation
463	Self-supervised Training of Proposal-based Segmentation via Background Prediction
464	Influence-aware Memory for Deep Reinforcement Learning
465	Gating Revisited: Deep Multi-layer RNNs That Can Be Trained
466	Decoupling Hierarchical Recurrent Neural Networks With Locally Computable Losses
467	A Simple Geometric Proof for the Benefit of Depth in ReLU Networks
468	Avoiding Negative Side-Effects and Promoting Safe Exploration with Imaginative Planning
469	BayesOpt Adversarial Attack
470	CrossNorm: On Normalization for Off-Policy Reinforcement Learning
471	A Simple Technique to Enable Saliency Methods to Pass the Sanity Checks
472	Directional Message Passing for Molecular Graphs
473	Unsupervised Learning of Efficient and Robust Speech Representations
474	Compositional Embeddings: Joint Perception and Comparison of Class Label Sets
475	Model-based reinforcement learning for biological sequence design
476	Learning to Optimize via Dual space Preconditioning
477	Self-Attentional Credit Assignment for Transfer in Reinforcement Learning
478	AdaGAN: Adaptive GAN for Many-to-Many Non-Parallel Voice Conversion
479	City Metro Network Expansion with Reinforcement Learning
480	BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations
481	ShardNet: One Filter Set to Rule Them All
482	Towards Interpretable Evaluations: A Case Study of Named Entity Recognition
483	Mixed-curvature Variational Autoencoders
484	Rethinking deep active learning: Using unlabeled data at model training
485	Blurring Structure and Learning to Optimize and Adapt Receptive Fields
486	Layerwise Learning Rates for Object Features in Unsupervised and Supervised Neural Networks And Consequent Predictions for the Infant Visual System
487	Continual Deep Learning by Functional Regularisation of Memorable Past
488	Demystifying Inter-Class Disentanglement
489	On the implicit minimization of alternative loss functions when training deep networks
490	Dynamic Graph Message Passing Networks
491	A Deep Recurrent Neural Network via Unfolding Reweighted l1-l1 Minimization
492	Differentially Private Mixed-Type Data Generation For Unsupervised Learning
493	Learning from Rules Generalizing Labeled Exemplars
494	Group-Transformer: Towards A Lightweight Character-level Language Model
495	Language-independent Cross-lingual Contextual Representations
496	Understanding the Limitations of Conditional Generative Models
497	Skew-Explore: Learn faster in continuous spaces with sparse rewards
498	Diversely Stale Parameters for Efficient Training of Deep Convolutional Networks
499	Exploring the Correlation between Likelihood of Flow-based Generative Models and Image Semantics
500	Anomaly Detection Based on Unsupervised Disentangled Representation Learning in Combination with Manifold Learning
501	Neural Arithmetic Unit by reusing many small pre-trained networks
502	On Stochastic Sign Descent Methods
503	GENN: Predicting Correlated Drug-drug Interactions with Graph Energy Neural Networks
504	Event Discovery for History Representation in Reinforcement Learning
505	Keep Doing What Worked: Behavior Modelling Priors for Offline Reinforcement Learning
506	Are Powerful Graph Neural Nets Necessary? A Dissection on Graph Classification
507	Domain-Invariant Representations: A Look on Compression and Weights
508	Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack
509	Spike-based causal inference for weight alignment
510	Symmetry and Systematicity
511	Efficacy of Pixel-Level OOD Detection for Semantic Segmentation
512	PatchFormer: A neural architecture for self-supervised representation learning on images
513	Address2vec: Generating vector embeddings for blockchain analytics
514	Attack-Resistant Federated Learning with Residual-based Reweighting
515	Learning scalable and transferable multi-robot/machine sequential assignment planning via graph embedding
516	Learning a Spatio-Temporal Embedding for Video Instance Segmentation
517	Efficient Exploration via State Marginal Matching
518	Side-Tuning: Network Adaptation via Additive Side Networks
519	Lookahead: A Far-sighted Alternative of Magnitude-based Pruning
520	SCELMo: Source Code Embeddings from Language Models
521	Detecting Change in Seasonal Pattern via Autoencoder and Temporal Regularization
522	CopyCAT: Taking Control of Neural Policies with Constant Attacks
523	VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning
524	A Generalized Training Approach for Multiagent Learning
525	Quantum Semi-Supervised Kernel Learning
526	Unsupervised Meta-Learning for Reinforcement Learning
527	Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
528	Training individually fair ML models with sensitive subspace robustness
529	Meta-learning curiosity algorithms
530	vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
531	The Secret Revealer: Generative Model Inversion Attacks Against Deep Neural Networks
532	Leveraging Entanglement Entropy for Deep Understanding of Attention Matrix in Text Matching
533	Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior Policies
534	Under what circumstances do local codes emerge in feed-forward neural networks
535	MMA Training: Direct Input Space Margin Maximization through Adversarial Training
536	Forecasting Deep Learning Dynamics with Applications to Hyperparameter Tuning
537	Batch Normalization has Multiple Benefits: An Empirical Study on Residual Networks
538	Building Deep Equivariant Capsule Networks
539	Learning to Infer User Interface Attributes from Images
540	Attacking Graph Convolutional Networks via Rewiring
541	Incorporating BERT into Neural Machine Translation
542	Unsupervised Hierarchical Graph Representation Learning with Variational Bayes
543	Copy That! Editing Sequences by Copying Spans
544	DeepXML: Scalable & Accurate Deep Extreme Classification for Matching User Queries to Advertiser Bid Phrases
545	What Can Neural Networks Reason About?
546	Structured Object-Aware Physics Prediction for Video Modeling and Planning
547	A multi-task U-net for segmentation with lazy labels
548	Neural Design of Contests and All-Pay Auctions using Multi-Agent Simulation
549	CaptainGAN: Navigate Through Embedding Space For Better Text Generation
550	Learning-Augmented Data Stream Algorithms
551	word2ket: Space-efficient Word Embeddings inspired by Quantum Entanglement
552	On Weight-Sharing and Bilevel Optimization in Architecture Search
553	Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models
554	Imbalanced Classification via Adversarial Minority Over-sampling
555	Compositional Transfer in Hierarchical Reinforcement Learning
556	On the Relationship between Self-Attention and Convolutional Layers
557	PolyGAN: High-Order Polynomial Generators
558	Dynamic Scale Inference by Entropy Minimization
559	SpikeGrad: An ANN-equivalent Computation Model for Implementing Backpropagation with Spikes
560	Rethinking Data Augmentation: Self-Supervision and Self-Distillation
561	GENERALIZATION GUARANTEES FOR NEURAL NETS VIA HARNESSING THE LOW-RANKNESS OF JACOBIAN
562	Learning to Remember from a Multi-Task Teacher
563	Gradient $\ell_1$ Regularization for Quantization Robustness
564	Coloring graph neural networks for node disambiguation
565	Spectral Embedding of Regularized Block Models
566	On Federated Learning of Deep Networks from Non-IID Data: Parameter Divergence and the Effects of Hyperparametric Methods
567	Improved Detection of Adversarial Attacks via Penetration Distortion Maximization
568	Barcodes as summary of objective functions' topology
569	Unsupervised Video-to-Video Translation via Self-Supervised Learning
570	Toward Evaluating Robustness of Deep Reinforcement Learning with Continuous Control
571	STYLE EXAMPLE-GUIDED TEXT GENERATION USING GENERATIVE ADVERSARIAL TRANSFORMERS
572	LEARNING TO IMPUTE: A GENERAL FRAMEWORK FOR SEMI-SUPERVISED LEARNING
573	Geometry-aware Generation of Adversarial and Cooperative Point Clouds
574	Crafting Data-free Universal Adversaries with Dilate Loss
575	Efficient Bi-Directional Verification of ReLU Networks via Quadratic Programming
576	Improving Sample Efficiency in Model-Free Reinforcement Learning from Images
577	Improving Exploration of Deep Reinforcement Learning using Planning for Policy Search
578	Spatial Information is Overrated for Image Classification
579	A Theoretical Analysis of Deep Q-Learning
580	Decentralized Deep Learning with Arbitrary Communication Compression
581	Can I Trust the Explainer? Verifying Post-Hoc Explanatory Methods
582	D3PG: Deep Differentiable Deterministic Policy Gradients
583	Deep Ensembles: A Loss Landscape Perspective
584	A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation
585	MULTI-STAGE INFLUENCE FUNCTION
586	Impact of the latent space on the ability of GANs to fit the distribution
587	Training Generative Adversarial Networks from Incomplete Observations using Factorised Discriminators
588	Combining Q-Learning and Search with Amortized Value Estimates
589	Hyperbolic Image Embeddings
590	Infinite-Horizon Differentiable Model Predictive Control
591	Neural Reverse Engineering of Stripped Binaries
592	Anchor & Transform: Learning Sparse Representations of Discrete Objects
593	Emergence of Collective Policies Inside Simulations with Biased Representations
594	Projection Based Constrained Policy Optimization
595	GraphFlow: Exploiting Conversation Flow with Graph Neural Networks for Conversational Machine Comprehension
596	Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning
597	Recurrent Layer Attention Network
598	Towards Effective 2-bit Quantization: Pareto-optimal Bit Allocation for Deep CNNs Compression
599	You Only Train Once: Loss-Conditional Training of Deep Networks
600	Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization
601	Using Explainabilty to Detect Adversarial Attacks
602	Feature Selection using Stochastic Gates
603	SpectroBank: A filter-bank convolutional layer for CNN-based audio applications
604	Testing For Typicality with Respect to an Ensemble of Learned Distributions
605	Emergent Communication in Networked Multi-Agent Reinforcement Learning
606	GraphSAINT: Graph Sampling Based Inductive Learning Method
607	Adversarial Filters of Dataset Biases
608	Value-Driven Hindsight Modelling
609	Incorporating Perceptual Prior to Improve Model's Adversarial Robustness
610	Learning Neural Causal Models from Unknown Interventions
611	Adaptive Generation of Unrestricted Adversarial Inputs
612	P-BN: Towards Effective Batch Normalization in the Path Space
613	Efficient Probabilistic Logic Reasoning with Graph Neural Networks
614	On the geometry and learning low-dimensional embeddings for directed graphs
615	GATO: Gates Are Not the Only Option
616	Probabilistic View of Multi-agent Reinforcement Learning: A Unified Approach
617	Neural Subgraph Isomorphism Counting
618	RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments
619	Continual Learning with Delayed Feedback
620	Neural Non-additive Utility Aggregation
621	Bayesian Variational Autoencoders for Unsupervised Out-of-Distribution Detection
622	``"Best-of-Many-Samples" Distribution Matching
623	Dynamically Balanced Value Estimates for Actor-Critic Methods
624	Spatially Parallel Attention and Component Extraction for Scene Decomposition
625	Efficient generation of structured objects with Constrained Adversarial Networks
626	Deep Variational Semi-Supervised Novelty Detection
627	Cross-Lingual Ability of Multilingual BERT: An Empirical Study
628	Towards Understanding Generalization in Gradient-Based Meta-Learning
629	Towards Finding Longer Proofs
630	Probing Emergent Semantics in Predictive Agents via Question Answering
631	Revisiting the Information Plane
632	Deep 3D-Zoom Net: Unsupervised Learning of Photo-Realistic 3D-Zoom
633	Hierarchical Graph Matching Networks for Deep Graph Similarity Learning
634	A Simple Approach to the Noisy Label Problem Through the Gambler's Loss
635	On the Reflection of Sensitivity in the Generalization Error
636	Redundancy-Free Computation Graphs for Graph Neural Networks
637	Toward Understanding The Effect of Loss Function on The Performance of Knowledge Graph Embedding
638	Reducing Transformer Depth on Demand with Structured Dropout
639	Semi-Supervised Learning with Normalizing Flows
640	Neural Communication Systems with Bandwidth-limited Channel
641	Reducing Computation in Recurrent Networks by Selectively Updating State Neurons
642	A Novel Analysis Framework of Lower Complexity Bounds for Finite-Sum Optimization
643	Neural Outlier Rejection for Self-Supervised Keypoint Learning
644	Exploring the Pareto-Optimality between Quality and Diversity in Text Generation
645	B-Spline CNNs on Lie groups
646	EMS: End-to-End Model Search for Network Architecture, Pruning and Quantization
647	Feature-based Augmentation for Semi-Supervised Learning
648	Quantifying Point-Prediction Uncertainty in Neural Networks via Residual Estimation with an I/O Kernel
649	Progressive Knowledge Distillation For Generative Modeling
650	EMPIR: Ensembles of Mixed Precision Deep Networks for Increased Robustness Against Adversarial Attacks
651	Learning To Explore Using Active Neural Mapping
652	Adversarial Robustness Against the Union of Multiple Perturbation Models
653	Understanding and Improving Information Transfer in Multi-Task Learning
654	Hyperparameter Tuning and Implicit Regularization in Minibatch SGD
655	Searching for Stage-wise Neural Graphs In the Limit
656	Restricting the Flow: Information Bottlenecks for Attribution
657	Stein Bridging: Enabling Mutual Reinforcement between Explicit and Implicit Generative Models
658	Step Size Optimization
659	Equilibrium Propagation with Continual Weight Updates
660	Global Adversarial Robustness Guarantees for Neural Networks
661	A Stochastic Derivative Free Optimization Method with Momentum
662	Coresets for Accelerating Incremental Gradient Methods
663	A Greedy Approach to Max-Sliced Wasserstein GANs
664	Off-Policy Actor-Critic with Shared Experience Replay
665	Intrinsically Motivated Discovery of Diverse Patterns in Self-Organizing Systems
666	The Ingredients of Real World Robotic Reinforcement Learning
667	Causal Discovery with Reinforcement Learning
668	Modelling the influence of data structure on learning in neural networks
669	Task-agnostic Continual Learning via Growing Long-Term Memory Networks
670	Scaling Autoregressive Video Models
671	TOWARDS FEATURE SPACE ADVERSARIAL ATTACK
672	Generative Integration Networks
673	Convergence Analysis of a Momentum Algorithm with Adaptive Step Size for Nonconvex Optimization
674	Compressive Transformers for Long-Range Sequence Modelling
675	Global Momentum Compression for Sparse Communication in Distributed SGD
676	State2vec: Off-Policy Successor Feature Approximators
677	Differentiation of Blackbox Combinatorial Solvers
678	Reinforced Genetic Algorithm Learning for Optimizing Computation Graphs
679	Lagrangian Fluid Simulation with Continuous Convolutions
680	Graph-based motion planning networks
681	Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks
682	Semi-supervised semantic segmentation needs strong, high-dimensional perturbations
683	Learning to Guide Random Search
684	Attentive Sequential Neural Processes
685	The intriguing role of module criticality in the generalization of deep networks
686	Yet another but more efficient black-box adversarial attack: tiling and evolution strategies
687	TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing
688	Learning with Social Influence through Interior Policy Differentiation
689	SPROUT: Self-Progressing Robust Training
690	Alleviating Privacy Attacks via Causal Learning
691	Hybrid Weight Representation: A Quantization Method Represented with Ternary and Sparse-Large Weights
692	Self-labelling via simultaneous clustering and representation learning
693	Meta Decision Trees for Explainable Recommendation Systems
694	Continual Learning with Gated Incremental Memories for Sequential Data Processing
695	Policy Optimization by Local Improvement through Search
696	Improving Model Compatibility of Generative Adversarial Networks by Boundary Calibration
697	Data Annealing Transfer learning Procedure for Informal Language Understanding Tasks
698	Robust anomaly detection and backdoor attack detection via differential privacy
699	CAT: Compression-Aware Training for bandwidth reduction
700	Scheduling the Learning Rate Via Hypergradients: New Insights and a New Algorithm
701	Learning Entailment-Based Sentence Embeddings from Natural Language Inference
702	Invariance vs Robustness of Neural Networks
703	Asymptotic learning curves of kernel methods: empirical data v.s. Teacher-Student paradigm
704	LARGE SCALE REPRESENTATION LEARNING FROM TRIPLET COMPARISONS
705	Irrationality can help reward inference
706	Learning to Reach Goals Without Reinforcement Learning
707	Pruning Depthwise Separable Convolutions for Extra Efficiency Gain of Lightweight Models
708	Subjective Reinforcement Learning for Open Complex Environments
709	Deep probabilistic subsampling for task-adaptive compressed sensing
710	Text Embedding Bank Module for Detailed Image Paragraph Caption
711	Semi-supervised 3D Face Reconstruction with Nonlinear Disentangled Representations
712	Representing Model Uncertainty of Neural Networks in Sparse Information Form
713	GroSS Decomposition: Group-Size Series Decomposition for Whole Search-Space Training
714	Neural Tangents: Fast and Easy Infinite Neural Networks in Python
715	Sparse Weight Activation Training
716	Learning Robust Representations via Multi-View Information Bottleneck
717	Batch-shaping for learning conditional channel gated networks
718	Making the Shoe Fit: Architectures, Initializations, and Tuning for Learning with Privacy
719	Universal Adversarial Attack Using Very Few Test Examples
720	Rotation-invariant clustering of functional cell types in primary visual cortex
721	Solving single-objective tasks by preference multi-objective reinforcement learning
722	Deep automodulators
723	Enhanced Convolutional Neural Tangent Kernels
724	Revisiting Gradient Episodic Memory for Continual Learning
725	Inductive and Unsupervised Representation Learning on Graph Structured Objects
726	A new perspective in understanding of Adam-Type algorithms and beyond
727	Causally Correct Partial Models for Reinforcement Learning
728	Spectral Nonlocal Block for Neural Network
729	U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation
730	Masked Based Unsupervised Content Transfer
731	Efficient meta reinforcement learning via meta goal generation
732	Learning robust visual representations using data augmentation invariance
733	A Simple Dynamic Learning Rate Tuning Algorithm For Automated Training of DNNs
734	DropEdge: Towards Deep Graph Convolutional Networks on Node Classification
735	Simple but effective techniques to reduce dataset biases
736	Projected Canonical Decomposition for Knowledge Base Completion
737	Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue
738	AMUSED: A Multi-Stream Vector Representation Method for Use In Natural Dialogue
739	Measuring the Reliability of Reinforcement Learning Algorithms
740	Semi-Supervised Named Entity Recognition with CRF-VAEs
741	Stable Rank Normalization for Improved Generalization in Neural Networks and GANs
742	Graph Neural Networks for Soft Semi-Supervised Learning on Hypergraphs
743	Why Not to Use Zero Imputation? Correcting Sparsity Bias in Training Neural Networks
744	Deep Neural Forests: An Architecture for Tabular Data
745	Self-Imitation Learning via Trajectory-Conditioned Policy for Hard-Exploration Tasks
746	ICNN: INPUT-CONDITIONED FEATURE REPRESENTATION LEARNING FOR TRANSFORMATION-INVARIANT NEURAL NETWORK
747	Data Augmentation in Training CNNs: Injecting Noise to Images
748	VAENAS: Sampling Matters in Neural Architecture Search
749	Self-Educated Language Agent with Hindsight Experience Replay for Instruction Following
750	Model-Agnostic Feature Selection with Additional Mutual Information
751	Do Deep Neural Networks for Segmentation Understand Insideness?
752	Adversarial Robustness as a Prior for Learned Representations
753	Explaining Time Series by Counterfactuals
754	Variational Diffusion Autoencoders with Random Walk Sampling
755	Probability Calibration for Knowledge Graph Embedding Models
756	Contrastive Multiview Coding
757	Fast Sparse ConvNets
758	Reformer: The Efficient Transformer
759	BasisVAE: Orthogonal Latent Space for Deep Disentangled Representation
760	Target-Embedding Autoencoders for Supervised Representation Learning
761	Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search
762	Conditional Flow Variational Autoencoders for Structured Sequence Prediction
763	High-Frequency guided Curriculum Learning for Class-specific Object Boundary Detection
764	On the Equivalence between Node Embeddings and Structural Graph Representations
765	Disagreement-Regularized Imitation Learning
766	Shifted Randomized Singular Value Decomposition
767	PassNet: Learning pass probability surfaces from single-location labels. An architecture for visually-interpretable soccer analytics
768	On Incorporating Semantic Prior Knowlegde in Deep Learning Through Embedding-Space Constraints
769	Are Few-shot Learning Benchmarks Too Simple ?
770	UNIVERSAL MODAL EMBEDDING OF DYNAMICS IN VIDEOS AND ITS APPLICATIONS
771	Universality Theorems for Generative Models
772	Function Feature Learning of Neural Networks
773	Manifold Learning and Alignment with Generative Adversarial Networks
774	Learning Deep-Latent Hierarchies by Stacking Wasserstein Autoencoders
775	Scalable Deep Neural Networks via Low-Rank Matrix Factorization
776	NoiGAN: NOISE AWARE KNOWLEDGE GRAPH EMBEDDING WITH GAN
777	Fast Task Adaptation for Few-Shot Learning
778	Weighted Empirical Risk Minimization: Transfer Learning based on Importance Sampling
779	Neural Program Synthesis By Self-Learning
780	Neural Epitome Search for Architecture-Agnostic Network Compression
781	Learning from Label Proportions with Consistency Regularization
782	Do recent advancements in model-based deep reinforcement learning really improve data efficiency?
783	Evo-NAS: Evolutionary-Neural Hybrid Agent for Architecture Search
784	Mixing Up Real Samples and Adversarial Samples for Semi-Supervised Learning
785	Task-Agnostic Robust Encodings for Combating Adversarial Typos
786	When Covariate-shifted Data Augmentation Increases Test Error And How to Fix It
787	Accelerated Variance Reduced Stochastic Extragradient Method for Sparse Machine Learning Problems
788	AdamT: A Stochastic Optimization with Trend Correction Scheme
789	The Variational InfoMax AutoEncoder
790	Skew-Fit: State-Covering Self-Supervised Reinforcement Learning
791	LOGAN: Latent Optimisation for Generative Adversarial Networks
792	Hyper-SAGNN: a self-attention based graph neural network for hypergraphs
793	A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning
794	Global-Local Network for Learning Depth with Very Sparse Supervision
795	CEB Improves Model Robustness
796	Music Source Separation in the Waveform Domain
797	Information lies in the eye of the beholder: The effect of representations on observed mutual information
798	On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach
799	Distributionally Robust Neural Networks
800	Distilling the Knowledge of BERT for Text Generation
801	Kernel of CycleGAN as a principal homogeneous space
802	Cross-Lingual Vision-Language Navigation
803	Molecule Property Prediction and Classification with Graph Hypernetworks
804	A Syntax-Aware Approach for Unsupervised Text Style Transfer
805	Relevant-features based Auxiliary Cells for Robust and Energy Efficient Deep Learning
806	Don't Use Large Mini-batches, Use Local SGD
807	Provable robustness against all adversarial $l_p$-perturbations for $p\geq 1$
808	Model Based Reinforcement Learning for Atari
809	Generating Multi-Sentence Abstractive Summaries of Interleaved Texts
810	On Universal Equivariant Set Networks
811	Compressive Hyperspherical Energy Minimization
812	OPTIMAL BINARY QUANTIZATION FOR DEEP NEURAL NETWORKS
813	Deep End-to-end Unsupervised Anomaly Detection
814	Tensor Decompositions for Temporal Knowledge Base Completion
815	CloudLSTM: A Recurrent Neural Model for Spatiotemporal Point-cloud Stream Forecasting
816	Neural Approximation of an Auto-Regressive Process through Confidence Guided Sampling
817	A Simple Randomization Technique for Generalization in Deep Reinforcement Learning
818	Stochastic Latent Residual Video Prediction
819	AlignNet: Self-supervised Alignment Module
820	Learning with Protection: Rejection of Suspicious Samples under Adversarial Environment
821	QXplore: Q-Learning Exploration by Maximizing Temporal Difference Error
822	Walking the Tightrope: An Investigation of the Convolutional Autoencoder Bottleneck
823	Partial Simulation for Imitation Learning
824	Few-shot Learning by Focusing on Differences
825	Robustness Verification for Transformers
826	EnsembleNet: A novel architecture for Incremental Learning
827	Anomalous Pattern Detection in Activations and Reconstruction Error of Autoencoders
828	Fantastic Generalization Measures and Where to Find Them
829	Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks
830	Learning De-biased Representations with Biased Representations
831	Weakly Supervised Disentanglement with Guarantees
832	Imagining the Latent Space of a Variational Auto-Encoders
833	A Copula approach for hyperparameter transfer learning
834	THE EFFECT OF ADVERSARIAL TRAINING: A THEORETICAL CHARACTERIZATION
835	Provenance detection through learning transformation-resilient watermarking
836	Regulatory Focus: Promotion and Prevention Inclinations in Policy Search
837	Fairness with Wasserstein Adversarial Networks
838	Diagonal Graph Convolutional Networks with Adaptive Neighborhood Aggregation
839	Discrepancy Ratio: Evaluating Model Performance When Even Experts Disagree on the Truth
840	The Dual Information Bottleneck
841	Deep Auto-Deferring Policy for Combinatorial Optimization
842	Towards trustworthy predictions from deep neural networks with fast adversarial calibration
843	Abductive Commonsense Reasoning
844	Variance Reduction With Sparse Gradients
845	BlockSwap: Fisher-guided Block Substitution for Network Compression on a Budget
846	RNA Secondary Structure Prediction By Learning Unrolled Algorithms
847	Learning transport cost from subset correspondence
848	Attentive Weights Generation for Few Shot Learning via Information Maximization
849	Semi-Supervised Few-Shot Learning with a Controlled Degree of Task-Adaptive Conditioning
850	Detecting Noisy Training Data with Loss Curves
851	Reducing Sentiment Bias in Language Models via Counterfactual Evaluation
852	Near-Zero-Cost Differentially Private Deep Learning with Teacher Ensembles
853	Neural Network Out-of-Distribution Detection for Regression Tasks
854	Rényi Fair Inference
855	Reject Illegal Inputs: Scaling Generative Classifiers with Supervised Deep Infomax
856	Lean Images for Geo-Localization
857	WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia
858	Deep Lifetime Clustering
859	Towards Understanding the Transferability of Deep Representations
860	Meta Dropout: Learning to Perturb Latent Features for Generalization
861	Adversarial AutoAugment
862	When Robustness Doesn’t Promote Robustness: Synthetic vs. Natural Distribution Shifts on ImageNet
863	Understanding Why Neural Networks Generalize Well Through GSNR of Parameters
864	State-only Imitation with Transition Dynamics Mismatch
865	Measuring and Improving the Use of Graph Information in Graph Neural Networks
866	Meta-Learning by Hallucinating Useful Examples
867	Pixel Co-Occurence Based Loss Metrics for Super Resolution Texture Recovery
868	A Latent Morphology Model for Open-Vocabulary Neural Machine Translation
869	Sample-Based Point Cloud Decoder Networks
870	AUGMENTED POLICY GRADIENT METHODS FOR EFFICIENT REINFORCEMENT LEARNING
871	BETANAS: Balanced Training and selective drop for Neural Architecture Search
872	Connecting the Dots Between MLE and RL for Sequence Prediction
873	Universal Approximation with Certified Networks
874	Explain Your Move: Understanding Agent Actions Using Focused Feature Saliency
875	SEERL : Sample Efficient Ensemble Reinforcement Learning
876	Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks
877	DyNet: Dynamic Convolution for Accelerating Convolution Neural Networks
878	Deep Symbolic Superoptimization Without Human Knowledge
879	Unsupervised domain adaptation with imputation
880	Sample Efficient Policy Gradient Methods with Recursive Variance Reduction
881	A Generative Model for Molecular Distance Geometry
882	Generating Biased Datasets for Neural Natural Language Processing
883	Robustified Importance Sampling for Covariate Shift
884	Fast Task Inference with Variational Intrinsic Successor Features
885	Certified Defenses for Adversarial Patches
886	Hardware-aware One-Shot Neural Architecture Search in Coordinate Ascent Framework
887	Contrastive Representation Distillation
888	Generating valid Euclidean distance matrices
889	Perturbations are not Enough: Generating Adversarial Examples with Spatial Distortions
890	Information Theoretic Model Predictive Q-Learning
891	On Predictive Information Sub-optimality of RNNs
892	Model Inversion Networks for Model-Based Optimization
893	Learning to Recognize the Unseen Visual Predicates
894	Continuous Control with Contexts, Provably
895	Stabilizing Transformers for Reinforcement Learning
896	A FRAMEWORK FOR ROBUSTNESS CERTIFICATION OF SMOOTHED CLASSIFIERS USING F-DIVERGENCES
897	The Detection of Distributional Discrepancy for Text Generation
898	Relative Pixel Prediction For Autoregressive Image Generation
899	FACE SUPER-RESOLUTION GUIDED BY 3D FACIAL PRIORS
900	Natural- to formal-language generation using Tensor Product Representations
901	Three-Head Neural Network Architecture for AlphaZero Learning
902	Consistency-Based Semi-Supervised Active Learning: Towards Minimizing Labeling Budget
903	Interpretable Network Structure for Modeling Contextual Dependency
904	Policy Tree Network
905	Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks
906	Characterize and Transfer Attention in Graph Neural Networks
907	Adversarial Neural Pruning
908	Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering
909	A Baseline for Few-Shot Image Classification
910	Abstract Diagrammatic Reasoning with Multiplex Graph Networks
911	Emergent Systematic Generalization In a Situated Agent
912	SoftAdam: Unifying SGD and Adam for better stochastic gradient descent
913	ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
914	Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning
915	Amharic Text Normalization with Sequence-to-Sequence Models
916	Thinking While Moving: Deep Reinforcement Learning with Concurrent Control
917	RATE-DISTORTION OPTIMIZATION GUIDED AUTOENCODER FOR GENERATIVE APPROACH
918	On the expected running time of nonconvex optimization with early stopping
919	Knossos: Compiling AI with AI
920	Multiagent Reinforcement Learning in Games with an Iterated Dominance Solution
921	CP-GAN: Towards a Better Global Landscape of GANs
922	Jacobian Adversarially Regularized Networks for Robustness
923	Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems
924	Improving Federated Learning Personalization via Model Agnostic Meta Learning
925	Towards Verified Robustness under Text Deletion Interventions
926	Discovering Topics With Neural Topic Models Built From PLSA Loss
927	And the Bit Goes Down: Revisiting the Quantization of Neural Networks
928	Meta-Learning Runge-Kutta
929	RGBD-GAN: Unsupervised 3D Representation Learning From Natural Image Datasets via RGBD Image Synthesis
930	Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks
931	Instant Quantization of Neural Networks using Monte Carlo Methods
932	Hallucinative Topological Memory for Zero-Shot Visual Planning
933	Learning Good Policies By Learning Good Perceptual Models
934	Implementation Matters in Deep RL: A Case Study on PPO and TRPO
935	A Closer Look at Deep Policy Gradients
936	Plug and Play Language Model: A simple baseline for controlled language generation
937	Efficient High-Dimensional Data Representation Learning via Semi-Stochastic Block Coordinate Descent Methods
938	Understanding and Robustifying Differentiable Architecture Search
939	Rethinking the Hyperparameters for Fine-tuning
940	UNITER: Learning UNiversal Image-TExt Representations
941	Self-Supervised GAN Compression
942	Retrieving Signals in the Frequency Domain with Deep Complex Extractors
943	Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings
944	Implementing Inductive bias for different navigation tasks through diverse RNN attrractors
945	Disentangling Style and Content in Anime Illustrations
946	Dynamic Instance Hardness
947	Multi-step Greedy Policies in Model-Free Deep Reinforcement Learning
948	A Random Matrix Perspective on Mixtures of Nonlinearities in High Dimensions
949	Is my Deep Learning Model Learning more than I want it to?
950	LIA: Latently Invertible Autoencoder with Adversarial Learning
951	PCMC-Net: Feature-based Pairwise Choice Markov Chains
952	Multi-Agent Interactions Modeling with Correlated Policies
953	Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning
954	Once for All: Train One Network and Specialize it for Efficient Deployment
955	Generalized Convolutional Forest Networks for Domain Generalization and Visual Recognition
956	Acutum: When Generalization Meets Adaptability
957	FR-GAN: Fair and Robust Training
958	SNODE: Spectral Discretization of Neural ODEs for System Identification
959	Guiding Program Synthesis by Learning to Generate Examples
960	Fast Neural Network Adaptation via Parameters Remapping
961	Measuring Calibration in Deep Learning
962	R2D2: Reuse & Reduce via Dynamic Weight Diffusion for Training Efficient NLP Models
963	Exploratory Not Explanatory: Counterfactual Analysis of Saliency Maps for Deep RL
964	On the Distribution of Penultimate Activations of Classification Networks
965	Divide-and-Conquer Adversarial Learning for High-Resolution Image Enhancement
966	Meta-Learning Deep Energy-Based Memory Models
967	Mutual Information Maximization for Robust Plannable Representations
968	Depth creates no more spurious local minima in linear networks
969	WORD SEQUENCE PREDICTION FOR AMHARIC LANGUAGE
970	YaoGAN: Learning Worst-case Competitive Algorithms from Self-generated Inputs
971	Annealed Denoising score matching: learning Energy based model in high-dimensional spaces
972	Finding Winning Tickets with Limited (or No) Supervision
973	Graph Convolutional Reinforcement Learning
974	Open-Set Domain Adaptation with Category-Agnostic Clusters
975	Deep Generative Classifier for Out-of-distribution Sample Detection
976	Reparameterized Variational Divergence Minimization for Stable Imitation
977	Learning Function-Specific Word Representations
978	Swoosh! Rattle! Thump! - Actions that Sound
979	Improving and Stabilizing Deep Energy-Based Learning
980	Perception-Driven Curiosity with Bayesian Surprise
981	Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning
982	Towards Effective and Efficient Zero-shot Learning by Fine-tuning with Task Descriptions
983	TWIN GRAPH CONVOLUTIONAL NETWORKS: GCN WITH DUAL GRAPH SUPPORT FOR SEMI-SUPERVISED LEARNING
984	Continual Density Ratio Estimation (CDRE): A new method for evaluating generative models in continual learning
985	CONTRIBUTION OF INTERNAL REFLECTION IN LANGUAGE EMERGENCE WITH AN UNDER-RESTRICTED SITUATION
986	Kernelized Wasserstein Natural Gradient
987	The Curious Case of Neural Text Degeneration
988	Universal approximations of permutation invariant/equivariant functions by deep neural networks
989	Improving Robustness Without Sacrificing Accuracy with Patch Gaussian Augmentation
990	What Can Learned Intrinsic Rewards Capture?
991	On Iterative Neural Network Pruning, Reinitialization, and the Similarity of Masks
992	Implicit Generative Modeling for Efficient Exploration
993	Continuous Meta-Learning without Tasks
994	Counterfactual Regularization for Model-Based Reinforcement Learning
995	Multilingual Alignment of Contextual Word Representations
996	A bi-diffusion based layer-wise sampling method for deep learning in large graphs
997	Learning Video Representations using Contrastive Bidirectional Transformer
998	Unrestricted Adversarial Attacks For Semantic Segmentation
999	Randomness in Deconvolutional Networks for Visual Representation
1000	HUBERT Untangles BERT to Improve Transfer across NLP Tasks
1001	The Gambler's Problem and Beyond
1002	CRAP: Semi-supervised Learning via Conditional Rotation Angle Prediction
1003	Noisy $\ell^{0}$-Sparse Subspace Clustering on Dimensionality Reduced Data
1004	GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation
1005	Off-policy Multi-step Q-learning
1006	Axial Attention in Multidimensional Transformers
1007	Joint text classification on multiple levels with multiple labels
1008	Fully Quantized Transformer for Improved Translation
1009	The Surprising Behavior Of Graph Neural Networks
1010	Double Neural Counterfactual Regret Minimization
1011	Resizable Neural Networks
1012	Multitask Soft Option Learning
1013	Adaptive Adversarial Imitation Learning
1014	Representation Learning with Multisets
1015	Improving Confident-Classifiers For Out-of-distribution Detection
1016	Cyclic Graph Dynamic Multilayer Perceptron for Periodic Signals
1017	Accelerating Monte Carlo Bayesian Inference via Approximating Predictive Uncertainty over the Simplex
1018	Capsule Networks without Routing Procedures
1019	Certifiably Robust Interpretation in Deep Learning
1020	Continuous Convolutional Neural Network forNonuniform Time Series
1021	DS-VIC: Unsupervised Discovery of Decision States for Transfer in RL
1022	Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
1023	Multi-objective Neural Architecture Search via Predictive Network Performance Optimization
1024	Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference
1025	Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers
1026	A Mean-Field Theory for Kernel Alignment with Random Features in Generative Adverserial Networks
1027	Learning Key Steps to Attack Deep Reinforcement Learning Agents
1028	Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks
1029	On PAC-Bayes Bounds for Deep Neural Networks using the Loss Curvature
1030	Deep Graph Matching Consensus
1031	Self-Supervised Learning of Appliance Usage
1032	Gaussian Conditional Random Fields for Classification
1033	Fourier networks for uncertainty estimates and out-of-distribution detection
1034	Semantic Hierarchy Emerges in the Deep Generative Representations for Scene Synthesis
1035	Quantum Algorithms for Deep Convolutional Neural Networks
1036	TWO-STEP UNCERTAINTY NETWORK FOR TASKDRIVEN SENSOR PLACEMENT
1037	EXPLOITING SEMANTIC COHERENCE TO IMPROVE PREDICTION IN SATELLITE SCENE IMAGE ANALYSIS: APPLICATION TO DISEASE DENSITY ESTIMATION
1038	Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds
1039	Abstractive Dialog Summarization with Semantic Scaffolds
1040	Evaluating Semantic Representations of Source Code
1041	Searching to Exploit Memorization Effect in Learning from Corrupted Labels
1042	Study of a Simple, Expressive and Consistent Graph Feature Representation
1043	Understanding l4-based Dictionary Learning: Interpretation, Stability, and Robustness
1044	Balancing Cost and Benefit with Tied-Multi Transformers
1045	End-to-End Multi-Domain Task-Oriented Dialogue Systems with Multi-level Neural Belief Tracker
1046	All Neural Networks are Created Equal
1047	Construction of Macro Actions for Deep Reinforcement Learning
1048	BOSH: An Efficient Meta Algorithm for Decision-based Attacks
1049	MGP-AttTCN: An Interpretable Machine Learning Model for the Prediction of Sepsis
1050	Unsupervised Representation Learning by Predicting Random Distances
1051	ConQUR: Mitigating Delusional Bias in Deep Q-Learning
1052	Where is the Information in a Deep Network?
1053	Extreme Values are Accurate and Robust in Deep Networks
1054	Statistically Consistent Saliency Estimation
1055	Domain-Independent Dominance of Adaptive Methods
1056	Neural Networks for Principal Component Analysis: A New Loss Function Provably Yields Ordered Exact Eigenvectors
1057	Symplectic ODE-Net: Learning Hamiltonian Dynamics with Control
1058	PNEN: Pyramid Non-Local Enhanced Networks
1059	Interpretations are useful: penalizing explanations to align neural networks with prior knowledge
1060	FreeLB: Enhanced Adversarial Training for Language Understanding
1061	Behaviour Suite for Reinforcement Learning
1062	Strategies for Pre-training Graph Neural Networks
1063	GRAPHS, ENTITIES, AND STEP MIXTURE
1064	Refining the variational posterior through iterative optimization
1065	Aggregating explanation methods for neural networks stabilizes explanations
1066	Recurrent Hierarchical Topic-Guided Neural Language Models
1067	Invertible generative models for inverse problems: mitigating representation error and dataset bias
1068	An Algorithm-Agnostic NAS Benchmark
1069	Learning World Graph Decompositions To Accelerate Reinforcement Learning
1070	Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems
1071	Controlling generative models with continuous factors of variations
1072	Emergent Tool Use From Multi-Agent Autocurricula
1073	The fairness-accuracy landscape of neural classifiers
1074	Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee
1075	Unsupervised Clustering using Pseudo-semi-supervised Learning
1076	Geometric Analysis of Nonconvex Optimization Landscapes for Overcomplete Learning
1077	POLYNOMIAL ACTIVATION FUNCTIONS
1078	PairNorm: Tackling Oversmoothing in GNNs
1079	Training-Free Uncertainty Estimation for Neural Networks
1080	Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning
1081	Empirical Studies on the Properties of Linear Regions in Deep Neural Networks
1082	SNOW: Subscribing to Knowledge via Channel Pooling for Transfer & Lifelong Learning
1083	Smoothness and Stability in GANs
1084	Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation
1085	On Bonus Based Exploration Methods In The Arcade Learning Environment
1086	Power up! Robust Graph Convolutional Network based on Graph Powering
1087	Global graph curvature
1088	Deep k-NN for Noisy Labels
1089	Filling the Soap Bubbles: Efficient Black-Box Adversarial Certification with Non-Gaussian Smoothing
1090	Guided Adaptive Credit Assignment for Sample Efficient Policy Optimization
1091	A Theory of Usable Information under Computational Constraints
1092	On the Invertibility of Invertible Neural Networks
1093	Shallow VAEs with RealNVP Prior Can Perform as Well as Deep Hierarchical VAEs
1094	GAN-based Gaussian Mixture Model Responsibility Learning
1095	Information-Theoretic Local Minima Characterization and Regularization
1096	Well-Read Students Learn Better: On the Importance of Pre-training Compact Models
1097	IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks
1098	UWGAN: UNDERWATER GAN FOR REAL-WORLD UNDERWATER COLOR RESTORATION AND DEHAZING
1099	HiLLoC: lossless image compression with hierarchical latent variable models
1100	Learning to Learn Kernels with Variational Random Features
1101	Efficient Wrapper Feature Selection using Autoencoder and Model Based Elimination
1102	Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics
1103	Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks
1104	Dual Sequential Monte Carlo: Tunneling Filtering and Planning in Continuous POMDPs
1105	Enhancing Language Emergence through Empathy
1106	The Generalization-Stability Tradeoff in Neural Network Pruning
1107	Word embedding re-examined: is the symmetrical factorization optimal?
1108	Empowering Graph Representation Learning with Paired Training and Graph Co-Attention
1109	Learning representations for binary-classification without backpropagation
1110	Deep unsupervised feature selection
1111	WaveFlow: A Compact Flow-based Model for Raw Audio
1112	Mathematical Reasoning in Latent Space
1113	Black Box Recursive Translations for Molecular Optimization
1114	Improved Generalization Bound of Permutation Invariant Deep Neural Networks
1115	Frequency-based Search-control in Dyna
1116	Off-policy Bandits with Deficient Support
1117	Implicit λ-Jeffreys Autoencoders: Taking the Best of Both Worlds
1118	Super-AND: A Holistic Approach to Unsupervised Embedding Learning
1119	FLUID FLOW MASS TRANSPORT FOR GENERATIVE NETWORKS
1120	Recognizing Plans by Learning Embeddings from Observed Action Distributions
1121	LEX-GAN: Layered Explainable Rumor Detector Based on Generative Adversarial Networks
1122	Towards Stable and Efficient Training of Verifiably Robust Neural Networks
1123	Multi-hop Question Answering via Reasoning Chains
1124	Factorized Multimodal Transformer for Multimodal Sequential Learning
1125	Learning in Confusion: Batch Active Learning with Noisy Oracle
1126	Iterative energy-based projection on a normal data manifold for anomaly localization
1127	Counting the Paths in Deep Neural Networks as a Performance Predictor
1128	Chart Auto-Encoders for Manifold Structured Data
1129	Optimizing Loss Landscape Connectivity via Neuron Alignment
1130	CROSS-DOMAIN CASCADED DEEP TRANSLATION
1131	V1Net: A computational model of cortical horizontal connections
1132	Distribution Matching Prototypical Network for Unsupervised Domain Adaptation
1133	Deep amortized clustering
1134	Using Objective Bayesian Methods to Determine the Optimal Degree of Curvature within the Loss Landscape
1135	Towards neural networks that provably know when they don't know
1136	BatchEnsemble: an Alternative Approach to Efficient Ensemble and Lifelong Learning
1137	Fully Convolutional Graph Neural Networks using Bipartite Graph Convolutions
1138	Inductive representation learning on temporal graphs
1139	Attention on Abstract Visual Reasoning
1140	Starfire: Regularization-Free Adversarially-Robust Structured Sparse Training
1141	Convolutional Tensor-Train LSTM for Long-Term Video Prediction
1142	An Information Theoretic Approach to Distributed Representation Learning
1143	PatchVAE: Learning Local Latent Codes for Recognition
1144	A Probabilistic Formulation of Unsupervised Text Style Transfer
1145	ROBUST GENERATIVE ADVERSARIAL NETWORK
1146	Feature Map Transform Coding for Energy-Efficient CNN Inference
1147	Generative Models for Effective ML on Private, Decentralized Datasets
1148	Learning from Partially-Observed Multimodal Data with Variational Autoencoders
1149	A SIMPLE AND EFFECTIVE FRAMEWORK FOR PAIRWISE DEEP METRIC LEARNING
1150	A Group-Theoretic Framework for Knowledge Graph Embedding
1151	A⋆MCTS: SEARCH WITH THEORETICAL GUARANTEE USING POLICY AND VALUE FUNCTIONS
1152	Picking Winning Tickets Before Training by Preserving Gradient Flow
1153	Exploring Cellular Protein Localization Through Semantic Image Synthesis
1154	Learning Calibratable Policies using Programmatic Style-Consistency
1155	Contextual Temperature for Language Modeling
1156	Retrospection: Leveraging the Past for Efficient Training of Deep Neural Networks
1157	Curriculum Loss: Robust Learning and Generalization against Label Corruption
1158	Discrete Transformer
1159	Adversarially Robust Generalization Just Requires More Unlabeled Data
1160	Why Does the VQA Model Answer No?: Improving Reasoning through Visual and Linguistic Inference
1161	DeepSFM: Structure From Motion Via Deep Bundle Adjustment
1162	IsoNN: Isomorphic Neural Network for Graph Representation Learning and Classification
1163	Uncertainty-guided Continual Learning with Bayesian Neural Networks
1164	Spline Templated Based Handwriting Generation
1165	On Empirical Comparisons of Optimizers for Deep Learning
1166	On Evaluating Explainability Algorithms
1167	Deep Hierarchical-Hyperspherical Learning (DH^2L)
1168	Versatile Anomaly Detection with Outlier Preserving Distribution Mapping Autoencoders
1169	Ladder Polynomial Neural Networks
1170	Training Recurrent Neural Networks Online by Learning Explicit State Variables
1171	How fine can fine-tuning be? Learning efficient language models
1172	Improved Modeling of Complex Systems Using Hybrid Physics/Machine Learning/Stochastic Models
1173	LEARNING TO LEARN WITH BETTER CONVERGENCE
1174	Deep Expectation-Maximization in Hidden Markov Models via Simultaneous Perturbation Stochastic Approximation
1175	Cross-lingual Alignment vs Joint Training: A Comparative Study and A Simple Unified Framework
1176	Compositional Visual Generation with Energy Based Models
1177	Learning Sparsity and Quantization Jointly and Automatically for Neural Network Compression via Constrained Optimization
1178	Hierarchical Bayes Autoencoders
1179	Wyner VAE: A Variational Autoencoder with Succinct Common Representation Learning
1180	Granger Causal Structure Reconstruction from Heterogeneous Multivariate Time Series
1181	CGT: Clustered Graph Transformer for Urban Spatio-temporal Prediction
1182	Robust Reinforcement Learning for Continuous Control with Model Misspecification
1183	Decoupling Representation and Classifier for Long-Tailed Recognition
1184	SDGM: Sparse Bayesian Classifier Based on a Discriminative Gaussian Mixture Model
1185	Which Tasks Should Be Learned Together in Multi-task Learning?
1186	COMBINED FLEXIBLE ACTIVATION FUNCTIONS FOR DEEP NEURAL NETWORKS
1187	Empirical observations pertaining to learned priors for deep latent variable models
1188	MetaPoison: Learning to craft adversarial poisoning examples via meta-learning
1189	Teacher-Student Compression with Generative Adversarial Networks
1190	Visual Hide and Seek
1191	Unsupervised Temperature Scaling: Robust Post-processing Calibration for Domain Shift
1192	Pareto Optimality in No-Harm Fairness
1193	Domain Adaptation Through Label Propagation: Learning Clustered and Aligned Features
1194	Visual Representation Learning with 3D View-Constrastive Inverse Graphics Networks
1195	Dream to Control: Learning Behaviors by Latent Imagination
1196	From Inference to Generation: End-to-end Fully Self-supervised Generation of Human Face from Speech
1197	Active Learning Graph Neural Networks via Node Feature Propagation
1198	Real or Not Real, that is the Question
1199	Deep Reinforcement Learning with Implicit Human Feedback
1200	Multi-Sample Dropout for Accelerated Training and Better Generalization
1201	MelNet: A Generative Model for Audio in the Frequency Domain
1202	Semi-Supervised Semantic Dependency Parsing Using CRF Autoencoders
1203	Image Classification Through Top-Down Image Pyramid Traversal
1204	Cross Domain Imitation Learning
1205	FAST LEARNING VIA EPISODIC MEMORY: A PERSPECTIVE FROM ANIMAL DECISION-MAKING
1206	DCTD: Deep Conditional Target Densities for Accurate Regression
1207	Blending Diverse Physical Priors with Neural Networks
1208	VISUALIZING POINT CLOUD CLASSIFIERS BY MORPHING POINT CLOUDS INTO POTATOES
1209	Read, Highlight and Summarize: A Hierarchical Neural Semantic Encoder-based Approach
1210	Posterior Control of Blackbox Generation
1211	A closer look at network resolution for efficient network design
1212	Efficient Systolic Array Based on Decomposable MAC for Quantized Deep Neural Networks
1213	Improved Image Augmentation for Convolutional Neural Networks by Copyout and CopyPairing
1214	On the Evaluation of Conditional GANs
1215	JAUNE: Justified And Unified Neural language Evaluation
1216	Classification as Decoder: Trading Flexibility for Control in Multi Domain Dialogue
1217	Statistical Adaptive Stochastic Optimization
1218	Scalable Neural Learning for Verifiable Consistency with Temporal Specifications
1219	Model Comparison of Beer data classification using an electronic nose
1220	Non-linear System Identification from Partial Observations via Iterative Smoothing and Learning
1221	Evaluating Lossy Compression Rates of Deep Generative Models
1222	LambdaNet: Probabilistic Type Inference using Graph Neural Networks
1223	Variational Autoencoders with Normalizing Flow Decoders
1224	Model-Augmented Actor-Critic: Backpropagating through Paths
1225	Metagross: Meta Gated Recursive Controller Units for Sequence Modeling
1226	Neural Symbolic Reader: Scalable Integration of Distributed and Symbolic Representations for Reading Comprehension
1227	Variational Autoencoders for Highly Multivariate Spatial Point Processes Intensities
1228	Stochastic Mirror Descent on Overparameterized Nonlinear Models
1229	Denoising and Regularization via Exploiting the Structural Bias of Convolutional Generators
1230	Recurrent Chunking Mechanisms for Conversational Machine Reading Comprehension
1231	Frequency Analysis for Graph Convolution Network
1232	Network Deconvolution
1233	Revisiting Self-Training for Neural Sequence Generation
1234	Generative Cleaning Networks with Quantized Nonlinear Transform for Deep Neural Network Defense
1235	Mutual Exclusivity as a Challenge for Deep Neural Networks
1236	Meta-Q-Learning
1237	CURSOR-BASED ADAPTIVE QUANTIZATION FOR DEEP NEURAL NETWORK
1238	Natural Image Manipulation for Autoregressive Models Using Fisher Scores
1239	Unifying Part Detection And Association For Multi-person Pose Estimation
1240	Towards a Deep Network Architecture for Structured Smoothness
1241	A novel text representation which enables image classifiers to perform text classification
1242	On the Global Convergence of Training Deep Linear ResNets
1243	A Closer Look at the Optimization Landscapes of Generative Adversarial Networks
1244	Perceptual Generative Autoencoders
1245	Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning
1246	JAX MD: End-to-End Differentiable, Hardware Accelerated, Molecular Dynamics in Pure Python
1247	Deflecting Adversarial Attacks
1248	Biologically inspired sleep algorithm for increased generalization and adversarial robustness in deep neural networks
1249	MUSE: Multi-Scale Attention Model for Sequence to Sequence Learning
1250	Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication
1251	Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning?
1252	Shifted and Squeezed 8-bit Floating Point format for Low-Precision Training of Deep Neural Networks
1253	Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks
1254	Intriguing Properties of Adversarial Training at Scale
1255	Point Process Flows
1256	Cover Filtration and Stable Paths in the Mapper
1257	Fully Polynomial-Time Randomized Approximation Schemes for Global Optimization of High-Dimensional Folded Concave Penalized Generalized Linear Models
1258	Learning Neural Surrogate Model for Warm-Starting Bayesian Optimization
1259	Scalable Differentially Private Data Generation via Private Aggregation of Teacher Ensembles
1260	Knowledge Graph Embedding: A Probabilistic Perspective and Generalization Bounds
1261	Stabilizing Neural ODE Networks with Stochasticity
1262	Adversarial Paritial Multi-label Learning
1263	Adversarial Interpolation Training: A Simple Approach for Improving Model Robustness
1264	Agent as Scientist: Learning to Verify Hypotheses
1265	CRNet: Image Super-Resolution Using A Convolutional Sparse Coding Inspired Network
1266	Deep Double Descent: Where Bigger Models and More Data Hurt
1267	Multigrid Neural Memory
1268	ASGen: Answer-containing Sentence Generation to Pre-Train Question Generator for Scale-up Data in Question Answering
1269	Distribution-Guided Local Explanation for Black-Box Classifiers
1270	Decoding As Dynamic Programming For Recurrent Autoregressive Models
1271	Compressed Sensing with Deep Image Prior and Learned Regularization
1272	Gradient Surgery for Multi-Task Learning
1273	SINGLE PATH ONE-SHOT NEURAL ARCHITECTURE SEARCH WITH UNIFORM SAMPLING
1274	Synthesizing Programmatic Policies that Inductively Generalize
1275	Transformer-XH: Multi-hop question answering with eXtra Hop attention
1276	Variational Hyper RNN for Sequence Modeling
1277	Generalization through Memorization: Nearest Neighbor Language Models
1278	Comparing Fine-tuning and Rewinding in Neural Network Pruning
1279	Simple is Better: Training an End-to-end Contract Bridge Bidding Agent without Human Knowledge
1280	The Sooner The Better: Investigating Structure of Early Winning Lottery Tickets
1281	Long History Short-Term Memory for Long-Term Video Prediction
1282	Adversarial training with perturbation generator networks
1283	Single episode transfer for differing environmental dynamics in reinforcement learning
1284	Inducing Stronger Object Representations in Deep Visual Trackers
1285	TOWARDS STABILIZING BATCH STATISTICS IN BACKWARD PROPAGATION OF BATCH NORMALIZATION
1286	STABILITY AND CONVERGENCE THEORY FOR LEARNING RESNET: A FULL CHARACTERIZATION
1287	Training Deep Neural Networks with Partially Adaptive Momentum
1288	NeurQuRI: Neural Question Requirement Inspector for Answerability Prediction in Machine Reading Comprehension
1289	Learning Latent Representations for Inverse Dynamics using Generalized Experiences
1290	Learning The Difference That Makes A Difference With Counterfactually-Augmented Data
1291	Differentiable Architecture Compression
1292	The Early Phase of Neural Network Training
1293	Chordal-GCN: Exploiting sparsity in training large-scale graph convolutional networks
1294	On The Difficulty of Warm-Starting Neural Network Training
1295	NeuroFabric: Identifying Ideal Topologies for Training A Priori Sparse Networks
1296	Distilled embedding: non-linear embedding factorization using knowledge distillation
1297	Incremental RNN: A Dynamical View.
1298	Domain-Relevant Embeddings for Question Similarity
1299	Actor-Critic Approach for Temporal Predictive Clustering
1300	Adversarial Privacy Preservation under Attribute Inference Attack
1301	Behavior-Guided Reinforcement Learning
1302	Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates
1303	Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling
1304	Extreme Tensoring for Low-Memory Preconditioning
1305	Blockwise Adaptivity: Faster Training and Better Generalization in Deep Learning
1306	Collapsed amortized variational inference for switching nonlinear dynamical systems
1307	Non-Autoregressive Dialog State Tracking
1308	Channel Equilibrium Networks
1309	Independence-aware Advantage Estimation
1310	Bayesian Meta Sampling for Fast Uncertainty Adaptation
1311	Salient Explanation for Fine-grained Classification
1312	SIMULTANEOUS ATTRIBUTED NETWORK EMBEDDING AND CLUSTERING
1313	Stochastic Gradient Methods with Block Diagonal Matrix Adaptation
1314	Harnessing Structures for Value-Based Planning and Reinforcement Learning
1315	The Dynamics of Signal Propagation in Gated Recurrent Neural Networks
1316	Economy Statistical Recurrent Units For Inferring Nonlinear Granger Causality
1317	Discriminability Distillation in Group Representation Learning
1318	Calibration, Entropy Rates, and Memory in Language Models
1319	Rethinking Generalized Matrix Factorization for Recommendation: The Importance of Multi-hot Encoding
1320	Efficient Saliency Maps for Explainable AI
1321	Reinforcement Learning with Probabilistically Complete Exploration
1322	Unaligned Image-to-Sequence Transformation with Loop Consistency
1323	Learning to Generate 3D Training Data through Hybrid Gradient
1324	Removing the Representation Error of GAN Image Priors Using the Deep Decoder
1325	MEMO: A Deep Network for Flexible Combination of Episodic Memories
1326	Superbloom: Bloom filter meets Transformer
1327	Longitudinal Enrichment of Imaging Biomarker Representations for Improved Alzheimer's Disease Diagnosis
1328	Probabilistic Connection Importance Inference and Lossless Compression of Deep Neural Networks
1329	Generating Semantic Adversarial Examples with Differentiable Rendering
1330	Guided variational autoencoder for disentanglement learning
1331	ManiGAN: Text-Guided Image Manipulation
1332	Quantum algorithm for finding the negative curvature direction
1333	Dual-module Inference for Efficient Recurrent Neural Networks
1334	GUIDEGAN: ATTENTION BASED SPATIAL GUIDANCE FOR IMAGE-TO-IMAGE TRANSLATION
1335	MixUp as Directional Adversarial Training
1336	Towards Interpretable Molecular Graph Representation Learning
1337	Representation Learning Through Latent Canonicalizations
1338	Winning Privately: The Differentially Private Lottery Ticket Mechanism
1339	Coherent Gradients: An Approach to Understanding Generalization in Gradient Descent-based Optimization
1340	WHAT ILLNESS OF LANDSCAPE CAN OVER-PARAMETERIZATION ALONE CURE?
1341	Correctness Verification of Neural Network
1342	Generalizing Natural Language Analysis through Span-relation Representations
1343	Jelly Bean World: A Testbed for Never-Ending Learning
1344	Characterizing convolutional neural networks with one-pixel signature
1345	A Deep Dive into Count-Min Sketch for Extreme Classification in Logarithmic Memory
1346	Large-scale Pretraining for Neural Machine Translation with Tens of Billions of Sentence Pairs
1347	Learning from Explanations with Neural Module Execution Tree
1348	A Coordinate-Free Construction of Scalable Natural Gradient
1349	Discovering Motor Programs by Recomposing Demonstrations
1350	How Aggressive Can Adversarial Attacks Be: Learning Ordered Top-k Attacks
1351	Adaptive Learned Bloom Filter (Ada-BF): Efficient Utilization of the Classifier
1352	Convergence Behaviour of Some Gradient-Based Methods on Bilinear Zero-Sum Games
1353	Aging Memories Generate More Fluent Dialogue Responses with Memory Networks
1354	DSReg: Using Distant Supervision as a Regularizer
1355	Iterative Target Augmentation for Effective Conditional Generation
1356	Composing Task-Agnostic Policies with Deep Reinforcement Learning
1357	The Local Elasticity of Neural Networks
1358	Gradient-Based Neural DAG Learning
1359	On Concept-Based Explanations in Deep Neural Networks
1360	Policy Message Passing: A New Algorithm for Probabilistic Graph Inference
1361	Learning to Control Latent Representations for Few-Shot Learning of Named Entities
1362	Amortized Nesterov's Momentum: Robust and Lightweight Momentum for Deep Learning
1363	Recurrent Event Network : Global Structure Inference Over Temporal Knowledge Graph
1364	Elastic-InfoGAN: Unsupervised Disentangled Representation Learning in Imbalanced Data
1365	Composition-based Multi-Relational Graph Convolutional Networks
1366	Capsules with Inverted Dot-Product Attention Routing
1367	The Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via Higher-Order Influence Functions
1368	Insights on Visual Representations for Embodied Navigation Tasks
1369	Unsupervised Disentanglement of Pose, Appearance and Background from Images and Videos
1370	On the Unintended Social Bias of Training Language Generation Models with News Articles
1371	Role-Wise Data Augmentation for Knowledge Distillation
1372	Learning Classifier Synthesis for Generalized Few-Shot Learning
1373	Attention Forcing for Sequence-to-sequence Model Training
1374	Topic Models with Survival Supervision: Archetypal Analysis and Neural Approaches
1375	FSNet: Compression of Deep Convolutional Neural Networks by Filter Summary
1376	On Need for Topology-Aware Generative Models for Manifold-Based Defenses
1377	Neural Execution of Graph Algorithms
1378	Objective Mismatch in Model-based Reinforcement Learning
1379	Molecular Graph Enhanced Transformer for Retrosynthesis Prediction
1380	Non-Sequential Melody Generation
1381	Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning
1382	Visual Explanation for Deep Metric Learning
1383	Deep Innovation Protection
1384	Alternating Recurrent Dialog Model with Large-Scale Pre-Trained Language Models
1385	BERTScore: Evaluating Text Generation with BERT
1386	Octave Graph Convolutional Network
1387	Learning from Imperfect Annotations: An End-to-End Approach
1388	Zeroth Order Optimization by a Mixture of Evolution Strategies
1389	Augmenting Non-Collaborative Dialog Systems with Explicit Semantic and Strategic Dialog History
1390	Machine Truth Serum
1391	Prediction, Consistency, Curvature: Representation Learning for Locally-Linear Control
1392	GraphZoom: A Multi-level Spectral Approach for Accurate and Scalable Graph Embedding
1393	Sensible adversarial learning
1394	Attention Interpretability Across NLP Tasks
1395	Neuron ranking - an informed way to compress convolutional neural networks
1396	MoET: Interpretable and Verifiable Reinforcement Learning via Mixture of Expert Trees
1397	AdaScale SGD: A Scale-Invariant Algorithm for Distributed Training
1398	INTERNAL-CONSISTENCY CONSTRAINTS FOR EMERGENT COMMUNICATION
1399	Bio-Inspired Hashing for Unsupervised Similarity Search
1400	Simplicial Complex Networks
1401	BEYOND SUPERVISED LEARNING: RECOGNIZING UNSEEN ATTRIBUTE-OBJECT PAIRS WITH VISION-LANGUAGE FUSION AND ATTRACTOR NETWORKS
1402	Underwhelming Generalization Improvements From Controlling Feature Attribution
1403	Graph Constrained Reinforcement Learning for Natural Language Action Spaces
1404	Solving Packing Problems by Conditional Query Learning
1405	Task-Relevant Adversarial Imitation Learning
1406	Generative Restricted Kernel Machines
1407	Towards Fast Adaptation of Neural Architectures with Meta Learning
1408	RL-ST: Reinforcing Style, Fluency and Content Preservation for Unsupervised Text Style Transfer
1409	A Functional Characterization of Randomly Initialized Gradient Descent in Deep ReLU Networks
1410	Variational Hetero-Encoder Randomized GANs for Joint Image-Text Modeling
1411	Toward Understanding Generalization of Over-parameterized Deep ReLU network trained with SGD in Student-teacher Setting
1412	Asymptotics of Wide Networks from Feynman Diagrams
1413	Symplectic Recurrent Neural Networks
1414	Representational Disentanglement for Multi-Domain Image Completion
1415	Generalized Bayesian Posterior Expectation Distillation for Deep Neural Networks
1416	Learning Cross-Context Entity Representations from Text
1417	SPECTRA: Sparse Entity-centric Transitions
1418	DeepSimplex: Reinforcement Learning of Pivot Rules Improves the Efficiency of Simplex Algorithm in Solving Linear Programming Problems
1419	Learning Temporal Abstraction with Information-theoretic Constraints for Hierarchical Reinforcement Learning
1420	Selective Brain Damage: Measuring the Disparate Impact of Model Pruning
1421	Asynchronous Stochastic Subgradient Methods for General Nonsmooth Nonconvex Optimization
1422	Improved Structural Discovery and Representation Learning of Multi-Agent Data
1423	Quantized Reinforcement Learning (QuaRL)
1424	R-TRANSFORMER: RECURRENT NEURAL NETWORK ENHANCED TRANSFORMER
1425	NADS: Neural Architecture Distribution Search for Uncertainty Awareness
1426	Rigging the Lottery: Making All Tickets Winners
1427	CAPACITY-LIMITED REINFORCEMENT LEARNING: APPLICATIONS IN DEEP ACTOR-CRITIC METHODS FOR CONTINUOUS CONTROL
1428	Discovering the compositional structure of vector representations with Role Learning Networks
1429	Higher-Order Function Networks for Learning Composable 3D Object Representations
1430	Adapting to Label Shift with Bias-Corrected Calibration
1431	Neural Module Networks for Reasoning over Text
1432	Strong Baseline Defenses Against Clean-Label Poisoning Attacks
1433	MANIFOLD FORESTS: CLOSING THE GAP ON NEURAL NETWORKS
1434	Learning to Plan in High Dimensions via Neural Exploration-Exploitation Trees
1435	Improved memory in recurrent neural networks with sequential non-normal dynamics
1436	Model Imitation for Model-Based Reinforcement Learning
1437	Embodied Language Grounding with Implicit 3D Visual Feature Representations
1438	Likelihood Contribution based Multi-scale Architecture for Generative Flows
1439	A Base Model Selection Methodology for Efficient Fine-Tuning
1440	Rethinking Curriculum Learning With Incremental Labels And Adaptive Compensation
1441	Graph Neural Networks for Reasoning 2-Quantified Boolean Formulas
1442	Learn to Explain Efficiently via Neural Logic Inductive Learning
1443	NormLime: A New Feature Importance Metric for Explaining Deep Neural Networks
1444	Pre-trained Contextual Embedding of Source Code
1445	Certified Robustness to Adversarial Label-Flipping Attacks via Randomized Smoothing
1446	Benefit of Interpolation in Nearest Neighbor Algorithms
1447	{COMPANYNAME}11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery
1448	Neural Clustering Processes
1449	Improving Neural Language Generation with Spectrum Control
1450	Span Recovery for Deep Neural Networks with Applications to Input Obfuscation
1451	Unknown-Aware Deep Neural Network
1452	MODELLING BIOLOGICAL ASSAYS WITH ADAPTIVE DEEP KERNEL LEARNING
1453	A Memory-augmented Neural Network by Resembling Human Cognitive Process of Memorization
1454	A Perturbation Analysis of Input Transformations for Adversarial Attacks
1455	ADA+: A GENERIC FRAMEWORK WITH MORE ADAPTIVE EXPLICIT ADJUSTMENT FOR LEARNING RATE
1456	Locally Constant Networks
1457	Smooth Kernels Improve Adversarial Robustness and Perceptually-Aligned Gradients
1458	Multi-View Summarization and Activity Recognition Meet Edge Computing in IoT Environments
1459	Neural ODEs for Image Segmentation with Level Sets
1460	Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations
1461	PAC Confidence Sets for Deep Neural Networks via Calibrated Prediction
1462	Low Rank Training of Deep Neural Networks for Emerging Memory Technology
1463	Decentralized Distributed PPO: Mastering PointGoal Navigation
1464	MultiGrain: a unified image embedding for classes and instances
1465	Learning to Learn by Zeroth-Order Oracle
1466	Neural Embeddings for Nearest Neighbor Search Under Edit Distance
1467	ADAPTING PRETRAINED LANGUAGE MODELS FOR LONG DOCUMENT CLASSIFICATION
1468	Robust Federated Learning Through Representation Matching and Adaptive Hyper-parameters
1469	ROS-HPL: Robotic Object Search with Hierarchical Policy Learning and Intrinsic-Extrinsic Modeling
1470	Knockoff-Inspired Feature Selection via Generative Models
1471	MetaPix: Few-Shot Video Retargeting
1472	SloMo: Improving Communication-Efficient Distributed SGD with Slow Momentum
1473	Stochastic Prototype Embeddings
1474	Way Off-Policy Batch Deep Reinforcement Learning of Human Preferences in Dialog
1475	Generalized Transformation-based Gradient
1476	Targeted sampling of enlarged neighborhood via Monte Carlo tree search for TSP
1477	Black-box Adversarial Attacks with Bayesian Optimization
1478	Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving
1479	Learning to Combat Compounding-Error in Model-Based Reinforcement Learning
1480	Understanding Attention Mechanisms
1481	Beyond GANs: Transforming without a Target Distribution
1482	Four Things Everyone Should Know to Improve Batch Normalization
1483	Learning to solve the credit assignment problem
1484	Improving Multi-Manifold GANs with a Learned Noise Prior
1485	Overparameterized Neural Networks Can Implement Associative Memory
1486	Bayesian Residual Policy Optimization: Scalable Bayesian Reinforcement Learning with Clairvoyant Experts
1487	Sampling-Free Learning of Bayesian Quantized Neural Networks
1488	A Hierarchy of Graph Neural Networks Based on Learnable Local Features
1489	The Blessing of Dimensionality: An Empirical Study of Generalization
1490	DeFINE: Deep Factorized Input Word Embeddings for Neural Sequence Modeling
1491	NEURAL EXECUTION ENGINES
1492	Learning to Make Generalizable and Diverse Predictions for Retrosynthesis
1493	Disentangled GANs for Controllable Generation of High-Resolution Images
1494	Continuous Graph Flow
1495	Benchmarking Adversarial Robustness
1496	ROBUST SINGLE-STEP ADVERSARIAL TRAINING
1497	Wasserstein-Bounded Generative Adversarial Networks
1498	DBA: Distributed Backdoor Attacks against Federated Learning
1499	Learning Generative Models using Denoising Density Estimators
1500	Fast is better than free: Revisiting adversarial training
1501	LOSSLESS SINGLE IMAGE SUPER RESOLUTION FROM LOW-QUALITY JPG IMAGES
1502	Improving Neural Abstractive Summarization Using Transfer Learning and Factuality-Based Evaluation: Towards Automating Science Journalism
1503	Deep Multivariate Mixture of Gaussians for Object Detection under Occlusion
1504	iWGAN: an Autoencoder WGAN for Inference
1505	BERT-AL: BERT for Arbitrarily Long Document Understanding
1506	Novelty Search in representational space for sample efficient exploration
1507	Switched linear projections and inactive state sensitivity for deep neural network interpretability
1508	An Optimization Principle Of Deep Learning?
1509	Testing Robustness Against Unforeseen Adversaries
1510	Thieves on Sesame Street! Model Extraction of BERT-based APIs
1511	Understanding Knowledge Distillation in Non-autoregressive Machine Translation
1512	Coordinated Exploration via Intrinsic Rewards for Multi-Agent Reinforcement Learning
1513	Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data
1514	Locality and Compositionality in Zero-Shot Learning
1515	Optimistic Adaptive Acceleration for Optimization
1516	Situating Sentence Embedders with Nearest Neighbor Overlap
1517	Posterior Sampling: Make Reinforcement Learning Sample Efficient Again
1518	Generalized Clustering by Learning to Optimize Expected Normalized Cuts
1519	Mix-review: Alleviate Forgetting in the Pretrain-Finetune Framework for Neural Language Generation Models
1520	The function of contextual illusions
1521	Disentangling neural mechanisms for perceptual grouping
1522	Adversarial Imitation Attack
1523	Regularizing Trajectories to Mitigate Catastrophic Forgetting
1524	When Do Variational Autoencoders Know What They Don't Know?
1525	Semantic Pruning for Single Class Interpretability
1526	Analyzing the Role of Model Uncertainty for Electronic Health Records
1527	Chameleon: Adaptive Code Optimization For Expedited Deep Neural Network Compilation
1528	Weakly-supervised Knowledge Graph Alignment with Adversarial Learning
1529	Auto Completion of User Interface Layout Design Using Transformer-Based Tree Decoders
1530	Not All Features Are Equal: Feature Leveling Deep Neural Networks for Better Interpretation
1531	Intrinsic Motivation for Encouraging Synergistic Behavior
1532	Noisy Machines: Understanding noisy neural networks and enhancing robustness to analog hardware errors using distillation
1533	Perceptual Regularization: Visualizing and Learning Generalizable Representations
1534	Neural networks with motivation
1535	Improving One-Shot NAS By Suppressing The Posterior Fading
1536	Toward Amortized Ranking-Critical Training For Collaborative Filtering
1537	ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
1538	Curriculum Learning for Deep Generative Models with Clustering
1539	Should All Cross-Lingual Embeddings Speak English?
1540	Sign-OPT: A Query-Efficient Hard-label Adversarial Attack
1541	Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP
1542	Learning Space Partitions for Nearest Neighbor Search
1543	Visual Interpretability Alone Helps Adversarial Robustness
1544	One-Shot Neural Architecture Search via Compressive Sensing
1545	Learning Adversarial Grammars for Future Prediction
1546	End-to-end named entity recognition and relation extraction using pre-trained language models
1547	How noise affects the Hessian spectrum in overparameterized neural networks
1548	A Simple Recurrent Unit with Reduced Tensor Product Representations
1549	Parallel Neural Text-to-Speech
1550	Context-Aware Object Detection With Convolutional Neural Networks
1551	DeepV2D: Video to Depth with Differentiable Structure from Motion
1552	TPO: TREE SEARCH POLICY OPTIMIZATION FOR CONTINUOUS ACTION SPACES
1553	Gaussian Process Meta-Representations Of Neural Networks
1554	CAN ALTQ LEARN FASTER: EXPERIMENTS AND THEORY
1555	The Break-Even Point on the Optimization Trajectories of Deep Neural Networks
1556	Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets
1557	Exploration Based Language Learning for Text-Based Games
1558	Robust And Interpretable Blind Image Denoising Via Bias-Free Convolutional Neural Networks
1559	CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning
1560	Deep Imitative Models for Flexible Inference, Planning, and Control
1561	Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness
1562	Defensive Quantization Layer For Convolutional Network Against Adversarial Attack
1563	Defective Convolutional Layers Learn Robust CNNs
1564	DASGrad: Double Adaptive Stochastic Gradient
1565	Finding Mixed Strategy Nash Equilibrium for Continuous Games through Deep Learning
1566	The Logical Expressiveness of Graph Neural Networks
1567	GOING BEYOND TOKEN-LEVEL PRE-TRAINING FOR EMBEDDING-BASED LARGE-SCALE RETRIEVAL
1568	Conditional Out-of-Sample Generation For Unpaired Data using trVAE
1569	The Benefits of Over-parameterization at Initialization in Deep ReLU Networks
1570	UniLoss: Unified Surrogate Loss by Adaptive Interpolation
1571	A Training Scheme for the Uncertain Neuromorphic Computing Chips
1572	Mildly Overparametrized Neural Nets can Memorize Training Data Efficiently
1573	Deep Graph Translation
1574	Are Transformers universal approximators of sequence-to-sequence functions?
1575	Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples
1576	Decoupling Weight Regularization from Batch Size for Model Compression
1577	Zero-Shot Out-of-Distribution Detection with Feature Correlations
1578	Proactive Sequence Generator via Knowledge Acquisition
1579	Interpretable Deep Neural Network Models: Hybrid of Image Kernels and Neural Networks
1580	Multi-scale Attributed Node Embedding
1581	$\textrm{D}^2$GAN: A Few-Shot Learning Approach with Diverse and Discriminative Feature Synthesis
1582	Understanding the functional and structural differences across excitatory and inhibitory neurons
1583	One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum Evaluation
1584	Differentially Private Meta-Learning
1585	Leveraging Adversarial Examples to Obtain Robust Second-Order Representations
1586	CLEVRER: Collision Events for Video Representation and Reasoning
1587	Using Logical Specifications of Objectives in Multi-Objective Reinforcement Learning
1588	Efficient Training of Robust and Verifiable Neural Networks
1589	Learning Compositional Koopman Operators for Model-Based Control
1590	Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness
1591	Confidence-Calibrated Adversarial Training: Towards Robust Models Generalizing Beyond the Attack Used During Training
1592	All SMILES Variational Autoencoder for Molecular Property Prediction and Optimization
1593	Generating Dialogue Responses From A Semantic Latent Space
1594	Is There Mode Collapse? A Case Study on Face Generation and Its Black-box Calibration
1595	Overlearning Reveals Sensitive Attributes
1596	Gaussian MRF Covariance Modeling for Efficient Black-Box Adversarial Attacks
1597	A Kolmogorov Complexity Approach to Generalization in Deep Learning
1598	Towards Modular Algorithm Induction
1599	Optimal Strategies Against Generative Attacks
1600	One Generation Knowledge Distillation by Utilizing Peer Samples
1601	Stein Self-Repulsive Dynamics: Benefits from Past Samples
1602	Adversarially robust transfer learning
1603	One Demonstration Imitation Learning
1604	Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
1605	Promoting Coordination through Policy Regularization in Multi-Agent Deep Reinforcement Learning
1606	Improving Irregularly Sampled Time Series Learning with Dense Descriptors of Time
1607	Contextual Text Style Transfer
1608	Modeling question asking using neural program generation
1609	Learning to Link
1610	Adversarial Attacks on Copyright Detection Systems
1611	Detecting Extrapolation with Local Ensembles
1612	Revisiting Fine-tuning for Few-shot Learning
1613	Global Relational Models of Source Code
1614	MONET: Debiasing Graph Embeddings via the Metadata-Orthogonal Training Unit
1615	Selection via Proxy: Efficient Data Selection for Deep Learning
1616	Deep Learning-Based Average Consensus
1617	Meta Learning via Learned Loss
1618	Short and Sparse Deconvolution --- A Geometric Approach
1619	If MaxEnt RL is the Answer, What is the Question?
1620	Stochastic Weight Averaging in Parallel: Large-Batch Training That Generalizes Well
1621	Characterizing Missing Information in Deep Networks Using Backpropagated Gradients
1622	INVOCMAP: MAPPING METHOD NAMES TO METHOD INVOCATIONS VIA MACHINE LEARNING
1623	Scaleable input gradient regularization for adversarial robustness
1624	Adjustable Real-time Style Transfer
1625	Unsupervised Progressive Learning and the STAM Architecture
1626	Wasserstein Robust Reinforcement Learning
1627	Knowledge Hypergraphs: Prediction Beyond Binary Relations
1628	Dynamics-Aware Unsupervised Skill Discovery
1629	A Fine-Grained Spectral Perspective on Neural Networks
1630	Energy-Aware Neural Architecture Optimization with Fast Splitting Steepest Descent
1631	UNPAIRED POINT CLOUD COMPLETION ON REAL SCANS USING ADVERSARIAL TRAINING
1632	Efficient Riemannian Optimization on the Stiefel Manifold via the Cayley Transform
1633	DIME: AN INFORMATION-THEORETIC DIFFICULTY MEASURE FOR AI DATASETS
1634	Structured consistency loss for semi-supervised semantic segmentation
1635	AMRL: Aggregated Memory For Reinforcement Learning
1636	Adapting Behaviour for Learning Progress
1637	Pretraining boosts out-of-domain robustness for pose estimation
1638	GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning
1639	Synthetic vs Real: Deep Learning on Controlled Noise
1640	Detecting malicious PDF using CNN
1641	NESTED LEARNING FOR MULTI-GRANULAR TASKS
1642	Scalable Model Compression by Entropy Penalized Reparameterization
1643	Stochastic Geodesic Optimization for Neural Networks
1644	Dynamic Time Lag Regression: Predicting What & When
1645	Scholastic-Actor-Critic For Multi Agent Reinforcement Learning
1646	On summarized validation curves and generalization
1647	Convolutional Bipartite Attractor Networks
1648	Anomaly Detection by Deep Direct Density Ratio Estimation
1649	New Loss Functions for Fast Maximum Inner Product Search
1650	Lipschitz Lifelong Reinforcement Learning
1651	Local Label Propagation for Large-Scale Semi-Supervised Learning
1652	GumbelClip: Off-Policy Actor-Critic Using Experience Replay
1653	Going Deeper with Lean Point Networks
1654	Improved Mutual Information Estimation
1655	Semi-Supervised Generative Modeling for Controllable Speech Synthesis
1656	Towards Physics-informed Deep Learning for Turbulent Flow Prediction
1657	Unsupervised Learning from Video with Deep Neural Embeddings
1658	Neural Text Generation With Unlikelihood Training
1659	Pure and Spurious Critical Points: a Geometric Study of Linear Networks
1660	Surrogate-Based Constrained Langevin Sampling With Applications to Optimal Material Configuration Design
1661	Learning Heuristics for Quantified Boolean Formulas through Reinforcement Learning
1662	Mean Field Models for Neural Networks in Teacher-student Setting
1663	A Causal View on Robustness of Neural Networks
1664	Striving for Simplicity in Off-Policy Deep Reinforcement Learning
1665	White Box Network: Obtaining a right composition ordering of functions
1666	Deep neuroethology of a virtual rodent
1667	DRASIC: Distributed Recurrent Autoencoder for Scalable Image Compression
1668	Emergence of functional and structural properties of the head direction system by optimization of recurrent neural networks
1669	Causal Induction from Visual Observations for Goal Directed Tasks
1670	Duration-of-Stay Storage Assignment under Uncertainty
1671	CAQL: Continuous Action Q-Learning
1672	GRAPH ANALYSIS AND GRAPH POOLING IN THE SPATIAL DOMAIN
1673	Your classifier is secretly an energy based model and you should treat it like one
1674	On the Linguistic Capacity of Real-time Counter Automata
1675	Combining MixMatch and Active Learning for Better Accuracy with Fewer Labels
1676	Adaptive Structural Fingerprints for Graph Attention Networks
1677	Inductive Matrix Completion Based on Graph Neural Networks
1678	Neural Operator Search
1679	Time2Vec: Learning a Vector Representation of Time
1680	ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring
1681	Conditional Learning of Fair Representations
1682	Mean-field Behaviour of Neural Tangent Kernel for Deep Neural Networks
1683	TabNet: Attentive Interpretable Tabular Learning
1684	Adapt-to-Learn: Policy Transfer in Reinforcement Learning
1685	Identity Crisis: Memorization and Generalization Under Extreme Overparameterization
1686	Stiffness: A New Perspective on Generalization in Neural Networks
1687	Linguistic Embeddings as a Common-Sense Knowledge Repository: Challenges and Opportunities
1688	First-Order Preconditioning via Hypergradient Descent
1689	Feature Partitioning for Efficient Multi-Task Architectures
1690	Layer Flexible Adaptive Computation Time for Recurrent Neural Networks
1691	Curvature-based Robustness Certificates against Adversarial Examples
1692	Adversarial Video Generation on Complex Datasets
1693	Topological Autoencoders
1694	Context-Gated Convolution
1695	Reinforcement Learning without Ground-Truth State
1696	Improved Sample Complexities for Deep Neural Networks and Robust Classification via an All-Layer Margin
1697	In-Domain Representation Learning For Remote Sensing
1698	Training Neural Networks for and by Interpolation
1699	FAN: Focused Attention Networks
1700	Unsupervised Data Augmentation for Consistency Training
1701	Assessing Generalization in TD methods for Deep Reinforcement Learning
1702	Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning
1703	Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?
1704	The Effect of Neural Net Architecture on Gradient Confusion & Training Performance
1705	Making DenseNet Interpretable: A Case Study in Clinical Radiology
1706	Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space
1707	Regularizing Deep Multi-Task Networks using Orthogonal Gradients
1708	Fast Training of Sparse Graph Neural Networks on Dense Hardware
1709	Simultaneous Classification and Out-of-Distribution Detection Using Deep Neural Networks
1710	Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML
1711	Long-term planning, short-term adjustments
1712	Imitation Learning via Off-Policy Distribution Matching
1713	Unsupervised Learning of Automotive 3D Crash Simulations using LSTMs
1714	Augmenting Transformers with KNN-Based Composite Memory
1715	SGD with Hardness Weighted Sampling for Distributionally Robust Deep Learning
1716	Constrained Markov Decision Processes via Backward Value Functions
1717	Reanalysis of Variance Reduced Temporal Difference Learning
1718	Meta-Learning for Variational Inference
1719	CONFEDERATED MACHINE LEARNING ON HORIZONTALLY AND VERTICALLY SEPARATED MEDICAL DATA FOR LARGE-SCALE HEALTH SYSTEM INTELLIGENCE
1720	Defending Against Adversarial Examples by Regularized Deep Embedding
1721	Minimizing FLOPs to Learn Efficient Sparse Representations
1722	Neural-Guided Symbolic Regression with Asymptotic Constraints
1723	Policy Optimization In the Face of Uncertainty
1724	DropGrad: Gradient Dropout Regularization for Meta-Learning
1725	Understanding Top-k Sparsification in Distributed Deep Learning
1726	Entropy Penalty: Towards Generalization Beyond the IID Assumption
1727	Improving Semantic Parsing with Neural Generator-Reranker Architecture
1728	Learning a Behavioral Repertoire from Demonstrations
1729	GRAPH NEIGHBORHOOD ATTENTIVE POOLING
1730	Deep symbolic regression
1731	Autoencoders and Generative Adversarial Networks for Imbalanced Sequence Classification
1732	Doubly Normalized Attention
1733	Uncertainty-Aware Prediction for Graph Neural Networks
1734	Training Deep Neural Networks by optimizing over nonlocal paths in hyperparameter space
1735	Lattice Representation Learning
1736	Omnibus Dropout for Improving The Probabilistic Classification Outputs of ConvNets
1737	Deep Multiple Instance Learning for Taxonomic Classification of Metagenomic read sets
1738	Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints
1739	RoBERTa: A Robustly Optimized BERT Pretraining Approach
1740	Deep Semi-Supervised Anomaly Detection
1741	GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation
1742	Out-of-distribution Detection in Few-shot Classification
1743	Learning Low-rank Deep Neural Networks via Singular Vector Orthogonality Regularization and Singular Value Sparsification
1744	Mirror-Generative Neural Machine Translation
1745	Frustratingly easy quasi-multitask learning
1746	Interpreting video features: a comparison of 3D convolutional networks and convolutional LSTM networks
1747	TrojanNet: Exposing the Danger of Trojan Horse Attack on Neural Networks
1748	Robust Learning with Jacobian Regularization
1749	Generalized Inner Loop Meta-Learning
1750	Sign Bits Are All You Need for Black-Box Attacks
1751	Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech
1752	Pre-training as Batch Meta Reinforcement Learning with tiMe
1753	On Global Feature Pooling for Fine-grained Visual Categorization
1754	Exploring by Exploiting Bad Models in Model-Based Reinforcement Learning
1755	Reinforced active learning for image segmentation
1756	Variational inference of latent hierarchical dynamical systems in neuroscience: an application to calcium imaging data
1757	Neural Architecture Search by Learning Action Space for Monte Carlo Tree Search
1758	Gradientless Descent: High-Dimensional Zeroth-Order Optimization
1759	Equivariant Entity-Relationship Networks
1760	Modeling Fake News in Social Networks with Deep Multi-Agent Reinforcement Learning
1761	Unsupervised Few-shot Object Recognition by Integrating Adversarial, Self-supervision, and Deep Metric Learning of Latent Parts
1762	On the "steerability" of generative adversarial networks
1763	GASL: Guided Attention for Sparsity Learning in Deep Neural Networks
1764	Affine Self Convolution
1765	Improving Differentially Private Models with Active Learning
1766	Matrix Multilayer Perceptron
1767	BEAN: Interpretable Representation Learning with Biologically-Enhanced Artificial Neuronal Assembly Regularization
1768	Feature-Robustness, Flatness and Generalization Error for Deep Neural Networks
1769	TriMap: Large-scale Dimensionality Reduction Using Triplets
1770	LEARNED STEP SIZE QUANTIZATION
1771	Frontal low-rank random tensors for high-order feature representation
1772	Learning General and Reusable Features via Racecar-Training
1773	Higher-order Weighted Graph Convolutional Networks
1774	Estimating counterfactual treatment outcomes over time through adversarially balanced representations
1775	Poincaré Wasserstein Autoencoder
1776	Robust Instruction-Following in a Situated Agent via Transfer-Learning from Text
1777	Stochastic Conditional Generative Networks with Basis Decomposition
1778	Task-Based Top-Down Modulation Network for Multi-Task-Learning Applications
1779	Global reasoning network for image super-resolution
1780	Tensor Graph Convolutional Networks for Prediction on Dynamic Graphs
1781	Matching Distributions via Optimal Transport for Semi-Supervised Learning
1782	GraphNVP: an Invertible Flow-based Model for Generating Molecular Graphs
1783	Language GANs Falling Short
1784	GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations
1785	Last-iterate convergence rates for min-max optimization
1786	Poisoning Attacks with Generative Adversarial Nets
1787	Parameterized Action Reinforcement Learning for Inverted Index Match Plan Generation
1788	Learnable Group Transform For Time-Series
1789	From English to Foreign Languages: Transferring Pre-trained Language Models
1790	COPHY: Counterfactual Learning of Physical Dynamics
1791	Semi-Supervised Few-Shot Learning with Prototypical Random Walks
1792	Why Convolutional Networks Learn Oriented Bandpass Filters: A Hypothesis
1793	Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning
1794	Unsupervised Out-of-Distribution Detection with Batch Normalization
1795	Understanding the Limitations of Variational Mutual Information Estimators
1796	Latent Question Reformulation and Information Accumulation for Multi-Hop Machine Reading
1797	Hamiltonian Generative Networks
1798	Customizing Sequence Generation with Multi-Task Dynamical Systems
1799	Extracting and Leveraging Feature Interaction Interpretations
1800	Zero-Shot Medical Image Artifact Reduction
1801	Quantum Expectation-Maximization for Gaussian Mixture Models
1802	Behavior Regularized Offline Reinforcement Learning
1803	Encoder-Agnostic Adaptation for Conditional Language Generation
1804	Optimizing Data Usage via Differentiable Rewards
1805	Dropout: Explicit Forms and Capacity Control
1806	Training Interpretable Convolutional Neural Networks towards Class-specific Filters
1807	Faster Neural Network Training with Data Echoing
1808	Kronecker Attention Networks
1809	Farkas layers: don't shift the data, fix the geometry
1810	Non-Gaussian processes and neural networks at finite widths
1811	Unsupervised Model Selection for Variational Disentangled Representation Learning
1812	Sticking to the Facts: Confident Decoding for Faithful Data-to-Text Generation
1813	How much Position Information Do Convolutional Neural Networks Encode?
1814	A Theoretical Analysis of the Number of Shots in Few-Shot Learning
1815	Event extraction from unstructured Amharic text
1816	Representation Learning for Remote Sensing: An Unsupervised Sensor Fusion Approach
1817	Natural Language State Representation for Reinforcement Learning
1818	Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery
1819	Project and Forget: Solving Large Scale Metric Constrained Problems
1820	On the Variance of the Adaptive Learning Rate and Beyond
1821	Translation Between Waves, wave2wave
1822	Quantifying the Cost of Reliable Photo Authentication via High-Performance Learned Lossy Representations
1823	Improving End-to-End Object Tracking Using Relational Reasoning
1824	Attention Privileged Reinforcement Learning for Domain Transfer
1825	Sliced Cramer Synaptic Consolidation for Preserving Deeply Learned Representations
1826	On Variational Learning of Controllable Representations for Text without Supervision
1827	Disentangled Representation Learning with Sequential Residual Variational Autoencoder
1828	Improved Training Speed, Accuracy, and Data Utilization via Loss Function Optimization
1829	Using Hindsight to Anchor Past Knowledge in Continual Learning
1830	Empirical confidence estimates for classification by deep neural networks
1831	iSOM-GSN: An Integrative Approach for Transforming Multi-omic Data into Gene Similarity Networks via Self-organizing Maps
1832	Learning Numeral Embedding
1833	Localized Generations with Deep Neural Networks for Multi-Scale Structured Datasets
1834	AlgoNet: $C^\infty$ Smooth Algorithmic Neural Networks
1835	Temporal-difference learning for nonlinear value function approximation in the lazy training regime
1836	A Bayes-Optimal View on Adversarial Examples
1837	Efficient Content-Based Sparse Attention with Routing Transformers
1838	Good Semi-supervised VAE Requires Tighter Evidence Lower Bound
1839	Option Discovery using Deep Skill Chaining
1840	HOPPITY: LEARNING GRAPH TRANSFORMATIONS TO DETECT AND FIX BUGS IN PROGRAMS
1841	PowerSGD: Powered Stochastic Gradient Descent Methods for Accelerated Non-Convex Optimization
1842	Deep Randomized Least Squares Value Iteration
1843	Self-Supervised Policy Adaptation
1844	RTC-VAE: HARNESSING THE PECULIARITY OF TOTAL CORRELATION IN LEARNING DISENTANGLED REPRESENTATIONS
1845	OmniNet: A unified architecture for multi-modal multi-task learning
1846	Unified Probabilistic Deep Continual Learning through Generative Replay and Open Set Recognition
1847	LEVERAGING AUXILIARY TEXT FOR DEEP RECOGNITION OF UNSEEN VISUAL RELATIONSHIPS
1848	TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising
1849	V4D: 4D Covolutional Neural Networks for Video-level Representations Learning
1850	ODE Analysis of Stochastic Gradient Methods with Optimism and Anchoring for Minimax Problems and GANs
1851	Learning to Represent Programs with Property Signatures
1852	Unified recurrent network for many feature types
1853	Restoration of Video Frames from a Single Blurred Image with Motion Understanding
1854	Improving Dirichlet Prior Network for Out-of-Distribution Example Detection
1855	Variational Autoencoders for Opponent Modeling in Multi-Agent Systems
1856	Prototype Recalls for Continual Learning
1857	Generative Ratio Matching Networks
1858	Emergence of Compositional Language with Deep Generational Transmission
1859	Deep Gradient Boosting -- Layer-wise Input Normalization of Neural Networks
1860	A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models
1861	Bridging ELBO objective and MMD
1862	In Search for a SAT-friendly Binarized Neural Network Architecture
1863	EfferenceNets for latent space planning
1864	Neural networks are a priori biased towards Boolean functions with low entropy
1865	DUAL ADVERSARIAL MODEL FOR GENERATING 3D POINT CLOUD
1866	Wider Networks Learn Better Features
1867	Conditional Invertible Neural Networks for Guided Image Generation
1868	Cost-Effective Testing of a Deep Learning Model through Input Reduction
1869	Hebbian Graph Embeddings
1870	NeuralUCB: Contextual Bandits with Neural Network-Based Exploration
1871	Meta-Graph: Few shot Link Prediction via Meta Learning
1872	Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games
1873	An implicit function learning approach for parametric modal regression
1874	The asymptotic spectrum of the Hessian of DNN throughout training
1875	Auto-Encoding Explanatory Examples
1876	RISE and DISE: Two Frameworks for Learning from Time Series with Missing Data
1877	Fast Machine Learning with Byzantine Workers and Servers
1878	How the Softmax Activation Hinders the Detection of Adversarial and Out-of-Distribution Examples in Neural Networks
1879	Tree-Structured Attention with Hierarchical Accumulation
1880	Deep 3D Pan via Local adaptive "t-shaped" convolutions with global and local adaptive dilations
1881	MANAS: Multi-Agent Neural Architecture Search
1882	SimulS2S: End-to-End Simultaneous Speech to Speech Translation
1883	Enhancing Attention with Explicit Phrasal Alignments
1884	LightPAFF: A Two-Stage Distillation Framework for Pre-training and Fine-tuning
1885	Robust saliency maps with distribution-preserving decoys
1886	Role of two learning rates in convergence of model-agnostic meta-learning
1887	Low-Resource Knowledge-Grounded Dialogue Generation
1888	Generative Multi Source Domain Adaptation
1889	GResNet: Graph Residual Network for Reviving Deep GNNs from Suspended Animation
1890	Realism Index: Interpolation in Generative Models With Arbitrary Prior
1891	Deep RL for Blood Glucose Control: Lessons, Challenges, and Opportunities
1892	A TARGET-AGNOSTIC ATTACK ON DEEP MODELS: EXPLOITING SECURITY VULNERABILITIES OF TRANSFER LEARNING
1893	Training Provably Robust Models by Polyhedral Envelope Regularization
1894	FleXOR: Trainable Fractional Quantization
1895	DP-LSSGD: An Optimization Method to Lift the Utility in Privacy-Preserving ERM
1896	Multi-Task Learning via Scale Aware Feature Pyramid Networks and Effective Joint Head
1897	AdaX: Adaptive Gradient Descent with Exponential Long Term Memory
1898	ON COMPUTATION AND GENERALIZATION OF GENER- ATIVE ADVERSARIAL IMITATION LEARNING
1899	Disentangling Improves VAEs' Robustness to Adversarial Attacks
1900	Sparsity Meets Robustness: Channel Pruning for the Feynman-Kac Formalism Principled Robust Deep Neural Nets
1901	FEW-SHOT LEARNING ON GRAPHS VIA SUPER-CLASSES BASED ON GRAPH SPECTRAL MEASURES
1902	On Recovering Latent Factors From Sampling And Firing Graph
1903	Influence-Based Multi-Agent Exploration
1904	Demonstration Actor Critic
1905	Deep Coordination Graphs
1906	Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time Series Imputation
1907	How Well Do WGANs Estimate the Wasserstein Metric?
1908	Revisiting the Generalization of Adaptive Gradient Methods
1909	An Information Theoretic Perspective on Disentangled Representation Learning
1910	Multiplicative Interactions and Where to Find Them
1911	SELF-KNOWLEDGE DISTILLATION ADVERSARIAL ATTACK
1912	DIVA: Domain Invariant Variational Autoencoder
1913	Continual Learning with Bayesian Neural Networks for Non-Stationary Data
1914	RPGAN: random paths as a latent space for GAN interpretability
1915	SAdam: A Variant of Adam for Strongly Convex Functions
1916	Improving the Generalization of Visual Navigation Policies using Invariance Regularization
1917	Improving the robustness of ImageNet classifiers using elements of human visual cognition
1918	Differentially Private Survival Function Estimation
1919	Size-free generalization bounds for convolutional neural networks
1920	Scaling Laws for the Principled Design, Initialization, and Preconditioning of ReLU Networks
1921	A Fair Comparison of Graph Neural Networks for Graph Classification
1922	Finding and Visualizing Weaknesses of Deep Reinforcement Learning Agents
1923	Computation Reallocation for Object Detection
1924	MULTI-LABEL METRIC LEARNING WITH BIDIRECTIONAL REPRESENTATION DEEP NEURAL NETWORKS
1925	Sparse Networks from Scratch: Faster Training without Losing Performance
1926	Modeling Winner-Take-All Competition in Sparse Binary Projections
1927	Laplacian Denoising Autoencoder
1928	Training Data Distribution Search with Ensemble Active Learning
1929	Meta-Learning without Memorization
1930	COMMUNITY PRESERVING NODE EMBEDDING
1931	From Variational to Deterministic Autoencoders
1932	Adversarially Robust Representations with Smooth Encoders
1933	AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures
1934	Representation Quality Explain Adversarial Attacks
1935	Inferring Dynamical Systems with Long-Range Dependencies through Line Attractor Regularization
1936	End-To-End Input Selection for Deep Neural Networks
1937	Hierarchical Graph-to-Graph Translation for Molecules
1938	Teaching GAN to generate per-pixel annotation
1939	ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
1940	DeepEnFM: Deep neural networks with Encoder enhanced Factorization Machine
1941	A NEW POINTWISE CONVOLUTION IN DEEP NEURAL NETWORKS THROUGH EXTREMELY FAST AND NON PARAMETRIC TRANSFORMS
1942	Decaying momentum helps neural network training
1943	Regularizing Black-box Models for Improved Interpretability
1944	GPNET: MONOCULAR 3D VEHICLE DETECTION BASED ON LIGHTWEIGHT WHEEL GROUNDING POINT DETECTION NETWORK
1945	Needles in Haystacks: On Classifying Tiny Objects in Large Images
1946	Quadratic GCN for graph classification
1947	The advantage of using Student's t-priors in variational autoencoders
1948	Finite Depth and Width Corrections to the Neural Tangent Kernel
1949	Order Learning and Its Application to Age Estimation
1950	Couple-VAE: Mitigating the Encoder-Decoder Incompatibility in Variational Text Modeling with Coupled Deterministic Networks
1951	Distilling Neural Networks for Faster and Greener Dependency Parsing
1952	Model-based Saliency for the Detection of Adversarial Examples
1953	Online Meta-Critic Learning for Off-Policy Actor-Critic Methods
1954	BUZz: BUffer Zones for defending adversarial examples in image classification
1955	Efficient and Information-Preserving Future Frame Prediction and Beyond
1956	Path Space for Recurrent Neural Networks with ReLU Activations
1957	Wasserstein Adversarial Regularization (WAR) on label noise
1958	Self-Supervised Speech Recognition via Local Prior Matching
1959	SRDGAN: learning the noise prior for Super Resolution with Dual Generative Adversarial Networks
1960	Amata: An Annealing Mechanism for Adversarial Training Acceleration
1961	An Inter-Layer Weight Prediction and Quantization for Deep Neural Networks based on Smoothly Varying Weight Hypothesis
1962	Context Based Machine Translation With Recurrent Neural Network For English-Amharic Translation
1963	Robust Domain Randomization for Reinforcement Learning
1964	NAS evaluation is frustratingly hard
1965	Ellipsoidal Trust Region Methods for Neural Network Training
1966	Learning Semantically Meaningful Representations Through Embodiment
1967	Superseding Model Scaling by Penalizing Dead Units and Points with Separation Constraints
1968	Artificial Design: Modeling Artificial Super Intelligence with Extended General Relativity and Universal Darwinism via Geometrization for Universal Design Automation
1969	Robust Graph Representation Learning via Neural Sparsification
1970	Hyperbolic Discounting and Learning Over Multiple Horizons
1971	CLN2INV: Learning Loop Invariants with Continuous Logic Networks
1972	Gated Channel Transformation for Visual Recognition
1973	Federated User Representation Learning
1974	INSTANCE CROSS ENTROPY FOR DEEP METRIC LEARNING
1975	Scalable Neural Methods for Reasoning With a Symbolic Knowledge Base
1976	Variational pSOM: Deep Probabilistic Clustering with Self-Organizing Maps
1977	Augmenting Self-attention with Persistent Memory
1978	Information Plane Analysis of Deep Neural Networks via Matrix--Based Renyi's Entropy and Tensor Kernels
1979	Ridge Regression: Structure, Cross-Validation, and Sketching
1980	Hindsight Trust Region Policy Optimization
1981	Policy Optimization with Stochastic Mirror Descent
1982	Graph convolutional networks for learning with few clean and many noisy labels
1983	A Constructive Prediction of the Generalization Error Across Scales
1984	MLModelScope: A Distributed Platform for ML Model Evaluation and Benchmarking at Scale
1985	A Mention-Pair Model of Annotation with Nonparametric User Communities
1986	An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality
1987	NPTC-net: Narrow-Band Parallel Transport Convolutional Neural Network on Point Clouds
1988	Mogrifier LSTM
1989	Individualised Dose-Response Estimation using Generative Adversarial Nets
1990	Physics-as-Inverse-Graphics: Unsupervised Physical Parameter Estimation from Video
1991	Trajectory representation learning for Multi-Task NMRDPs planning
1992	Incorporating Horizontal Connections in Convolution by Spatial Shuffling
1993	Is Deep Reinforcement Learning Really Superhuman on Atari? Leveling the playing field
1994	Counterfactuals uncover the modular structure of deep generative models
1995	Pushing the bounds of dropout
1996	Confidence Scores Make Instance-dependent Label-noise Learning Possible
1997	Gap-Aware Mitigation of Gradient Staleness
1998	Evaluating and Calibrating Uncertainty Prediction in Regression Tasks
1999	Ensemble Distribution Distillation
2000	Deformable Kernels: Adapting Effective Receptive Fields for Object Deformation
2001	On the Tunability of Optimizers in Deep Learning
2002	Gradient Perturbation is Underrated for Differentially Private Convex Optimization
2003	VL-BERT: Pre-training of Generic Visual-Linguistic Representations
2004	Credible Sample Elicitation by Deep Learning, for Deep Learning
2005	Neural Markov Logic Networks
2006	Optimistic Exploration even with a Pessimistic Initialisation
2007	Better Optimization for Neural Architecture Search with Mixed-Level Reformulation
2008	Risk Averse Value Expansion for Sample Efficient and Robust Policy Learning
2009	Certified Robustness for Top-k Predictions against Adversarial Perturbations via Randomized Smoothing
2010	LabelFool: A Trick in the Label Space
2011	RGTI:Response generation via templates integration for End to End dialog
2012	Towards Disentangling Non-Robust and Robust Components in Performance Metric
2013	A Mechanism of Implicit Regularization in Deep Learning
2014	Feature-map-level Online Adversarial Knowledge Distillation
2015	Optimising Neural Network Architectures for Provable Adversarial Robustness
2016	Recurrent Independent Mechanisms
2017	An Explicitly Relational Neural Network Architecture
2018	Branched Multi-Task Networks: Deciding What Layers To Share
2019	MxPool: Multiplex Pooling for Hierarchical Graph Representation Learning
2020	Mixture-of-Experts Variational Autoencoder for clustering and generating from similarity-based representations
2021	Temporal Difference Weighted Ensemble For Reinforcement Learning
2022	Task Level Data Augmentation for Meta-Learning
2023	Effect of top-down connections in Hierarchical Sparse Coding
2024	Compressive Recovery Defense: A Defense Framework for $\ell_0, \ell_2$ and $\ell_\infty$ norm attacks.
2025	Match prediction from group comparison data using neural networks
2026	Extractor-Attention Network: A New Attention Network with Hybrid Encoders for Chinese Text Classification
2027	Identifying through Flows for Recovering Latent Representations
2028	Robust training with ensemble consensus
2029	Fault Tolerant Reinforcement Learning via A Markov Game of Control and Stopping
2030	BRIDGING ADVERSARIAL SAMPLES AND ADVERSARIAL NETWORKS
2031	Hierarchical Summary-to-Article Generation
2032	Unsupervised-Learning of time-varying features
2033	Self-Adversarial Learning with Comparative Discrimination for Text Generation
2034	A General Upper Bound for Unsupervised Domain Adaptation
2035	Vid2Game: Controllable Characters Extracted from Real-World Videos
2036	Action Semantics Network: Considering the Effects of Actions in Multiagent Systems
2037	Growing Action Spaces
2038	Learning Generative Image Object Manipulations from Language Instructions
2039	Discourse-Based Evaluation of Language Understanding
2040	Learning Efficient Parameter Server Synchronization Policies for Distributed SGD
2041	Relational State-Space Model for Stochastic Multi-Object Systems
2042	TSInsight: A local-global attribution framework for interpretability in time-series data
2043	OPTIMAL TRANSPORT, CYCLEGAN, AND PENALIZED LS FOR UNSUPERVISED LEARNING IN INVERSE PROBLEMS
2044	Structural Language Models for Any-Code Generation
2045	How does Lipschitz Regularization Influence GAN Training?
2046	Simple and Effective Stochastic Neural Networks
2047	Robust Reinforcement Learning with Wasserstein Constraint
2048	Cross-Iteration Batch Normalization
2049	Model Ensemble-Based Intrinsic Reward for Sparse Reward Reinforcement Learning
2050	The Effect of Residual Architecture on the Per-Layer Gradient of Deep Networks
2051	Prune or quantize? Strategy for Pareto-optimally low-cost and accurate CNN
2052	Graph Residual Flow for Molecular Graph Generation
2053	Nonlinearities in activations substantially shape the loss surfaces of neural networks
2054	Attention over Parameters for Dialogue Systems
2055	The Convex Information Bottleneck Lagrangian
2056	The problem with DDPG: understanding failures in deterministic environments with sparse rewards
2057	LocalGAN: Modeling Local Distributions for Adversarial Response Generation
2058	Hierarchical Image-to-image Translation with Nested Distributions Modeling
2059	Generative Adversarial Networks For Data Scarcity Industrial Positron Images With Attention
2060	OvA-INN: Continual Learning with Invertible Neural Networks
2061	Contextual Inverse Reinforcement Learning
2062	Mining GANs for knowledge transfer to small domains
2063	Learning Time-Aware Assistance Functions for Numerical Fluid Solvers
2064	Transition Based Dependency Parser for Amharic Language Using Deep Learning
2065	Samples Are Useful? Not Always: denoising policy gradient updates using variance explained
2066	Learning Surrogate Losses
2067	Boosting Network: Learn by Growing Filters and Layers via SplitLBI
2068	Split LBI for Deep Learning: Structural Sparsity via Differential Inclusion Paths
2069	Generalizing Deep Multi-task Learning with Heterogeneous Structured Networks
2070	Unsupervised Universal Self-Attention Network for Graph Classification
2071	FairFace: A Novel Face Attribute Dataset for Bias Measurement and Mitigation
2072	Manifold Modeling in Embedded Space: A Perspective for Interpreting "Deep Image Prior"
2073	Novelty Detection Via Blurring
2074	Small-GAN: Speeding up GAN Training using Core-Sets
2075	Bounds on Over-Parameterization for Guaranteed Existence of Descent Paths in Shallow ReLU Networks
2076	Data-Independent Neural Pruning via Coresets
2077	Deeper Insights into Weight Sharing in Neural Architecture Search
2078	Learnable Higher-order Representation for Action Recognition
2079	Dirichlet Wrapper to Quantify Classification Uncertainty in Black-Box Systems
2080	S2VG: Soft Stochastic Value Gradient method
2081	Deep Network classification by Scattering and Homotopy dictionary learning
2082	Scalable Generative Models for Graphs with Graph Attention Mechanism
2083	Continuous Adaptation in Multi-agent Competitive Environments
2084	Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP
2085	Combiner: Inductively Learning Tree Structured Attention in Transformers
2086	Robust Cross-lingual Embeddings from Parallel Sentences
2087	Semi-supervised Learning by Coaching
2088	DYNAMIC SELF-TRAINING FRAMEWORK FOR GRAPH CONVOLUTIONAL NETWORKS
2089	Blockwise Self-Attention for Long Document Understanding
2090	Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models
2091	I am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively
2092	Black-Box Adversarial Attack with Transferable Model-based Embedding
2093	Stabilizing Off-Policy Reinforcement Learning with Conservative Policy Gradients
2094	Understanding Distributional Ambiguity via Non-robust Chance Constraint
2095	MobileBERT: Task-Agnostic Compression of BERT by Progressive Knowledge Transfer
2096	Do Image Classifiers Generalize Across Time?
2097	Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation
2098	Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination
2099	A shallow feature extraction network with a large receptive field for stereo matching tasks
2100	Learning Boolean Circuits with Neural Networks
2101	ProxNet: End-to-End Learning of Structured Representation by Proximal Mapping
2102	Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets
2103	Towards Principled Objectives for Contrastive Disentanglement
2104	Compositional languages emerge in a neural iterated learning model
2105	Population-Guided Parallel Policy Search for Reinforcement Learning
2106	Classification Logit Two-sample Testing by Neural Networks
2107	Variational Recurrent Models for Solving Partially Observable Control Tasks
2108	Learning to Discretize: Solving 1D Scalar Conservation Laws via Deep Reinforcement Learning
2109	Towards Unifying Neural Architecture Space Exploration and Generalization
2110	Composable Semi-parametric Modelling for Long-range Motion Generation
2111	Towards an Adversarially Robust Normalization Approach
2112	Generative Latent Flow
2113	Adversarial Example Detection and Classification with Asymmetrical Adversarial Training
2114	CZ-GEM: A FRAMEWORK FOR DISENTANGLED REPRESENTATION LEARNING
2115	Generalized Natural Language Grounded Navigation via Environment-agnostic Multitask Learning
2116	Global Concavity and Optimization in a Class of Dynamic Discrete Choice Models
2117	Posterior sampling for multi-agent reinforcement learning: solving extensive games with imperfect information
2118	On the Pareto Efficiency of Quantized CNN
2119	BANANAS: Bayesian Optimization with Neural Networks for Neural Architecture Search
2120	Potential Flow Generator with $L_2$ Optimal Transport Regularity for Generative Models
2121	Integrative Tensor-based Anomaly Detection System For Satellites
2122	Detecting and Diagnosing Adversarial Images with Class-Conditional Capsule Reconstructions
2123	MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius
2124	TinyBERT: Distilling BERT for Natural Language Understanding
2125	UW-NET: AN INCEPTION-ATTENTION NETWORK FOR UNDERWATER IMAGE CLASSIFICATION
2126	Semantically-Guided Representation Learning for Self-Supervised Monocular Depth
2127	Stochastic AUC Maximization with Deep Neural Networks
2128	Data-Driven Approach to Encoding and Decoding 3-D Crystal Structures
2129	Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity
2130	Why ADAM Beats SGD for Attention Models
2131	Reflection-based Word Attribute Transfer
2132	Difference-Seeking Generative Adversarial Network--Unseen Sample Generation
2133	EINS: Long Short-Term Memory with Extrapolated Input Network Simplification
2134	FasterSeg: Searching for Faster Real-time Semantic Segmentation
2135	LEARNING EXECUTION THROUGH NEURAL CODE FUSION
2136	Meta Module Network for Compositional Visual Reasoning
2137	Min-max Entropy for Weakly Supervised Pointwise Localization
2138	Editable Neural Networks
2139	Parallel Scheduled Sampling
2140	Learning Explainable Models Using Attribution Priors
2141	Efficient Inference and Exploration for Reinforcement Learning
2142	Leveraging inductive bias of neural networks for learning without explicit human annotations
2143	Bias-Resilient Neural Network
2144	Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis
2145	Accelerating Reinforcement Learning Through GPU Atari Emulation
2146	Can gradient clipping mitigate label noise?
2147	Concise Multi-head Attention Models
2148	Tensorized Embedding Layers for Efficient Model Compression
2149	Rethinking Neural Network Quantization
2150	Zero-shot task adaptation by homoiconic meta-mapping
2151	iSparse: Output Informed Sparsification of Neural Networks
2152	HyperEmbed: Tradeoffs Between Resources and Performance in NLP Tasks with Hyperdimensional Computing enabled embedding of n-gram statistics
2153	Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model
2154	Fast Linear Interpolation for Piecewise-Linear Functions, GAMs, and Deep Lattice Networks
2155	Adversarial Training: embedding adversarial perturbations into the parameter space of a neural network to build a robust system
2156	Collaborative Generated Hashing for Market Analysis and Fast Cold-start Recommendation
2157	Pruned Graph Scattering Transforms
2158	DDSP: Differentiable Digital Signal Processing
2159	Continual Learning via Neural Pruning
2160	Min-Max Optimization without Gradients: Convergence and Applications to Adversarial ML
2161	XLDA: Cross-Lingual Data Augmentation for Natural Language Inference and Question Answering
2162	Attraction-Repulsion Actor-Critic for Continuous Control Reinforcement Learning
2163	GLAD: Learning Sparse Graph Recovery
2164	PDP: A General Neural Framework for Learning SAT Solvers
2165	Adaptive Loss Scaling for Mixed Precision Training
2166	Quantifying Exposure Bias for Neural Language Generation
2167	How many weights are enough : can tensor factorization learn efficient policies ?
2168	Domain Aggregation Networks for Multi-Source Domain Adaptation
2169	Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming
2170	AHash: A Load-Balanced One Permutation Hash
2171	Ordinary differential equations on graph networks
2172	Lift-the-flap: what, where and when for context reasoning
2173	Unifying Question Answering, Text Classification, and Regression via Span Extraction
2174	Supervised learning with incomplete data via sparse representations
2175	Conversation Generation with Concept Flow
2176	The Probabilistic Fault Tolerance of Neural Networks in the Continuous Limit
2177	Variational Hashing-based Collaborative Filtering with Self-Masking
2178	Neural Network Branching for Neural Network Verification
2179	SoftLoc: Robust Temporal Localization under Label Misalignment
2180	VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation
2181	Adaptive Data Augmentation with Deep Parallel Generative Models
2182	Domain-invariant Learning using Adaptive Filter Decomposition
2183	Topology of deep neural networks
2184	Adversarial Policies: Attacking Deep Reinforcement Learning
2185	Escaping Saddle Points Faster with Stochastic Momentum
2186	Few-shot Text Classification with Distributional Signatures
2187	RotationOut as a Regularization Method for Neural Network
2188	Universal Approximation with Deep Narrow Networks
2189	A Dynamic Approach to Accelerate Deep Learning Training
2190	Geometric Insights into the Convergence of Nonlinear TD Learning
2191	Efficient Multivariate Bandit Algorithm with Path Planning
2192	Learning Self-Correctable Policies and Value Functions from Demonstrations with Negative Sampling
2193	Exploring Model-based Planning with Policy Networks
2194	Benchmarking Model-Based Reinforcement Learning
2195	Encoder-decoder Network as Loss Function for Summarization
2196	Locally adaptive activation functions with slope recovery term for deep and physics-informed neural networks
2197	On Identifiability in Transformers
2198	Automated curriculum generation through setter-solver interactions
2199	Deep Multi-View Learning via Task-Optimal CCA
2200	Bandlimiting Neural Networks Against Adversarial Attacks
2201	Progressive Memory Banks for Incremental Domain Adaptation
2202	MMD GAN with Random-Forest Kernels
2203	What graph neural networks cannot learn: depth vs width
2204	INFERENCE, PREDICTION, AND ENTROPY RATE OF CONTINUOUS-TIME, DISCRETE-EVENT PROCESSES
2205	Learning an off-policy predictive state representation for deep reinforcement learning for vision-based steering in autonomous driving
2206	RTFM: Generalising to New Environment Dynamics via Reading
2207	MIM: Mutual Information Machine
2208	Real or Fake: An Empirical Study and Improved Model for Fake Face Detection
2209	Constant Time Graph Neural Networks
2210	AutoLR: A Method for Automatic Tuning of Learning Rate
2211	Generating Robust Audio Adversarial Examples using Iterative Proportional Clipping
2212	Optimal Attacks on Reinforcement Learning Policies
2213	Multi-Agent Hierarchical Reinforcement Learning for Humanoid Navigation
2214	SMiRL: Surprise Minimizing RL in Entropic Environments
2215	Mesh-Free Unsupervised Learning-Based PDE Solver of Forward and Inverse problems
2216	Input Complexity and Out-of-distribution Detection with Likelihood-based Generative Models
2217	Sparse and Structured Visual Attention
2218	Network Pruning for Low-Rank Binary Index
2219	Style-based Encoder Pre-training for Multi-modal Image Synthesis
2220	LDMGAN: Reducing Mode Collapse in GANs with Latent Distribution Matching
2221	Bootstrapping the Expressivity with Model-based Planning
2222	DeepAGREL: Biologically plausible deep learning via direct reinforcement
2223	Homogeneous Linear Inequality Constraints for Neural Network Activations
2224	Leveraging Simple Model Predictions for Enhancing its Performance
2225	Modeling treatment events in disease progression
2226	DG-GAN: the GAN with the duality gap
2227	Stochastic Gradient Descent with Biased but Consistent Gradient Estimators
2228	One-way prototypical networks
2229	Encoding word order in complex embeddings
2230	ADASAMPLE: ADAPTIVE SAMPLING OF HARD POSITIVES FOR DESCRIPTOR LEARNING
2231	Functional vs. parametric equivalence of ReLU networks
2232	A New Multi-input Model with the Attention Mechanism for Text Classification
2233	Multi-Dimensional Explanation of Reviews
2234	A Uniform Generalization Error Bound for Generative Adversarial Networks
2235	QGAN: Quantize Generative Adversarial Networks to Extreme low-bits
2236	Learning to Transfer Learn
2237	Contrastive Learning of Structured World Models
2238	Disentangling Factors of Variations Using Few Labels
2239	Detecting Out-of-Distribution Inputs to Deep Generative Models Using Typicality
2240	EDUCE: Explaining model Decision through Unsupervised Concepts Extraction
2241	Target-directed Atomic Importance Estimation via Reverse Self-attention
2242	A critical analysis of self-supervision, or what we can learn from a single image
2243	Accelerating SGD with momentum for over-parameterized learning
2244	Discrete InfoMax Codes for Meta-Learning
2245	The Geometry of Sign Gradient Descent
2246	Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation
2247	Attributes Obfuscation with Complex-Valued Features
2248	V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control
2249	MDE: Multiple Distance Embeddings for Link Prediction in Knowledge Graphs
2250	Improving Adversarial Robustness Requires Revisiting Misclassified Examples
2251	Learning to Sit: Synthesizing Human-Chair Interactions via Hierarchical Control
2252	InfoCNF: Efficient Conditional Continuous Normalizing Flow Using Adaptive Solvers
2253	Mirror Descent View For Neural Network Quantization
2254	Hierarchical Disentangle Network for Object Representation Learning
2255	Deep Multiple Instance Learning with Gaussian Weighting
2256	Mitigating Posterior Collapse in Strongly Conditioned Variational Autoencoders
2257	Zeno++: Robust Fully Asynchronous SGD
2258	DivideMix: Learning with Noisy Labels as Semi-supervised Learning
2259	PAD-Nets: Learning Dynamic Receptive Fields via Pixel-Wise Adaptive Dilation
2260	PLEX: PLanner and EXecutor for Embodied Learning in Navigation
2261	DeepObfusCode: Source Code Obfuscation Through Sequence-to-Sequence Networks
2262	Extreme Value k-means Clustering
2263	Adaptive network sparsification with dependent variational beta-Bernoulli dropout
2264	Data-dependent Gaussian Prior Objective for Language Generation
2265	Learning Representations in Reinforcement Learning: an Information Bottleneck Approach
2266	LSTOD: Latent Spatial-Temporal Origin-Destination prediction model and its applications in ride-sharing platforms
2267	Ecological Reinforcement Learning
2268	Dual-Component Deep Domain Adaptation: A New Approach for Cross Project Software Vulnerability Detection
2269	Towards Understanding the Regularization of Adversarial Robustness on Neural Networks
2270	MaskConvNet: Training Efficient ConvNets from Scratch via Budget-constrained Filter Pruning
2271	Fast Bilinear Matrix Normalization via Rank-1 Update
2272	Scale-Equivariant Neural Networks with Decomposed Convolutional Filters
2273	A novel Bayesian estimation-based word embedding model for sentiment analysis
2274	Attacking Lifelong Learning Models with Gradient Reversion
2275	Learning with Long-term Remembering: Following the Lead of Mixed Stochastic Gradient
2276	A Harmonic Structure-Based Neural Network Model for Musical Pitch Detection
2277	Fooling Detection Alone is Not Enough: Adversarial Attack against Multiple Object Tracking
2278	Towards A Unified Min-Max Framework for Adversarial Exploration and Robustness
2279	Domain-Agnostic Few-Shot Classification by Learning Disparate Modulators
2280	Anomaly Detection and Localization in Images using Guided Attention
2281	Watch, Try, Learn: Meta-Learning from Demonstrations and Rewards
2282	Logic and the 2-Simplicial Transformer
2283	PAC-Bayes Few-shot Meta-learning with Implicit Learning of Model Prior Distribution
2284	Reinforcement Learning with Chromatic Networks
2285	AE-OT: A NEW GENERATIVE MODEL BASED ON EXTENDED SEMI-DISCRETE OPTIMAL TRANSPORT
2286	Deep Mining: Detecting Anomalous Patterns in Neural Network Activations with Subset Scanning
2287	A Data-Efficient Mutual Information Neural Estimator for Statistical Dependency Testing
2288	Enhancing Adversarial Defense by k-Winners-Take-All
2289	Thwarting finite difference adversarial attacks with output randomization
2290	Exploration in Reinforcement Learning with Deep Covering Options
2291	Towards Controllable and Interpretable Face Completion via Structure-Aware and Frequency-Oriented Attentive GANs
2292	Learning audio representations with self-supervision
2293	Learning Disentangled Representations for CounterFactual Regression
2294	Learning relevant features for statistical inference
2295	VILD: Variational Imitation Learning with Diverse-quality Demonstrations
2296	Entropy Minimization In Emergent Languages
2297	A Unified framework for randomized smoothing based certified defenses
2298	Analysis of Video Feature Learning in Two-Stream CNNs on the Example of Zebrafish Swim Bout Classification
2299	MIST: Multiple Instance Spatial Transformer Networks
2300	ISBNet: Instance-aware Selective Branching Networks
2301	MODiR: Multi-Objective Dimensionality Reduction for Joint Data Visualisation
2302	Robust Local Features for Improving the Generalization of Adversarial Training
2303	Online and stochastic optimization beyond Lipschitz continuity: A Riemannian approach
2304	Distributed Online Optimization with Long-Term Constraints
2305	Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives
2306	Learning the Arrow of Time for Problems in Reinforcement Learning
2307	Topological based classification using graph convolutional networks
2308	The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget
2309	AutoGrow: Automatic Layer Growing in Deep Convolutional Networks
2310	Sequence-level Intrinsic Exploration Model for Partially Observable Domains
2311	Pipelined Training with Stale Weights of Deep Convolutional Neural Networks
2312	StacNAS: Towards Stable and Consistent Optimization for Differentiable Neural Architecture Search
2313	Universal Learning Approach for Adversarial Defense
2314	Boosting Generative Models by Leveraging Cascaded Meta-Models
2315	Quantitatively Disentangling and Understanding Part Information in CNNs
2316	The Implicit Bias of Depth: How Incremental Learning Drives Generalization
2317	FAKE CAN BE REAL IN GANS
2318	Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness
2319	Measuring Compositional Generalization: A Comprehensive Method on Realistic Data
2320	Theory and Evaluation Metrics for Learning Disentangled Representations
2321	Mixup Inference: Better Exploiting Mixup to Defend Adversarial Attacks
2322	Dynamically Pruned Message Passing Networks for Large-scale Knowledge Graph Reasoning
2323	A TWO-STAGE FRAMEWORK FOR MATHEMATICAL EXPRESSION RECOGNITION
2324	Universal Source-Free Domain Adaptation
2325	Learning Invariants through Soft Unification
2326	Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction
2327	Macro Action Ensemble Searching Methodology for Deep Reinforcement Learning
2328	INTERPRETING CNN COMPRESSION USING INFORMATION BOTTLENECK
2329	Increasing batch size through instance repetition improves generalization
2330	FSPool: Learning Set Representations with Featurewise Sort Pooling
2331	Recurrent Neural Networks are Universal Filters
2332	On the Convergence of FedAvg on Non-IID Data
2333	Adversarially Robust Neural Networks via Optimal Control: Bridging Robustness with Lyapunov Stability
2334	Multi-agent Reinforcement Learning for Networked System Control
2335	Learning to Anneal and Prune Proximity Graphs for Similarity Search
2336	Deep Bayesian Structure Networks
2337	Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation
2338	Keyframing the Future: Discovering Temporal Hierarchy with Keyframe-Inpainter Prediction
2339	Differential Privacy in Adversarial Learning with Provable Robustness
2340	Topology-Aware Pooling via Graph Attention
2341	Siamese Attention Networks
2342	Neural Stored-program Memory
2343	ES-MAML: Simple Hessian-Free Meta Learning
2344	Enforcing Physical Constraints in Neural Neural Networks through Differentiable PDE Layer
2345	TabFact: A Large-scale Dataset for Table-based Fact Verification
2346	Evidence-Aware Entropy Decomposition For Active Deep Learning
2347	Learning to Generate Grounded Visual Captions without Localization Supervision
2348	Extreme Triplet Learning: Effectively Optimizing Easy Positives and Hard Negatives
2349	Implicit Bias of Gradient Descent based Adversarial Training on Separable Data
2350	Graph Warp Module: an Auxiliary Module for Boosting the Power of Graph Neural Networks in Molecular Graph Analysis
2351	BERT Wears GloVes: Distilling Static Embeddings from Pretrained Contextual Representations
2352	The Visual Task Adaptation Benchmark
2353	Input Alignment along Chaotic directions increases Stability in Recurrent Neural Networks
2354	3D-SIC: 3D Semantic Instance Completion for RGB-D Scans
2355	Learning Similarity Metrics for Numerical Simulations
2356	Image-guided Neural Object Rendering
2357	MULTIPOLAR: Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental Dynamics
2358	Effective and Robust Detection of Adversarial Examples via Benford-Fourier Coefficients
2359	Stablizing Adversarial Invariance Induction by Discriminator Matching
2360	Natural Language Adversarial Attack and Defense in Word Level
2361	Amharic Light Stemmer
2362	Dynamical Clustering of Time Series Data Using Multi-Decoder RNN Autoencoder
2363	POP-Norm: A Theoretically Justified and More Accelerated Normalization Approach
2364	Programmable Neural Network Trojan for Pre-trained Feature Extractor
2365	Cost-Effective Interactive Neural Attention Learning
2366	On Layer Normalization in the Transformer Architecture
2367	PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search
2368	Knowledge Consistency between Neural Networks and Beyond
2369	Temporal Probabilistic Asymmetric Multi-task Learning
2370	Lazy-CFR: fast and near-optimal regret minimization for extensive games with imperfect information
2371	Corpus Based Amharic Sentiment Lexicon Generation
2372	Principled Weight Initialization for Hypernetworks
2373	Additive Powers-of-Two Quantization: A Non-uniform Discretization for Neural Networks
2374	Transfer Alignment Network for Double Blind Unsupervised Domain Adaptation
2375	Towards understanding the true loss surface of deep neural networks using random matrix theory and iterative spectral methods
2376	Neural Architecture Search in Embedding Space
2377	Enhancing Transformation-Based Defenses Against Adversarial Attacks with a Distribution Classifier
2378	Single Deep Counterfactual Regret Minimization
2379	HaarPooling: Graph Pooling with Compressive Haar Basis
2380	Safe Policy Learning for Continuous Control
2381	A Stochastic Trust Region Method for Non-convex Minimization
2382	Learning Effective Exploration Strategies For Contextual Bandits
2383	Improving Batch Normalization with Skewness Reduction for Deep Neural Networks
2384	Adversarial Inductive Transfer Learning with input and output space adaptation
2385	Graph Neural Networks For Multi-Image Matching
2386	An Empirical Study on Post-processing Methods for Word Embeddings
2387	AN EFFICIENT HOMOTOPY TRAINING ALGORITHM FOR NEURAL NETWORKS
2388	High performance RNNs with spiking neurons
2389	CLAREL: classification via retrieval loss for zero-shot learning
2390	Observational Overfitting in Reinforcement Learning
2391	On Mutual Information Maximization for Representation Learning
2392	Localizing and Amortizing: Efficient Inference for Gaussian Processes
2393	PNAT: Non-autoregressive Transformer by Position Learning
2394	On unsupervised-supervised risk and one-class neural networks
2395	Tranquil Clouds: Neural Networks for Learning Temporally Coherent Features in Point Clouds
2396	Distillation $\approx$ Early Stopping? Harvesting Dark Knowledge Utilizing Anisotropic Information Retrieval For Overparameterized NN
2397	Bayesian Inference for Large Scale Image Classification
2398	Ranking Policy Gradient
2399	How Does Learning Rate Decay Help Modern Neural Networks?
2400	Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures
2401	SVQN: Sequential Variational Soft Q-Learning Networks
2402	Classification Attention for Chinese NER
2403	Understanding Isomorphism Bias in Graph Data Sets
2404	Neural Machine Translation with Universal Visual Representation
2405	Towards More Realistic Neural Network Uncertainties
2406	Understanding Architectures Learnt by Cell-based Neural Architecture Search
2407	Soft Token Matching for Interpretable Low-Resource Classification
2408	Beyond Classical Diffusion: Ballistic Graph Neural Network
2409	Hierarchical Complement Objective Training
2410	Understanding and Stabilizing GANs' Training Dynamics with Control Theory
2411	Variance Reduced Local SGD with Lower Communication Complexity
2412	AutoQ: Automated Kernel-Wise Neural Network Quantization
2413	Quantifying Layerwise Information Discarding of Neural Networks and Beyond
2414	GDP: Generalized Device Placement for Dataflow Graphs
2415	Unveiling Hidden Biases in Deep Networks with Classification Images and Spike Triggered Analysis
2416	Generalization Puzzles in Deep Networks
2417	Why Learning of Large-Scale Neural Networks Behaves Like Convex Optimization
2418	Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring
2419	HighRes-net: Multi-Frame Super-Resolution by Recursive Fusion
2420	A Learning-based Iterative Method for Solving Vehicle Routing Problems
2421	Transferable Perturbations of Deep Feature Distributions
2422	Rethinking the Security of Skip Connections in ResNet-like Neural Networks
2423	ProtoAttend: Attention-Based Prototypical Learning
2424	A Signal Propagation Perspective for Pruning Neural Networks at Initialization
2425	Wildly Unsupervised Domain Adaptation and Its Powerful and Efficient Solution
2426	Automatically Learning Feature Crossing from Model Interpretation for Tabular Data
2427	Continual Learning with Adaptive Weights (CLAW)
2428	Interpretability Evaluation Framework for Deep Neural Networks
2429	Progressive Upsampling Audio Synthesis via Effective Adversarial Training
2430	Learning Compact Reward for Image Captioning
2431	S-Flow GAN
2432	Gradient-free Neural Network Training by Multi-convex Alternating Optimization
2433	Semi-supervised Semantic Segmentation using Auxiliary Network
2434	Intensity-Free Learning of Temporal Point Processes
2435	Scalable and Order-robust Continual Learning with Additive Parameter Decomposition
2436	Discriminator Based Corpus Generation for General Code Synthesis
2437	Storage Efficient and Dynamic Flexible Runtime Channel Pruning via Deep Reinforcement Learning
2438	BOOSTING ENCODER-DECODER CNN FOR INVERSE PROBLEMS
2439	Weakly Supervised Clustering by Exploiting Unique Class Count
2440	Domain Adaptation via Low-Rank Basis Approximation
2441	Learning to Control PDEs with Differentiable Physics
2442	Linear Symmetric Quantization of Neural Networks for Low-precision Integer Hardware
2443	Estimating Gradients for Discrete Random Variables by Sampling without Replacement
2444	Structural Multi-agent Learning
2445	A Gradient-based Architecture HyperParameter Optimization Approach
2446	On importance-weighted autoencoders
2447	FALCON: Fast and Lightweight Convolution for Compressing and Accelerating CNN
2448	Multi-Task Adapters for On-Device Audio Inference
2449	Mincut Pooling in Graph Neural Networks
2450	Dual Graph Representation Learning
2451	Unsupervised Few Shot Learning via Self-supervised Training
2452	To Relieve Your Headache of Training an MRF, Take AdVIL
2453	ILS-SUMM: Iterated Local Search for Unsupervised Video Summarization
2454	On the Dynamics and Convergence of Weight Normalization for Training Neural Networks
2455	CurricularFace: Adaptive Curriculum Learning Loss for Deep Face Recognition
2456	Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View
2457	Revisit Knowledge Distillation: a Teacher-free Framework
2458	SesameBERT: Attention for Anywhere
2459	Automated Relational Meta-learning
2460	Training Deep Networks with Stochastic Gradient Normalized by Layerwise Adaptive Second Moments
2461	Boosting Ticket: Towards Practical Pruning for Adversarial Training with Lottery Ticket Hypothesis
2462	Moniqua: Modulo Quantized Communication in Decentralized SGD
2463	Defending Against Physically Realizable Attacks on Image Classification
2464	Certifying Distributional Robustness using Lipschitz Regularisation
2465	A SPIKING SEQUENTIAL MODEL: RECURRENT LEAKY INTEGRATE-AND-FIRE
2466	N-BEATS: Neural basis expansion analysis for interpretable time series forecasting
2467	Subgraph Attention for Node Classification and Hierarchical Graph Pooling
2468	Are there any 'object detectors' in the hidden layers of CNNs trained to identify objects or scenes?
2469	Learning Human Postural Control with Hierarchical Acquisition Functions
2470	Unsupervised Intuitive Physics from Past Experiences
2471	Expected Tight Bounds for Robust Deep Neural Network Training
2472	Analytical Moment Regularizer for Training Robust Networks
2473	Model Architecture Controls Gradient Descent Dynamics: A Combinatorial Path-Based Formula
2474	Deep Learning of Determinantal Point Processes via Proper Spectral Sub-gradient
2475	Collaborative Filtering With A Synthetic Feedback Loop
2476	Self-Supervised State-Control through Intrinsic Mutual Information Rewards
2477	Stagnant zone segmentation with U-net
2478	Distance-Based Learning from Errors for Confidence Calibration
2479	Curvature Graph Network
2480	Learning Algorithmic Solutions to Symbolic Planning Tasks with a Neural Computer
2481	Generative Imputation and Stochastic Prediction
2482	PROTOTYPE-ASSISTED ADVERSARIAL LEARNING FOR UNSUPERVISED DOMAIN ADAPTATION
2483	Learning Expensive Coordination: An Event-Based Deep RL Approach
2484	Unifying Graph Convolutional Networks as Matrix Factorization
2485	Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks
2486	Model-free Learning Control of Nonlinear Stochastic Systems with Stability Guarantee
2487	Depth-Recurrent Residual Connections for Super-Resolution of Real-Time Renderings
2488	LAMAL: LAnguage Modeling Is All You Need for Lifelong Language Learning
2489	GenDICE: Generalized Offline Estimation of Stationary Values
2490	Deep Audio Prior
2491	Compressing Deep Neural Networks With Learnable Regularization
2492	ATLPA:ADVERSARIAL TOLERANT LOGIT PAIRING WITH ATTENTION FOR CONVOLUTIONAL NEURAL NETWORK
2493	SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering
2494	Make Lead Bias in Your Favor: A Simple and Effective Method for News Summarization
2495	Learning Out-of-distribution Detection without Out-of-distribution Data
2496	Prox-SGD: Training Structured Neural Networks under Regularization and Constraints
2497	Unsupervised Learning of Node Embeddings by Detecting Communities
2498	Diverse Trajectory Forecasting with Determinantal Point Processes
2499	Bridging the domain gap in cross-lingual document classification
2500	Evaluating The Search Phase of Neural Architecture Search
2501	Learning to Defense by Learning to Attack
2502	Smooth Regularized Reinforcement Learning
2503	On Robustness of Neural Ordinary Differential Equations
2504	Diving into Optimization of Topology in Neural Networks
2505	FoveaBox: Beyound Anchor-based Object Detection
2506	Cascade Style Transfer
2507	Advantage Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning
2508	Unifying Graph Convolutional Neural Networks and Label Propagation
2509	Equivariant neural networks and equivarification
2510	Towards a Unified Evaluation of Explanation Methods without Ground Truth
2511	Data Valuation using Reinforcement Learning
2512	RL-LIM: Reinforcement Learning-based Locally Interpretable Modeling
2513	BackPACK: Packing more into Backprop
2514	DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures
2515	Regional based query in graph active learning
2516	Group-Connected Multilayer Perceptron Networks
2517	Towards Stable and comprehensive Domain Alignment: Max-Margin Domain-Adversarial Training
2518	Depth-Adaptive Transformer
2519	VUSFA:Variational Universal Successor Features Approximator
2520	InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization
2521	Federated Adversarial Domain Adaptation
2522	CATER: A diagnostic dataset for Compositional Actions & TEmporal Reasoning
2523	Learning Structured Communication for Multi-agent Reinforcement Learning
2524	Utilizing Edge Features in Graph Neural Networks via Variational Information Maximization
2525	Stabilizing DARTS with Amended Gradient Estimation on Architectural Parameters
2526	Utility Analysis of Network Architectures for 3D Point Cloud Processing
2527	Effective Mechanism to Mitigate Injuries During NFL Plays
2528	TechKG: A Large-Scale Chinese Technology-Oriented Knowledge Graph
2529	Learning Reusable Options for Multi-Task Reinforcement Learning
2530	Maxmin Q-learning: Controlling the Estimation Bias of Q-learning
2531	X-Forest: Approximate Random Projection Trees for Similarity Measurement
2532	From Here to There: Video Inbetweening Using Direct 3D Convolutions
2533	Low Bias Gradient Estimates for Very Deep Boolean Stochastic Networks
2534	Automatically Discovering and Learning New Visual Categories with Ranking Statistics
2535	Support-guided Adversarial Imitation Learning
2536	Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identification
2537	Multi-Scale Representation Learning for Spatial Feature Distributions using Grid Cells
2538	Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data
2539	Data augmentation instead of explicit regularization
2540	SCL: Towards Accurate Domain Adaptive Object Detection via Gradient Detach Based Stacked Complementary Losses
2541	SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards
2542	Label Cleaning with Likelihood Ratio Test
2543	Visual Imitation with Reinforcement Learning using Recurrent Siamese Networks
2544	Graph Neural Networks Exponentially Lose Expressive Power for Node Classification
2545	VIDEO AFFECTIVE IMPACT PREDICTION WITH MULTIMODAL FUSION AND LONG-SHORT TEMPORAL CONTEXT
2546	Graph inference learning for semi-supervised classification
2547	Sparse Coding with Gated Learned ISTA
2548	Dimensional Reweighting Graph Convolution Networks
2549	ROBUST DISCRIMINATIVE REPRESENTATION LEARNING VIA GRADIENT RESCALING: AN EMPHASIS REGULARISATION PERSPECTIVE
2550	Explaining A Black-box By Using A Deep Variational Information Bottleneck Approach
2551	Learning deep graph matching with channel-independent embedding and Hungarian attention
2552	EnsembleNet: End-to-End Optimization of Multi-headed Models
2553	Out-of-Distribution Detection Using Layerwise Uncertainty in Deep Neural Networks
2554	Semantics Preserving Adversarial Attacks
2555	Ensemble methods and LSTM outperformed other eight machine learning classifiers in an EEG-based BCI experiment
2556	Scaling Up Neural Architecture Search with Big Single-Stage Models
2557	AutoSlim: Towards One-Shot Architecture Search for Channel Numbers
2558	Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching
2559	EgoMap: Projective mapping and structured egocentric memory for Deep RL
2560	Accelerated Information Gradient flow
2561	Adversarial Attribute Learning by Exploiting negative correlated attributes
2562	StructPool: Structured Graph Pooling via Conditional Random Fields
2563	On the Decision Boundaries of Deep Neural Networks: A Tropical Geometry Perspective
2564	Probabilistic modeling the hidden layers of Deep Neural Networks
2565	IEG: Robust neural net training with severe label noises
2566	VideoEpitoma: Efficient Recognition of Long-range Actions
2567	On the Weaknesses of Reinforcement Learning for Neural Machine Translation
2568	Stochastically Controlled Compositional Gradient for the Composition problem
2569	Sharing Knowledge in Multi-Task Deep Reinforcement Learning
2570	HOW IMPORTANT ARE NETWORK WEIGHTS? TO WHAT EXTENT DO THEY NEED AN UPDATE?
2571	Deep Reasoning Networks: Thinking Fast and Slow, for Pattern De-mixing
2572	When Does Self-supervision Improve Few-shot Learning?
2573	Reinforcement Learning Based Graph-to-Sequence Model for Natural Question Generation
2574	Context-aware Attention Model for Coreference Resolution
2575	SELF: Learning to Filter Noisy Labels with Self-Ensembling
2576	Neural Maximum Common Subgraph Detection with Guided Subgraph Extraction
2577	Amharic Negation Handling
2578	Noise Regularization for Conditional Density Estimation
2579	Star-Convexity in Non-Negative Matrix Factorization
2580	Count-guided Weakly Supervised Localization Based on Density Map
2581	Scoring-Aggregating-Planning: Learning task-agnostic priors from interactions and sparse rewards for zero-shot generalization
2582	SSE-PT: Sequential Recommendation Via Personalized Transformer
2583	Wide Neural Networks are Interpolating Kernel Methods: Impact of Initialization on Generalization
2584	Improving Evolutionary Strategies with Generative Neural Networks
2585	Analysis and Interpretation of Deep CNN Representations as Perceptual Quality Features
2586	Program Guided Agent
2587	Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency
2588	Prestopping: How Does Early Stopping Help Generalization Against Label Noise?
2589	Carpe Diem, Seize the Samples Uncertain "at the Moment" for Adaptive Batch Selection
2590	Large Batch Optimization for Deep Learning: Training BERT in 76 minutes