ICLR 2020 Submissions

1Empirical Bayes Transductive Meta-Learning with Synthetic Gradients
2Contextualized Sparse Representation with Rectified N-Gram Attention for Open-Domain Question Answering
3Generalized Domain Adaptation with Covariate and Label Shift CO-ALignment
4Quaternion Equivariant Capsule Networks for 3D Point Clouds
5Pay Attention to Features, Transfer Learn faster CNNs
6Differentiable Hebbian Consolidation for Continual Learning
7Generative Hierarchical Models for Parts, Objects, and Scenes
8Mixture Distributions for Scalable Bayesian Inference
9Best feature performance in codeswitched hate speech texts
10Geom-GCN: Geometric Graph Convolutional Networks
11Smart Ternary Quantization
14Gradients as Features for Deep Representation Learning
15Deceptive Opponent Modeling with Proactive Network Interdiction for Stochastic Goal Recognition Control
16Monotonic Multihead Attention
17Massively Multilingual Sparse Word Representations
18Attention over Phrases
19Query-efficient Meta Attack to Deep Neural Networks
21Meta-Learning Initializations for Image Segmentation
22Privacy-preserving Representation Learning by Disentanglement
23Building Hierarchical Interpretations in Natural Language via Feature Interaction Detection
25End-to-end learning of energy-based representations for irregularly-sampled signals and images
26Enabling Deep Spiking Neural Networks with Hybrid Conversion and Spike Timing Dependent Backpropagation
27How to 0wn the NAS in Your Spare Time
28Generalized Zero-shot ICD Coding
30WEEGNET: an wavelet based Convnet for Brain-computer interfaces
31Meta Label Correction for Learning with Weak Supervision
32Toward Controllable Text Content Manipulation
33NAMSG: An Efficient Method for Training Neural Networks
34Learning to Reason: Distilling Hierarchy via Self-Supervision and Reinforcement Learning
35The Shape of Data: Intrinsic Distance for Data Distributions
36Measuring Numerical Common Sense: Is A Word Embedding Approach Effective?
37Learning DNA folding patterns with Recurrent Neural Networks
38Generative Adversarial Nets for Multiple Text Corpora
39Understanding Generalization in Recurrent Neural Networks
40Measure by Measure: Automatic Music Composition with Traditional Western Music Notation
41Weakly-Supervised Trajectory Segmentation for Learning Reusable Skills
42Learn Interpretable Word Embeddings Efficiently with von Mises-Fisher Distribution
43Goten: GPU-Outsourcing Trusted Execution of Neural Network Training and Prediction
44Limitations for Learning from Point Clouds
46Conservative Uncertainty Estimation By Fitting Prior Networks
47Re-Examining Linear Embeddings for High-dimensional Bayesian Optimization
49Predictive Coding for Boosting Deep Reinforcement Learning with Sparse Rewards
50NORML: Nodal Optimization for Recurrent Meta-Learning
51Keyword Spotter Model for Crop Pest and Disease Monitoring from Community Radio Data
53Defense against Adversarial Examples by Encoder-Assisted Search in the Latent Coding Space
54Fuzzing-Based Hard-Label Black-Box Attacks Against Machine Learning Models
55Conditional generation of molecules from disentangled representations
56Dataset Distillation
57Learning RNNs with Commutative State Transitions
58XD: Cross-lingual Knowledge Distillation for Polyglot Sentence Embeddings
59LAVAE: Disentangling Location and Appearance
60Sparse Skill Coding: Learning Behavioral Hierarchies with Sparse Codes
63A Bilingual Generative Transformer for Semantic Sentence Embedding
64Learning to Coordinate Manipulation Skills via Skill Behavior Diversification
65DeepPCM: Predicting Protein-Ligand Binding using Unsupervised Learned Representations
66Ternary MobileNets via Per-Layer Hybrid Filter Banks
67Constant Curvature Graph Convolutional Networks
68Variational Information Bottleneck for Unsupervised Clustering: Deep Gaussian Mixture Embedding
69Combining graph and sequence information to learn protein representations
71Cancer homogeneity in single cell revealed by Bi-state model and Binary matrix factorization
72Robust Subspace Recovery Layer for Unsupervised Anomaly Detection
73Learning Nearly Decomposable Value Functions Via Communication Minimization
74Batch Normalization is a Cause of Adversarial Vulnerability
75Undersensitivity in Neural Reading Comprehension
76Extreme Classification via Adversarial Softmax Approximation
78Information Geometry of Orthogonal Initializations and Training
79Multi-Step Decentralized Domain Adaptation
80Mixed Precision DNNs: All you need is a good parametrization
82Co-Attentive Equivariant Neural Networks: Focusing Equivariance On Transformations Co-Ocurring in Data
83Improving the Gating Mechanism of Recurrent Neural Networks
84Learning to Transfer via Modelling Multi-level Task Dependency
85Latent Variables on Spheres for Sampling and Inference
86Deep Orientation Uncertainty Learning based on a Bingham Loss
87Analyzing Privacy Loss in Updates of Natural Language Models
88Learning from Positive and Unlabeled Data with Adversarial Training
89Deep exploration by novelty-pursuit with maximum state entropy
90Reconstructing continuous distributions of 3D protein structure from cryo-EM images
91Deep Evidential Uncertainty
92Tree-structured Attention Module for Image Classification
93Generalization of Two-layer Neural Networks: An Asymptotic Viewpoint
94Better Knowledge Retention through Metric Learning
95Winning the Lottery with Continuous Sparsification
96Critical initialisation in continuous approximations of binary neural networks
97Learning to Learn via Gradient Component Corrections
99Filter redistribution templates for iteration-lessconvolutional model reduction
100Universal Safeguarded Learned Convex Optimization with Guaranteed Convergence
101A Gradient-Based Approach to Neural Networks Structure Learning
102Sub-policy Adaptation for Hierarchical Reinforcement Learning
103AdvCodec: Towards A Unified Framework for Adversarial Text Generation
105Learning Latent State Spaces for Planning through Reward Prediction
106Variational lower bounds on mutual information based on nonextensive statistical mechanics
107Hope For The Best But Prepare For The Worst: Cautious Adaptation In RL Agents
108Semi-Supervised Boosting via Self Labelling
109Fractional Graph Convolutional Networks (FGCN) for Semi-Supervised Learning
110Antifragile and Robust Heteroscedastic Bayesian Optimisation
111Generalizing Reinforcement Learning to Unseen Actions
112Provable Representation Learning for Imitation Learning via Bi-level Optimization
113Episodic Reinforcement Learning with Associative Memory
114Flexible and Efficient Long-Range Planning Through Curious Exploration
115Learning to Prove Theorems by Learning to Generate Theorems
116Depth-Width Trade-offs for ReLU Networks via Sharkovsky's Theorem
117Common sense and Semantic-Guided Navigation via Language in Embodied Environments
118Gradient-based training of Gaussian Mixture Models in High-Dimensional Spaces
119Neural Phrase-to-Phrase Machine Translation
120At Your Fingertips: Automatic Piano Fingering Detection
121Energy-based models for atomic-resolution protein conformations
122Federated Learning with Matched Averaging
123Clustered Reinforcement Learning
124Understanding the (Un)interpretability of Natural Image Distributions Using Generative Models
125Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning
126Efficient and Robust Asynchronous Federated Learning with Stragglers
127Handwritten Amharic Character Recognition System Using Convolutional Neural Networks
128Effects of Linguistic Labels on Learned Visual Representations in Convolutional Neural Networks: Labels matter!
129Differentiable Programming for Physical Simulation
130Fooling Pre-trained Language Models: An Evolutionary Approach to Generate Wrong Sentences with High Acceptability Score
131Implicit Rugosity Regularization via Data Augmentation
132A Mutual Information Maximization Perspective of Language Representation Learning
133Goal-Conditioned Video Prediction
134Accelerate DNN Inference By Inter-Operator Parallelization
135Compression without Quantization
136Geometry-Aware Visual Predictive Models of Intuitive Physics
137Growing Up Together: Structured Exploration for Large Action Spaces
138Adversarial Training with Voronoi Constraints
139A Non-asymptotic comparison of SVRG and SGD: tradeoffs between compute and speed
140RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers
141Towards Understanding the Spectral Bias of Deep Learning
142Domain Adaptive Multiflow Networks
143Unbiased Contrastive Divergence Algorithm for Training Energy-Based Latent Variable Models
144Unsupervised Distillation of Syntactic Information from Contextualized Word Representations
145Optimal Unsupervised Domain Translation
146Multi-task Network Embedding with Adaptive Loss Weighting
147Biologically Plausible Neural Networks via Evolutionary Dynamics and Dopaminergic Plasticity
149Continual Learning using the SHDL Framework with Skewed Replay Distributions
150Semi-supervised Autoencoding Projective Dependency Parsing
151Differentiable Reasoning over a Virtual Knowledge Base
152Making Sense of Reinforcement Learning and Probabilistic Inference
153Negative Sampling in Variational Autoencoders
154Improved Training of Certifiably Robust Models
155Unsupervised Generative 3D Shape Learning from Natural Images
156Diagnosing the Environment Bias in Vision-and-Language Navigation
157Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation
158Learning Mahalanobis Metric Spaces via Geometric Approximation Algorithms
159Laconic Image Classification: Human vs. Machine Performance
160Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks
161Reinforcement Learning with Structured Hierarchical Grammar Representations of Actions
162The Usual Suspects? Reassessing Blame for VAE Posterior Collapse
163Dynamical System Embedding for Efficient Intrinsically Motivated Artificial Agents
164BERT for Sequence-to-Sequence Milti-Label Text Classification
166Evaluations and Methods for Explanation through Robustness Analysis
167Attributed Graph Learning with 2-D Graph Convolution
168Stochastic Neural Physics Predictor
169Neural tangent kernels, transportation mappings, and universal approximation
170Pragmatic Evaluation of Adversarial Examples in Natural Language
171Learning to Move with Affordance Maps
172Towards Interpreting Deep Neural Networks via Understanding Layer Behaviors
173Deep Learning For Symbolic Mathematics
174Deep Interaction Processes for Time-Evolving Graphs
175Differentiable learning of numerical rules in knowledge graphs
176Consistency Regularization for Generative Adversarial Networks
177On Generalization Error Bounds of Noisy Gradient Methods for Non-Convex Learning
178Lyceum: An efficient and scalable ecosystem for robot learning
179SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models
180In-training Matrix Factorization for Parameter-frugal Neural Machine Translation
181Benefits of Overparameterization in Single-Layer Latent Variable Generative Models
182Implicit competitive regularization in GANs
183Scale-Equivariant Steerable Networks
184Extreme Language Model Compression with Optimal Subwords and Shared Projections
185DeepSphere: a graph-based spherical CNN
186Improved Training Techniques for Online Neural Machine Translation
188Overcoming Catastrophic Forgetting via Hessian-free Curvature Estimates
189Score and Lyrics-Free Singing Voice Generation
190Neural Video Encoding
191Interactive Classification by Asking Informative Questions
192Classification-Based Anomaly Detection for General Data
193Mixture Density Networks Find Viewpoint the Dominant Factor for Accurate Spatial Offset Regression
194Distributed Training Across the World
195Unrestricted Adversarial Examples via Semantic Manipulation
196Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model
197Closed loop deep Bayesian inversion: Uncertainty driven acquisition for fast MRI
199Discriminative Particle Filter Reinforcement Learning for Complex Partial observations
200Learning to Group: A Bottom-Up Framework for 3D Part Discovery in Unseen Categories
201State Alignment-based Imitation Learning
202Reweighted Proximal Pruning for Large-Scale Language Representation
203Neural Arithmetic Units
204Lipschitz constant estimation for Neural Networks via sparse polynomial optimization
205Random Bias Initialization Improving Binary Neural Network Training
206Meta-RCNN: Meta Learning for Few-Shot Object Detection
207Adversarially learned anomaly detection for time series data
209Multi-Precision Policy Enforced Training (MuPPET) : A precision-switching strategy for quantised fixed-point training of CNNs
210Deep Spike Decoder (DSD)
211Isolating Latent Structure with Cross-population Variational Autoencoders
212Learning Compact Embedding Layers via Differentiable Product Quantization
213Accelerating First-Order Optimization Algorithms
214Physics-Aware Flow Data Completion Using Neural Inpainting
215Imagine That! Leveraging Emergent Affordances for Tool Synthesis in Reaching Tasks
216Provable Filter Pruning for Efficient Neural Networks
218Learning transitional skills with intrinsic motivation
219Quantifying uncertainty with GAN-based priors
220End to End Trainable Active Contours via Differentiable Rendering
221Plan2Vec: Unsupervised Representation Learning by Latent Plans
222Uncertainty-aware Variational-Recurrent Imputation Network for Clinical Time Series
223Compositional Continual Language Learning
224Out-of-Distribution Image Detection Using the Normalized Compression Distance
225Discriminative Variational Autoencoder for Continual Learning with Generative Replay
226Connectivity-constrained interactive annotations for panoptic segmentation
227On learning visual odometry errors
228Regularization Matters in Policy Optimization
229Adaptive Online Planning for Continual Lifelong Learning
230Measuring causal influence with back-to-back regression: the linear case
231Regularizing Predictions via Class-wise Self-knowledge Distillation
232Multi-source Multi-view Transfer Learning in Neural Topic Modeling with Pretrained Topic and Word Embeddings
233Adversarial Lipschitz Regularization
234Reasoning-Aware Graph Convolutional Network for Visual Question Answering
235SGD Learns One-Layer Networks in WGANs
236Localized Meta-Learning: A PAC-Bayes Analysis for Meta-Leanring Beyond Global Prior
237FNNP: Fast Neural Network Pruning Using Adaptive Batch Normalization
238Adversarial Training and Provable Defenses: Bridging the Gap
239Finding Deep Local Optima Using Network Pruning
240Adversarial Training Generalizes Data-dependent Spectral Norm Regularization
241Knowledge Transfer via Student-Teacher Collaboration
242A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case
243Weight-space symmetry in neural network loss landscapes revisited
244Differentiable Bayesian Neural Network Inference for Data Streams
245Efficient Transformer for Mobile Applications
246Learning by shaking: Computing policy gradients by physical forward-propagation
247Occlusion resistant learning of intuitive physics from videos
248Quantum Graph Neural Networks
249Statistical Verification of General Perturbations by Gaussian Smoothing
250Localised Generative Flows
251TransINT: Embedding Implication Rules in Knowledge Graphs with Isomorphic Intersections of Linear Subspaces
252Robust Few-Shot Learning with Adversarially Queried Meta-Learners
253Certifying Neural Network Audio Classifiers
254Collaborative Training of Balanced Random Forests for Open Set Domain Adaptation
255PAC-Bayesian Neural Network Bounds
256Semi-Implicit Back Propagation
257Mutual Information Gradient Estimation for Representation Learning
258Qgraph-bounded Q-learning: Stabilizing Model-Free Off-Policy Deep Reinforcement Learning
259Iterative Deep Graph Learning for Graph Neural Networks
260Mint: Matrix-Interleaving for Multi-Task Learning
261Learning Cluster Structured Sparsity by Reweighting
262Selfish Emergent Communication
263Decoupling Adaptation from Modeling with Meta-Optimizers for Meta Learning
264Imitation Learning of Robot Policies using Language, Vision and Motion
265Improving Visual Relation Detection using Depth Maps
266Semi-supervised Pose Estimation with Geometric Latent Representations
267Identifying Weights and Architectures of Unknown ReLU Networks
268Unsupervised Domain Adaptation through Self-Supervision
269Improving Gradient Estimation in Evolutionary Strategies With Past Descent Directions
270$\alpha^{\alpha}$-Rank: Scalable Multi-agent Evaluation through Evolution
271Variable Complexity in the Univariate and Multivariate Structural Causal Model
272Regularizing activations in neural networks via distribution matching with the Wassertein metric
273RefNet: Automatic Essay Scoring by Pairwise Comparison
274Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
275Mixed Precision Training With 8-bit Floating Point
276An Empirical and Comparative Analysis of Data Valuation with Scalable Algorithms
277Consistent Meta-Reinforcement Learning via Model Identification and Experience Relabeling
278Transferring Optimality Across Data Distributions via Homotopy Methods
279Latent Normalizing Flows for Many-to-Many Cross Domain Mappings
280Learning Multi-Agent Communication Through Structured Attentive Reasoning
281Dynamic Model Pruning with Feedback
282$\ell_1$ Adversarial Robustness Certificates: a Randomized Smoothing Approach
283On the interaction between supervision and self-play in emergent communication
284CNAS: Channel-Level Neural Architecture Search
286Slow Thinking Enables Task-Uncertain Lifelong and Sequential Few-Shot Learning
287A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms
288Expected Information Maximization: Using the I-Projection for Mixture Density Estimation
289Through the Lens of Neural Network: Analyzing Neural QA Models via Quantized Latent Representation
290All Simulations Are Not Equal: Simulation Reweighing for Imperfect Information Games
291Truth or backpropaganda? An empirical investigation of deep learning theory
292Learning to Rank Learning Curves
293Set Functions for Time Series
294I love your chain mail! Making knights smile in a fantasy game world
295Masked Translation Model
296MissDeepCausal: causal inference from incomplete data using deep latent variable models
297Variational Constrained Reinforcement Learning with Application to Planning at Roundabout
298Efficient Deep Representation Learning by Adaptive Latent Space Sampling
299Learning Functionally Decomposed Hierarchies for Continuous Navigation Tasks
300Deep Audio Priors Emerge From Harmonic Convolutional Networks
301Drawing Early-Bird Tickets: Toward More Efficient Training of Deep Networks
302On Understanding Knowledge Graph Representation
303Encoding Musical Style with Transformer Autoencoders
304Collaborative Inter-agent Knowledge Distillation for Reinforcement Learning
305Gauge Equivariant Spherical CNNs
307Preventing Imitation Learning with Adversarial Policy Ensembles
308On the Anomalous Generalization of GANs
309Improving Generalization in Meta Reinforcement Learning using Neural Objectives
310A closer look at the approximation capabilities of neural networks
311VIMPNN: A physics informed neural network for estimating potential energies of out-of-equilibrium systems
312SLM Lab: A Comprehensive Benchmark and Modular Software Framework for Reproducible Deep Reinforcement Learning
313Resolving Lexical Ambiguity in English–Japanese Neural Machine Translation
314Data-Efficient Image Recognition with Contrastive Predictive Coding
315Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps
317Residual Energy-Based Models for Text Generation
318AtomNAS: Fine-Grained End-to-End Neural Architecture Search
319The Power of Semantic Similarity based Soft-Labeling for Generalized Zero-Shot Learning
320AugMix: A Simple Method to Improve Robustness and Uncertainty under Data Shift
321Learning Latent Dynamics for Partially-Observed Chaotic Systems
322Exploration via Flow-Based Intrinsic Rewards
323Learning Underlying Physical Properties From Observations For Trajectory Prediction
325GraphQA: Protein Model Quality Assessment using Graph Convolutional Network
326Disentanglement through Nonlinear ICA with General Incompressible-flow Networks (GIN)
328Angular Visual Hardness
329Deep Relational Factorization Machines
330Towards Scalable Imitation Learning for Multi-Agent Systems with Graph Neural Networks
331On the Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks
333Mem2Mem: Learning to Summarize Long Texts with Memory-to-Memory Transfer
334GQ-Net: Training Quantization-Friendly Deep Networks
335An Empirical Study of Encoders and Decoders in Graph-Based Dependency Parsing
336ExpandNets: Linear Over-parameterization to Train Compact Convolutional Networks
337Variational Template Machine for Data-to-Text Generation
338Phase Transitions for the Information Bottleneck in Representation Learning
339PopSGD: Decentralized Stochastic Gradient Descent in the Population Model
340Symmetric-APL Activations: Training Insights and Robustness to Adversarial Attacks
341Faster and Just As Accurate: A Simple Decomposition for Transformer Models
342Hidden incentives for self-induced distributional shift
343The divergences minimized by non-saturating GAN training
344The Differentiable Cross-Entropy Method
345Atomic Compression Networks
346Continual learning with hypernetworks
347Few-Shot Regression via Learning Sparsifying Basis Functions
348Understanding and Training Deep Diagonal Circulant Neural Networks
349Removing input features via a generative model to explain their attributions to classifier's decisions
350Top-down training for neural networks
351Demystifying Graph Neural Network Via Graph Filter Assessment
352Towards Certified Defense for Unrestricted Adversarial Attacks
353Permutation Equivariant Models for Compositional Generalization in Language
354Training binary neural networks with real-to-binary convolutions
355DO-AutoEncoder: Learning and Intervening Bivariate Causal Mechanisms in Images
356StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
357Multichannel Generative Language Models
358Smooth markets: A basic mechanism for organizing gradient-based learners
359Enhancing the Transformer with explicit relational encoding for math problem solving
360Ergodic Inference: Accelerate Convergence by Optimisation
361SemanticAdv: Generating Adversarial Examples via Attribute-Conditional Image Editing
362Uncertainty - sensitive learning and planning with ensembles
363Fair Resource Allocation in Federated Learning
364Continual Learning via Principal Components Projection
365Task-Mediated Representation Learning
366Convolutional Conditional Neural Processes
367Self-Induced Curriculum Learning in Neural Machine Translation
368CWAE-IRL: Formulating a supervised approach to Inverse Reinforcement Learning problem
369A Quality-Diversity Controllable GAN for Text Generation
370Newton Residual Learning
371Hydra: Preserving Ensemble Diversity for Model Distillation
372Few-Shot Few-Shot Learning and the role of Spatial Attention
373BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning
374Lossless Data Compression with Transformer
375Meta-Learning with Warped Gradient Descent
376Never Give Up: Learning Directed Exploration Strategies
377AdvectiveNet: An Eulerian-Lagrangian Fluidic Reservoir for Point Cloud Processing
378Unsupervised Spatiotemporal Data Inpainting
379Transferable Recognition-Aware Image Processing
380GMM-UNIT: Unsupervised Multi-Domain and Multi-Modal Image-to-Image Translation via Attribute Gaussian Mixture Modelling
381Transfer Active Learning For Graph Neural Networks
382Trajectory growth through random deep ReLU networks
383Frequency Pooling: Shift-Equivalent and Anti-Aliasing Down Sampling
384Improving Sequential Latent Variable Models with Autoregressive Flows
385SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference
386Sparse Transformer: Concentrated Attention Through Explicit Selection
387Minimizing Change in Classifier Likelihood to Mitigate Catastrophic Forgetting
388Scheduled Intrinsic Drive: A Hierarchical Take on Intrinsically Motivated Exploration
389You CAN Teach an Old Dog New Tricks! On Training Knowledge Graph Embeddings
390Unsupervised Learning of Graph Hierarchical Abstractions with Differentiable Coarsening and Optimal Transport
391Defensive Tensorization: Randomized Tensor Parametrization for Robust Neural Networks
392Question Generation from Paragraphs: A Tale of Two Hierarchical Models
393Robust Reinforcement Learning via Adversarial Training with Langevin Dynamics
394Embodied Multimodal Multitask Learning
395High Fidelity Speech Synthesis with Adversarial Networks
396Autoencoder-based Initialization for Recurrent Neural Networks with a Linear Memory
397Test-Time Training for Out-of-Distribution Generalization
398Distance-based Composable Representations with Neural Networks
399At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?
400GPU Memory Management for Deep Neural Networks Using Deep Q-Network
402Walking on the Edge: Fast, Low-Distortion Adversarial Examples
403Disentangling Trainability and Generalization in Deep Learning
404Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization
405Functional Regularisation for Continual Learning with Gaussian Processes
406Verification of Generative-Model-Based Visual Transformations
407A Graph Neural Network Assisted Monte Carlo Tree Search Approach to Traveling Salesman Problem
408Residual EBMs: Does Real vs. Fake Text Discrimination Generalize?
409Learning Likelihoods with Conditional Normalizing Flows
410Informed Temporal Modeling via Logical Specification of Factorial LSTMs
411Auto Network Compression with Cross-Validation Gradient
412Regularly varying representation for sentence embedding
413A Simple and Scalable Shape Representation for 3D Reconstruction
414Learning Through Limited Self-Supervision: Improving Time-Series Classification Without Additional Data via Auxiliary Tasks
415EvoNet: A Neural Network for Predicting the Evolution of Dynamic Graphs
416Few-Shot One-Class Classification via Meta-Learning
417Training a Constrained Natural Media Painting Agent using Reinforcement Learning
418Fix-Net: pure fixed-point representation of deep neural networks
419Learning Semantic Correspondences from Noisy Data-text Pairs by Local-to-Global Alignments
420The Role of Embedding Complexity in Domain-invariant Representations
421Learning Curves for Deep Neural Networks: A field theory perspective
422Zero-Shot Policy Transfer with Disentangled Attention
423Disentangled Cumulants Help Successor Representations Transfer to New Tasks
424Learning vector representation of local content and matrix representation of local motion, with implications for V1
425Online Learned Continual Compression with Stacked Quantization Modules
426Gumbel-Matrix Routing for Flexible Multi-task Learning
427The Frechet Distance of training and test distribution predicts the generalization gap
428Mixed Setting Training Methods for Incremental Slot-Filling Tasks
429Selective sampling for accelerating training of deep neural networks
430Representing Unordered Data Using Multiset Automata and Complex Numbers
431Robust Natural Language Representation Learning for Natural Language Inference by Projecting Superficial Words out
432Deep Nonlinear Stochastic Optimal Control for Systems with Multiplicative Uncertainties
433Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network
434Sentence embedding with contrastive multi-views learning
435Dynamics-Aware Embeddings
436Learning Multi-facet Embeddings of Phrases and Sentences using Sparse Coding for Unsupervised Semantic Applications
438RaPP: Novelty Detection with Reconstruction along Projection Pathway
439SAFE-DNN: A Deep Neural Network with Spike Assisted Feature Extraction for Noise Robust Inference
440Putting Machine Translation in Context with the Noisy Channel Model
441Deep geometric matrix completion: Are we doing it right?
442Progressive Compressed Records: Taking a Byte Out of Deep Learning Data
443Robustness and/or Redundancy Emerge in Overparametrized Deep Neural Networks
444The Intriguing Effects of Focal Loss on the Calibration of Deep Neural Networks
445Hypermodels for Exploration
446Denoising Improves Latent Space Geometry in Text Autoencoders
447Provable Convergence and Global Optimality of Generative Adversarial Network
448On Symmetry and Initialization for Neural Networks
449Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies
450Policy path programming
451Meta-Learning with Network Pruning for Overfitting Reduction
452Kernel and Rich Regimes in Overparametrized Models
453A Boolean Task Algebra for Reinforcement Learning
454Explanation by Progressive Exaggeration
455Quantum Optical Experiments Modeled by Long Short-Term Memory
456Why do These Match? Explaining the Behavior of Image Similarity Models
457Mode Connectivity and Sparse Neural Networks
458Monte Carlo Deep Neural Network Arithmetic
459Shape Features Improve General Model Robustness
460Random Partition Relaxation for Training Binary and Ternary Weight Neural Network
461How can we generalise learning distributed representations of graphs?
462Relation-based Generalized Zero-shot Classification with the Domain Discriminator on the shared representation
463Self-supervised Training of Proposal-based Segmentation via Background Prediction
464Influence-aware Memory for Deep Reinforcement Learning
465Gating Revisited: Deep Multi-layer RNNs That Can Be Trained
466Decoupling Hierarchical Recurrent Neural Networks With Locally Computable Losses
467A Simple Geometric Proof for the Benefit of Depth in ReLU Networks
468Avoiding Negative Side-Effects and Promoting Safe Exploration with Imaginative Planning
469BayesOpt Adversarial Attack
470CrossNorm: On Normalization for Off-Policy Reinforcement Learning
471A Simple Technique to Enable Saliency Methods to Pass the Sanity Checks
472Directional Message Passing for Molecular Graphs
473Unsupervised Learning of Efficient and Robust Speech Representations
474Compositional Embeddings: Joint Perception and Comparison of Class Label Sets
475Model-based reinforcement learning for biological sequence design
476Learning to Optimize via Dual space Preconditioning
477Self-Attentional Credit Assignment for Transfer in Reinforcement Learning
478AdaGAN: Adaptive GAN for Many-to-Many Non-Parallel Voice Conversion
479City Metro Network Expansion with Reinforcement Learning
480BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations
481ShardNet: One Filter Set to Rule Them All
482Towards Interpretable Evaluations: A Case Study of Named Entity Recognition
483Mixed-curvature Variational Autoencoders
484Rethinking deep active learning: Using unlabeled data at model training
485Blurring Structure and Learning to Optimize and Adapt Receptive Fields
486Layerwise Learning Rates for Object Features in Unsupervised and Supervised Neural Networks And Consequent Predictions for the Infant Visual System
487Continual Deep Learning by Functional Regularisation of Memorable Past
488Demystifying Inter-Class Disentanglement
489On the implicit minimization of alternative loss functions when training deep networks
490Dynamic Graph Message Passing Networks
491A Deep Recurrent Neural Network via Unfolding Reweighted l1-l1 Minimization
492Differentially Private Mixed-Type Data Generation For Unsupervised Learning
493Learning from Rules Generalizing Labeled Exemplars
494Group-Transformer: Towards A Lightweight Character-level Language Model
495Language-independent Cross-lingual Contextual Representations
496Understanding the Limitations of Conditional Generative Models
497Skew-Explore: Learn faster in continuous spaces with sparse rewards
498Diversely Stale Parameters for Efficient Training of Deep Convolutional Networks
499Exploring the Correlation between Likelihood of Flow-based Generative Models and Image Semantics
500Anomaly Detection Based on Unsupervised Disentangled Representation Learning in Combination with Manifold Learning
501Neural Arithmetic Unit by reusing many small pre-trained networks
502On Stochastic Sign Descent Methods
503GENN: Predicting Correlated Drug-drug Interactions with Graph Energy Neural Networks
504Event Discovery for History Representation in Reinforcement Learning
505Keep Doing What Worked: Behavior Modelling Priors for Offline Reinforcement Learning
506Are Powerful Graph Neural Nets Necessary? A Dissection on Graph Classification
507Domain-Invariant Representations: A Look on Compression and Weights
508Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack
509Spike-based causal inference for weight alignment
510Symmetry and Systematicity
511Efficacy of Pixel-Level OOD Detection for Semantic Segmentation
512PatchFormer: A neural architecture for self-supervised representation learning on images
513Address2vec: Generating vector embeddings for blockchain analytics
514Attack-Resistant Federated Learning with Residual-based Reweighting
515Learning scalable and transferable multi-robot/machine sequential assignment planning via graph embedding
516Learning a Spatio-Temporal Embedding for Video Instance Segmentation
517Efficient Exploration via State Marginal Matching
518Side-Tuning: Network Adaptation via Additive Side Networks
519Lookahead: A Far-sighted Alternative of Magnitude-based Pruning
520SCELMo: Source Code Embeddings from Language Models
521Detecting Change in Seasonal Pattern via Autoencoder and Temporal Regularization
522CopyCAT: Taking Control of Neural Policies with Constant Attacks
523VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning
524A Generalized Training Approach for Multiagent Learning
525Quantum Semi-Supervised Kernel Learning
526Unsupervised Meta-Learning for Reinforcement Learning
527Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
528Training individually fair ML models with sensitive subspace robustness
529Meta-learning curiosity algorithms
530vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
531The Secret Revealer: Generative Model Inversion Attacks Against Deep Neural Networks
532Leveraging Entanglement Entropy for Deep Understanding of Attention Matrix in Text Matching
533Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior Policies
534Under what circumstances do local codes emerge in feed-forward neural networks
535MMA Training: Direct Input Space Margin Maximization through Adversarial Training
536Forecasting Deep Learning Dynamics with Applications to Hyperparameter Tuning
537Batch Normalization has Multiple Benefits: An Empirical Study on Residual Networks
538Building Deep Equivariant Capsule Networks
539Learning to Infer User Interface Attributes from Images
540Attacking Graph Convolutional Networks via Rewiring
541Incorporating BERT into Neural Machine Translation
542Unsupervised Hierarchical Graph Representation Learning with Variational Bayes
543Copy That! Editing Sequences by Copying Spans
544DeepXML: Scalable & Accurate Deep Extreme Classification for Matching User Queries to Advertiser Bid Phrases
545What Can Neural Networks Reason About?
546Structured Object-Aware Physics Prediction for Video Modeling and Planning
547A multi-task U-net for segmentation with lazy labels
548Neural Design of Contests and All-Pay Auctions using Multi-Agent Simulation
549CaptainGAN: Navigate Through Embedding Space For Better Text Generation
550Learning-Augmented Data Stream Algorithms
551word2ket: Space-efficient Word Embeddings inspired by Quantum Entanglement
552On Weight-Sharing and Bilevel Optimization in Architecture Search
553Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models
554Imbalanced Classification via Adversarial Minority Over-sampling
555Compositional Transfer in Hierarchical Reinforcement Learning
556On the Relationship between Self-Attention and Convolutional Layers
557PolyGAN: High-Order Polynomial Generators
558Dynamic Scale Inference by Entropy Minimization
559SpikeGrad: An ANN-equivalent Computation Model for Implementing Backpropagation with Spikes
560Rethinking Data Augmentation: Self-Supervision and Self-Distillation
562Learning to Remember from a Multi-Task Teacher
563Gradient $\ell_1$ Regularization for Quantization Robustness
564Coloring graph neural networks for node disambiguation
565Spectral Embedding of Regularized Block Models
566On Federated Learning of Deep Networks from Non-IID Data: Parameter Divergence and the Effects of Hyperparametric Methods
567Improved Detection of Adversarial Attacks via Penetration Distortion Maximization
568Barcodes as summary of objective functions' topology
569Unsupervised Video-to-Video Translation via Self-Supervised Learning
570Toward Evaluating Robustness of Deep Reinforcement Learning with Continuous Control
573Geometry-aware Generation of Adversarial and Cooperative Point Clouds
574Crafting Data-free Universal Adversaries with Dilate Loss
575Efficient Bi-Directional Verification of ReLU Networks via Quadratic Programming
576Improving Sample Efficiency in Model-Free Reinforcement Learning from Images
577Improving Exploration of Deep Reinforcement Learning using Planning for Policy Search
578Spatial Information is Overrated for Image Classification
579A Theoretical Analysis of Deep Q-Learning
580Decentralized Deep Learning with Arbitrary Communication Compression
581Can I Trust the Explainer? Verifying Post-Hoc Explanatory Methods
582D3PG: Deep Differentiable Deterministic Policy Gradients
583Deep Ensembles: A Loss Landscape Perspective
584A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation
586Impact of the latent space on the ability of GANs to fit the distribution
587Training Generative Adversarial Networks from Incomplete Observations using Factorised Discriminators
588Combining Q-Learning and Search with Amortized Value Estimates
589Hyperbolic Image Embeddings
590Infinite-Horizon Differentiable Model Predictive Control
591Neural Reverse Engineering of Stripped Binaries
592Anchor & Transform: Learning Sparse Representations of Discrete Objects
593Emergence of Collective Policies Inside Simulations with Biased Representations
594Projection Based Constrained Policy Optimization
595GraphFlow: Exploiting Conversation Flow with Graph Neural Networks for Conversational Machine Comprehension
596Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning
597Recurrent Layer Attention Network
598Towards Effective 2-bit Quantization: Pareto-optimal Bit Allocation for Deep CNNs Compression
599You Only Train Once: Loss-Conditional Training of Deep Networks
600Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization
601Using Explainabilty to Detect Adversarial Attacks
602Feature Selection using Stochastic Gates
603SpectroBank: A filter-bank convolutional layer for CNN-based audio applications
604Testing For Typicality with Respect to an Ensemble of Learned Distributions
605Emergent Communication in Networked Multi-Agent Reinforcement Learning
606GraphSAINT: Graph Sampling Based Inductive Learning Method
607Adversarial Filters of Dataset Biases
608Value-Driven Hindsight Modelling
609Incorporating Perceptual Prior to Improve Model's Adversarial Robustness
610Learning Neural Causal Models from Unknown Interventions
611Adaptive Generation of Unrestricted Adversarial Inputs
612P-BN: Towards Effective Batch Normalization in the Path Space
613Efficient Probabilistic Logic Reasoning with Graph Neural Networks
614On the geometry and learning low-dimensional embeddings for directed graphs
615GATO: Gates Are Not the Only Option
616Probabilistic View of Multi-agent Reinforcement Learning: A Unified Approach
617Neural Subgraph Isomorphism Counting
618RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments
619Continual Learning with Delayed Feedback
620Neural Non-additive Utility Aggregation
621Bayesian Variational Autoencoders for Unsupervised Out-of-Distribution Detection
622``"Best-of-Many-Samples" Distribution Matching
623Dynamically Balanced Value Estimates for Actor-Critic Methods
624Spatially Parallel Attention and Component Extraction for Scene Decomposition
625Efficient generation of structured objects with Constrained Adversarial Networks
626Deep Variational Semi-Supervised Novelty Detection
627Cross-Lingual Ability of Multilingual BERT: An Empirical Study
628Towards Understanding Generalization in Gradient-Based Meta-Learning
629Towards Finding Longer Proofs
630Probing Emergent Semantics in Predictive Agents via Question Answering
631Revisiting the Information Plane
632Deep 3D-Zoom Net: Unsupervised Learning of Photo-Realistic 3D-Zoom
633Hierarchical Graph Matching Networks for Deep Graph Similarity Learning
634A Simple Approach to the Noisy Label Problem Through the Gambler's Loss
635On the Reflection of Sensitivity in the Generalization Error
636Redundancy-Free Computation Graphs for Graph Neural Networks
637Toward Understanding The Effect of Loss Function on The Performance of Knowledge Graph Embedding
638Reducing Transformer Depth on Demand with Structured Dropout
639Semi-Supervised Learning with Normalizing Flows
640Neural Communication Systems with Bandwidth-limited Channel
641Reducing Computation in Recurrent Networks by Selectively Updating State Neurons
642A Novel Analysis Framework of Lower Complexity Bounds for Finite-Sum Optimization
643Neural Outlier Rejection for Self-Supervised Keypoint Learning
644Exploring the Pareto-Optimality between Quality and Diversity in Text Generation
645B-Spline CNNs on Lie groups
646EMS: End-to-End Model Search for Network Architecture, Pruning and Quantization
647Feature-based Augmentation for Semi-Supervised Learning
648Quantifying Point-Prediction Uncertainty in Neural Networks via Residual Estimation with an I/O Kernel
649Progressive Knowledge Distillation For Generative Modeling
650EMPIR: Ensembles of Mixed Precision Deep Networks for Increased Robustness Against Adversarial Attacks
651Learning To Explore Using Active Neural Mapping
652Adversarial Robustness Against the Union of Multiple Perturbation Models
653Understanding and Improving Information Transfer in Multi-Task Learning
654Hyperparameter Tuning and Implicit Regularization in Minibatch SGD
655Searching for Stage-wise Neural Graphs In the Limit
656Restricting the Flow: Information Bottlenecks for Attribution
657Stein Bridging: Enabling Mutual Reinforcement between Explicit and Implicit Generative Models
658Step Size Optimization
659Equilibrium Propagation with Continual Weight Updates
660Global Adversarial Robustness Guarantees for Neural Networks
661A Stochastic Derivative Free Optimization Method with Momentum
662Coresets for Accelerating Incremental Gradient Methods
663A Greedy Approach to Max-Sliced Wasserstein GANs
664Off-Policy Actor-Critic with Shared Experience Replay
665Intrinsically Motivated Discovery of Diverse Patterns in Self-Organizing Systems
666The Ingredients of Real World Robotic Reinforcement Learning
667Causal Discovery with Reinforcement Learning
668Modelling the influence of data structure on learning in neural networks
669Task-agnostic Continual Learning via Growing Long-Term Memory Networks
670Scaling Autoregressive Video Models
672Generative Integration Networks
673Convergence Analysis of a Momentum Algorithm with Adaptive Step Size for Nonconvex Optimization
674Compressive Transformers for Long-Range Sequence Modelling
675Global Momentum Compression for Sparse Communication in Distributed SGD
676State2vec: Off-Policy Successor Feature Approximators
677Differentiation of Blackbox Combinatorial Solvers
678Reinforced Genetic Algorithm Learning for Optimizing Computation Graphs
679Lagrangian Fluid Simulation with Continuous Convolutions
680Graph-based motion planning networks
681Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks
682Semi-supervised semantic segmentation needs strong, high-dimensional perturbations
683Learning to Guide Random Search
684Attentive Sequential Neural Processes
685The intriguing role of module criticality in the generalization of deep networks
686Yet another but more efficient black-box adversarial attack: tiling and evolution strategies
687TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing
688Learning with Social Influence through Interior Policy Differentiation
689SPROUT: Self-Progressing Robust Training
690Alleviating Privacy Attacks via Causal Learning
691Hybrid Weight Representation: A Quantization Method Represented with Ternary and Sparse-Large Weights
692Self-labelling via simultaneous clustering and representation learning
693Meta Decision Trees for Explainable Recommendation Systems
694Continual Learning with Gated Incremental Memories for Sequential Data Processing
695Policy Optimization by Local Improvement through Search
696Improving Model Compatibility of Generative Adversarial Networks by Boundary Calibration
697Data Annealing Transfer learning Procedure for Informal Language Understanding Tasks
698Robust anomaly detection and backdoor attack detection via differential privacy
699CAT: Compression-Aware Training for bandwidth reduction
700Scheduling the Learning Rate Via Hypergradients: New Insights and a New Algorithm
701Learning Entailment-Based Sentence Embeddings from Natural Language Inference
702Invariance vs Robustness of Neural Networks
703Asymptotic learning curves of kernel methods: empirical data v.s. Teacher-Student paradigm
705Irrationality can help reward inference
706Learning to Reach Goals Without Reinforcement Learning
707Pruning Depthwise Separable Convolutions for Extra Efficiency Gain of Lightweight Models
708Subjective Reinforcement Learning for Open Complex Environments
709Deep probabilistic subsampling for task-adaptive compressed sensing
710Text Embedding Bank Module for Detailed Image Paragraph Caption
711Semi-supervised 3D Face Reconstruction with Nonlinear Disentangled Representations
712Representing Model Uncertainty of Neural Networks in Sparse Information Form
713GroSS Decomposition: Group-Size Series Decomposition for Whole Search-Space Training
714Neural Tangents: Fast and Easy Infinite Neural Networks in Python
715Sparse Weight Activation Training
716Learning Robust Representations via Multi-View Information Bottleneck
717Batch-shaping for learning conditional channel gated networks
718Making the Shoe Fit: Architectures, Initializations, and Tuning for Learning with Privacy
719Universal Adversarial Attack Using Very Few Test Examples
720Rotation-invariant clustering of functional cell types in primary visual cortex
721Solving single-objective tasks by preference multi-objective reinforcement learning
722Deep automodulators
723Enhanced Convolutional Neural Tangent Kernels
724Revisiting Gradient Episodic Memory for Continual Learning
725Inductive and Unsupervised Representation Learning on Graph Structured Objects
726A new perspective in understanding of Adam-Type algorithms and beyond
727Causally Correct Partial Models for Reinforcement Learning
728Spectral Nonlocal Block for Neural Network
729U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation
730Masked Based Unsupervised Content Transfer
731Efficient meta reinforcement learning via meta goal generation
732Learning robust visual representations using data augmentation invariance
733A Simple Dynamic Learning Rate Tuning Algorithm For Automated Training of DNNs
734DropEdge: Towards Deep Graph Convolutional Networks on Node Classification
735Simple but effective techniques to reduce dataset biases
736Projected Canonical Decomposition for Knowledge Base Completion
737Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue
738AMUSED: A Multi-Stream Vector Representation Method for Use In Natural Dialogue
739Measuring the Reliability of Reinforcement Learning Algorithms
740Semi-Supervised Named Entity Recognition with CRF-VAEs
741Stable Rank Normalization for Improved Generalization in Neural Networks and GANs
742Graph Neural Networks for Soft Semi-Supervised Learning on Hypergraphs
743Why Not to Use Zero Imputation? Correcting Sparsity Bias in Training Neural Networks
744Deep Neural Forests: An Architecture for Tabular Data
745Self-Imitation Learning via Trajectory-Conditioned Policy for Hard-Exploration Tasks
747Data Augmentation in Training CNNs: Injecting Noise to Images
748VAENAS: Sampling Matters in Neural Architecture Search
749Self-Educated Language Agent with Hindsight Experience Replay for Instruction Following
750Model-Agnostic Feature Selection with Additional Mutual Information
751Do Deep Neural Networks for Segmentation Understand Insideness?
752Adversarial Robustness as a Prior for Learned Representations
753Explaining Time Series by Counterfactuals
754Variational Diffusion Autoencoders with Random Walk Sampling
755Probability Calibration for Knowledge Graph Embedding Models
756Contrastive Multiview Coding
757Fast Sparse ConvNets
758Reformer: The Efficient Transformer
759BasisVAE: Orthogonal Latent Space for Deep Disentangled Representation
760Target-Embedding Autoencoders for Supervised Representation Learning
761Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search
762Conditional Flow Variational Autoencoders for Structured Sequence Prediction
763High-Frequency guided Curriculum Learning for Class-specific Object Boundary Detection
764On the Equivalence between Node Embeddings and Structural Graph Representations
765Disagreement-Regularized Imitation Learning
766Shifted Randomized Singular Value Decomposition
767PassNet: Learning pass probability surfaces from single-location labels. An architecture for visually-interpretable soccer analytics
768On Incorporating Semantic Prior Knowlegde in Deep Learning Through Embedding-Space Constraints
769Are Few-shot Learning Benchmarks Too Simple ?
771Universality Theorems for Generative Models
772Function Feature Learning of Neural Networks
773Manifold Learning and Alignment with Generative Adversarial Networks
774Learning Deep-Latent Hierarchies by Stacking Wasserstein Autoencoders
775Scalable Deep Neural Networks via Low-Rank Matrix Factorization
777Fast Task Adaptation for Few-Shot Learning
778Weighted Empirical Risk Minimization: Transfer Learning based on Importance Sampling
779Neural Program Synthesis By Self-Learning
780Neural Epitome Search for Architecture-Agnostic Network Compression
781Learning from Label Proportions with Consistency Regularization
782Do recent advancements in model-based deep reinforcement learning really improve data efficiency?
783Evo-NAS: Evolutionary-Neural Hybrid Agent for Architecture Search
784Mixing Up Real Samples and Adversarial Samples for Semi-Supervised Learning
785Task-Agnostic Robust Encodings for Combating Adversarial Typos
786When Covariate-shifted Data Augmentation Increases Test Error And How to Fix It
787Accelerated Variance Reduced Stochastic Extragradient Method for Sparse Machine Learning Problems
788AdamT: A Stochastic Optimization with Trend Correction Scheme
789The Variational InfoMax AutoEncoder
790Skew-Fit: State-Covering Self-Supervised Reinforcement Learning
791LOGAN: Latent Optimisation for Generative Adversarial Networks
792Hyper-SAGNN: a self-attention based graph neural network for hypergraphs
793A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning
794Global-Local Network for Learning Depth with Very Sparse Supervision
795CEB Improves Model Robustness
796Music Source Separation in the Waveform Domain
797Information lies in the eye of the beholder: The effect of representations on observed mutual information
798On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach
799Distributionally Robust Neural Networks
800Distilling the Knowledge of BERT for Text Generation
801Kernel of CycleGAN as a principal homogeneous space
802Cross-Lingual Vision-Language Navigation
803Molecule Property Prediction and Classification with Graph Hypernetworks
804A Syntax-Aware Approach for Unsupervised Text Style Transfer
805Relevant-features based Auxiliary Cells for Robust and Energy Efficient Deep Learning
806Don't Use Large Mini-batches, Use Local SGD
807Provable robustness against all adversarial $l_p$-perturbations for $p\geq 1$
808Model Based Reinforcement Learning for Atari
809Generating Multi-Sentence Abstractive Summaries of Interleaved Texts
810On Universal Equivariant Set Networks
811Compressive Hyperspherical Energy Minimization
813Deep End-to-end Unsupervised Anomaly Detection
814Tensor Decompositions for Temporal Knowledge Base Completion
815CloudLSTM: A Recurrent Neural Model for Spatiotemporal Point-cloud Stream Forecasting
816Neural Approximation of an Auto-Regressive Process through Confidence Guided Sampling
817A Simple Randomization Technique for Generalization in Deep Reinforcement Learning
818Stochastic Latent Residual Video Prediction
819AlignNet: Self-supervised Alignment Module
820Learning with Protection: Rejection of Suspicious Samples under Adversarial Environment
821QXplore: Q-Learning Exploration by Maximizing Temporal Difference Error
822Walking the Tightrope: An Investigation of the Convolutional Autoencoder Bottleneck
823Partial Simulation for Imitation Learning
824Few-shot Learning by Focusing on Differences
825Robustness Verification for Transformers
826EnsembleNet: A novel architecture for Incremental Learning
827Anomalous Pattern Detection in Activations and Reconstruction Error of Autoencoders
828Fantastic Generalization Measures and Where to Find Them
829Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks
830Learning De-biased Representations with Biased Representations
831Weakly Supervised Disentanglement with Guarantees
832Imagining the Latent Space of a Variational Auto-Encoders
833A Copula approach for hyperparameter transfer learning
835Provenance detection through learning transformation-resilient watermarking
836Regulatory Focus: Promotion and Prevention Inclinations in Policy Search
837Fairness with Wasserstein Adversarial Networks
838Diagonal Graph Convolutional Networks with Adaptive Neighborhood Aggregation
839Discrepancy Ratio: Evaluating Model Performance When Even Experts Disagree on the Truth
840The Dual Information Bottleneck
841Deep Auto-Deferring Policy for Combinatorial Optimization
842Towards trustworthy predictions from deep neural networks with fast adversarial calibration
843Abductive Commonsense Reasoning
844Variance Reduction With Sparse Gradients
845BlockSwap: Fisher-guided Block Substitution for Network Compression on a Budget
846RNA Secondary Structure Prediction By Learning Unrolled Algorithms
847Learning transport cost from subset correspondence
848Attentive Weights Generation for Few Shot Learning via Information Maximization
849Semi-Supervised Few-Shot Learning with a Controlled Degree of Task-Adaptive Conditioning
850Detecting Noisy Training Data with Loss Curves
851Reducing Sentiment Bias in Language Models via Counterfactual Evaluation
852Near-Zero-Cost Differentially Private Deep Learning with Teacher Ensembles
853Neural Network Out-of-Distribution Detection for Regression Tasks
854Rényi Fair Inference
855Reject Illegal Inputs: Scaling Generative Classifiers with Supervised Deep Infomax
856Lean Images for Geo-Localization
857WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia
858Deep Lifetime Clustering
859Towards Understanding the Transferability of Deep Representations
860Meta Dropout: Learning to Perturb Latent Features for Generalization
861Adversarial AutoAugment
862When Robustness Doesn’t Promote Robustness: Synthetic vs. Natural Distribution Shifts on ImageNet
863Understanding Why Neural Networks Generalize Well Through GSNR of Parameters
864State-only Imitation with Transition Dynamics Mismatch
865Measuring and Improving the Use of Graph Information in Graph Neural Networks
866Meta-Learning by Hallucinating Useful Examples
867Pixel Co-Occurence Based Loss Metrics for Super Resolution Texture Recovery
868A Latent Morphology Model for Open-Vocabulary Neural Machine Translation
869Sample-Based Point Cloud Decoder Networks
871BETANAS: Balanced Training and selective drop for Neural Architecture Search
872Connecting the Dots Between MLE and RL for Sequence Prediction
873Universal Approximation with Certified Networks
874Explain Your Move: Understanding Agent Actions Using Focused Feature Saliency
875SEERL : Sample Efficient Ensemble Reinforcement Learning
876Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks
877DyNet: Dynamic Convolution for Accelerating Convolution Neural Networks
878Deep Symbolic Superoptimization Without Human Knowledge
879Unsupervised domain adaptation with imputation
880Sample Efficient Policy Gradient Methods with Recursive Variance Reduction
881A Generative Model for Molecular Distance Geometry
882Generating Biased Datasets for Neural Natural Language Processing
883Robustified Importance Sampling for Covariate Shift
884Fast Task Inference with Variational Intrinsic Successor Features
885Certified Defenses for Adversarial Patches
886Hardware-aware One-Shot Neural Architecture Search in Coordinate Ascent Framework
887Contrastive Representation Distillation
888Generating valid Euclidean distance matrices
889Perturbations are not Enough: Generating Adversarial Examples with Spatial Distortions
890Information Theoretic Model Predictive Q-Learning
891On Predictive Information Sub-optimality of RNNs
892Model Inversion Networks for Model-Based Optimization
893Learning to Recognize the Unseen Visual Predicates
894Continuous Control with Contexts, Provably
895Stabilizing Transformers for Reinforcement Learning
897The Detection of Distributional Discrepancy for Text Generation
898Relative Pixel Prediction For Autoregressive Image Generation
900Natural- to formal-language generation using Tensor Product Representations
901Three-Head Neural Network Architecture for AlphaZero Learning
902Consistency-Based Semi-Supervised Active Learning: Towards Minimizing Labeling Budget
903Interpretable Network Structure for Modeling Contextual Dependency
904Policy Tree Network
905Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks
906Characterize and Transfer Attention in Graph Neural Networks
907Adversarial Neural Pruning
908Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering
909A Baseline for Few-Shot Image Classification
910Abstract Diagrammatic Reasoning with Multiplex Graph Networks
911Emergent Systematic Generalization In a Situated Agent
912SoftAdam: Unifying SGD and Adam for better stochastic gradient descent
913ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
914Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning
915Amharic Text Normalization with Sequence-to-Sequence Models
916Thinking While Moving: Deep Reinforcement Learning with Concurrent Control
918On the expected running time of nonconvex optimization with early stopping
919Knossos: Compiling AI with AI
920Multiagent Reinforcement Learning in Games with an Iterated Dominance Solution
921CP-GAN: Towards a Better Global Landscape of GANs
922Jacobian Adversarially Regularized Networks for Robustness
923Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems
924Improving Federated Learning Personalization via Model Agnostic Meta Learning
925Towards Verified Robustness under Text Deletion Interventions
926Discovering Topics With Neural Topic Models Built From PLSA Loss
927And the Bit Goes Down: Revisiting the Quantization of Neural Networks
928Meta-Learning Runge-Kutta
929RGBD-GAN: Unsupervised 3D Representation Learning From Natural Image Datasets via RGBD Image Synthesis
930Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks
931Instant Quantization of Neural Networks using Monte Carlo Methods
932Hallucinative Topological Memory for Zero-Shot Visual Planning
933Learning Good Policies By Learning Good Perceptual Models
934Implementation Matters in Deep RL: A Case Study on PPO and TRPO
935A Closer Look at Deep Policy Gradients
936Plug and Play Language Model: A simple baseline for controlled language generation
937Efficient High-Dimensional Data Representation Learning via Semi-Stochastic Block Coordinate Descent Methods
938Understanding and Robustifying Differentiable Architecture Search
939Rethinking the Hyperparameters for Fine-tuning
940UNITER: Learning UNiversal Image-TExt Representations
941Self-Supervised GAN Compression
942Retrieving Signals in the Frequency Domain with Deep Complex Extractors
943Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings
944Implementing Inductive bias for different navigation tasks through diverse RNN attrractors
945Disentangling Style and Content in Anime Illustrations
946Dynamic Instance Hardness
947Multi-step Greedy Policies in Model-Free Deep Reinforcement Learning
948A Random Matrix Perspective on Mixtures of Nonlinearities in High Dimensions
949Is my Deep Learning Model Learning more than I want it to?
950LIA: Latently Invertible Autoencoder with Adversarial Learning
951PCMC-Net: Feature-based Pairwise Choice Markov Chains
952Multi-Agent Interactions Modeling with Correlated Policies
953Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning
954Once for All: Train One Network and Specialize it for Efficient Deployment
955Generalized Convolutional Forest Networks for Domain Generalization and Visual Recognition
956Acutum: When Generalization Meets Adaptability
957FR-GAN: Fair and Robust Training
958SNODE: Spectral Discretization of Neural ODEs for System Identification
959Guiding Program Synthesis by Learning to Generate Examples
960Fast Neural Network Adaptation via Parameters Remapping
961Measuring Calibration in Deep Learning
962R2D2: Reuse & Reduce via Dynamic Weight Diffusion for Training Efficient NLP Models
963Exploratory Not Explanatory: Counterfactual Analysis of Saliency Maps for Deep RL
964On the Distribution of Penultimate Activations of Classification Networks
965Divide-and-Conquer Adversarial Learning for High-Resolution Image Enhancement
966Meta-Learning Deep Energy-Based Memory Models
967Mutual Information Maximization for Robust Plannable Representations
968Depth creates no more spurious local minima in linear networks
970YaoGAN: Learning Worst-case Competitive Algorithms from Self-generated Inputs
971Annealed Denoising score matching: learning Energy based model in high-dimensional spaces
972Finding Winning Tickets with Limited (or No) Supervision
973Graph Convolutional Reinforcement Learning
974Open-Set Domain Adaptation with Category-Agnostic Clusters
975Deep Generative Classifier for Out-of-distribution Sample Detection
976Reparameterized Variational Divergence Minimization for Stable Imitation
977Learning Function-Specific Word Representations
978Swoosh! Rattle! Thump! - Actions that Sound
979Improving and Stabilizing Deep Energy-Based Learning
980Perception-Driven Curiosity with Bayesian Surprise
981Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning
982Towards Effective and Efficient Zero-shot Learning by Fine-tuning with Task Descriptions
984Continual Density Ratio Estimation (CDRE): A new method for evaluating generative models in continual learning
986Kernelized Wasserstein Natural Gradient
987The Curious Case of Neural Text Degeneration
988Universal approximations of permutation invariant/equivariant functions by deep neural networks
989Improving Robustness Without Sacrificing Accuracy with Patch Gaussian Augmentation
990What Can Learned Intrinsic Rewards Capture?
991On Iterative Neural Network Pruning, Reinitialization, and the Similarity of Masks
992Implicit Generative Modeling for Efficient Exploration
993Continuous Meta-Learning without Tasks
994Counterfactual Regularization for Model-Based Reinforcement Learning
995Multilingual Alignment of Contextual Word Representations
996A bi-diffusion based layer-wise sampling method for deep learning in large graphs
997Learning Video Representations using Contrastive Bidirectional Transformer
998Unrestricted Adversarial Attacks For Semantic Segmentation
999Randomness in Deconvolutional Networks for Visual Representation
1000HUBERT Untangles BERT to Improve Transfer across NLP Tasks
1001The Gambler's Problem and Beyond
1002CRAP: Semi-supervised Learning via Conditional Rotation Angle Prediction
1003Noisy $\ell^{0}$-Sparse Subspace Clustering on Dimensionality Reduced Data
1004GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation
1005Off-policy Multi-step Q-learning
1006Axial Attention in Multidimensional Transformers
1007Joint text classification on multiple levels with multiple labels
1008Fully Quantized Transformer for Improved Translation
1009The Surprising Behavior Of Graph Neural Networks
1010Double Neural Counterfactual Regret Minimization
1011Resizable Neural Networks
1012Multitask Soft Option Learning
1013Adaptive Adversarial Imitation Learning
1014Representation Learning with Multisets
1015Improving Confident-Classifiers For Out-of-distribution Detection
1016Cyclic Graph Dynamic Multilayer Perceptron for Periodic Signals
1017Accelerating Monte Carlo Bayesian Inference via Approximating Predictive Uncertainty over the Simplex
1018Capsule Networks without Routing Procedures
1019Certifiably Robust Interpretation in Deep Learning
1020Continuous Convolutional Neural Network forNonuniform Time Series
1021DS-VIC: Unsupervised Discovery of Decision States for Transfer in RL
1022Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
1023Multi-objective Neural Architecture Search via Predictive Network Performance Optimization
1024Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference
1025Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers
1026A Mean-Field Theory for Kernel Alignment with Random Features in Generative Adverserial Networks
1027Learning Key Steps to Attack Deep Reinforcement Learning Agents
1028Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks
1029On PAC-Bayes Bounds for Deep Neural Networks using the Loss Curvature
1030Deep Graph Matching Consensus
1031Self-Supervised Learning of Appliance Usage
1032Gaussian Conditional Random Fields for Classification
1033Fourier networks for uncertainty estimates and out-of-distribution detection
1034Semantic Hierarchy Emerges in the Deep Generative Representations for Scene Synthesis
1035Quantum Algorithms for Deep Convolutional Neural Networks
1038Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds
1039Abstractive Dialog Summarization with Semantic Scaffolds
1040Evaluating Semantic Representations of Source Code
1041Searching to Exploit Memorization Effect in Learning from Corrupted Labels
1042Study of a Simple, Expressive and Consistent Graph Feature Representation
1043Understanding l4-based Dictionary Learning: Interpretation, Stability, and Robustness
1044Balancing Cost and Benefit with Tied-Multi Transformers
1045End-to-End Multi-Domain Task-Oriented Dialogue Systems with Multi-level Neural Belief Tracker
1046All Neural Networks are Created Equal
1047Construction of Macro Actions for Deep Reinforcement Learning
1048BOSH: An Efficient Meta Algorithm for Decision-based Attacks
1049MGP-AttTCN: An Interpretable Machine Learning Model for the Prediction of Sepsis
1050Unsupervised Representation Learning by Predicting Random Distances
1051ConQUR: Mitigating Delusional Bias in Deep Q-Learning
1052Where is the Information in a Deep Network?
1053Extreme Values are Accurate and Robust in Deep Networks
1054Statistically Consistent Saliency Estimation
1055Domain-Independent Dominance of Adaptive Methods
1056Neural Networks for Principal Component Analysis: A New Loss Function Provably Yields Ordered Exact Eigenvectors
1057Symplectic ODE-Net: Learning Hamiltonian Dynamics with Control
1058PNEN: Pyramid Non-Local Enhanced Networks
1059Interpretations are useful: penalizing explanations to align neural networks with prior knowledge
1060FreeLB: Enhanced Adversarial Training for Language Understanding
1061Behaviour Suite for Reinforcement Learning
1062Strategies for Pre-training Graph Neural Networks
1064Refining the variational posterior through iterative optimization
1065Aggregating explanation methods for neural networks stabilizes explanations
1066Recurrent Hierarchical Topic-Guided Neural Language Models
1067Invertible generative models for inverse problems: mitigating representation error and dataset bias
1068An Algorithm-Agnostic NAS Benchmark
1069Learning World Graph Decompositions To Accelerate Reinforcement Learning
1070Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems
1071Controlling generative models with continuous factors of variations
1072Emergent Tool Use From Multi-Agent Autocurricula
1073The fairness-accuracy landscape of neural classifiers
1074Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee
1075Unsupervised Clustering using Pseudo-semi-supervised Learning
1076Geometric Analysis of Nonconvex Optimization Landscapes for Overcomplete Learning
1078PairNorm: Tackling Oversmoothing in GNNs
1079Training-Free Uncertainty Estimation for Neural Networks
1080Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning
1081Empirical Studies on the Properties of Linear Regions in Deep Neural Networks
1082SNOW: Subscribing to Knowledge via Channel Pooling for Transfer & Lifelong Learning
1083Smoothness and Stability in GANs
1084Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation
1085On Bonus Based Exploration Methods In The Arcade Learning Environment
1086Power up! Robust Graph Convolutional Network based on Graph Powering
1087Global graph curvature
1088Deep k-NN for Noisy Labels
1089Filling the Soap Bubbles: Efficient Black-Box Adversarial Certification with Non-Gaussian Smoothing
1090Guided Adaptive Credit Assignment for Sample Efficient Policy Optimization
1091A Theory of Usable Information under Computational Constraints
1092On the Invertibility of Invertible Neural Networks
1093Shallow VAEs with RealNVP Prior Can Perform as Well as Deep Hierarchical VAEs
1094GAN-based Gaussian Mixture Model Responsibility Learning
1095Information-Theoretic Local Minima Characterization and Regularization
1096Well-Read Students Learn Better: On the Importance of Pre-training Compact Models
1097IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks
1099HiLLoC: lossless image compression with hierarchical latent variable models
1100Learning to Learn Kernels with Variational Random Features
1101Efficient Wrapper Feature Selection using Autoencoder and Model Based Elimination
1102Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics
1103Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks
1104Dual Sequential Monte Carlo: Tunneling Filtering and Planning in Continuous POMDPs
1105Enhancing Language Emergence through Empathy
1106The Generalization-Stability Tradeoff in Neural Network Pruning
1107Word embedding re-examined: is the symmetrical factorization optimal?
1108Empowering Graph Representation Learning with Paired Training and Graph Co-Attention
1109Learning representations for binary-classification without backpropagation
1110Deep unsupervised feature selection
1111WaveFlow: A Compact Flow-based Model for Raw Audio
1112Mathematical Reasoning in Latent Space
1113Black Box Recursive Translations for Molecular Optimization
1114Improved Generalization Bound of Permutation Invariant Deep Neural Networks
1115Frequency-based Search-control in Dyna
1116Off-policy Bandits with Deficient Support
1117Implicit λ-Jeffreys Autoencoders: Taking the Best of Both Worlds
1118Super-AND: A Holistic Approach to Unsupervised Embedding Learning
1120Recognizing Plans by Learning Embeddings from Observed Action Distributions
1121LEX-GAN: Layered Explainable Rumor Detector Based on Generative Adversarial Networks
1122Towards Stable and Efficient Training of Verifiably Robust Neural Networks
1123Multi-hop Question Answering via Reasoning Chains
1124Factorized Multimodal Transformer for Multimodal Sequential Learning
1125Learning in Confusion: Batch Active Learning with Noisy Oracle
1126Iterative energy-based projection on a normal data manifold for anomaly localization
1127Counting the Paths in Deep Neural Networks as a Performance Predictor
1128Chart Auto-Encoders for Manifold Structured Data
1129Optimizing Loss Landscape Connectivity via Neuron Alignment
1131V1Net: A computational model of cortical horizontal connections
1132Distribution Matching Prototypical Network for Unsupervised Domain Adaptation
1133Deep amortized clustering
1134Using Objective Bayesian Methods to Determine the Optimal Degree of Curvature within the Loss Landscape
1135Towards neural networks that provably know when they don't know
1136BatchEnsemble: an Alternative Approach to Efficient Ensemble and Lifelong Learning
1137Fully Convolutional Graph Neural Networks using Bipartite Graph Convolutions
1138Inductive representation learning on temporal graphs
1139Attention on Abstract Visual Reasoning
1140Starfire: Regularization-Free Adversarially-Robust Structured Sparse Training
1141Convolutional Tensor-Train LSTM for Long-Term Video Prediction
1142An Information Theoretic Approach to Distributed Representation Learning
1143PatchVAE: Learning Local Latent Codes for Recognition
1144A Probabilistic Formulation of Unsupervised Text Style Transfer
1146Feature Map Transform Coding for Energy-Efficient CNN Inference
1147Generative Models for Effective ML on Private, Decentralized Datasets
1148Learning from Partially-Observed Multimodal Data with Variational Autoencoders
1150A Group-Theoretic Framework for Knowledge Graph Embedding
1152Picking Winning Tickets Before Training by Preserving Gradient Flow
1153Exploring Cellular Protein Localization Through Semantic Image Synthesis
1154Learning Calibratable Policies using Programmatic Style-Consistency
1155Contextual Temperature for Language Modeling
1156Retrospection: Leveraging the Past for Efficient Training of Deep Neural Networks
1157Curriculum Loss: Robust Learning and Generalization against Label Corruption
1158Discrete Transformer
1159Adversarially Robust Generalization Just Requires More Unlabeled Data
1160Why Does the VQA Model Answer No?: Improving Reasoning through Visual and Linguistic Inference
1161DeepSFM: Structure From Motion Via Deep Bundle Adjustment
1162IsoNN: Isomorphic Neural Network for Graph Representation Learning and Classification
1163Uncertainty-guided Continual Learning with Bayesian Neural Networks
1164Spline Templated Based Handwriting Generation
1165On Empirical Comparisons of Optimizers for Deep Learning
1166On Evaluating Explainability Algorithms
1167Deep Hierarchical-Hyperspherical Learning (DH^2L)
1168Versatile Anomaly Detection with Outlier Preserving Distribution Mapping Autoencoders
1169Ladder Polynomial Neural Networks
1170Training Recurrent Neural Networks Online by Learning Explicit State Variables
1171How fine can fine-tuning be? Learning efficient language models
1172Improved Modeling of Complex Systems Using Hybrid Physics/Machine Learning/Stochastic Models
1174Deep Expectation-Maximization in Hidden Markov Models via Simultaneous Perturbation Stochastic Approximation
1175Cross-lingual Alignment vs Joint Training: A Comparative Study and A Simple Unified Framework
1176Compositional Visual Generation with Energy Based Models
1177Learning Sparsity and Quantization Jointly and Automatically for Neural Network Compression via Constrained Optimization
1178Hierarchical Bayes Autoencoders
1179Wyner VAE: A Variational Autoencoder with Succinct Common Representation Learning
1180Granger Causal Structure Reconstruction from Heterogeneous Multivariate Time Series
1181CGT: Clustered Graph Transformer for Urban Spatio-temporal Prediction
1182Robust Reinforcement Learning for Continuous Control with Model Misspecification
1183Decoupling Representation and Classifier for Long-Tailed Recognition
1184SDGM: Sparse Bayesian Classifier Based on a Discriminative Gaussian Mixture Model
1185Which Tasks Should Be Learned Together in Multi-task Learning?
1187Empirical observations pertaining to learned priors for deep latent variable models
1188MetaPoison: Learning to craft adversarial poisoning examples via meta-learning
1189Teacher-Student Compression with Generative Adversarial Networks
1190Visual Hide and Seek
1191Unsupervised Temperature Scaling: Robust Post-processing Calibration for Domain Shift
1192Pareto Optimality in No-Harm Fairness
1193Domain Adaptation Through Label Propagation: Learning Clustered and Aligned Features
1194Visual Representation Learning with 3D View-Constrastive Inverse Graphics Networks
1195Dream to Control: Learning Behaviors by Latent Imagination
1196From Inference to Generation: End-to-end Fully Self-supervised Generation of Human Face from Speech
1197Active Learning Graph Neural Networks via Node Feature Propagation
1198Real or Not Real, that is the Question
1199Deep Reinforcement Learning with Implicit Human Feedback
1200Multi-Sample Dropout for Accelerated Training and Better Generalization
1201MelNet: A Generative Model for Audio in the Frequency Domain
1202Semi-Supervised Semantic Dependency Parsing Using CRF Autoencoders
1203Image Classification Through Top-Down Image Pyramid Traversal
1204Cross Domain Imitation Learning
1206DCTD: Deep Conditional Target Densities for Accurate Regression
1207Blending Diverse Physical Priors with Neural Networks
1209Read, Highlight and Summarize: A Hierarchical Neural Semantic Encoder-based Approach
1210Posterior Control of Blackbox Generation
1211A closer look at network resolution for efficient network design
1212Efficient Systolic Array Based on Decomposable MAC for Quantized Deep Neural Networks
1213Improved Image Augmentation for Convolutional Neural Networks by Copyout and CopyPairing
1214On the Evaluation of Conditional GANs
1215JAUNE: Justified And Unified Neural language Evaluation
1216Classification as Decoder: Trading Flexibility for Control in Multi Domain Dialogue
1217Statistical Adaptive Stochastic Optimization
1218Scalable Neural Learning for Verifiable Consistency with Temporal Specifications
1219Model Comparison of Beer data classification using an electronic nose
1220Non-linear System Identification from Partial Observations via Iterative Smoothing and Learning
1221Evaluating Lossy Compression Rates of Deep Generative Models
1222LambdaNet: Probabilistic Type Inference using Graph Neural Networks
1223Variational Autoencoders with Normalizing Flow Decoders
1224Model-Augmented Actor-Critic: Backpropagating through Paths
1225Metagross: Meta Gated Recursive Controller Units for Sequence Modeling
1226Neural Symbolic Reader: Scalable Integration of Distributed and Symbolic Representations for Reading Comprehension
1227Variational Autoencoders for Highly Multivariate Spatial Point Processes Intensities
1228Stochastic Mirror Descent on Overparameterized Nonlinear Models
1229Denoising and Regularization via Exploiting the Structural Bias of Convolutional Generators
1230Recurrent Chunking Mechanisms for Conversational Machine Reading Comprehension
1231Frequency Analysis for Graph Convolution Network
1232Network Deconvolution
1233Revisiting Self-Training for Neural Sequence Generation
1234Generative Cleaning Networks with Quantized Nonlinear Transform for Deep Neural Network Defense
1235Mutual Exclusivity as a Challenge for Deep Neural Networks
1238Natural Image Manipulation for Autoregressive Models Using Fisher Scores
1239Unifying Part Detection And Association For Multi-person Pose Estimation
1240Towards a Deep Network Architecture for Structured Smoothness
1241A novel text representation which enables image classifiers to perform text classification
1242On the Global Convergence of Training Deep Linear ResNets
1243A Closer Look at the Optimization Landscapes of Generative Adversarial Networks
1244Perceptual Generative Autoencoders
1245Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning
1246JAX MD: End-to-End Differentiable, Hardware Accelerated, Molecular Dynamics in Pure Python
1247Deflecting Adversarial Attacks
1248Biologically inspired sleep algorithm for increased generalization and adversarial robustness in deep neural networks
1249MUSE: Multi-Scale Attention Model for Sequence to Sequence Learning
1250Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication
1251Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning?
1252Shifted and Squeezed 8-bit Floating Point format for Low-Precision Training of Deep Neural Networks
1253Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks
1254Intriguing Properties of Adversarial Training at Scale
1255Point Process Flows
1256Cover Filtration and Stable Paths in the Mapper
1257Fully Polynomial-Time Randomized Approximation Schemes for Global Optimization of High-Dimensional Folded Concave Penalized Generalized Linear Models
1258Learning Neural Surrogate Model for Warm-Starting Bayesian Optimization
1259Scalable Differentially Private Data Generation via Private Aggregation of Teacher Ensembles
1260Knowledge Graph Embedding: A Probabilistic Perspective and Generalization Bounds
1261Stabilizing Neural ODE Networks with Stochasticity
1262Adversarial Paritial Multi-label Learning
1263Adversarial Interpolation Training: A Simple Approach for Improving Model Robustness
1264Agent as Scientist: Learning to Verify Hypotheses
1265CRNet: Image Super-Resolution Using A Convolutional Sparse Coding Inspired Network
1266Deep Double Descent: Where Bigger Models and More Data Hurt
1267Multigrid Neural Memory
1268ASGen: Answer-containing Sentence Generation to Pre-Train Question Generator for Scale-up Data in Question Answering
1269Distribution-Guided Local Explanation for Black-Box Classifiers
1270Decoding As Dynamic Programming For Recurrent Autoregressive Models
1271Compressed Sensing with Deep Image Prior and Learned Regularization
1272Gradient Surgery for Multi-Task Learning
1274Synthesizing Programmatic Policies that Inductively Generalize
1275Transformer-XH: Multi-hop question answering with eXtra Hop attention
1276Variational Hyper RNN for Sequence Modeling
1277Generalization through Memorization: Nearest Neighbor Language Models
1278Comparing Fine-tuning and Rewinding in Neural Network Pruning
1279Simple is Better: Training an End-to-end Contract Bridge Bidding Agent without Human Knowledge
1280The Sooner The Better: Investigating Structure of Early Winning Lottery Tickets
1281Long History Short-Term Memory for Long-Term Video Prediction
1282Adversarial training with perturbation generator networks
1283Single episode transfer for differing environmental dynamics in reinforcement learning
1284Inducing Stronger Object Representations in Deep Visual Trackers
1287Training Deep Neural Networks with Partially Adaptive Momentum
1288NeurQuRI: Neural Question Requirement Inspector for Answerability Prediction in Machine Reading Comprehension
1289Learning Latent Representations for Inverse Dynamics using Generalized Experiences
1290Learning The Difference That Makes A Difference With Counterfactually-Augmented Data
1291Differentiable Architecture Compression
1292The Early Phase of Neural Network Training
1293Chordal-GCN: Exploiting sparsity in training large-scale graph convolutional networks
1294On The Difficulty of Warm-Starting Neural Network Training
1295NeuroFabric: Identifying Ideal Topologies for Training A Priori Sparse Networks
1296Distilled embedding: non-linear embedding factorization using knowledge distillation
1297Incremental RNN: A Dynamical View.
1298Domain-Relevant Embeddings for Question Similarity
1299Actor-Critic Approach for Temporal Predictive Clustering
1300Adversarial Privacy Preservation under Attribute Inference Attack
1301Behavior-Guided Reinforcement Learning
1302Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates
1303Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling
1304Extreme Tensoring for Low-Memory Preconditioning
1305Blockwise Adaptivity: Faster Training and Better Generalization in Deep Learning
1306Collapsed amortized variational inference for switching nonlinear dynamical systems
1307Non-Autoregressive Dialog State Tracking
1308Channel Equilibrium Networks
1309Independence-aware Advantage Estimation
1310Bayesian Meta Sampling for Fast Uncertainty Adaptation
1311Salient Explanation for Fine-grained Classification
1313Stochastic Gradient Methods with Block Diagonal Matrix Adaptation
1314Harnessing Structures for Value-Based Planning and Reinforcement Learning
1315The Dynamics of Signal Propagation in Gated Recurrent Neural Networks
1316Economy Statistical Recurrent Units For Inferring Nonlinear Granger Causality
1317Discriminability Distillation in Group Representation Learning
1318Calibration, Entropy Rates, and Memory in Language Models
1319Rethinking Generalized Matrix Factorization for Recommendation: The Importance of Multi-hot Encoding
1320Efficient Saliency Maps for Explainable AI
1321Reinforcement Learning with Probabilistically Complete Exploration
1322Unaligned Image-to-Sequence Transformation with Loop Consistency
1323Learning to Generate 3D Training Data through Hybrid Gradient
1324Removing the Representation Error of GAN Image Priors Using the Deep Decoder
1325MEMO: A Deep Network for Flexible Combination of Episodic Memories
1326Superbloom: Bloom filter meets Transformer
1327Longitudinal Enrichment of Imaging Biomarker Representations for Improved Alzheimer's Disease Diagnosis
1328Probabilistic Connection Importance Inference and Lossless Compression of Deep Neural Networks
1329Generating Semantic Adversarial Examples with Differentiable Rendering
1330Guided variational autoencoder for disentanglement learning
1331ManiGAN: Text-Guided Image Manipulation
1332Quantum algorithm for finding the negative curvature direction
1333Dual-module Inference for Efficient Recurrent Neural Networks
1335MixUp as Directional Adversarial Training
1336Towards Interpretable Molecular Graph Representation Learning
1337Representation Learning Through Latent Canonicalizations
1338Winning Privately: The Differentially Private Lottery Ticket Mechanism
1339Coherent Gradients: An Approach to Understanding Generalization in Gradient Descent-based Optimization
1341Correctness Verification of Neural Network
1342Generalizing Natural Language Analysis through Span-relation Representations
1343Jelly Bean World: A Testbed for Never-Ending Learning
1344Characterizing convolutional neural networks with one-pixel signature
1345A Deep Dive into Count-Min Sketch for Extreme Classification in Logarithmic Memory
1346Large-scale Pretraining for Neural Machine Translation with Tens of Billions of Sentence Pairs
1347Learning from Explanations with Neural Module Execution Tree
1348A Coordinate-Free Construction of Scalable Natural Gradient
1349Discovering Motor Programs by Recomposing Demonstrations
1350How Aggressive Can Adversarial Attacks Be: Learning Ordered Top-k Attacks
1351Adaptive Learned Bloom Filter (Ada-BF): Efficient Utilization of the Classifier
1352Convergence Behaviour of Some Gradient-Based Methods on Bilinear Zero-Sum Games
1353Aging Memories Generate More Fluent Dialogue Responses with Memory Networks
1354DSReg: Using Distant Supervision as a Regularizer
1355Iterative Target Augmentation for Effective Conditional Generation
1356Composing Task-Agnostic Policies with Deep Reinforcement Learning
1357The Local Elasticity of Neural Networks
1358Gradient-Based Neural DAG Learning
1359On Concept-Based Explanations in Deep Neural Networks
1360Policy Message Passing: A New Algorithm for Probabilistic Graph Inference
1361Learning to Control Latent Representations for Few-Shot Learning of Named Entities
1362Amortized Nesterov's Momentum: Robust and Lightweight Momentum for Deep Learning
1363Recurrent Event Network : Global Structure Inference Over Temporal Knowledge Graph
1364Elastic-InfoGAN: Unsupervised Disentangled Representation Learning in Imbalanced Data
1365Composition-based Multi-Relational Graph Convolutional Networks
1366Capsules with Inverted Dot-Product Attention Routing
1367The Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via Higher-Order Influence Functions
1368Insights on Visual Representations for Embodied Navigation Tasks
1369Unsupervised Disentanglement of Pose, Appearance and Background from Images and Videos
1370On the Unintended Social Bias of Training Language Generation Models with News Articles
1371Role-Wise Data Augmentation for Knowledge Distillation
1372Learning Classifier Synthesis for Generalized Few-Shot Learning
1373Attention Forcing for Sequence-to-sequence Model Training
1374Topic Models with Survival Supervision: Archetypal Analysis and Neural Approaches
1375FSNet: Compression of Deep Convolutional Neural Networks by Filter Summary
1376On Need for Topology-Aware Generative Models for Manifold-Based Defenses
1377Neural Execution of Graph Algorithms
1378Objective Mismatch in Model-based Reinforcement Learning
1379Molecular Graph Enhanced Transformer for Retrosynthesis Prediction
1380Non-Sequential Melody Generation
1381Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning
1382Visual Explanation for Deep Metric Learning
1383Deep Innovation Protection
1384Alternating Recurrent Dialog Model with Large-Scale Pre-Trained Language Models
1385BERTScore: Evaluating Text Generation with BERT
1386Octave Graph Convolutional Network
1387Learning from Imperfect Annotations: An End-to-End Approach
1388Zeroth Order Optimization by a Mixture of Evolution Strategies
1389Augmenting Non-Collaborative Dialog Systems with Explicit Semantic and Strategic Dialog History
1390Machine Truth Serum
1391Prediction, Consistency, Curvature: Representation Learning for Locally-Linear Control
1392GraphZoom: A Multi-level Spectral Approach for Accurate and Scalable Graph Embedding
1393Sensible adversarial learning
1394Attention Interpretability Across NLP Tasks
1395Neuron ranking - an informed way to compress convolutional neural networks
1396MoET: Interpretable and Verifiable Reinforcement Learning via Mixture of Expert Trees
1397AdaScale SGD: A Scale-Invariant Algorithm for Distributed Training
1399Bio-Inspired Hashing for Unsupervised Similarity Search
1400Simplicial Complex Networks
1402Underwhelming Generalization Improvements From Controlling Feature Attribution
1403Graph Constrained Reinforcement Learning for Natural Language Action Spaces
1404Solving Packing Problems by Conditional Query Learning
1405Task-Relevant Adversarial Imitation Learning
1406Generative Restricted Kernel Machines
1407Towards Fast Adaptation of Neural Architectures with Meta Learning
1408RL-ST: Reinforcing Style, Fluency and Content Preservation for Unsupervised Text Style Transfer
1409A Functional Characterization of Randomly Initialized Gradient Descent in Deep ReLU Networks
1410Variational Hetero-Encoder Randomized GANs for Joint Image-Text Modeling
1411Toward Understanding Generalization of Over-parameterized Deep ReLU network trained with SGD in Student-teacher Setting
1412Asymptotics of Wide Networks from Feynman Diagrams
1413Symplectic Recurrent Neural Networks
1414Representational Disentanglement for Multi-Domain Image Completion
1415Generalized Bayesian Posterior Expectation Distillation for Deep Neural Networks
1416Learning Cross-Context Entity Representations from Text
1417SPECTRA: Sparse Entity-centric Transitions
1418DeepSimplex: Reinforcement Learning of Pivot Rules Improves the Efficiency of Simplex Algorithm in Solving Linear Programming Problems
1419Learning Temporal Abstraction with Information-theoretic Constraints for Hierarchical Reinforcement Learning
1420Selective Brain Damage: Measuring the Disparate Impact of Model Pruning
1421Asynchronous Stochastic Subgradient Methods for General Nonsmooth Nonconvex Optimization
1422Improved Structural Discovery and Representation Learning of Multi-Agent Data
1423Quantized Reinforcement Learning (QuaRL)
1425NADS: Neural Architecture Distribution Search for Uncertainty Awareness
1426Rigging the Lottery: Making All Tickets Winners
1428Discovering the compositional structure of vector representations with Role Learning Networks
1429Higher-Order Function Networks for Learning Composable 3D Object Representations
1430Adapting to Label Shift with Bias-Corrected Calibration
1431Neural Module Networks for Reasoning over Text
1432Strong Baseline Defenses Against Clean-Label Poisoning Attacks
1434Learning to Plan in High Dimensions via Neural Exploration-Exploitation Trees
1435Improved memory in recurrent neural networks with sequential non-normal dynamics
1436Model Imitation for Model-Based Reinforcement Learning
1437Embodied Language Grounding with Implicit 3D Visual Feature Representations
1438Likelihood Contribution based Multi-scale Architecture for Generative Flows
1439A Base Model Selection Methodology for Efficient Fine-Tuning
1440Rethinking Curriculum Learning With Incremental Labels And Adaptive Compensation
1441Graph Neural Networks for Reasoning 2-Quantified Boolean Formulas
1442Learn to Explain Efficiently via Neural Logic Inductive Learning
1443NormLime: A New Feature Importance Metric for Explaining Deep Neural Networks
1444Pre-trained Contextual Embedding of Source Code
1445Certified Robustness to Adversarial Label-Flipping Attacks via Randomized Smoothing
1446Benefit of Interpolation in Nearest Neighbor Algorithms
1447{COMPANYNAME}11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery
1448Neural Clustering Processes
1449Improving Neural Language Generation with Spectrum Control
1450Span Recovery for Deep Neural Networks with Applications to Input Obfuscation
1451Unknown-Aware Deep Neural Network
1453A Memory-augmented Neural Network by Resembling Human Cognitive Process of Memorization
1454A Perturbation Analysis of Input Transformations for Adversarial Attacks
1456Locally Constant Networks
1457Smooth Kernels Improve Adversarial Robustness and Perceptually-Aligned Gradients
1458Multi-View Summarization and Activity Recognition Meet Edge Computing in IoT Environments
1459Neural ODEs for Image Segmentation with Level Sets
1460Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations
1461PAC Confidence Sets for Deep Neural Networks via Calibrated Prediction
1462Low Rank Training of Deep Neural Networks for Emerging Memory Technology
1463Decentralized Distributed PPO: Mastering PointGoal Navigation
1464MultiGrain: a unified image embedding for classes and instances
1465Learning to Learn by Zeroth-Order Oracle
1466Neural Embeddings for Nearest Neighbor Search Under Edit Distance
1468Robust Federated Learning Through Representation Matching and Adaptive Hyper-parameters
1469ROS-HPL: Robotic Object Search with Hierarchical Policy Learning and Intrinsic-Extrinsic Modeling
1470Knockoff-Inspired Feature Selection via Generative Models
1471MetaPix: Few-Shot Video Retargeting
1472SloMo: Improving Communication-Efficient Distributed SGD with Slow Momentum
1473Stochastic Prototype Embeddings
1474Way Off-Policy Batch Deep Reinforcement Learning of Human Preferences in Dialog
1475Generalized Transformation-based Gradient
1476Targeted sampling of enlarged neighborhood via Monte Carlo tree search for TSP
1477Black-box Adversarial Attacks with Bayesian Optimization
1478Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving
1479Learning to Combat Compounding-Error in Model-Based Reinforcement Learning
1480Understanding Attention Mechanisms
1481Beyond GANs: Transforming without a Target Distribution
1482Four Things Everyone Should Know to Improve Batch Normalization
1483Learning to solve the credit assignment problem
1484Improving Multi-Manifold GANs with a Learned Noise Prior
1485Overparameterized Neural Networks Can Implement Associative Memory
1486Bayesian Residual Policy Optimization: Scalable Bayesian Reinforcement Learning with Clairvoyant Experts
1487Sampling-Free Learning of Bayesian Quantized Neural Networks
1488A Hierarchy of Graph Neural Networks Based on Learnable Local Features
1489The Blessing of Dimensionality: An Empirical Study of Generalization
1490DeFINE: Deep Factorized Input Word Embeddings for Neural Sequence Modeling
1492Learning to Make Generalizable and Diverse Predictions for Retrosynthesis
1493Disentangled GANs for Controllable Generation of High-Resolution Images
1494Continuous Graph Flow
1495Benchmarking Adversarial Robustness
1497Wasserstein-Bounded Generative Adversarial Networks
1498DBA: Distributed Backdoor Attacks against Federated Learning
1499Learning Generative Models using Denoising Density Estimators
1500Fast is better than free: Revisiting adversarial training
1502Improving Neural Abstractive Summarization Using Transfer Learning and Factuality-Based Evaluation: Towards Automating Science Journalism
1503Deep Multivariate Mixture of Gaussians for Object Detection under Occlusion
1504iWGAN: an Autoencoder WGAN for Inference
1505BERT-AL: BERT for Arbitrarily Long Document Understanding
1506Novelty Search in representational space for sample efficient exploration
1507Switched linear projections and inactive state sensitivity for deep neural network interpretability
1508An Optimization Principle Of Deep Learning?
1509Testing Robustness Against Unforeseen Adversaries
1510Thieves on Sesame Street! Model Extraction of BERT-based APIs
1511Understanding Knowledge Distillation in Non-autoregressive Machine Translation
1512Coordinated Exploration via Intrinsic Rewards for Multi-Agent Reinforcement Learning
1513Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data
1514Locality and Compositionality in Zero-Shot Learning
1515Optimistic Adaptive Acceleration for Optimization
1516Situating Sentence Embedders with Nearest Neighbor Overlap
1517Posterior Sampling: Make Reinforcement Learning Sample Efficient Again
1518Generalized Clustering by Learning to Optimize Expected Normalized Cuts
1519Mix-review: Alleviate Forgetting in the Pretrain-Finetune Framework for Neural Language Generation Models
1520The function of contextual illusions
1521Disentangling neural mechanisms for perceptual grouping
1522Adversarial Imitation Attack
1523Regularizing Trajectories to Mitigate Catastrophic Forgetting
1524When Do Variational Autoencoders Know What They Don't Know?
1525Semantic Pruning for Single Class Interpretability
1526Analyzing the Role of Model Uncertainty for Electronic Health Records
1527Chameleon: Adaptive Code Optimization For Expedited Deep Neural Network Compilation
1528Weakly-supervised Knowledge Graph Alignment with Adversarial Learning
1529Auto Completion of User Interface Layout Design Using Transformer-Based Tree Decoders
1530Not All Features Are Equal: Feature Leveling Deep Neural Networks for Better Interpretation
1531Intrinsic Motivation for Encouraging Synergistic Behavior
1532Noisy Machines: Understanding noisy neural networks and enhancing robustness to analog hardware errors using distillation
1533Perceptual Regularization: Visualizing and Learning Generalizable Representations
1534Neural networks with motivation
1535Improving One-Shot NAS By Suppressing The Posterior Fading
1536Toward Amortized Ranking-Critical Training For Collaborative Filtering
1537ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
1538Curriculum Learning for Deep Generative Models with Clustering
1539Should All Cross-Lingual Embeddings Speak English?
1540Sign-OPT: A Query-Efficient Hard-label Adversarial Attack
1541Playing the lottery with rewards and multiple languages: lottery tickets in RL and NLP
1542Learning Space Partitions for Nearest Neighbor Search
1543Visual Interpretability Alone Helps Adversarial Robustness
1544One-Shot Neural Architecture Search via Compressive Sensing
1545Learning Adversarial Grammars for Future Prediction
1546End-to-end named entity recognition and relation extraction using pre-trained language models
1547How noise affects the Hessian spectrum in overparameterized neural networks
1548A Simple Recurrent Unit with Reduced Tensor Product Representations
1549Parallel Neural Text-to-Speech
1550Context-Aware Object Detection With Convolutional Neural Networks
1551DeepV2D: Video to Depth with Differentiable Structure from Motion
1553Gaussian Process Meta-Representations Of Neural Networks
1555The Break-Even Point on the Optimization Trajectories of Deep Neural Networks
1556Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets
1557Exploration Based Language Learning for Text-Based Games
1558Robust And Interpretable Blind Image Denoising Via Bias-Free Convolutional Neural Networks
1559CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning
1560Deep Imitative Models for Flexible Inference, Planning, and Control
1561Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness
1562Defensive Quantization Layer For Convolutional Network Against Adversarial Attack
1563Defective Convolutional Layers Learn Robust CNNs
1564DASGrad: Double Adaptive Stochastic Gradient
1565Finding Mixed Strategy Nash Equilibrium for Continuous Games through Deep Learning
1566The Logical Expressiveness of Graph Neural Networks
1568Conditional Out-of-Sample Generation For Unpaired Data using trVAE
1569The Benefits of Over-parameterization at Initialization in Deep ReLU Networks
1570UniLoss: Unified Surrogate Loss by Adaptive Interpolation
1571A Training Scheme for the Uncertain Neuromorphic Computing Chips
1572Mildly Overparametrized Neural Nets can Memorize Training Data Efficiently
1573Deep Graph Translation
1574Are Transformers universal approximators of sequence-to-sequence functions?
1575Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples
1576Decoupling Weight Regularization from Batch Size for Model Compression
1577Zero-Shot Out-of-Distribution Detection with Feature Correlations
1578Proactive Sequence Generator via Knowledge Acquisition
1579Interpretable Deep Neural Network Models: Hybrid of Image Kernels and Neural Networks
1580Multi-scale Attributed Node Embedding
1581$\textrm{D}^2$GAN: A Few-Shot Learning Approach with Diverse and Discriminative Feature Synthesis
1582Understanding the functional and structural differences across excitatory and inhibitory neurons
1583One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum Evaluation
1584Differentially Private Meta-Learning
1585Leveraging Adversarial Examples to Obtain Robust Second-Order Representations
1586CLEVRER: Collision Events for Video Representation and Reasoning
1587Using Logical Specifications of Objectives in Multi-Objective Reinforcement Learning
1588Efficient Training of Robust and Verifiable Neural Networks
1589Learning Compositional Koopman Operators for Model-Based Control
1590Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness
1591Confidence-Calibrated Adversarial Training: Towards Robust Models Generalizing Beyond the Attack Used During Training
1592All SMILES Variational Autoencoder for Molecular Property Prediction and Optimization
1593Generating Dialogue Responses From A Semantic Latent Space
1594Is There Mode Collapse? A Case Study on Face Generation and Its Black-box Calibration
1595Overlearning Reveals Sensitive Attributes
1596Gaussian MRF Covariance Modeling for Efficient Black-Box Adversarial Attacks
1597A Kolmogorov Complexity Approach to Generalization in Deep Learning
1598Towards Modular Algorithm Induction
1599Optimal Strategies Against Generative Attacks
1600One Generation Knowledge Distillation by Utilizing Peer Samples
1601Stein Self-Repulsive Dynamics: Benefits from Past Samples
1602Adversarially robust transfer learning
1603One Demonstration Imitation Learning
1604Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
1605Promoting Coordination through Policy Regularization in Multi-Agent Deep Reinforcement Learning
1606Improving Irregularly Sampled Time Series Learning with Dense Descriptors of Time
1607Contextual Text Style Transfer
1608Modeling question asking using neural program generation
1609Learning to Link
1610Adversarial Attacks on Copyright Detection Systems
1611Detecting Extrapolation with Local Ensembles
1612Revisiting Fine-tuning for Few-shot Learning
1613Global Relational Models of Source Code
1614MONET: Debiasing Graph Embeddings via the Metadata-Orthogonal Training Unit
1615Selection via Proxy: Efficient Data Selection for Deep Learning
1616Deep Learning-Based Average Consensus
1617Meta Learning via Learned Loss
1618Short and Sparse Deconvolution --- A Geometric Approach
1619If MaxEnt RL is the Answer, What is the Question?
1620Stochastic Weight Averaging in Parallel: Large-Batch Training That Generalizes Well
1621Characterizing Missing Information in Deep Networks Using Backpropagated Gradients
1623Scaleable input gradient regularization for adversarial robustness
1624Adjustable Real-time Style Transfer
1625Unsupervised Progressive Learning and the STAM Architecture
1626Wasserstein Robust Reinforcement Learning
1627Knowledge Hypergraphs: Prediction Beyond Binary Relations
1628Dynamics-Aware Unsupervised Skill Discovery
1629A Fine-Grained Spectral Perspective on Neural Networks
1630Energy-Aware Neural Architecture Optimization with Fast Splitting Steepest Descent
1632Efficient Riemannian Optimization on the Stiefel Manifold via the Cayley Transform
1634Structured consistency loss for semi-supervised semantic segmentation
1635AMRL: Aggregated Memory For Reinforcement Learning
1636Adapting Behaviour for Learning Progress
1637Pretraining boosts out-of-domain robustness for pose estimation
1638GraphMix: Regularized Training of Graph Neural Networks for Semi-Supervised Learning
1639Synthetic vs Real: Deep Learning on Controlled Noise
1640Detecting malicious PDF using CNN
1642Scalable Model Compression by Entropy Penalized Reparameterization
1643Stochastic Geodesic Optimization for Neural Networks
1644Dynamic Time Lag Regression: Predicting What & When
1645Scholastic-Actor-Critic For Multi Agent Reinforcement Learning
1646On summarized validation curves and generalization
1647Convolutional Bipartite Attractor Networks
1648Anomaly Detection by Deep Direct Density Ratio Estimation
1649New Loss Functions for Fast Maximum Inner Product Search
1650Lipschitz Lifelong Reinforcement Learning
1651Local Label Propagation for Large-Scale Semi-Supervised Learning
1652GumbelClip: Off-Policy Actor-Critic Using Experience Replay
1653Going Deeper with Lean Point Networks
1654Improved Mutual Information Estimation
1655Semi-Supervised Generative Modeling for Controllable Speech Synthesis
1656Towards Physics-informed Deep Learning for Turbulent Flow Prediction
1657Unsupervised Learning from Video with Deep Neural Embeddings
1658Neural Text Generation With Unlikelihood Training
1659Pure and Spurious Critical Points: a Geometric Study of Linear Networks
1660Surrogate-Based Constrained Langevin Sampling With Applications to Optimal Material Configuration Design
1661Learning Heuristics for Quantified Boolean Formulas through Reinforcement Learning
1662Mean Field Models for Neural Networks in Teacher-student Setting
1663A Causal View on Robustness of Neural Networks
1664Striving for Simplicity in Off-Policy Deep Reinforcement Learning
1665White Box Network: Obtaining a right composition ordering of functions
1666Deep neuroethology of a virtual rodent
1667DRASIC: Distributed Recurrent Autoencoder for Scalable Image Compression
1668Emergence of functional and structural properties of the head direction system by optimization of recurrent neural networks
1669Causal Induction from Visual Observations for Goal Directed Tasks
1670Duration-of-Stay Storage Assignment under Uncertainty
1671CAQL: Continuous Action Q-Learning
1673Your classifier is secretly an energy based model and you should treat it like one
1674On the Linguistic Capacity of Real-time Counter Automata
1675Combining MixMatch and Active Learning for Better Accuracy with Fewer Labels
1676Adaptive Structural Fingerprints for Graph Attention Networks
1677Inductive Matrix Completion Based on Graph Neural Networks
1678Neural Operator Search
1679Time2Vec: Learning a Vector Representation of Time
1680ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring
1681Conditional Learning of Fair Representations
1682Mean-field Behaviour of Neural Tangent Kernel for Deep Neural Networks
1683TabNet: Attentive Interpretable Tabular Learning
1684Adapt-to-Learn: Policy Transfer in Reinforcement Learning
1685Identity Crisis: Memorization and Generalization Under Extreme Overparameterization
1686Stiffness: A New Perspective on Generalization in Neural Networks
1687Linguistic Embeddings as a Common-Sense Knowledge Repository: Challenges and Opportunities
1688First-Order Preconditioning via Hypergradient Descent
1689Feature Partitioning for Efficient Multi-Task Architectures
1690Layer Flexible Adaptive Computation Time for Recurrent Neural Networks
1691Curvature-based Robustness Certificates against Adversarial Examples
1692Adversarial Video Generation on Complex Datasets
1693Topological Autoencoders
1694Context-Gated Convolution
1695Reinforcement Learning without Ground-Truth State
1696Improved Sample Complexities for Deep Neural Networks and Robust Classification via an All-Layer Margin
1697In-Domain Representation Learning For Remote Sensing
1698Training Neural Networks for and by Interpolation
1699FAN: Focused Attention Networks
1700Unsupervised Data Augmentation for Consistency Training
1701Assessing Generalization in TD methods for Deep Reinforcement Learning
1702Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning
1703Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?
1704The Effect of Neural Net Architecture on Gradient Confusion & Training Performance
1705Making DenseNet Interpretable: A Case Study in Clinical Radiology
1706Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space
1707Regularizing Deep Multi-Task Networks using Orthogonal Gradients
1708Fast Training of Sparse Graph Neural Networks on Dense Hardware
1709Simultaneous Classification and Out-of-Distribution Detection Using Deep Neural Networks
1710Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML
1711Long-term planning, short-term adjustments
1712Imitation Learning via Off-Policy Distribution Matching
1713Unsupervised Learning of Automotive 3D Crash Simulations using LSTMs
1714Augmenting Transformers with KNN-Based Composite Memory
1715SGD with Hardness Weighted Sampling for Distributionally Robust Deep Learning
1716Constrained Markov Decision Processes via Backward Value Functions
1717Reanalysis of Variance Reduced Temporal Difference Learning
1718Meta-Learning for Variational Inference
1720Defending Against Adversarial Examples by Regularized Deep Embedding
1721Minimizing FLOPs to Learn Efficient Sparse Representations
1722Neural-Guided Symbolic Regression with Asymptotic Constraints
1723Policy Optimization In the Face of Uncertainty
1724DropGrad: Gradient Dropout Regularization for Meta-Learning
1725Understanding Top-k Sparsification in Distributed Deep Learning
1726Entropy Penalty: Towards Generalization Beyond the IID Assumption
1727Improving Semantic Parsing with Neural Generator-Reranker Architecture
1728Learning a Behavioral Repertoire from Demonstrations
1730Deep symbolic regression
1731Autoencoders and Generative Adversarial Networks for Imbalanced Sequence Classification
1732Doubly Normalized Attention
1733Uncertainty-Aware Prediction for Graph Neural Networks
1734Training Deep Neural Networks by optimizing over nonlocal paths in hyperparameter space
1735Lattice Representation Learning
1736Omnibus Dropout for Improving The Probabilistic Classification Outputs of ConvNets
1737Deep Multiple Instance Learning for Taxonomic Classification of Metagenomic read sets
1738Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints
1739RoBERTa: A Robustly Optimized BERT Pretraining Approach
1740Deep Semi-Supervised Anomaly Detection
1741GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation
1742Out-of-distribution Detection in Few-shot Classification
1743Learning Low-rank Deep Neural Networks via Singular Vector Orthogonality Regularization and Singular Value Sparsification
1744Mirror-Generative Neural Machine Translation
1745Frustratingly easy quasi-multitask learning
1746Interpreting video features: a comparison of 3D convolutional networks and convolutional LSTM networks
1747TrojanNet: Exposing the Danger of Trojan Horse Attack on Neural Networks
1748Robust Learning with Jacobian Regularization
1749Generalized Inner Loop Meta-Learning
1750Sign Bits Are All You Need for Black-Box Attacks
1751Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech
1752Pre-training as Batch Meta Reinforcement Learning with tiMe
1753On Global Feature Pooling for Fine-grained Visual Categorization
1754Exploring by Exploiting Bad Models in Model-Based Reinforcement Learning
1755Reinforced active learning for image segmentation
1756Variational inference of latent hierarchical dynamical systems in neuroscience: an application to calcium imaging data
1757Neural Architecture Search by Learning Action Space for Monte Carlo Tree Search
1758Gradientless Descent: High-Dimensional Zeroth-Order Optimization
1759Equivariant Entity-Relationship Networks
1760Modeling Fake News in Social Networks with Deep Multi-Agent Reinforcement Learning
1761Unsupervised Few-shot Object Recognition by Integrating Adversarial, Self-supervision, and Deep Metric Learning of Latent Parts
1762On the "steerability" of generative adversarial networks
1763GASL: Guided Attention for Sparsity Learning in Deep Neural Networks
1764Affine Self Convolution
1765Improving Differentially Private Models with Active Learning
1766Matrix Multilayer Perceptron
1767BEAN: Interpretable Representation Learning with Biologically-Enhanced Artificial Neuronal Assembly Regularization
1768Feature-Robustness, Flatness and Generalization Error for Deep Neural Networks
1769TriMap: Large-scale Dimensionality Reduction Using Triplets
1771Frontal low-rank random tensors for high-order feature representation
1772Learning General and Reusable Features via Racecar-Training
1773Higher-order Weighted Graph Convolutional Networks
1774Estimating counterfactual treatment outcomes over time through adversarially balanced representations
1775Poincaré Wasserstein Autoencoder
1776Robust Instruction-Following in a Situated Agent via Transfer-Learning from Text
1777Stochastic Conditional Generative Networks with Basis Decomposition
1778Task-Based Top-Down Modulation Network for Multi-Task-Learning Applications
1779Global reasoning network for image super-resolution
1780Tensor Graph Convolutional Networks for Prediction on Dynamic Graphs
1781Matching Distributions via Optimal Transport for Semi-Supervised Learning
1782GraphNVP: an Invertible Flow-based Model for Generating Molecular Graphs
1783Language GANs Falling Short
1784GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations
1785Last-iterate convergence rates for min-max optimization
1786Poisoning Attacks with Generative Adversarial Nets
1787Parameterized Action Reinforcement Learning for Inverted Index Match Plan Generation
1788Learnable Group Transform For Time-Series
1789From English to Foreign Languages: Transferring Pre-trained Language Models
1790COPHY: Counterfactual Learning of Physical Dynamics
1791Semi-Supervised Few-Shot Learning with Prototypical Random Walks
1792Why Convolutional Networks Learn Oriented Bandpass Filters: A Hypothesis
1793Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning
1794Unsupervised Out-of-Distribution Detection with Batch Normalization
1795Understanding the Limitations of Variational Mutual Information Estimators
1796Latent Question Reformulation and Information Accumulation for Multi-Hop Machine Reading
1797Hamiltonian Generative Networks
1798Customizing Sequence Generation with Multi-Task Dynamical Systems
1799Extracting and Leveraging Feature Interaction Interpretations
1800Zero-Shot Medical Image Artifact Reduction
1801Quantum Expectation-Maximization for Gaussian Mixture Models
1802Behavior Regularized Offline Reinforcement Learning
1803Encoder-Agnostic Adaptation for Conditional Language Generation
1804Optimizing Data Usage via Differentiable Rewards
1805Dropout: Explicit Forms and Capacity Control
1806Training Interpretable Convolutional Neural Networks towards Class-specific Filters
1807Faster Neural Network Training with Data Echoing
1808Kronecker Attention Networks
1809Farkas layers: don't shift the data, fix the geometry
1810Non-Gaussian processes and neural networks at finite widths
1811Unsupervised Model Selection for Variational Disentangled Representation Learning
1812Sticking to the Facts: Confident Decoding for Faithful Data-to-Text Generation
1813How much Position Information Do Convolutional Neural Networks Encode?
1814A Theoretical Analysis of the Number of Shots in Few-Shot Learning
1815Event extraction from unstructured Amharic text
1816Representation Learning for Remote Sensing: An Unsupervised Sensor Fusion Approach
1817Natural Language State Representation for Reinforcement Learning
1818Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery
1819Project and Forget: Solving Large Scale Metric Constrained Problems
1820On the Variance of the Adaptive Learning Rate and Beyond
1821Translation Between Waves, wave2wave
1822Quantifying the Cost of Reliable Photo Authentication via High-Performance Learned Lossy Representations
1823Improving End-to-End Object Tracking Using Relational Reasoning
1824Attention Privileged Reinforcement Learning for Domain Transfer
1825Sliced Cramer Synaptic Consolidation for Preserving Deeply Learned Representations
1826On Variational Learning of Controllable Representations for Text without Supervision
1827Disentangled Representation Learning with Sequential Residual Variational Autoencoder
1828Improved Training Speed, Accuracy, and Data Utilization via Loss Function Optimization
1829Using Hindsight to Anchor Past Knowledge in Continual Learning
1830Empirical confidence estimates for classification by deep neural networks
1831iSOM-GSN: An Integrative Approach for Transforming Multi-omic Data into Gene Similarity Networks via Self-organizing Maps
1832Learning Numeral Embedding
1833Localized Generations with Deep Neural Networks for Multi-Scale Structured Datasets
1834AlgoNet: $C^\infty$ Smooth Algorithmic Neural Networks
1835Temporal-difference learning for nonlinear value function approximation in the lazy training regime
1836A Bayes-Optimal View on Adversarial Examples
1837Efficient Content-Based Sparse Attention with Routing Transformers
1838Good Semi-supervised VAE Requires Tighter Evidence Lower Bound
1839Option Discovery using Deep Skill Chaining
1841PowerSGD: Powered Stochastic Gradient Descent Methods for Accelerated Non-Convex Optimization
1842Deep Randomized Least Squares Value Iteration
1843Self-Supervised Policy Adaptation
1845OmniNet: A unified architecture for multi-modal multi-task learning
1846Unified Probabilistic Deep Continual Learning through Generative Replay and Open Set Recognition
1848TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising
1849V4D: 4D Covolutional Neural Networks for Video-level Representations Learning
1850ODE Analysis of Stochastic Gradient Methods with Optimism and Anchoring for Minimax Problems and GANs
1851Learning to Represent Programs with Property Signatures
1852Unified recurrent network for many feature types
1853Restoration of Video Frames from a Single Blurred Image with Motion Understanding
1854Improving Dirichlet Prior Network for Out-of-Distribution Example Detection
1855Variational Autoencoders for Opponent Modeling in Multi-Agent Systems
1856Prototype Recalls for Continual Learning
1857Generative Ratio Matching Networks
1858Emergence of Compositional Language with Deep Generational Transmission
1859Deep Gradient Boosting -- Layer-wise Input Normalization of Neural Networks
1860A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models
1861Bridging ELBO objective and MMD
1862In Search for a SAT-friendly Binarized Neural Network Architecture
1863EfferenceNets for latent space planning
1864Neural networks are a priori biased towards Boolean functions with low entropy
1866Wider Networks Learn Better Features
1867Conditional Invertible Neural Networks for Guided Image Generation
1868Cost-Effective Testing of a Deep Learning Model through Input Reduction
1869Hebbian Graph Embeddings
1870NeuralUCB: Contextual Bandits with Neural Network-Based Exploration
1871Meta-Graph: Few shot Link Prediction via Meta Learning
1872Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games
1873An implicit function learning approach for parametric modal regression
1874The asymptotic spectrum of the Hessian of DNN throughout training
1875Auto-Encoding Explanatory Examples
1876RISE and DISE: Two Frameworks for Learning from Time Series with Missing Data
1877Fast Machine Learning with Byzantine Workers and Servers
1878How the Softmax Activation Hinders the Detection of Adversarial and Out-of-Distribution Examples in Neural Networks
1879Tree-Structured Attention with Hierarchical Accumulation
1880Deep 3D Pan via Local adaptive "t-shaped" convolutions with global and local adaptive dilations
1881MANAS: Multi-Agent Neural Architecture Search
1882SimulS2S: End-to-End Simultaneous Speech to Speech Translation
1883Enhancing Attention with Explicit Phrasal Alignments
1884LightPAFF: A Two-Stage Distillation Framework for Pre-training and Fine-tuning
1885Robust saliency maps with distribution-preserving decoys
1886Role of two learning rates in convergence of model-agnostic meta-learning
1887Low-Resource Knowledge-Grounded Dialogue Generation
1888Generative Multi Source Domain Adaptation
1889GResNet: Graph Residual Network for Reviving Deep GNNs from Suspended Animation
1890Realism Index: Interpolation in Generative Models With Arbitrary Prior
1891Deep RL for Blood Glucose Control: Lessons, Challenges, and Opportunities
1893Training Provably Robust Models by Polyhedral Envelope Regularization
1894FleXOR: Trainable Fractional Quantization
1895DP-LSSGD: An Optimization Method to Lift the Utility in Privacy-Preserving ERM
1896Multi-Task Learning via Scale Aware Feature Pyramid Networks and Effective Joint Head
1897AdaX: Adaptive Gradient Descent with Exponential Long Term Memory
1899Disentangling Improves VAEs' Robustness to Adversarial Attacks
1900Sparsity Meets Robustness: Channel Pruning for the Feynman-Kac Formalism Principled Robust Deep Neural Nets
1902On Recovering Latent Factors From Sampling And Firing Graph
1903Influence-Based Multi-Agent Exploration
1904Demonstration Actor Critic
1905Deep Coordination Graphs
1906Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time Series Imputation
1907How Well Do WGANs Estimate the Wasserstein Metric?
1908Revisiting the Generalization of Adaptive Gradient Methods
1909An Information Theoretic Perspective on Disentangled Representation Learning
1910Multiplicative Interactions and Where to Find Them
1912DIVA: Domain Invariant Variational Autoencoder
1913Continual Learning with Bayesian Neural Networks for Non-Stationary Data
1914RPGAN: random paths as a latent space for GAN interpretability
1915SAdam: A Variant of Adam for Strongly Convex Functions
1916Improving the Generalization of Visual Navigation Policies using Invariance Regularization
1917Improving the robustness of ImageNet classifiers using elements of human visual cognition
1918Differentially Private Survival Function Estimation
1919Size-free generalization bounds for convolutional neural networks
1920Scaling Laws for the Principled Design, Initialization, and Preconditioning of ReLU Networks
1921A Fair Comparison of Graph Neural Networks for Graph Classification
1922Finding and Visualizing Weaknesses of Deep Reinforcement Learning Agents
1923Computation Reallocation for Object Detection
1925Sparse Networks from Scratch: Faster Training without Losing Performance
1926Modeling Winner-Take-All Competition in Sparse Binary Projections
1927Laplacian Denoising Autoencoder
1928Training Data Distribution Search with Ensemble Active Learning
1929Meta-Learning without Memorization
1931From Variational to Deterministic Autoencoders
1932Adversarially Robust Representations with Smooth Encoders
1933AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures
1934Representation Quality Explain Adversarial Attacks
1935Inferring Dynamical Systems with Long-Range Dependencies through Line Attractor Regularization
1936End-To-End Input Selection for Deep Neural Networks
1937Hierarchical Graph-to-Graph Translation for Molecules
1938Teaching GAN to generate per-pixel annotation
1939ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
1940DeepEnFM: Deep neural networks with Encoder enhanced Factorization Machine
1942Decaying momentum helps neural network training
1943Regularizing Black-box Models for Improved Interpretability
1945Needles in Haystacks: On Classifying Tiny Objects in Large Images
1946Quadratic GCN for graph classification
1947The advantage of using Student's t-priors in variational autoencoders
1948Finite Depth and Width Corrections to the Neural Tangent Kernel
1949Order Learning and Its Application to Age Estimation
1950Couple-VAE: Mitigating the Encoder-Decoder Incompatibility in Variational Text Modeling with Coupled Deterministic Networks
1951Distilling Neural Networks for Faster and Greener Dependency Parsing
1952Model-based Saliency for the Detection of Adversarial Examples
1953Online Meta-Critic Learning for Off-Policy Actor-Critic Methods
1954BUZz: BUffer Zones for defending adversarial examples in image classification
1955Efficient and Information-Preserving Future Frame Prediction and Beyond
1956Path Space for Recurrent Neural Networks with ReLU Activations
1957Wasserstein Adversarial Regularization (WAR) on label noise
1958Self-Supervised Speech Recognition via Local Prior Matching
1959SRDGAN: learning the noise prior for Super Resolution with Dual Generative Adversarial Networks
1960Amata: An Annealing Mechanism for Adversarial Training Acceleration
1961An Inter-Layer Weight Prediction and Quantization for Deep Neural Networks based on Smoothly Varying Weight Hypothesis
1962Context Based Machine Translation With Recurrent Neural Network For English-Amharic Translation
1963Robust Domain Randomization for Reinforcement Learning
1964NAS evaluation is frustratingly hard
1965Ellipsoidal Trust Region Methods for Neural Network Training
1966Learning Semantically Meaningful Representations Through Embodiment
1967Superseding Model Scaling by Penalizing Dead Units and Points with Separation Constraints
1968Artificial Design: Modeling Artificial Super Intelligence with Extended General Relativity and Universal Darwinism via Geometrization for Universal Design Automation
1969Robust Graph Representation Learning via Neural Sparsification
1970Hyperbolic Discounting and Learning Over Multiple Horizons
1971CLN2INV: Learning Loop Invariants with Continuous Logic Networks
1972Gated Channel Transformation for Visual Recognition
1973Federated User Representation Learning
1975Scalable Neural Methods for Reasoning With a Symbolic Knowledge Base
1976Variational pSOM: Deep Probabilistic Clustering with Self-Organizing Maps
1977Augmenting Self-attention with Persistent Memory
1978Information Plane Analysis of Deep Neural Networks via Matrix--Based Renyi's Entropy and Tensor Kernels
1979Ridge Regression: Structure, Cross-Validation, and Sketching
1980Hindsight Trust Region Policy Optimization
1981Policy Optimization with Stochastic Mirror Descent
1982Graph convolutional networks for learning with few clean and many noisy labels
1983A Constructive Prediction of the Generalization Error Across Scales
1984MLModelScope: A Distributed Platform for ML Model Evaluation and Benchmarking at Scale
1985A Mention-Pair Model of Annotation with Nonparametric User Communities
1986An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality
1987NPTC-net: Narrow-Band Parallel Transport Convolutional Neural Network on Point Clouds
1988Mogrifier LSTM
1989Individualised Dose-Response Estimation using Generative Adversarial Nets
1990Physics-as-Inverse-Graphics: Unsupervised Physical Parameter Estimation from Video
1991Trajectory representation learning for Multi-Task NMRDPs planning
1992Incorporating Horizontal Connections in Convolution by Spatial Shuffling
1993Is Deep Reinforcement Learning Really Superhuman on Atari? Leveling the playing field
1994Counterfactuals uncover the modular structure of deep generative models
1995Pushing the bounds of dropout
1996Confidence Scores Make Instance-dependent Label-noise Learning Possible
1997Gap-Aware Mitigation of Gradient Staleness
1998Evaluating and Calibrating Uncertainty Prediction in Regression Tasks
1999Ensemble Distribution Distillation
2000Deformable Kernels: Adapting Effective Receptive Fields for Object Deformation
2001On the Tunability of Optimizers in Deep Learning
2002Gradient Perturbation is Underrated for Differentially Private Convex Optimization
2003VL-BERT: Pre-training of Generic Visual-Linguistic Representations
2004Credible Sample Elicitation by Deep Learning, for Deep Learning
2005Neural Markov Logic Networks
2006Optimistic Exploration even with a Pessimistic Initialisation
2007Better Optimization for Neural Architecture Search with Mixed-Level Reformulation
2008Risk Averse Value Expansion for Sample Efficient and Robust Policy Learning
2009Certified Robustness for Top-k Predictions against Adversarial Perturbations via Randomized Smoothing
2010LabelFool: A Trick in the Label Space
2011RGTI:Response generation via templates integration for End to End dialog
2012Towards Disentangling Non-Robust and Robust Components in Performance Metric
2013A Mechanism of Implicit Regularization in Deep Learning
2014Feature-map-level Online Adversarial Knowledge Distillation
2015Optimising Neural Network Architectures for Provable Adversarial Robustness
2016Recurrent Independent Mechanisms
2017An Explicitly Relational Neural Network Architecture
2018Branched Multi-Task Networks: Deciding What Layers To Share
2019MxPool: Multiplex Pooling for Hierarchical Graph Representation Learning
2020Mixture-of-Experts Variational Autoencoder for clustering and generating from similarity-based representations
2021Temporal Difference Weighted Ensemble For Reinforcement Learning
2022Task Level Data Augmentation for Meta-Learning
2023Effect of top-down connections in Hierarchical Sparse Coding
2024Compressive Recovery Defense: A Defense Framework for $\ell_0, \ell_2$ and $\ell_\infty$ norm attacks.
2025Match prediction from group comparison data using neural networks
2026Extractor-Attention Network: A New Attention Network with Hybrid Encoders for Chinese Text Classification
2027Identifying through Flows for Recovering Latent Representations
2028Robust training with ensemble consensus
2029Fault Tolerant Reinforcement Learning via A Markov Game of Control and Stopping
2031Hierarchical Summary-to-Article Generation
2032Unsupervised-Learning of time-varying features
2033Self-Adversarial Learning with Comparative Discrimination for Text Generation
2034A General Upper Bound for Unsupervised Domain Adaptation
2035Vid2Game: Controllable Characters Extracted from Real-World Videos
2036Action Semantics Network: Considering the Effects of Actions in Multiagent Systems
2037Growing Action Spaces
2038Learning Generative Image Object Manipulations from Language Instructions
2039Discourse-Based Evaluation of Language Understanding
2040Learning Efficient Parameter Server Synchronization Policies for Distributed SGD
2041Relational State-Space Model for Stochastic Multi-Object Systems
2042TSInsight: A local-global attribution framework for interpretability in time-series data
2044Structural Language Models for Any-Code Generation
2045How does Lipschitz Regularization Influence GAN Training?
2046Simple and Effective Stochastic Neural Networks
2047Robust Reinforcement Learning with Wasserstein Constraint
2048Cross-Iteration Batch Normalization
2049Model Ensemble-Based Intrinsic Reward for Sparse Reward Reinforcement Learning
2050The Effect of Residual Architecture on the Per-Layer Gradient of Deep Networks
2051Prune or quantize? Strategy for Pareto-optimally low-cost and accurate CNN
2052Graph Residual Flow for Molecular Graph Generation
2053Nonlinearities in activations substantially shape the loss surfaces of neural networks
2054Attention over Parameters for Dialogue Systems
2055The Convex Information Bottleneck Lagrangian
2056The problem with DDPG: understanding failures in deterministic environments with sparse rewards
2057LocalGAN: Modeling Local Distributions for Adversarial Response Generation
2058Hierarchical Image-to-image Translation with Nested Distributions Modeling
2059Generative Adversarial Networks For Data Scarcity Industrial Positron Images With Attention
2060OvA-INN: Continual Learning with Invertible Neural Networks
2061Contextual Inverse Reinforcement Learning
2062Mining GANs for knowledge transfer to small domains
2063Learning Time-Aware Assistance Functions for Numerical Fluid Solvers
2064Transition Based Dependency Parser for Amharic Language Using Deep Learning
2065Samples Are Useful? Not Always: denoising policy gradient updates using variance explained
2066Learning Surrogate Losses
2067Boosting Network: Learn by Growing Filters and Layers via SplitLBI
2068Split LBI for Deep Learning: Structural Sparsity via Differential Inclusion Paths
2069Generalizing Deep Multi-task Learning with Heterogeneous Structured Networks
2070Unsupervised Universal Self-Attention Network for Graph Classification
2071FairFace: A Novel Face Attribute Dataset for Bias Measurement and Mitigation
2072Manifold Modeling in Embedded Space: A Perspective for Interpreting "Deep Image Prior"
2073Novelty Detection Via Blurring
2074Small-GAN: Speeding up GAN Training using Core-Sets
2075Bounds on Over-Parameterization for Guaranteed Existence of Descent Paths in Shallow ReLU Networks
2076Data-Independent Neural Pruning via Coresets
2077Deeper Insights into Weight Sharing in Neural Architecture Search
2078Learnable Higher-order Representation for Action Recognition
2079Dirichlet Wrapper to Quantify Classification Uncertainty in Black-Box Systems
2080S2VG: Soft Stochastic Value Gradient method
2081Deep Network classification by Scattering and Homotopy dictionary learning
2082Scalable Generative Models for Graphs with Graph Attention Mechanism
2083Continuous Adaptation in Multi-agent Competitive Environments
2084Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP
2085Combiner: Inductively Learning Tree Structured Attention in Transformers
2086Robust Cross-lingual Embeddings from Parallel Sentences
2087Semi-supervised Learning by Coaching
2089Blockwise Self-Attention for Long Document Understanding
2090Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models
2091I am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively
2092Black-Box Adversarial Attack with Transferable Model-based Embedding
2093Stabilizing Off-Policy Reinforcement Learning with Conservative Policy Gradients
2094Understanding Distributional Ambiguity via Non-robust Chance Constraint
2095MobileBERT: Task-Agnostic Compression of BERT by Progressive Knowledge Transfer
2096Do Image Classifiers Generalize Across Time?
2097Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation
2098Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination
2099A shallow feature extraction network with a large receptive field for stereo matching tasks
2100Learning Boolean Circuits with Neural Networks
2101ProxNet: End-to-End Learning of Structured Representation by Proximal Mapping
2102Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets
2103Towards Principled Objectives for Contrastive Disentanglement
2104Compositional languages emerge in a neural iterated learning model
2105Population-Guided Parallel Policy Search for Reinforcement Learning
2106Classification Logit Two-sample Testing by Neural Networks
2107Variational Recurrent Models for Solving Partially Observable Control Tasks
2108Learning to Discretize: Solving 1D Scalar Conservation Laws via Deep Reinforcement Learning
2109Towards Unifying Neural Architecture Space Exploration and Generalization
2110Composable Semi-parametric Modelling for Long-range Motion Generation
2111Towards an Adversarially Robust Normalization Approach
2112Generative Latent Flow
2113Adversarial Example Detection and Classification with Asymmetrical Adversarial Training
2115Generalized Natural Language Grounded Navigation via Environment-agnostic Multitask Learning
2116Global Concavity and Optimization in a Class of Dynamic Discrete Choice Models
2117Posterior sampling for multi-agent reinforcement learning: solving extensive games with imperfect information
2118On the Pareto Efficiency of Quantized CNN
2119BANANAS: Bayesian Optimization with Neural Networks for Neural Architecture Search
2120Potential Flow Generator with $L_2$ Optimal Transport Regularity for Generative Models
2121Integrative Tensor-based Anomaly Detection System For Satellites
2122Detecting and Diagnosing Adversarial Images with Class-Conditional Capsule Reconstructions
2123MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius
2124TinyBERT: Distilling BERT for Natural Language Understanding
2126Semantically-Guided Representation Learning for Self-Supervised Monocular Depth
2127Stochastic AUC Maximization with Deep Neural Networks
2128Data-Driven Approach to Encoding and Decoding 3-D Crystal Structures
2129Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity
2130Why ADAM Beats SGD for Attention Models
2131Reflection-based Word Attribute Transfer
2132Difference-Seeking Generative Adversarial Network--Unseen Sample Generation
2133EINS: Long Short-Term Memory with Extrapolated Input Network Simplification
2134FasterSeg: Searching for Faster Real-time Semantic Segmentation
2136Meta Module Network for Compositional Visual Reasoning
2137Min-max Entropy for Weakly Supervised Pointwise Localization
2138Editable Neural Networks
2139Parallel Scheduled Sampling
2140Learning Explainable Models Using Attribution Priors
2141Efficient Inference and Exploration for Reinforcement Learning
2142Leveraging inductive bias of neural networks for learning without explicit human annotations
2143Bias-Resilient Neural Network
2144Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis
2145Accelerating Reinforcement Learning Through GPU Atari Emulation
2146Can gradient clipping mitigate label noise?
2147Concise Multi-head Attention Models
2148Tensorized Embedding Layers for Efficient Model Compression
2149Rethinking Neural Network Quantization
2150Zero-shot task adaptation by homoiconic meta-mapping
2151iSparse: Output Informed Sparsification of Neural Networks
2152HyperEmbed: Tradeoffs Between Resources and Performance in NLP Tasks with Hyperdimensional Computing enabled embedding of n-gram statistics
2153Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model
2154Fast Linear Interpolation for Piecewise-Linear Functions, GAMs, and Deep Lattice Networks
2155Adversarial Training: embedding adversarial perturbations into the parameter space of a neural network to build a robust system
2156Collaborative Generated Hashing for Market Analysis and Fast Cold-start Recommendation
2157Pruned Graph Scattering Transforms
2158DDSP: Differentiable Digital Signal Processing
2159Continual Learning via Neural Pruning
2160Min-Max Optimization without Gradients: Convergence and Applications to Adversarial ML
2161XLDA: Cross-Lingual Data Augmentation for Natural Language Inference and Question Answering
2162Attraction-Repulsion Actor-Critic for Continuous Control Reinforcement Learning
2163GLAD: Learning Sparse Graph Recovery
2164PDP: A General Neural Framework for Learning SAT Solvers
2165Adaptive Loss Scaling for Mixed Precision Training
2166Quantifying Exposure Bias for Neural Language Generation
2167How many weights are enough : can tensor factorization learn efficient policies ?
2168Domain Aggregation Networks for Multi-Source Domain Adaptation
2169Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming
2170AHash: A Load-Balanced One Permutation Hash
2171Ordinary differential equations on graph networks
2172Lift-the-flap: what, where and when for context reasoning
2173Unifying Question Answering, Text Classification, and Regression via Span Extraction
2174Supervised learning with incomplete data via sparse representations
2175Conversation Generation with Concept Flow
2176The Probabilistic Fault Tolerance of Neural Networks in the Continuous Limit
2177Variational Hashing-based Collaborative Filtering with Self-Masking
2178Neural Network Branching for Neural Network Verification
2179SoftLoc: Robust Temporal Localization under Label Misalignment
2180VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation
2181Adaptive Data Augmentation with Deep Parallel Generative Models
2182Domain-invariant Learning using Adaptive Filter Decomposition
2183Topology of deep neural networks
2184Adversarial Policies: Attacking Deep Reinforcement Learning
2185Escaping Saddle Points Faster with Stochastic Momentum
2186Few-shot Text Classification with Distributional Signatures
2187RotationOut as a Regularization Method for Neural Network
2188Universal Approximation with Deep Narrow Networks
2189A Dynamic Approach to Accelerate Deep Learning Training
2190Geometric Insights into the Convergence of Nonlinear TD Learning
2191Efficient Multivariate Bandit Algorithm with Path Planning
2192Learning Self-Correctable Policies and Value Functions from Demonstrations with Negative Sampling
2193Exploring Model-based Planning with Policy Networks
2194Benchmarking Model-Based Reinforcement Learning
2195Encoder-decoder Network as Loss Function for Summarization
2196Locally adaptive activation functions with slope recovery term for deep and physics-informed neural networks
2197On Identifiability in Transformers
2198Automated curriculum generation through setter-solver interactions
2199Deep Multi-View Learning via Task-Optimal CCA
2200Bandlimiting Neural Networks Against Adversarial Attacks
2201Progressive Memory Banks for Incremental Domain Adaptation
2202MMD GAN with Random-Forest Kernels
2203What graph neural networks cannot learn: depth vs width
2205Learning an off-policy predictive state representation for deep reinforcement learning for vision-based steering in autonomous driving
2206RTFM: Generalising to New Environment Dynamics via Reading
2207MIM: Mutual Information Machine
2208Real or Fake: An Empirical Study and Improved Model for Fake Face Detection
2209Constant Time Graph Neural Networks
2210AutoLR: A Method for Automatic Tuning of Learning Rate
2211Generating Robust Audio Adversarial Examples using Iterative Proportional Clipping
2212Optimal Attacks on Reinforcement Learning Policies
2213Multi-Agent Hierarchical Reinforcement Learning for Humanoid Navigation
2214SMiRL: Surprise Minimizing RL in Entropic Environments
2215Mesh-Free Unsupervised Learning-Based PDE Solver of Forward and Inverse problems
2216Input Complexity and Out-of-distribution Detection with Likelihood-based Generative Models
2217Sparse and Structured Visual Attention
2218Network Pruning for Low-Rank Binary Index
2219Style-based Encoder Pre-training for Multi-modal Image Synthesis
2220LDMGAN: Reducing Mode Collapse in GANs with Latent Distribution Matching
2221Bootstrapping the Expressivity with Model-based Planning
2222DeepAGREL: Biologically plausible deep learning via direct reinforcement
2223Homogeneous Linear Inequality Constraints for Neural Network Activations
2224Leveraging Simple Model Predictions for Enhancing its Performance
2225Modeling treatment events in disease progression
2226DG-GAN: the GAN with the duality gap
2227Stochastic Gradient Descent with Biased but Consistent Gradient Estimators
2228One-way prototypical networks
2229Encoding word order in complex embeddings
2231Functional vs. parametric equivalence of ReLU networks
2232A New Multi-input Model with the Attention Mechanism for Text Classification
2233Multi-Dimensional Explanation of Reviews
2234A Uniform Generalization Error Bound for Generative Adversarial Networks
2235QGAN: Quantize Generative Adversarial Networks to Extreme low-bits
2236Learning to Transfer Learn
2237Contrastive Learning of Structured World Models
2238Disentangling Factors of Variations Using Few Labels
2239Detecting Out-of-Distribution Inputs to Deep Generative Models Using Typicality
2240EDUCE: Explaining model Decision through Unsupervised Concepts Extraction
2241Target-directed Atomic Importance Estimation via Reverse Self-attention
2242A critical analysis of self-supervision, or what we can learn from a single image
2243Accelerating SGD with momentum for over-parameterized learning
2244Discrete InfoMax Codes for Meta-Learning
2245The Geometry of Sign Gradient Descent
2246Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation
2247Attributes Obfuscation with Complex-Valued Features
2248V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control
2249MDE: Multiple Distance Embeddings for Link Prediction in Knowledge Graphs
2250Improving Adversarial Robustness Requires Revisiting Misclassified Examples
2251Learning to Sit: Synthesizing Human-Chair Interactions via Hierarchical Control
2252InfoCNF: Efficient Conditional Continuous Normalizing Flow Using Adaptive Solvers
2253Mirror Descent View For Neural Network Quantization
2254Hierarchical Disentangle Network for Object Representation Learning
2255Deep Multiple Instance Learning with Gaussian Weighting
2256Mitigating Posterior Collapse in Strongly Conditioned Variational Autoencoders
2257Zeno++: Robust Fully Asynchronous SGD
2258DivideMix: Learning with Noisy Labels as Semi-supervised Learning
2259PAD-Nets: Learning Dynamic Receptive Fields via Pixel-Wise Adaptive Dilation
2260PLEX: PLanner and EXecutor for Embodied Learning in Navigation
2261DeepObfusCode: Source Code Obfuscation Through Sequence-to-Sequence Networks
2262Extreme Value k-means Clustering
2263Adaptive network sparsification with dependent variational beta-Bernoulli dropout
2264Data-dependent Gaussian Prior Objective for Language Generation
2265Learning Representations in Reinforcement Learning: an Information Bottleneck Approach
2266LSTOD: Latent Spatial-Temporal Origin-Destination prediction model and its applications in ride-sharing platforms
2267Ecological Reinforcement Learning
2268Dual-Component Deep Domain Adaptation: A New Approach for Cross Project Software Vulnerability Detection
2269Towards Understanding the Regularization of Adversarial Robustness on Neural Networks
2270MaskConvNet: Training Efficient ConvNets from Scratch via Budget-constrained Filter Pruning
2271Fast Bilinear Matrix Normalization via Rank-1 Update
2272Scale-Equivariant Neural Networks with Decomposed Convolutional Filters
2273A novel Bayesian estimation-based word embedding model for sentiment analysis
2274Attacking Lifelong Learning Models with Gradient Reversion
2275Learning with Long-term Remembering: Following the Lead of Mixed Stochastic Gradient
2276A Harmonic Structure-Based Neural Network Model for Musical Pitch Detection
2277Fooling Detection Alone is Not Enough: Adversarial Attack against Multiple Object Tracking
2278Towards A Unified Min-Max Framework for Adversarial Exploration and Robustness
2279Domain-Agnostic Few-Shot Classification by Learning Disparate Modulators
2280Anomaly Detection and Localization in Images using Guided Attention
2281Watch, Try, Learn: Meta-Learning from Demonstrations and Rewards
2282Logic and the 2-Simplicial Transformer
2283PAC-Bayes Few-shot Meta-learning with Implicit Learning of Model Prior Distribution
2284Reinforcement Learning with Chromatic Networks
2286Deep Mining: Detecting Anomalous Patterns in Neural Network Activations with Subset Scanning
2287A Data-Efficient Mutual Information Neural Estimator for Statistical Dependency Testing
2288Enhancing Adversarial Defense by k-Winners-Take-All
2289Thwarting finite difference adversarial attacks with output randomization
2290Exploration in Reinforcement Learning with Deep Covering Options
2291Towards Controllable and Interpretable Face Completion via Structure-Aware and Frequency-Oriented Attentive GANs
2292Learning audio representations with self-supervision
2293Learning Disentangled Representations for CounterFactual Regression
2294Learning relevant features for statistical inference
2295VILD: Variational Imitation Learning with Diverse-quality Demonstrations
2296Entropy Minimization In Emergent Languages
2297A Unified framework for randomized smoothing based certified defenses
2298Analysis of Video Feature Learning in Two-Stream CNNs on the Example of Zebrafish Swim Bout Classification
2299MIST: Multiple Instance Spatial Transformer Networks
2300ISBNet: Instance-aware Selective Branching Networks
2301MODiR: Multi-Objective Dimensionality Reduction for Joint Data Visualisation
2302Robust Local Features for Improving the Generalization of Adversarial Training
2303Online and stochastic optimization beyond Lipschitz continuity: A Riemannian approach
2304Distributed Online Optimization with Long-Term Constraints
2305Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives
2306Learning the Arrow of Time for Problems in Reinforcement Learning
2307Topological based classification using graph convolutional networks
2308The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget
2309AutoGrow: Automatic Layer Growing in Deep Convolutional Networks
2310Sequence-level Intrinsic Exploration Model for Partially Observable Domains
2311Pipelined Training with Stale Weights of Deep Convolutional Neural Networks
2312StacNAS: Towards Stable and Consistent Optimization for Differentiable Neural Architecture Search
2313Universal Learning Approach for Adversarial Defense
2314Boosting Generative Models by Leveraging Cascaded Meta-Models
2315Quantitatively Disentangling and Understanding Part Information in CNNs
2316The Implicit Bias of Depth: How Incremental Learning Drives Generalization
2318Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness
2319Measuring Compositional Generalization: A Comprehensive Method on Realistic Data
2320Theory and Evaluation Metrics for Learning Disentangled Representations
2321Mixup Inference: Better Exploiting Mixup to Defend Adversarial Attacks
2322Dynamically Pruned Message Passing Networks for Large-scale Knowledge Graph Reasoning
2324Universal Source-Free Domain Adaptation
2325Learning Invariants through Soft Unification
2326Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction
2327Macro Action Ensemble Searching Methodology for Deep Reinforcement Learning
2329Increasing batch size through instance repetition improves generalization
2330FSPool: Learning Set Representations with Featurewise Sort Pooling
2331Recurrent Neural Networks are Universal Filters
2332On the Convergence of FedAvg on Non-IID Data
2333Adversarially Robust Neural Networks via Optimal Control: Bridging Robustness with Lyapunov Stability
2334Multi-agent Reinforcement Learning for Networked System Control
2335Learning to Anneal and Prune Proximity Graphs for Similarity Search
2336Deep Bayesian Structure Networks
2337Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation
2338Keyframing the Future: Discovering Temporal Hierarchy with Keyframe-Inpainter Prediction
2339Differential Privacy in Adversarial Learning with Provable Robustness
2340Topology-Aware Pooling via Graph Attention
2341Siamese Attention Networks
2342Neural Stored-program Memory
2343ES-MAML: Simple Hessian-Free Meta Learning
2344Enforcing Physical Constraints in Neural Neural Networks through Differentiable PDE Layer
2345TabFact: A Large-scale Dataset for Table-based Fact Verification
2346Evidence-Aware Entropy Decomposition For Active Deep Learning
2347Learning to Generate Grounded Visual Captions without Localization Supervision
2348Extreme Triplet Learning: Effectively Optimizing Easy Positives and Hard Negatives
2349Implicit Bias of Gradient Descent based Adversarial Training on Separable Data
2350Graph Warp Module: an Auxiliary Module for Boosting the Power of Graph Neural Networks in Molecular Graph Analysis
2351BERT Wears GloVes: Distilling Static Embeddings from Pretrained Contextual Representations
2352The Visual Task Adaptation Benchmark
2353Input Alignment along Chaotic directions increases Stability in Recurrent Neural Networks
23543D-SIC: 3D Semantic Instance Completion for RGB-D Scans
2355Learning Similarity Metrics for Numerical Simulations
2356Image-guided Neural Object Rendering
2357MULTIPOLAR: Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental Dynamics
2358Effective and Robust Detection of Adversarial Examples via Benford-Fourier Coefficients
2359Stablizing Adversarial Invariance Induction by Discriminator Matching
2360Natural Language Adversarial Attack and Defense in Word Level
2361Amharic Light Stemmer
2362Dynamical Clustering of Time Series Data Using Multi-Decoder RNN Autoencoder
2363POP-Norm: A Theoretically Justified and More Accelerated Normalization Approach
2364Programmable Neural Network Trojan for Pre-trained Feature Extractor
2365Cost-Effective Interactive Neural Attention Learning
2366On Layer Normalization in the Transformer Architecture
2367PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search
2368Knowledge Consistency between Neural Networks and Beyond
2369Temporal Probabilistic Asymmetric Multi-task Learning
2370Lazy-CFR: fast and near-optimal regret minimization for extensive games with imperfect information
2371Corpus Based Amharic Sentiment Lexicon Generation
2372Principled Weight Initialization for Hypernetworks
2373Additive Powers-of-Two Quantization: A Non-uniform Discretization for Neural Networks
2374Transfer Alignment Network for Double Blind Unsupervised Domain Adaptation
2375Towards understanding the true loss surface of deep neural networks using random matrix theory and iterative spectral methods
2376Neural Architecture Search in Embedding Space
2377Enhancing Transformation-Based Defenses Against Adversarial Attacks with a Distribution Classifier
2378Single Deep Counterfactual Regret Minimization
2379HaarPooling: Graph Pooling with Compressive Haar Basis
2380Safe Policy Learning for Continuous Control
2381A Stochastic Trust Region Method for Non-convex Minimization
2382Learning Effective Exploration Strategies For Contextual Bandits
2383Improving Batch Normalization with Skewness Reduction for Deep Neural Networks
2384Adversarial Inductive Transfer Learning with input and output space adaptation
2385Graph Neural Networks For Multi-Image Matching
2386An Empirical Study on Post-processing Methods for Word Embeddings
2388High performance RNNs with spiking neurons
2389CLAREL: classification via retrieval loss for zero-shot learning
2390Observational Overfitting in Reinforcement Learning
2391On Mutual Information Maximization for Representation Learning
2392Localizing and Amortizing: Efficient Inference for Gaussian Processes
2393PNAT: Non-autoregressive Transformer by Position Learning
2394On unsupervised-supervised risk and one-class neural networks
2395Tranquil Clouds: Neural Networks for Learning Temporally Coherent Features in Point Clouds
2396Distillation $\approx$ Early Stopping? Harvesting Dark Knowledge Utilizing Anisotropic Information Retrieval For Overparameterized NN
2397Bayesian Inference for Large Scale Image Classification
2398Ranking Policy Gradient
2399How Does Learning Rate Decay Help Modern Neural Networks?
2400Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures
2401SVQN: Sequential Variational Soft Q-Learning Networks
2402Classification Attention for Chinese NER
2403Understanding Isomorphism Bias in Graph Data Sets
2404Neural Machine Translation with Universal Visual Representation
2405Towards More Realistic Neural Network Uncertainties
2406Understanding Architectures Learnt by Cell-based Neural Architecture Search
2407Soft Token Matching for Interpretable Low-Resource Classification
2408Beyond Classical Diffusion: Ballistic Graph Neural Network
2409Hierarchical Complement Objective Training
2410Understanding and Stabilizing GANs' Training Dynamics with Control Theory
2411Variance Reduced Local SGD with Lower Communication Complexity
2412AutoQ: Automated Kernel-Wise Neural Network Quantization
2413Quantifying Layerwise Information Discarding of Neural Networks and Beyond
2414GDP: Generalized Device Placement for Dataflow Graphs
2415Unveiling Hidden Biases in Deep Networks with Classification Images and Spike Triggered Analysis
2416Generalization Puzzles in Deep Networks
2417Why Learning of Large-Scale Neural Networks Behaves Like Convex Optimization
2418Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring
2419HighRes-net: Multi-Frame Super-Resolution by Recursive Fusion
2420A Learning-based Iterative Method for Solving Vehicle Routing Problems
2421Transferable Perturbations of Deep Feature Distributions
2422Rethinking the Security of Skip Connections in ResNet-like Neural Networks
2423ProtoAttend: Attention-Based Prototypical Learning
2424A Signal Propagation Perspective for Pruning Neural Networks at Initialization
2425Wildly Unsupervised Domain Adaptation and Its Powerful and Efficient Solution
2426Automatically Learning Feature Crossing from Model Interpretation for Tabular Data
2427Continual Learning with Adaptive Weights (CLAW)
2428Interpretability Evaluation Framework for Deep Neural Networks
2429Progressive Upsampling Audio Synthesis via Effective Adversarial Training
2430Learning Compact Reward for Image Captioning
2431S-Flow GAN
2432Gradient-free Neural Network Training by Multi-convex Alternating Optimization
2433Semi-supervised Semantic Segmentation using Auxiliary Network
2434Intensity-Free Learning of Temporal Point Processes
2435Scalable and Order-robust Continual Learning with Additive Parameter Decomposition
2436Discriminator Based Corpus Generation for General Code Synthesis
2437Storage Efficient and Dynamic Flexible Runtime Channel Pruning via Deep Reinforcement Learning
2439Weakly Supervised Clustering by Exploiting Unique Class Count
2440Domain Adaptation via Low-Rank Basis Approximation
2441Learning to Control PDEs with Differentiable Physics
2442Linear Symmetric Quantization of Neural Networks for Low-precision Integer Hardware
2443Estimating Gradients for Discrete Random Variables by Sampling without Replacement
2444Structural Multi-agent Learning
2445A Gradient-based Architecture HyperParameter Optimization Approach
2446On importance-weighted autoencoders
2447FALCON: Fast and Lightweight Convolution for Compressing and Accelerating CNN
2448Multi-Task Adapters for On-Device Audio Inference
2449Mincut Pooling in Graph Neural Networks
2450Dual Graph Representation Learning
2451Unsupervised Few Shot Learning via Self-supervised Training
2452To Relieve Your Headache of Training an MRF, Take AdVIL
2453ILS-SUMM: Iterated Local Search for Unsupervised Video Summarization
2454On the Dynamics and Convergence of Weight Normalization for Training Neural Networks
2455CurricularFace: Adaptive Curriculum Learning Loss for Deep Face Recognition
2456Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View
2457Revisit Knowledge Distillation: a Teacher-free Framework
2458SesameBERT: Attention for Anywhere
2459Automated Relational Meta-learning
2460Training Deep Networks with Stochastic Gradient Normalized by Layerwise Adaptive Second Moments
2461Boosting Ticket: Towards Practical Pruning for Adversarial Training with Lottery Ticket Hypothesis
2462Moniqua: Modulo Quantized Communication in Decentralized SGD
2463Defending Against Physically Realizable Attacks on Image Classification
2464Certifying Distributional Robustness using Lipschitz Regularisation
2466N-BEATS: Neural basis expansion analysis for interpretable time series forecasting
2467Subgraph Attention for Node Classification and Hierarchical Graph Pooling
2468Are there any 'object detectors' in the hidden layers of CNNs trained to identify objects or scenes?
2469Learning Human Postural Control with Hierarchical Acquisition Functions
2470Unsupervised Intuitive Physics from Past Experiences
2471Expected Tight Bounds for Robust Deep Neural Network Training
2472Analytical Moment Regularizer for Training Robust Networks
2473Model Architecture Controls Gradient Descent Dynamics: A Combinatorial Path-Based Formula
2474Deep Learning of Determinantal Point Processes via Proper Spectral Sub-gradient
2475Collaborative Filtering With A Synthetic Feedback Loop
2476Self-Supervised State-Control through Intrinsic Mutual Information Rewards
2477Stagnant zone segmentation with U-net
2478Distance-Based Learning from Errors for Confidence Calibration
2479Curvature Graph Network
2480Learning Algorithmic Solutions to Symbolic Planning Tasks with a Neural Computer
2481Generative Imputation and Stochastic Prediction
2483Learning Expensive Coordination: An Event-Based Deep RL Approach
2484Unifying Graph Convolutional Networks as Matrix Factorization
2485Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks
2486Model-free Learning Control of Nonlinear Stochastic Systems with Stability Guarantee
2487Depth-Recurrent Residual Connections for Super-Resolution of Real-Time Renderings
2488LAMAL: LAnguage Modeling Is All You Need for Lifelong Language Learning
2489GenDICE: Generalized Offline Estimation of Stationary Values
2490Deep Audio Prior
2491Compressing Deep Neural Networks With Learnable Regularization
2493SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering
2494Make Lead Bias in Your Favor: A Simple and Effective Method for News Summarization
2495Learning Out-of-distribution Detection without Out-of-distribution Data
2496Prox-SGD: Training Structured Neural Networks under Regularization and Constraints
2497Unsupervised Learning of Node Embeddings by Detecting Communities
2498Diverse Trajectory Forecasting with Determinantal Point Processes
2499Bridging the domain gap in cross-lingual document classification
2500Evaluating The Search Phase of Neural Architecture Search
2501Learning to Defense by Learning to Attack
2502Smooth Regularized Reinforcement Learning
2503On Robustness of Neural Ordinary Differential Equations
2504Diving into Optimization of Topology in Neural Networks
2505FoveaBox: Beyound Anchor-based Object Detection
2506Cascade Style Transfer
2507Advantage Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning
2508Unifying Graph Convolutional Neural Networks and Label Propagation
2509Equivariant neural networks and equivarification
2510Towards a Unified Evaluation of Explanation Methods without Ground Truth
2511Data Valuation using Reinforcement Learning
2512RL-LIM: Reinforcement Learning-based Locally Interpretable Modeling
2513BackPACK: Packing more into Backprop
2514DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures
2515Regional based query in graph active learning
2516Group-Connected Multilayer Perceptron Networks
2517Towards Stable and comprehensive Domain Alignment: Max-Margin Domain-Adversarial Training
2518Depth-Adaptive Transformer
2519VUSFA:Variational Universal Successor Features Approximator
2520InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization
2521Federated Adversarial Domain Adaptation
2522CATER: A diagnostic dataset for Compositional Actions & TEmporal Reasoning
2523Learning Structured Communication for Multi-agent Reinforcement Learning
2524Utilizing Edge Features in Graph Neural Networks via Variational Information Maximization
2525Stabilizing DARTS with Amended Gradient Estimation on Architectural Parameters
2526Utility Analysis of Network Architectures for 3D Point Cloud Processing
2527Effective Mechanism to Mitigate Injuries During NFL Plays
2528TechKG: A Large-Scale Chinese Technology-Oriented Knowledge Graph
2529Learning Reusable Options for Multi-Task Reinforcement Learning
2530Maxmin Q-learning: Controlling the Estimation Bias of Q-learning
2531X-Forest: Approximate Random Projection Trees for Similarity Measurement
2532From Here to There: Video Inbetweening Using Direct 3D Convolutions
2533Low Bias Gradient Estimates for Very Deep Boolean Stochastic Networks
2534Automatically Discovering and Learning New Visual Categories with Ranking Statistics
2535Support-guided Adversarial Imitation Learning
2536Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identification
2537Multi-Scale Representation Learning for Spatial Feature Distributions using Grid Cells
2538Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data
2539Data augmentation instead of explicit regularization
2540SCL: Towards Accurate Domain Adaptive Object Detection via Gradient Detach Based Stacked Complementary Losses
2541SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards
2542Label Cleaning with Likelihood Ratio Test
2543Visual Imitation with Reinforcement Learning using Recurrent Siamese Networks
2544Graph Neural Networks Exponentially Lose Expressive Power for Node Classification
2546Graph inference learning for semi-supervised classification
2547Sparse Coding with Gated Learned ISTA
2548Dimensional Reweighting Graph Convolution Networks
2550Explaining A Black-box By Using A Deep Variational Information Bottleneck Approach
2551Learning deep graph matching with channel-independent embedding and Hungarian attention
2552EnsembleNet: End-to-End Optimization of Multi-headed Models
2553Out-of-Distribution Detection Using Layerwise Uncertainty in Deep Neural Networks
2554Semantics Preserving Adversarial Attacks
2555Ensemble methods and LSTM outperformed other eight machine learning classifiers in an EEG-based BCI experiment
2556Scaling Up Neural Architecture Search with Big Single-Stage Models
2557AutoSlim: Towards One-Shot Architecture Search for Channel Numbers
2558Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching
2559EgoMap: Projective mapping and structured egocentric memory for Deep RL
2560Accelerated Information Gradient flow
2561Adversarial Attribute Learning by Exploiting negative correlated attributes
2562StructPool: Structured Graph Pooling via Conditional Random Fields
2563On the Decision Boundaries of Deep Neural Networks: A Tropical Geometry Perspective
2564Probabilistic modeling the hidden layers of Deep Neural Networks
2565IEG: Robust neural net training with severe label noises
2566VideoEpitoma: Efficient Recognition of Long-range Actions
2567On the Weaknesses of Reinforcement Learning for Neural Machine Translation
2568Stochastically Controlled Compositional Gradient for the Composition problem
2569Sharing Knowledge in Multi-Task Deep Reinforcement Learning
2571Deep Reasoning Networks: Thinking Fast and Slow, for Pattern De-mixing
2572When Does Self-supervision Improve Few-shot Learning?
2573Reinforcement Learning Based Graph-to-Sequence Model for Natural Question Generation
2574Context-aware Attention Model for Coreference Resolution
2575SELF: Learning to Filter Noisy Labels with Self-Ensembling
2576Neural Maximum Common Subgraph Detection with Guided Subgraph Extraction
2577Amharic Negation Handling
2578Noise Regularization for Conditional Density Estimation
2579Star-Convexity in Non-Negative Matrix Factorization
2580Count-guided Weakly Supervised Localization Based on Density Map
2581Scoring-Aggregating-Planning: Learning task-agnostic priors from interactions and sparse rewards for zero-shot generalization
2582SSE-PT: Sequential Recommendation Via Personalized Transformer
2583Wide Neural Networks are Interpolating Kernel Methods: Impact of Initialization on Generalization
2584Improving Evolutionary Strategies with Generative Neural Networks
2585Analysis and Interpretation of Deep CNN Representations as Perceptual Quality Features
2586Program Guided Agent
2587Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency
2588Prestopping: How Does Early Stopping Help Generalization Against Label Noise?
2589Carpe Diem, Seize the Samples Uncertain "at the Moment" for Adaptive Batch Selection
2590Large Batch Optimization for Deep Learning: Training BERT in 76 minutes