机器学习方向-LingLab

机器学习方向

702 阅读 2020-08-15 09:38:02 上传

以下文章来源于语言学探索

今日 cs.LG方向共计74篇文章。

Graph(1篇)

[1]：GraphKKE: Graph Kernel Koopman Embedding for Human Microbiome Analysis
标题：GraphKKE：用于人体微生物组分析的图核Koopman嵌入
作者：Kateryna Melnyk, Stefan Klus, Grégoire Montavon, Tim Conrad
链接：https://arxiv.org/abs/2008.05903

摘要：More and more diseases have been found to be strongly correlated with disturbances in the microbiome constitution, e.g., obesity, diabetes, or some cancer types. Thanks to modern high-throughput omics technologies, it becomes possible to directly analyze human microbiome and its influence on the health status. Microbial communities are monitored over long periods of time and the associations between their members are explored. These relationships can be described by a time-evolving graph. In order to understand responses of the microbial community members to a distinct range of perturbations such as antibiotics exposure or diseases and general dynamical properties, the time-evolving graph of the human microbial communities has to be analyzed. This becomes especially challenging due to dozens of complex interactions among microbes and metastable dynamics. The key to solving this problem is the representation of the time-evolving graphs as fixed-length feature vectors preserving the original dynamics. We propose a method for learning the embedding of the time-evolving graph that is based on the spectral analysis of transfer operators and graph kernels. We demonstrate that our method can capture temporary changes in the time-evolving graph on both created synthetic data and real-world data. Our experiments demonstrate the efficacy of the method. Furthermore, we show that our method can be applied to human microbiome data to study dynamic processes.

联邦学习(1篇)

[1]：WAFFLe: Weight Anonymized Factorization for Federated Learning
标题：WAFFLe：用于联合学习的加权匿名因子分解
作者：Weituo Hao, Nikhil Mehta, Kevin J Liang, Pengyu Cheng, Mostafa El-Khamy, Lawrence Carin
链接：https://arxiv.org/abs/2008.05687

摘要：In domains where data are sensitive or private, there is great value in methods that can learn in a distributed manner without the data ever leaving the local devices. In light of this need, federated learning has emerged as a popular training paradigm. However, many federated learning approaches trade transmitting data for communicating updated weight parameters for each local device. Therefore, a successful breach that would have otherwise directly compromised the data instead grants whitebox access to the local model, which opens the door to a number of attacks, including exposing the very data federated learning seeks to protect. Additionally, in distributed scenarios, individual client devices commonly exhibit high statistical heterogeneity. Many common federated approaches learn a single global model; while this may do well on average, performance degrades when the i.i.d. assumption is violated, underfitting individuals further from the mean, and raising questions of fairness. To address these issues, we propose Weight Anonymized Factorization for Federated Learning (WAFFLe), an approach that combines the Indian Buffet Process with a shared dictionary of weight factors for neural networks. Experiments on MNIST, FashionMNIST, and CIFAR-10 demonstrate WAFFLe's significant improvement to local test performance and fairness while simultaneously providing an extra layer of security.

对抗样本/GAN(2篇)

[1]：Enhancing Speech Intelligibility in Text-To-Speech Synthesis using Speaking Style Conversion
标题：用语体转换提高语篇合成中的语音清晰度
作者：Dipjyoti Paul, Muhammed PV Shifas, Yannis Pantazis, Yannis Stylianou
备注：Accepted in INTERSPEECH 2020
链接：https://arxiv.org/abs/2008.05809

摘要：The increased adoption of digital assistants makes text-to-speech (TTS) synthesis systems an indispensable feature of modern mobile devices. It is hence desirable to build a system capable of generating highly intelligible speech in the presence of noise. Past studies have investigated style conversion in TTS synthesis, yet degraded synthesized quality often leads to worse intelligibility. To overcome such limitations, we proposed a novel transfer learning approach using Tacotron and WaveRNN based TTS synthesis. The proposed speech system exploits two modification strategies: (a) Lombard speaking style data and (b) Spectral Shaping and Dynamic Range Compression (SSDRC) which has been shown to provide high intelligibility gains by redistributing the signal energy on the time-frequency domain. We refer to this extension as Lombard-SSDRC TTS system. Intelligibility enhancement as quantified by the Intelligibility in Bits (SIIB-Gauss) measure shows that the proposed Lombard-SSDRC TTS system shows significant relative improvement between 110% and 130% in speech-shaped noise (SSN), and 47% to 140% in competing-speaker noise (CSN) against the state-of-the-art TTS approach. Additional subjective evaluation shows that Lombard-SSDRC TTS successfully increases the speech intelligibility with relative improvement of 455% for SSN and 104% for CSN in median keyword correction rate compared to the baseline TTS method.

[2]：Process Discovery for Structured Program Synthesis
标题：结构化程序综合的过程发现
作者：Dell Zhang, Alexander Kuhnle, Julian Richardson, Murat Sensoy
链接：https://arxiv.org/abs/2008.05804

摘要：A core task in process mining is process discovery which aims to learn an accurate process model from event log data. In this paper, we propose to use (block-) structured programs directly as target process models so as to establish connections to the field of program synthesis and facilitate the translation from abstract process models to executable processes, e.g., for robotic process automation. Furthermore, we develop a novel bottom-up agglomerative approach to the discovery of such structured program process models. In comparison with the popular top-down recursive inductive miner, our proposed agglomerative miner enjoys the similar theoretical guarantee to produce sound process models (without deadlocks and other anomalies) while exhibiting some advantages like avoiding silent activities and accommodating duplicate activities. The proposed algorithm works by iteratively applying a few graph rewriting rules to the directly-follows-graph of activities. For real-world (sparse) directly-follows-graphs, the algorithm has quadratic computational complexity with respect to the number of distinct activities. To our knowledge, this is the first process discovery algorithm that is made for the purpose of program synthesis. Experiments on the BPI-Challenge 2020 dataset and the Karel programming dataset have demonstrated that our proposed algorithm can outperform the inductive miner not only according to the traditional process discovery metrics but also in terms of the effectiveness in finding out the true underlying structured program from a small number of its execution traces.

弱/半/无监督(2篇)

[1]：A statistical theory of semi-supervised learning
标题：半监督学习的统计理论
作者：Laurence Aitchison
链接：https://arxiv.org/abs/2008.05913

摘要：We currently lack a solid statistical understanding of semi-supervised learning methods, instead treating them as a collection of highly effective tricks. This precludes the principled combination e.g. of Bayesian methods and semi-supervised learning, as semi-supervised learning objectives are not currently formulated as likelihoods for an underlying generative model of the data. Here, we note that standard image benchmark datasets such as CIFAR-10 are carefully curated, and we provide a generative model describing the curation process. Under this generative model, several state-of-the-art semi-supervised learning techniques, including entropy minimization, pseudo-labelling and the FixMatch family emerge naturally as variational lower-bounds on the log-likelihood.

[2]：Neural collaborative filtering for unsupervised mitral valve segmentation in echocardiography
标题：超声心动图二尖瓣无监督分割的神经协同滤波方法
作者：Luca Corinzia, Fabian Laumer, Alessandro Candreva, Maurizio Taramasso, Francesco Maisano, Joachim M. Buhmann
链接：https://arxiv.org/abs/2008.05867

摘要：The segmentation of the mitral valve annulus and leaflets specifies a crucial first step to establish a machine learning pipeline that can support physicians in performing multiple tasks, e.g.\ diagnosis of mitral valve diseases, surgical planning, and intraoperative procedures. Current methods for mitral valve segmentation on 2D echocardiography videos require extensive interaction with annotators and perform poorly on low-quality and noisy videos. We propose an automated and unsupervised method for the mitral valve segmentation based on a low dimensional embedding of the echocardiography videos using neural network collaborative filtering. The method is evaluated in a collection of echocardiography videos of patients with a variety of mitral valve diseases, and additionally on an independent test cohort. It outperforms state-of-the-art \emph{unsupervised} and \emph{supervised} methods on low-quality videos or in the case of sparse annotation.

Zero/One-Shot、迁移学习、Domain Adaptation(1篇)

[1]：MLNET: An Adaptive Multiple Receptive-field Attention Neural Network for Voice Activity Detection
标题：MLNET：一种用于语音活动检测的自适应多感受野注意神经网络
作者：Zhenpeng Zheng, Jianzong Wang, Ning Cheng, Jian Luo, Jing Xiao
备注：will be presented in INTERSPEECH 2020
链接：https://arxiv.org/abs/2008.05650

摘要：Voice activity detection (VAD) makes a distinction between speech and non-speech and its performance is of crucial importance for speech based services. Recently, deep neural network (DNN)-based VADs have achieved better performance than conventional signal processing methods. The existed DNNbased models always handcrafted a fixed window to make use of the contextual speech information to improve the performance of VAD. However, the fixed window of contextual speech information can't handle various unpredicatable noise environments and highlight the critical speech information to VAD task. In order to solve this problem, this paper proposed an adaptive multiple receptive-field attention neural network, called MLNET, to finish VAD task. The MLNET leveraged multi-branches to extract multiple contextual speech information and investigated an effective attention block to weight the most crucial parts of the context for final classification. Experiments in real-world scenarios demonstrated that the proposed MLNET-based model outperformed other baselines.

强化学习(4篇)

[1]：Offline Meta-Reinforcement Learning with Advantage Weighting
标题：基于优势加权的离线元强化学习
作者：Eric Mitchell, Rafael Rafailov, Xue Bin Peng, Sergey Levine, Chelsea Finn
备注：8 pages main text; 18 pages total
链接：https://arxiv.org/abs/2008.06043

摘要：Massive datasets have proven critical to successfully applying deep learning to real-world problems, catalyzing progress on tasks such as object recognition, speech transcription, and machine translation. In this work, we study an analogous problem within reinforcement learning: can we enable an agent to leverage large, diverse experiences from previous tasks in order to quickly learn a new task? While recent work has shown some promise towards offline reinforcement learning, considerably less work has studied how we might leverage offline behavioral data when transferring to new tasks. To address this gap, we consider the problem setting of offline meta-reinforcement learning. By nature of being offline, algorithms for offline meta-RL can utilize the largest possible pool of training data available, and eliminate potentially unsafe or costly data collection during meta-training. Targeting this setting, we propose Meta-Actor Critic with Advantage Weighting (MACAW), an optimization-based meta-learning algorithm that uses simple, supervised regression objectives for both inner-loop adaptation and outer-loop meta-learning. To our knowledge, MACAW is the first successful combination of gradient-based meta-learning and value-based reinforcement learning. We empirically find that this approach enables fully offline meta-reinforcement learning and achieves notable gains over prior methods in some settings.

[2]：Reinforcement Learning with Trajectory Feedback
标题：轨迹反馈强化学习
作者：Yonathan Efroni, Nadav Merlis, Shie Mannor
链接：https://arxiv.org/abs/2008.06036

摘要：The computational model of reinforcement learning is based upon the ability to query a score of every visited state-action pair, i.e., to observe a per state-action reward signal. However, in practice, it is often the case such a score is not readily available to the algorithm designer. In this work, we relax this assumption and require a weaker form of feedback, which we refer to as \emph{trajectory feedback}. Instead of observing the reward from every visited state-action pair, we assume we only receive a score that represents the quality of the whole trajectory observed by the agent. We study natural extensions of reinforcement learning algorithms to this setting, based on least-squares estimation of the unknown reward, for both the known and unknown transition model cases, and study the performance of these algorithms by analyzing the regret. For cases where the transition model is unknown, we offer a hybrid optimistic-Thompson Sampling approach that results in a computationally efficient algorithm.

[3]：Model-Based Deep Reinforcement Learning for High-Dimensional Problems, a Survey
标题：基于模型的高维问题深度强化学习综述
作者：Aske Plaat, Walter Kosters, Mike Preuss
链接：https://arxiv.org/abs/2008.05598

摘要：Deep reinforcement learning has shown remarkable success in the past few years. Highly complex sequential decision making problems have been solved in tasks such as game playing and robotics. Unfortunately, the sample complexity of most deep reinforcement learning methods is high, precluding their use in some important applications. Model-based reinforcement learning creates an explicit model of the environment dynamics to reduce the need for environment samples. Current deep learning methods use high-capacity networks to solve high-dimensional problems. Unfortunately, high-capacity models typically require many samples, negating the potential benefit of lower sample complexity in model-based methods. A challenge for deep model-based methods is therefore to achieve high predictive power while maintaining low sample complexity. In recent years, many model-based methods have been introduced to address this challenge. In this paper, we survey the contemporary model-based landscape. First we discuss definitions and relations to other fields. We propose a taxonomy based on three approaches: using explicit planning on given transitions, using explicit planning on learned transitions, and end-to-end learning of both planning and transitions. We use these approaches to organize a comprehensive overview of important recent developments such as latent models. We describe methods and benchmarks, and we suggest directions for future work for each of the approaches. Among promising research directions are curriculum learning, uncertainty modeling, and use of latent models for transfer learning.

[4]：Overcoming Model Bias for Robust Offline Deep Reinforcement Learning
标题：克服模型偏差的鲁棒离线深度强化学习
作者：Phillip Swazinna, Steffen Udluft, Thomas Runkler
链接：https://arxiv.org/abs/2008.05533

摘要：State-of-the-art reinforcement learning algorithms mostly rely on being allowed to directly interact with their environment to collect millions of observations. This makes it hard to transfer their success to industrial control problems, where simulations are often very costly or do not exist at all. Furthermore, interacting with (and especially exploring in) the real, physical environment has the potential to lead to catastrophic events. We thus propose a novel model-based RL algorithm, called MOOSE (MOdel-based Offline policy Search with Ensembles) which can train a policy from a pre-existing, fixed dataset. It ensures that dynamics models are able to accurately assess policy performance by constraining the policy to stay within the support of the data. We design MOOSE deliberately similar to state-of-the-art model-free, offline (a.k.a. batch) RL algorithms BEAR and BCQ, with the main difference being that our algorithm is model-based. We compare the algorithms on the Industrial Benchmark and Mujoco continuous control tasks in terms of robust performance and find that MOOSE almost always outperforms its model-free counterparts by far.

主动学习(1篇)

[1]：Iterative Surrogate Model Optimization (ISMO): An active learning algorithm for PDE constrained optimization with deep neural networks
标题：迭代代理模型优化（ISMO）：一种基于深度神经网络的PDE约束优化主动学习算法
作者：Kjetil O. Lye, Siddhartha Mishra, Deep Ray, Praveen Chandrasekhar
链接：https://arxiv.org/abs/2008.05730

摘要：We present a novel active learning algorithm, termed as iterative surrogate model optimization (ISMO), for robust and efficient numerical approximation of PDE constrained optimization problems. This algorithm is based on deep neural networks and its key feature is the iterative selection of training data through a feedback loop between deep neural networks and any underlying standard optimization algorithm. Under suitable hypotheses, we show that the resulting optimizers converge exponentially fast (and with exponentially decaying variance), with respect to increasing number of training samples. Numerical examples for optimal control, parameter identification and shape optimization problems for PDEs are provided to validate the proposed theory and to illustrate that ISMO significantly outperforms a standard deep neural network based surrogate optimization algorithm.

Neural Networks(4篇)

[1]：Deep-Lock: Secure Authorization for Deep Neural Networks
标题：Deep-Lock:Deep神经网络的安全授权
作者：Manaar Alam, Sayandeep Saha, Debdeep Mukhopadhyay, Sandip Kundu
链接：https://arxiv.org/abs/2008.05966

摘要：Trained Deep Neural Network (DNN) models are considered valuable Intellectual Properties (IP) in several business models. Prevention of IP theft and unauthorized usage of such DNN models has been raised as of significant concern by industry. In this paper, we address the problem of preventing unauthorized usage of DNN models by proposing a generic and lightweight key-based model-locking scheme, which ensures that a locked model functions correctly only upon applying the correct secret key. The proposed scheme, known as Deep-Lock, utilizes S-Boxes with good security properties to encrypt each parameter of a trained DNN model with secret keys generated from a master key via a key scheduling algorithm. The resulting dense network of encrypted weights is found robust against model fine-tuning attacks. Finally, Deep-Lock does not require any intervention in the structure and training of the DNN models, making it applicable for all existing software and hardware implementations of DNN.

[2]：An Efficient Confidence Measure-Based Evaluation Metric for Breast Cancer Screening Using Bayesian Neural Networks
标题：基于置信度的贝叶斯神经网络乳腺癌筛查评价指标
作者：Anika Tabassum, Naimul Khan
备注：To be presented at the IEEE ICHI 2020
链接：https://arxiv.org/abs/2008.05566

摘要：Screening mammograms is the gold standard for detecting breast cancer early. While a good amount of work has been performed on mammography image classification, especially with deep neural networks, there has not been much exploration into the confidence or uncertainty measurement of the classification. In this paper, we propose a confidence measure-based evaluation metric for breast cancer screening. We propose a modular network architecture, where a traditional neural network is used as a feature extractor with transfer learning, followed by a simple Bayesian neural network. Utilizing a two-stage approach helps reducing the computational complexity, making the proposed framework attractive for wider deployment. We show that by providing the medical practitioners with a tool to tune two hyperparameters of the Bayesian neural network, namely, fraction of sampled number of networks and minimum probability, the framework can be adapted as needed by the domain expert. Finally, we argue that instead of just a single number such as accuracy, a tuple (accuracy, coverage, sampled number of networks, and minimum probability) can be utilized as an evaluation metric of our framework. We provide experimental results on the CBIS-DDSM dataset, where we show the trends in accuracy-coverage tradeoff while tuning the two hyperparameters. We also show that our confidence tuning results in increased accuracy with a reduced set of images with high confidence when compared to the baseline transfer learning. To make the proposed framework readily deployable, we provide (anonymized) source code with reproducible results atthis https URL.

[3]：A statistical theory of cold posteriors in deep neural networks
标题：深神经网络冷后验概率的统计理论
作者：Laurence Aitchison
链接：https://arxiv.org/abs/2008.05912

摘要：To get Bayesian neural networks to perform comparably to standard neural networks it is usually necessary to artificially reduce uncertainty using a "tempered" or "cold" posterior. This is extremely concerning: if the prior is accurate, Bayes inference/decision theory is optimal, and any artificial changes to the posterior should harm performance. While this suggests that the prior may be at fault, here we argue that in fact, BNNs for image classification use the wrong likelihood. In particular, standard image benchmark datasets such as CIFAR-10 are carefully curated. We develop a generative model describing curation which gives a principled Bayesian account of cold posteriors, because the likelihood under this new generative model closely matches the tempered likelihoods used in past work.

[4]：Interpretable Partial Discharge Detection with Temporal Convolution and Pulse Activation Maps: An application to Power Lines
标题：基于时间卷积和脉冲激活图的可解释局部放电检测：在电力线中的应用
作者：Chi-Ching Hsu, Gabriel Michau, Olga Fink
备注：12 pages, 3 figures, 2 tables
链接：https://arxiv.org/abs/2008.05838

摘要：Partial discharge (PD) is a common indication of insulation damages in power systems and cables. These damages can eventually result in costly repairs and substantial power outages. PD detection traditionally relies on hand-crafted features and domain expertise to identify very specific pulses in the electrical current, and the performance declines in the presence of noise or of superposed pulses. In this paper, we propose a novel end-to-end framework based on convolutional neural networks. The framework has two contributions. First, it does not require any feature extraction and enables robust PD detection. Second, we devise the pulse activation map. It provides interpretability of the results for the domain experts with the identification of the pulses that led to the detection of the PDs. The performance is evaluated on a public dataset for the detection of damaged power lines. An ablation study demonstrates the benefits of each part of the proposed framework.

梯度(1篇)

[1]：Training Faster with Compressed Gradient
标题：压缩梯度训练更快
作者：An Xu, Zhouyuan Huo, Heng Huang
链接：https://arxiv.org/abs/2008.05823

摘要：Although the distributed machine learning methods show the potential for the speed-up of training large deep neural networks, the communication cost has been the notorious bottleneck to constrain the performance. To address this challenge, the gradient compression based communication-efficient distributed learning methods were designed to reduce the communication cost, and more recently the local error feedback was incorporated to compensate for the performance loss. However, in this paper, we will show the "gradient mismatch" problem of the local error feedback in centralized distributed training and this issue can lead to degraded performance compared with full-precision training. To solve this critical problem, we propose two novel techniques: 1) step ahead; 2) error averaging. Both our theoretical and empirical results show that our new methods can alleviate the "gradient mismatch" problem. Experiments show that we can even train \textbf{faster with compressed gradient} than full-precision training \textbf{regarding training epochs}.

其他(35篇)

[1]：Variance Regularization for Accelerating Stochastic Optimization
标题：加速随机优化的方差正则化方法
作者：Tong Yang, Long Sha, Pengyu Hong
备注：22 pages, 3 figures
链接：https://arxiv.org/abs/2008.05969

摘要：While nowadays most gradient-based optimization methods focus on exploring the high-dimensional geometric features, the random error accumulated in a stochastic version of any algorithm implementation has not been stressed yet. In this work, we propose a universal principle which reduces the random error accumulation by exploiting statistic information hidden in mini-batch gradients. This is achieved by regularizing the learning-rate according to mini-batch variances. Due to the complementarity of our perspective, this regularization could provide a further improvement for stochastic implementation of generic 1st order approaches. With empirical results, we demonstrated the variance regularization could speed up the convergence as well as stabilize the stochastic optimization.

[2]：Learning Stability Certificates from Data
标题：从数据中学习稳定性证书
作者：Nicholas M. Boffi, Stephen Tu, Nikolai Matni, Jean-Jacques E. Slotine, Vikas Sindhwani
链接：https://arxiv.org/abs/2008.05952

摘要：Many existing tools in nonlinear control theory for establishing stability or safety of a dynamical system can be distilled to the construction of a certificate function that guarantees a desired property. However, algorithms for synthesizing certificate functions typically require a closed-form analytical expression of the underlying dynamics, which rules out their use on many modern robotic platforms. To circumvent this issue, we develop algorithms for learning certificate functions only from trajectory data. We establish bounds on the generalization error - the probability that a certificate will not certify a new, unseen trajectory - when learning from trajectories, and we convert such generalization error bounds into global stability guarantees. We demonstrate empirically that certificates for complex dynamics can be efficiently learned, and that the learned certificates can be used for downstream tasks such as adaptive control.

[3]：A maximum value for the Kullback-Leibler divergence between quantum discrete distributions
标题：量子离散分布间Kullback-Leibler发散的一个最大值
作者：Vincenzo Bonnici
链接：https://arxiv.org/abs/2008.05932

摘要：This work presents an upper-bound for the maximum value that the Kullback-Leibler (KL) divergence from a given discrete probability distribution P can reach. In particular, the aim is to find a discrete distribution Q which maximizes the KL divergence from a given P under the assumption that P and Q have been generated by distributing a fixed discretized quantity. In addition, infinite divergences are avoided. The theoretical findings are used for proposing a notion of normalized KL divergence that is empirically shown to behave differently from already known measures.

[4]：So You Need Datasets for Your COVID-19 Detection Research Using Machine Learning?
标题：所以你需要数据集来进行COVID-19的机器学习检测研究？
作者：Md Fahimuzzman Sohan
备注：6 pages, 1 figure, 4 tables
链接：https://arxiv.org/abs/2008.05906

摘要：Millions of people are infected by the coronavirus disease 2019 (COVID19) around the world. Machine Learning (ML) techniques are being used for COVID19 detection research from the beginning of the epidemic. This article represents the detailed information on frequently used datasets in COVID19 detection using Machine Learning (ML). We investigated 96 papers on COVID19 detection between January 2020 and June 2020. We extracted the information about used datasets from the articles and represented them here simultaneously. This investigation will help future researchers to find the COVID19 datasets without difficulty.

[5]：Lifelong Property Price Prediction: A Case Study for the Toronto Real Estate Market
标题：终身房价预测：以多伦多房地产市场为例
作者：Hao Peng, Jianxin Li, Zheng Wang, Renyu Yang, Mingzhe Liu, Mingming Zhang, Philip S. Yu, Lifang He
备注：14 pages, journal
链接：https://arxiv.org/abs/2008.05880

摘要：We present Luce, the first life-long predictive model for automated property valuation. Luce addresses two critical issues of property valuation: the lack of recent sold prices and the sparsity of house data. It is designed to operate on a limited volume of recent house transaction data. As a departure from prior work, Luce organizes the house data in a heterogeneous information network (HIN) where graph nodes are house entities and attributes that are important for house price valuation. We employ a Graph Convolutional Network (GCN) to extract the spatial information from the HIN for house-related data like geographical locations, and then use a Long Short Term Memory (LSTM) network to model the temporal dependencies for house transaction data over time. Unlike prior work, Luce can make effective use of the limited house transactions data in the past few months to update valuation information for all house entities within the HIN. By providing a complete and up-to-date house valuation dataset, Luce thus massively simplifies the downstream valuation task for the targeting properties. We demonstrate the benefit of Luce by applying it to large, real-life datasets obtained from the Toronto real estate market. Extensive experimental results show that Luce not only significantly outperforms prior property valuation methods but also often reaches and sometimes exceeds the valuation accuracy given by independent experts when using the actual realization price as the ground truth.

[6]：Single-Photon Image Classification
标题：单光子图像分类
作者：Thomas Fischbacher, Luciano Sbaiz
备注：See ancillary files for training code and pre-trained models
链接：https://arxiv.org/abs/2008.05859

摘要：Quantum computing-based machine learning mainly focuses on quantum computing hardware that is experimentally challenging to realize due to requiring quantum gates that operate at very low temperature. Instead, we demonstrate the existence of a lower performance and much lower effort island on the accuracy-vs-qubits graph that may well be experimentally accessible with room temperature optics. This high temperature "quantum computing toy model" is nevertheless interesting to study as it allows rather accessible explanations of key concepts in quantum computing, in particular interference, entanglement, and the measurement process.
We specifically study the problem of classifying an example from the MNIST and Fashion-MNIST datasets, subject to the constraint that we have to make a prediction after the detection of the very first photon that passed a coherently illuminated filter showing the example. Whereas a classical set-up in which a photon is detected after falling on one of the~$28\times 28$ image pixels is limited to a (maximum likelihood estimation) accuracy of~$21.27\%$ for MNIST, respectively $18.27\%$ for Fashion-MNIST, we show that the theoretically achievable accuracy when exploiting inference by optically transforming the quantum state of the photon is at least $41.27\%$ for MNIST, respectively $36.14\%$ for Fashion-MNIST.
We show in detail how to train the corresponding transformation with TensorFlow and also explain how this example can serve as a teaching tool for the measurement process in quantum mechanics.

[7]：Unifying supervised learning and VAEs -- automating statistical inference in high-energy physics
标题：监督学习与VAEs的统一——高能物理统计推断的自动化
作者：Thorsten Glüsenkamp
链接：https://arxiv.org/abs/2008.05825

摘要：A KL-divergence objective of the joint distribution of data and labels allows to unify supervised learning, VAEs and semi-supervised learning under one umbrella of variational inference. This viewpoint has several advantages. For VAEs, it clarifies the interpretation of encoder and decoder parts. For supervised learning, it re-iterates that the training procedure approximates the true posterior over labels and can always be viewed as approximate likelihood-free inference. This is typically not discussed, even though the derivation is well-known in the literature. In the context of semi-supervised learning it motivates an extended supervised scheme which allows to calculate a goodness-of-fit p-value using posterior predictive simulations. Flow-based networks with a standard normal base distribution are crucial. We discuss how they allow to rigorously define coverage for arbitrary joint posteriors on $\mathbb{R}^n \times \mathcal{S}^m$, which encompasses posteriors over directions. Finally, systematic uncertainties are naturally included in the variational viewpoint. With the three ingredients of (1) systematics, (2) coverage and (3) goodness-of-fit, flow-based neural networks have the potential to replace a large part of the statistical toolbox of the contemporary high-energy physicist.

[8]：Small Towers Make Big Differences
标题：小塔楼大不相同
作者：Yuyan Wang, Zhe Zhao, Bo Dai, Christopher Fifty, Dong Lin, Lichan Hong, Ed H. Chi
链接：https://arxiv.org/abs/2008.05808

摘要：Multi-task learning aims at solving multiple machine learning tasks at the same time. A good solution to a multi-task learning problem should be generalizable in addition to being Pareto optimal. In this paper, we provide some insights on understanding the trade-off between Pareto efficiency and generalization as a result of parameterization in multi-task deep learning models. As a multi-objective optimization problem, enough parameterization is needed for handling task conflicts in a constrained solution space; however, from a multi-task generalization perspective, over-parameterization undermines the benefit of learning a shared representation which helps harder tasks or tasks with limited training examples. A delicate balance between multi-task generalization and multi-objective optimization is therefore needed for finding a better trade-off between efficiency and generalization. To this end, we propose a method of under-parameterized self-auxiliaries for multi-task models to achieve the best of both worlds. It is task-agnostic and works with other multi-task learning algorithms. Empirical results show that small towers of under-parameterized self-auxiliaries can make big differences in improving Pareto efficiency in various multi-task applications.

[9]：Explaining Naive Bayes and Other Linear Classifiers with Polynomial Time and Delay
标题：用多项式时间和延迟解释naivebayes等线性分类器
作者：Joao Marques-Silva, Thomas Gerspacher, Martin C. Cooper, Alexey Ignatiev, Nina Narodytska
链接：https://arxiv.org/abs/2008.05803

摘要：Recent work proposed the computation of so-called PI-explanations of Naive Bayes Classifiers (NBCs). PI-explanations are subset-minimal sets of feature-value pairs that are sufficient for the prediction, and have been computed with state-of-the-art exact algorithms that are worst-case exponential in time and space. In contrast, we show that the computation of one PI-explanation for an NBC can be achieved in log-linear time, and that the same result also applies to the more general class of linear classifiers. Furthermore, we show that the enumeration of PI-explanations can be obtained with polynomial delay. Experimental results demonstrate the performance gains of the new algorithms when compared with earlier work. The experimental results also investigate ways to measure the quality of heuristic explanations

[10]：Statistical Evaluation of Anomaly Detectors for Sequences
标题：序列异常检测器的统计评价
作者：Erik Scharwächter, Emmanuel Müller
备注：5 pages, 6 figures, accepted at the 6th KDD Workshop on Mining and Learning from Time Series (KDD MiLeTS 2020), source code available atthis https URL
链接：https://arxiv.org/abs/2008.05788

摘要：Although precision and recall are standard performance measures for anomaly detection, their statistical properties in sequential detection settings are poorly understood. In this work, we formalize a notion of precision and recall with temporal tolerance for point-based anomaly detection in sequential data. These measures are based on time-tolerant confusion matrices that may be used to compute time-tolerant variants of many other standard measures. However, care has to be taken to preserve interpretability. We perform a statistical simulation study to demonstrate that precision and recall may overestimate the performance of a detector, when computed with temporal tolerance. To alleviate this problem, we show how to obtain null distributions for the two measures to assess the statistical significance of reported results.

[11]：Imitating Unknown Policies via Exploration
标题：探索模仿未知策略
作者：Nathan Gavenski, Juarez Monteiro, Roger Granada, Felipe Meneguzzi, Rodrigo C. Barros
备注：This paper has been accepted in the British Machine Vision Virtual Conference (BMVC) 2020
链接：https://arxiv.org/abs/2008.05660

摘要：Behavioral cloning is an imitation learning technique that teaches an agent how to behave through expert demonstrations. Recent approaches use self-supervision of fully-observable unlabeled snapshots of the states to decode state-pairs into actions. However, the iterative learning scheme from these techniques are prone to getting stuck into bad local minima. We address these limitations incorporating a two-phase model into the original framework, which learns from unlabeled observations via exploration, substantially improving traditional behavioral cloning by exploiting (i) a sampling mechanism to prevent bad local minima, (ii) a sampling mechanism to improve exploration, and (iii) self-attention modules to capture global features. The resulting technique outperforms the previous state-of-the-art in four different environments by a large margin.

[12]：The Slow Deterioration of the Generalization Error of the Random Feature Model
标题：随机特征模型泛化误差的缓慢退化
作者：Chao Ma, Lei Wu, Weinan E
链接：https://arxiv.org/abs/2008.05621

摘要：The random feature model exhibits a kind of resonance behavior when the number of parameters is close to the training sample size. This behavior is characterized by the appearance of large generalization gap, and is due to the occurrence of very small eigenvalues for the associated Gram matrix. In this paper, we examine the dynamic behavior of the gradient descent algorithm in this regime. We show, both theoretically and experimentally, that there is a dynamic self-correction mechanism at work: The larger the eventual generalization gap, the slower it develops, both because of the small eigenvalues. This gives us ample time to stop the training process and obtain solutions with good generalization property.

[13]：Modeling the Field Value Variations and Field Interactions Simultaneously for Fraud Detection
标题：同时对字段值变化和字段交互进行建模，以便进行欺诈检测
作者：Dongbo Xi, Bowen Song, Fuzhen Zhuang, Yongchun Zhu, Shuai Chen, Tianyi Zhang, Yuan Qi, Qing He
备注：11 pages, 4 figures
链接：https://arxiv.org/abs/2008.05600

摘要：With the explosive growth of e-commerce, online transaction fraud has become one of the biggest challenges for e-commerce platforms. The historical behaviors of users provide rich information for digging into the users' fraud risk. While considerable efforts have been made in this direction, a long-standing challenge is how to effectively exploit internal user information and provide explainable prediction results. In fact, the value variations of same field from different events and the interactions of different fields inside one event have proven to be strong indicators for fraudulent behaviors. In this paper, we propose the Dual Importance-aware Factorization Machines (DIFM), which exploits the internal field information among users' behavior sequence from dual perspectives, i.e., field value variations and field interactions simultaneously for fraud detection. The proposed model is deployed in the risk management system of one of the world's largest e-commerce platforms, which utilize it to provide real-time transaction fraud detection. Experimental results on real industrial data from different regions in the platform clearly demonstrate that our model achieves significant improvements compared with various state-of-the-art baseline models. Moreover, the DIFM could also give an insight into the explanation of the prediction results from dual perspectives.

[14]：Machine Learning for Robust Identification of Complex Nonlinear Dynamical Systems: Applications to Earth Systems Modeling
标题：复杂非线性动力系统鲁棒辨识的机器学习：在地球系统建模中的应用
作者：Nishant Yadav, Sai Ravela, Auroop R. Ganguly
备注：10 pages
链接：https://arxiv.org/abs/2008.05590

摘要：Systems exhibiting nonlinear dynamics, including but not limited to chaos, are ubiquitous across Earth Sciences such as Meteorology, Hydrology, Climate and Ecology, as well as Biology such as neural and cardiac processes. However, System Identification remains a challenge. In climate and earth systems models, while governing equations follow from first principles and understanding of key processes has steadily improved, the largest uncertainties are often caused by parameterizations such as cloud physics, which in turn have witnessed limited improvements over the last several decades. Climate scientists have pointed to Machine Learning enhanced parameter estimation as a possible solution, with proof-of-concept methodological adaptations being examined on idealized systems. While climate science has been highlighted as a "Big Data" challenge owing to the volume and complexity of archived model-simulations and observations from remote and in-situ sensors, the parameter estimation process is often relatively a "small data" problem. A crucial question for data scientists in this context is the relevance of state-of-the-art data-driven approaches including those based on deep neural networks or kernel-based processes. Here we consider a chaotic system - two-level Lorenz-96 - used as a benchmark model in the climate science literature, adopt a methodology based on Gaussian Processes for parameter estimation and compare the gains in predictive understanding with a suite of Deep Learning and strawman Linear Regression methods. Our results show that adaptations of kernel-based Gaussian Processes can outperform other approaches under small data constraints along with uncertainty quantification; and needs to be considered as a viable approach in climate science and earth system modeling.

[15]：Sequential recommendation with metric models based on frequent sequences
标题：基于频繁序列的度量模型序贯推荐
作者：Corentin Lonjarret, Roch Auburtin, Céline Robardet, Marc Plantevit
备注：25 pages, 6 figures, submitted to DAMI (under review)
链接：https://arxiv.org/abs/2008.05587

摘要：Modeling user preferences (long-term history) and user dynamics (short-term history) is of greatest importance to build efficient sequential recommender systems. The challenge lies in the successful combination of the whole user's history and his recent actions (sequential dynamics) to provide personalized recommendations. Existing methods capture the sequential dynamics of a user using fixed-order Markov chains (usually first order chains) regardless of the user, which limits both the impact of the past of the user on the recommendation and the ability to adapt its length to the user profile. In this article, we propose to use frequent sequences to identify the most relevant part of the user history for the recommendation. The most salient items are then used in a unified metric model that embeds items based on user preferences and sequential dynamics. Extensive experiments demonstrate that our method outperforms state-of-the-art, especially on sparse datasets. We show that considering sequences of varying lengths improves the recommendations and we also emphasize that these sequences provide explanations on the recommendation.

[16]：Comprehensive forecasting based analysis using stacked stateless and stateful Gated Recurrent Unit models
标题：基于预测的叠加无状态和状态选通递归单元模型的综合分析
作者：Swayamjit Saha, Niladri Majumder, Devansh Sangani
备注：12 pages, 2 figures
链接：https://arxiv.org/abs/2008.05575

摘要：Photovoltaic power is a renewable source of energy which is highly used in industries. In economically struggling countries it can be a potential source of electric energy as other non-renewable resources are already exhausting. Now if installation of a photovoltaic cell in a region is done prior to research, it may not provide the desired energy output required for running that region. Hence forecasting is required which can elicit the output from a particular region considering its geometrical coordinates, solar parameter like GHI and weather parameters like temperature and wind speed etc. Our paper explores forecasting of solar irradiance on four such regions, out of which three is in West Bengal and one outside to depict with using stacked Gated Recurrent Unit (GRU) models. We have checked that stateful stacked gated recurrent unit model improves the prediction accuracy significantly.

[17]：Model-Based Offline Planning
标题：基于模型的离线规划
作者：Arthur Argenson, Gabriel Dulac-Arnold
链接：https://arxiv.org/abs/2008.05556

摘要：Offline learning is a key part of making reinforcement learning (RL) useable in real systems. Offline RL looks at scenarios where there is data from a system's operation, but no direct access to the system when learning a policy. Recent work on training RL policies from offline data has shown results both with model-free policies learned directly from the data, or with planning on top of learnt models of the data. Model-free policies tend to be more performant, but are more opaque, harder to command externally, and less easy to integrate into larger systems. We propose an offline learner that generates a model that can be used to control the system directly through planning. This allows us to have easily controllable policies directly from data, without ever interacting with the system. We show the performance of our algorithm, Model-Based Offline Planning (MBOP) on a series of robotics-inspired tasks, and demonstrate its ability leverage planning to respect environmental constraints. We are able to find near-optimal polices for certain simulated systems from as little as 50 seconds of real-time system interaction, and create zero-shot goal-conditioned policies on a series of environments.

[18]：Non-Stochastic Control with Bandit Feedback
标题：Bandit反馈的非随机控制
作者：Paula Gradu, John Hallman, Elad Hazan
链接：https://arxiv.org/abs/2008.05523

摘要：We study the problem of controlling a linear dynamical system with adversarial perturbations where the only feedback available to the controller is the scalar loss, and the loss function itself is unknown. For this problem, with either a known or unknown system, we give an efficient sublinear regret algorithm. The main algorithmic difficulty is the dependence of the loss on past controls. To overcome this issue, we propose an efficient algorithm for the general setting of bandit convex optimization for loss functions with memory, which may be of independent interest.

[19]：Textual Echo Cancellation
标题：文本回音消除
作者：Shaojin Ding, Ye Jia, Ke Hu, Quan Wang
链接：https://arxiv.org/abs/2008.06006

摘要：In this paper, we propose Textual Echo Cancellation (TEC) - a framework for cancelling the text-to-speech (TTS) playback echo from overlapped speech recordings. Such a system can largely improve speech recognition performance and user experience for intelligent devices such as smart speakers, as the user can talk to the device while the device is still playing the TTS signal responding to the previous query. We implement this system by using a novel sequence-to-sequence model with multi-source attention that takes both the microphone mixture signal and the source text of the TTS playback as inputs, and predicts the enhanced audio. Experiments show that the textual information of the TTS playback is critical to the enhancement performance. Besides, the text sequence is much smaller in size compared with the raw acoustic signal of the TTS playback, and can be immediately transmitted to the device and the ASR server even before the playback is synthesized. Therefore, our proposed approach effectively reduces Internet communication and latency compared with alternative approaches such as acoustic echo cancellation (AEC).

[20]：A community-powered search of machine learning strategy space to find NMR property prediction models
标题：一个基于社区的机器学习策略空间搜索以找到核磁共振特性预测模型
作者：Lars A. Bratholm, Will Gerrard, Brandon Anderson, Shaojie Bai, Sunghwan Choi, Lam Dang, Pavel Hanchar, Addison Howard, Guillaume Huard, Sanghoon Kim, Zico Kolter, Risi Kondor, Mordechai Kornbluth, Youhan Lee, Youngsoo Lee, Jonathan P. Mailoa, Thanh Tu Nguyen, Milos Popovic, Goran Rakocevic, Walter Reade, Wonho Song, Luka Stojanovic, Erik H. Thiede, Nebojsa Tijanic, Andres Torrubia, Devin Willmott, Craig P. Butts, David R. Glowacki, Kaggle participants
链接：https://arxiv.org/abs/2008.05994

摘要：The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of possibilities. Here we outline the results of an online community-powered effort to swarm search the space of ML strategies and develop algorithms for predicting atomic-pairwise nuclear magnetic resonance (NMR) properties in molecules. Using an open-source dataset, we worked with Kaggle to design and host a 3-month competition which received 47,800 ML model predictions from 2,700 teams in 84 countries. Within 3 weeks, the Kaggle community produced models with comparable accuracy to our best previously published "in-house" efforts. A meta-ensemble model constructed as a linear combination of the top predictions has a prediction accuracy which exceeds that of any individual model, 7-19x better than our previous state-of-the-art. The results highlight the potential of transformer architectures for predicting quantum mechanical (QM) molecular properties.

[21]：A Novel CMAQ-CNN Hybrid Model to Forecast Hourly Surface-Ozone Concentrations Fourteen Days in Advance
标题：一种新的CMAQ-CNN混合模式用于提前14天预报地表臭氧小时浓度
作者：Alqamah Sayeed, Yunsoo Choi, Ebrahim Eslami, Jia Jung, Yannic Lops, Ahmed Khan Salman
备注：15 pages, 4 main figures and supplemantary figures and tables
链接：https://arxiv.org/abs/2008.05987

摘要：Issues regarding air quality and related health concerns have prompted this study, which develops an accurate and computationally fast, efficient hybrid modeling system that combines numerical modeling and machine learning for forecasting concentrations of surface ozone. Currently available numerical modeling systems for air quality predictions (e.g., CMAQ, NCEP EMP) can forecast 24 to 48 hours in advance. In this study, we develop a modeling system based on a convolutional neural network (CNN) model that is not only fast but covers a temporal period of two weeks with a resolution as small as a single hour for 255 stations. The CNN model uses forecasted meteorology from the Weather Research and Forecasting model (processed by the Meteorology-Chemistry Interface Processor), forecasted air quality from the Community Multi-scale Air Quality Model (CMAQ), and previous 24-hour concentrations of various measurable air quality parameters as inputs and predicts the following 14-day hourly surface ozone concentrations. The model achieves an average accuracy of 0.91 in terms of the index of agreement for the first day and 0.78 for the fourteenth day while the average index of agreement for one day ahead prediction from the CMAQ is 0.77. Through this study, we intend to amalgamate the best features of numerical modeling (i.e., fine spatial resolution) and a deep neural network (i.e., computation speed and accuracy) to achieve more accurate spatio-temporal predictions of hourly ozone concentrations. Although the primary purpose of this study is the prediction of hourly ozone concentrations, the system can be extended to various other pollutants.

[22]：Meta Learning MPC using Finite-Dimensional Gaussian Process Approximations
标题：基于有限维高斯过程逼近的元学习MPC
作者：Elena Arcari, Andrea Carron, Melanie N. Zeilinger
链接：https://arxiv.org/abs/2008.05984

摘要：Data availability has dramatically increased in recent years, driving model-based control methods to exploit learning techniques for improving the system description, and thus control performance. Two key factors that hinder the practical applicability of learning methods in control are their high computational complexity and limited generalization capabilities to unseen conditions. Meta-learning is a powerful tool that enables efficient learning across a finite set of related tasks, easing adaptation to new unseen tasks. This paper makes use of a meta-learning approach for adaptive model predictive control, by learning a system model that leverages data from previous related tasks, while enabling fast fine-tuning to the current task during closed-loop operation. The dynamics is modeled via Gaussian process regression and, building on the Karhunen-Lo{è}ve expansion, can be approximately reformulated as a finite linear combination of kernel eigenfunctions. Using data collected over a set of tasks, the eigenfunction hyperparameters are optimized in a meta-training phase by maximizing a variational bound for the log-marginal likelihood. During meta-testing, the eigenfunctions are fixed, so that only the linear parameters are adapted to the new unseen task in an online adaptive fashion via Bayesian linear regression, providing a simple and efficient inference scheme. Simulation results are provided for autonomous racing with miniature race cars adapting to unseen road conditions.

[23]：Creativity in the era of artificial intelligence
标题：人工智能时代的创造力
作者：Philippe Esling, Ninon Devis
备注：Keynote paper - JIM Conference 2020 - 12 pages
链接：https://arxiv.org/abs/2008.05959

摘要：Creativity is a deeply debated topic, as this concept is arguably quintessential to our humanity. Across different epochs, it has been infused with an extensive variety of meanings relevant to that era. Along these, the evolution of technology have provided a plurality of novel tools for creative purposes. Recently, the advent of Artificial Intelligence (AI), through deep learning approaches, have seen proficient successes across various applications. The use of such technologies for creativity appear in a natural continuity to the artistic trend of this century. However, the aura of a technological artefact labeled as intelligent has unleashed passionate and somewhat unhinged debates on its implication for creative endeavors. In this paper, we aim to provide a new perspective on the question of creativity at the era of AI, by blurring the frontier between social and computational sciences. To do so, we rely on reflections from social science studies of creativity to view how current AI would be considered through this lens. As creativity is a highly context-prone concept, we underline the limits and deficiencies of current AI, requiring to move towards artificial creativity. We argue that the objective of trying to purely mimic human creative traits towards a self-contained ex-nihilo generative machine would be highly counterproductive, putting us at risk of not harnessing the almost unlimited possibilities offered by the sheer computational power of artificial agents.

[24]：An Exploratory Study of COVID-19 Information on Twitter in the Greater Region
标题：大区域Twitter上COVID-19信息的探索性研究
作者：Ninghan Chen, Zhiqiang Zhong, Jun Pang
链接：https://arxiv.org/abs/2008.05900

摘要：The outbreak of the Coronavirus disease (COVID-19) leads to an outbreak of pandemic information in major online social networks (OSNs). In the constantly changing situation, OSNs are becoming a critical conduit for people in expressing opinions and seek up-to-the-minute information. Thus, social behaviour on OSNs may become a predictor or reflection of reality. This paper aims to study the social behaviour of the public in the Greater Region (GR) and related countries based on Twitter information with machine learning and representation learning methods. We find that tweets volume only can be a predictor of outbreaks in a particular period of the pandemic. Moreover, we map out the evolution of public behaviour in each country from 2020/01/22 to 2020/06/05, figuring out the main differences in public behaviour between GR and related countries. Finally, we conclude that tweets volume of anti-contiguous measures may affect the effeteness of the government policy.

[25]：Revealing the Hidden Patterns: A Comparative Study on Profiling Subpopulations of MOOC Students
标题：揭示隐藏模式：MOOC学生亚群体特征的比较研究
作者：Lei Shi, Alexandra I. Cristea, Armando M. Toda, Wilk Oliveira
备注：Information Systems Development: Information Systems Beyond 2020 (ISD2019)
链接：https://arxiv.org/abs/2008.05850

摘要：Massive Open Online Courses (MOOCs) exhibit a remarkable heterogeneity of students. The advent of complex "big data" from MOOC platforms is a challenging yet rewarding opportunity to deeply understand how students are engaged in MOOCs. Past research, looking mainly into overall behavior, may have missed patterns related to student diversity. Using a large dataset from a MOOC offered by FutureLearn, we delve into a new way of investigating hidden patterns through both machine learning and statistical modelling. In this paper, we report on clustering analysis of student activities and comparative analysis on both behavioral patterns and demographical patterns between student subpopulations in the MOOC. Our approach allows for a deeper understanding of how MOOC students behave and achieve. Our findings may be used to design adaptive strategies towards an enhanced MOOC experience

[26]：Predicting MOOCs Dropout Using Only Two Easily Obtainable Features from the First Week's Activities
标题：仅使用第一周活动中两个容易获得的特征来预测MOOCs的退出
作者：Ahmed Alamri, Mohammad Alshehri, Alexandra I. Cristea, Filipe D. Pereira, Elaine Oliveira, Lei Shi, Craig Stewart
备注：Intelligent Tutoring Systems. ITS 2019. Lecture Notes in Computer Science, vol 11528. Springer, Cham
链接：https://arxiv.org/abs/2008.05849

摘要：While Massive Open Online Course (MOOCs) platforms provide knowledge in a new and unique way, the very high number of dropouts is a significant drawback. Several features are considered to contribute towards learner attrition or lack of interest, which may lead to disengagement or total dropout. The jury is still out on which factors are the most appropriate predictors. However, the literature agrees that early prediction is vital to allow for a timely intervention. Whilst feature-rich predictors may have the best chance for high accuracy, they may be unwieldy. This study aims to predict learner dropout early-on, from the first week, by comparing several machine-learning approaches, including Random Forest, Adaptive Boost, XGBoost and GradientBoost Classifiers. The results show promising accuracies (82%-94%) using as little as 2 features. We show that the accuracies obtained outperform state of the art approaches, even when the latter deploy several features.

[27]：Conservative Stochastic Optimization with Expectation Constraints
标题：带期望约束的保守随机优化
作者：Zeeshan Akhtar, Amrit Singh Bedi, Ketan Rajawat
链接：https://arxiv.org/abs/2008.05758

摘要：This paper considers stochastic convex optimization problems where the objective and constraint functions involve expectations with respect to the data indices or environmental variables, in addition to deterministic convex constraints on the domain of the variables. Although the setting is generic and arises in different machine learning applications, online and efficient approaches for solving such problems have not been widely studied. Since the underlying data distribution is unknown a priori, a closed-form solution is generally not available, and classical deterministic optimization paradigms are not applicable. State-of-the-art approaches, such as those using the saddle point framework, can ensure that the optimality gap as well as the constraint violation decay as $Ø\left(T^{-\frac{1}{2}}\right)$ where $T$ is the number of stochastic gradients. The domain constraints are assumed simple and handled via projection at every iteration. In this work, we propose a novel conservative stochastic optimization algorithm (CSOA) that achieves zero constraint violation and $Ø\left(T^{-\frac{1}{2}}\right)$ optimality gap.
Further, the projection operation (for scenarios when calculating projection is expensive) in the proposed algorithm can be avoided by considering the conditional gradient or Frank-Wolfe (FW) variant of the algorithm. The state-of-the-art stochastic FW variants achieve an optimality gap of $Ø\left(T^{-\frac{1}{3}}\right)$ after $T$ iterations, though these algorithms have not been applied to problems with functional expectation constraints. In this work, we propose the FW-CSOA algorithm that is not only projection-free but also achieves zero constraint violation with $Ø\left(T^{-\frac{1}{4}}\right)$ decay of the optimality gap. The efficacy of the proposed algorithms is tested on two relevant problems: fair classification and structured matrix completion.

[28]：Metrics for Multi-Class Classification: an Overview
标题：多类分类的度量：综述
作者：Margherita Grandini, Enrico Bagli, Giorgio Visani
链接：https://arxiv.org/abs/2008.05756

摘要：Classification tasks in machine learning involving more than two classes are known by the name of "multi-class classification". Performance indicators are very useful when the aim is to evaluate and compare different classification models or machine learning techniques. Many metrics come in handy to test the ability of a multi-class classifier. Those metrics turn out to be useful at different stage of the development process, e.g. comparing the performance of two different models or analysing the behaviour of the same model by tuning different parameters. In this white paper we review a list of the most promising multi-class metrics, we highlight their advantages and disadvantages and show their possible usages during the development of a classification model.

[29]：LAC : LSTM AUTOENCODER with Community for Insider Threat Detection
标题：LAC:LSTM自动编码器，用于内部威胁检测
作者：Sudipta Paul, Subhankar Mishra
备注：10 pages, 8 figures, 5 tables, Accepted to the 3rd ICIST 2020, Tokyo, Japan
链接：https://arxiv.org/abs/2008.05646

摘要：The employees of any organization, institute, or industry, spend a significant amount of time on a computer network, where they develop their own routine of activities in the form of network transactions over a time period. Insider threat detection involves identifying deviations in the routines or anomalies which may cause harm to the organization in the form of data leaks and secrets sharing. If not automated, this process involves feature engineering for modeling human behavior which is a tedious and time-consuming task. Anomalies in human behavior are forwarded to a human analyst for final threat classification. We developed an unsupervised deep neural network model using LSTM AUTOENCODER which learns to mimic the behavior of individual employees from their day-wise time-stamped sequence of activities. It predicts the threat scenario via significant loss from anomalous routine. Employees in a community tend to align their routine with each other rather than the employees outside their communities, this motivates us to explore a variation of the AUTOENCODER, LSTM AUTOENCODER- trained on the interleaved sequences of activities in the Community (LAC). We evaluate the model on the CERT v6.2 dataset and perform analysis on the loss for normal and anomalous routine across 4000 employees. The aim of our paper is to detect the anomalous employees as well as to explore how the surrounding employees are affecting that employees' routine over time.

[30]：A Deep Learning Approach for COVID-19 Trend Prediction
标题：COVID-19趋势预测的深度学习方法
作者：Tong Yang, Long Sha, Justin Li, Pengyu Hong
备注：7 pages, 11 figures, accepted by KDD 2020 epiDAMIK workshop
链接：https://arxiv.org/abs/2008.05644

摘要：In this work, we developed a deep learning model-based approach to forecast the spreading trend of SARS-CoV-2 in the United States. We implemented the designed model using the United States to confirm cases and state demographic data and achieved promising trend prediction results. The model incorporates demographic information and epidemic time-series data through a Gated Recurrent Unit structure. The identification of dominating demographic factors is delivered in the end.

[31]：A clarification of misconceptions, myths and desired status of artificial intelligence
标题：澄清人工智能的误解、神话和理想状态
作者：Frank Emmert-Streib, Olli Yli-Harja, Matthias Dehmer
链接：https://arxiv.org/abs/2008.05607

摘要：The field artificial intelligence (AI) has been founded over 65 years ago. Starting with great hopes and ambitious goals the field progressed though various stages of popularity and received recently a revival in the form of deep neural networks. Some problems of AI are that so far neither 'intelligence' nor the goals of AI are formally defined causing confusion when comparing AI to other fields. In this paper, we present a perspective on the desired and current status of AI in relation to machine learning and statistics and clarify common misconceptions and myths. Our discussion is intended to uncurtain the veil of vagueness surrounding AI to see its true countenance.

[32]：On the complexity of finding a local minimizer of a quadratic function over a polytope
标题：关于在多面体上求二次函数局部极小的复杂性
作者：Amir Ali Ahmadi, Jeffrey Zhang
备注：9 pages
链接：https://arxiv.org/abs/2008.05558

摘要：We show that unless P=NP, there cannot be a polynomial-time algorithm that finds a point within Euclidean distance $c^n$ (for any constant $c \ge 0$) of a local minimizer of an $n$-variate quadratic function over a polytope. This result (even with $c=0$) answers a question of Pardalos and Vavasis that appeared in 1992 on a list of seven open problems in complexity theory for numerical optimization. Our proof technique also implies that the problem of deciding whether a quadratic function has a local minimizer over an (unbounded) polyhedron, and that of deciding if a quartic polynomial has a local minimizer are NP-hard.

[33]：Reparametrization Invariance in non-parametric Causal Discovery
标题：非参数因果发现中的重参数化不变性
作者：Martin Jørgensen, Søren Hauberg
链接：https://arxiv.org/abs/2008.05552

摘要：Causal discovery estimates the underlying physical process that generates the observed data: does X cause Y or does Y cause X? Current methodologies use structural conditions to turn the causal query into a statistical query, when only observational data is available. But what if these statistical queries are sensitive to causal invariants? This study investigates one such invariant: the causal relationship between X and Y is invariant to the marginal distributions of X and Y. We propose an algorithm that uses a non-parametric estimator that is robust to changes in the marginal distributions. This way we may marginalize the marginals, and inspect what relationship is intrinsically there. The resulting causal estimator is competitive with current methodologies and has high emphasis on the uncertainty in the causal query; an aspect just as important as the query itself.

[34]：Convergence of Deep Fictitious Play for Stochastic Differential Games
标题：随机微分对策的深层虚拟对策的收敛性
作者：Jiequn Han, Ruimeng Hu, Jihao Long
链接：https://arxiv.org/abs/2008.05519

摘要：Stochastic differential games have been used extensively to model agents' competitions in Finance, for instance, in P2P lending platforms from the Fintech industry, the banking system for systemic risk, and insurance markets. The recently proposed machine learning algorithm, deep fictitious play, provides a novel efficient tool for finding Markovian Nash equilibrium of large $N$-player asymmetric stochastic differential games [J. Han and R. Hu, Mathematical and Scientific Machine Learning Conference, 2020]. By incorporating the idea of fictitious play, the algorithm decouples the game into $N$ sub-optimization problems, and identifies each player's optimal strategy with the deep backward stochastic differential equation (BSDE) method parallelly and repeatedly. In this paper, under appropriate conditions, we prove the convergence of deep fictitious play (DFP) to the true Nash equilibrium. We can also show that the strategy based on DFP forms an $\epsilon$-Nash equilibrium. We generalize the algorithm by proposing a new approach to decouple the games, and present numerical results of large population games showing the empirical convergence of the algorithm beyond the technical assumptions in the theorems.

[35]：Synergy between Machine/Deep Learning and Software Engineering: How Far Are We?
标题：机器/深度学习与软件工程的协同：我们还有多远？
作者：Simin Wang, Liguo Huang, Jidong Ge, Tengfei Zhang, Haitao Feng, Ming Li, He Zhang, Vincent Ng
链接：https://arxiv.org/abs/2008.05515

摘要：Since 2009, the deep learning revolution, which was triggered by the introduction of ImageNet, has stimulated the synergy between Machine Learning (ML)/Deep Learning (DL) and Software Engineering (SE). Meanwhile, critical reviews have emerged that suggest that ML/DL should be used cautiously. To improve the quality (especially the applicability and generalizability) of ML/DL-related SE studies, and to stimulate and enhance future collaborations between SE/AI researchers and industry practitioners, we conducted a 10-year Systematic Literature Review (SLR) on 906 ML/DL-related SE papers published between 2009 and 2018. Our trend analysis demonstrated the mutual impacts that ML/DL and SE have had on each other. At the same time, however, we also observed a paucity of replicable and reproducible ML/DL-related SE studies and identified five factors that influence their replicability and reproducibility. To improve the applicability and generalizability of research results, we analyzed what ingredients in a study would facilitate an understanding of why a ML/DL technique was selected for a specific SE problem. In addition, we identified the unique trends of impacts of DL models on SE tasks, as well as five unique challenges that needed to be met in order to better leverage DL to improve the productivity of SE tasks. Finally, we outlined a road-map that we believe can facilitate the transfer of ML/DL-based SE research results into real-world industry practices.

CV方向重复(18篇)

[1]：Weight Equalizing Shift Scaler-Coupled Post-training Quantization
标题：加权均衡移位定标器耦合训练后量化
作者：Jihun Oh, SangJeong Lee, Meejeong Park, Pooni Walagaurav, Kiseok Kwon
备注：9 pages, 4 figures, 4 tables
链接：https://arxiv.org/abs/2008.05767

摘要：Post-training, layer-wise quantization is preferable because it is free from retraining and is hardware-friendly. Nevertheless, accuracy degradation has occurred when a neural network model has a big difference of per-out-channel weight ranges. In particular, the MobileNet family has a tragedy drop in top-1 accuracy from 70.60% ~ 71.87% to 0.1% on the ImageNet dataset after 8-bit weight quantization. To mitigate this significant accuracy reduction, we propose a new weight equalizing shift scaler, i.e. rescaling the weight range per channel by a 4-bit binary shift, prior to a layer-wise quantization. To recover the original output range, inverse binary shifting is efficiently fused to the existing per-layer scale compounding in the fixed-computing convolutional operator of the custom neural processing unit. The binary shift is a key feature of our algorithm, which significantly improved the accuracy performance without impeding the memory footprint. As a result, our proposed method achieved a top-1 accuracy of 69.78% ~ 70.96% in MobileNets and showed robust performance in varying network models and tasks, which is competitive to channel-wise quantization results.

[2]：Towards Visually Explaining Similarity Models
标题：面向视觉解释的相似模型
作者：Meng Zheng, Srikrishna Karanam, Terrence Chen, Richard J. Radke, Ziyan Wu
备注：14 pages, 6 figures, 3 tables
链接：https://arxiv.org/abs/2008.06035

摘要：We consider the problem of visually explaining similarity models, i.e., explaining why a model predicts two images to be similar in addition to producing a scalar score. While much recent work in visual model interpretability has focused on gradient-based attention, these methods rely on a classification module to generate visual explanations. Consequently, they cannot readily explain other kinds of models that do not use or need classification-like loss functions (e.g., similarity models trained with a metric learning loss). In this work, we bridge this crucial gap, presenting the first method to generate gradient-based visual explanations for image similarity predictors. By relying solely on the learned feature embedding, we show that our approach can be applied to any kind of CNN-based similarity architecture, an important step towards generic visual explainability. We show that our resulting visual explanations serve more than just interpretability; they can be infused into the model learning process itself with new trainable constraints based on our similarity explanations. We show that the resulting similarity models perform, and can be visually explained, better than the corresponding baseline models trained without our explanation constraints. We demonstrate our approach using extensive experiments on three different kinds of tasks: generic image retrieval, person re-identification, and low-shot semantic segmentation.

[3]：Multi-Mask Self-Supervised Learning for Physics-Guided Neural Networks in Highly Accelerated MRI
标题：高加速MRI物理引导神经网络的多掩模自监督学习
作者：Burhaneddin Yaman, Seyed Amir Hossein Hosseini, Steen Moeller, Jutta Ellermann, Kâmil Uğurbil, Mehmet Akçakaya
链接：https://arxiv.org/abs/2008.06029

摘要：Purpose: To develop an improved self-supervised learning strategy that efficiently uses the acquired data for training a physics-guided reconstruction network without a database of fully-sampled data.
Methods: Currently self-supervised learning for physics-guided reconstruction networks splits acquired undersampled data into two disjoint sets, where one is used for data consistency (DC) in the unrolled network and the other to define the training loss. The proposed multi-mask self-supervised learning via data undersampling (SSDU) splits acquired measurements into multiple pairs of disjoint sets for each training sample, while using one of these sets for DC units and the other for defining loss, thereby more efficiently using the undersampled data. Multi-mask SSDU is applied on fully-sampled 3D knee and prospectively undersampled 3D brain MRI datasets, which are retrospectively subsampled to acceleration rate (R)=8, and compared to CG-SENSE and single-mask SSDU DL-MRI, as well as supervised DL-MRI when fully-sampled data is available.
Results: Results on knee MRI show that the proposed multi-mask SSDU outperforms SSDU and performs closely with supervised DL-MRI, while significantly outperforming CG-SENSE. A clinical reader study further ranks the multi-mask SSDU higher than supervised DL-MRI in terms of SNR and aliasing artifacts. Results on brain MRI show that multi-mask SSDU achieves better reconstruction quality compared to SSDU and CG-SENSE. Reader study demonstrates that multi-mask SSDU at R=8 significantly improves reconstruction compared to single-mask SSDU at R=8, as well as CG-SENSE at R=2.
Conclusion: The proposed multi-mask SSDU approach enables improved training of physics-guided neural networks without fully-sampled data, by enabling efficient use of the undersampled data with multiple masks.

[4]：BioMetricNet: deep unconstrained face verification through learning of metrics regularized onto Gaussian distributions
标题：生物网络：通过学习正则化高斯分布的度量进行深度无约束人脸验证
作者：Arslan Ali, Matteo Testa, Tiziano Bianchi, Enrico Magli
备注：Accepted at ECCV20
链接：https://arxiv.org/abs/2008.06021

摘要：We present BioMetricNet: a novel framework for deep unconstrained face verification which learns a regularized metric to compare facial features. Differently from popular methods such as FaceNet, the proposed approach does not impose any specific metric on facial features; instead, it shapes the decision space by learning a latent representation in which matching and non-matching pairs are mapped onto clearly separated and well-behaved target distributions. In particular, the network jointly learns the best feature representation, and the best metric that follows the target distributions, to be used to discriminate face images. In this paper we present this general framework, first of its kind for facial verification, and tailor it to Gaussian distributions. This choice enables the use of a simple linear decision boundary that can be tuned to achieve the desired trade-off between false alarm and genuine acceptance rate, and leads to a loss function that can be written in closed form. Extensive analysis and experimentation on publicly available datasets such as Labeled Faces in the wild (LFW), Youtube faces (YTF), Celebrities in Frontal-Profile in the Wild (CFP), and challenging datasets like cross-age LFW (CALFW), cross-pose LFW (CPLFW), In-the-wild Age Dataset (AgeDB) show a significant performance improvement and confirms the effectiveness and superiority of BioMetricNet over existing state-of-the-art methods.

[5]：Testing the Safety of Self-driving Vehicles by Simulating Perception and Prediction
标题：自动驾驶汽车安全性的模拟感知与预测测试
作者：Kelvin Wong, Qiang Zhang, Ming Liang, Bin Yang, Renjie Liao, Abbas Sadat, Raquel Urtasun
备注：ECCV 2020
链接：https://arxiv.org/abs/2008.06020

摘要：We present a novel method for testing the safety of self-driving vehicles in simulation. We propose an alternative to sensor simulation, as sensor simulation is expensive and has large domain gaps. Instead, we directly simulate the outputs of the self-driving vehicle's perception and prediction system, enabling realistic motion planning testing. Specifically, we use paired data in the form of ground truth labels and real perception and prediction outputs to train a model that predicts what the online system will produce. Importantly, the inputs to our system consists of high definition maps, bounding boxes, and trajectories, which can be easily sketched by a test engineer in a matter of minutes. This makes our approach a much more scalable solution. Quantitative results on two large-scale datasets demonstrate that we can realistically test motion planning using our simulations.

[6]：SIDOD: A Synthetic Image Dataset for 3D Object Pose Recognition with Distractors
标题：SIDOD：一个用于三维物体姿态识别的综合图像数据集
作者：Mona Jalal, Josef Spjut, Ben Boudaoud, Margrit Betke
备注：3 pages, 4 figures, 1 table, Accepted at CVPR 2019 Workshop
链接：https://arxiv.org/abs/2008.05955

摘要：We present a new, publicly-available image dataset generated by the NVIDIA Deep Learning Data Synthesizer intended for use in object detection, pose estimation, and tracking applications. This dataset contains 144k stereo image pairs that synthetically combine 18 camera viewpoints of three photorealistic virtual environments with up to 10 objects (chosen randomly from the 21 object models of the YCB dataset [1]) and flying distractors. Object and camera pose, scene lighting, and quantity of objects and distractors were randomized. Each provided view includes RGB, depth, segmentation, and surface normal images, all pixel level. We describe our approach for domain randomization and provide insight into the decisions that produced the dataset.

[7]：On failures of RGB cameras and their effects in autonomous driving applications
标题：RGB摄像机在自动驾驶中的失效及其影响
作者：Francesco Secci, Andrea Ceccarelli
备注：preprint - accepted to the The 31st International Symposium on Software Reliability Engineering (ISSRE 2020)
链接：https://arxiv.org/abs/2008.05938

摘要：RGB cameras are arguably one of the most relevant sensors for autonomous driving applications. It is undeniable that failures of vehicle cameras may compromise the autonomous driving task, possibly leading to unsafe behaviors when images that are subsequently processed by the driving system are altered. To support the definition of safe and robust vehicle architectures and intelligent systems, in this paper we define the failures model of a vehicle camera, together with an analysis of effects and known mitigations. Further, we build a software library for the generation of the corresponding failed images and we feed them to the trained agent of an autonomous driving simulator: the misbehavior of the trained agent allows a better understanding of failures effects and especially of the resulting safety risk.

[8]：Perceive, Predict, and Plan: Safe Motion Planning Through Interpretable Semantic Representations
标题：感知、预测和计划：通过可解释的语义表示进行安全的运动规划
作者：Abbas Sadat, Sergio Casas, Mengye Ren, Xinyu Wu, Pranaab Dhawan, Raquel Urtasun
备注：European Conference on Computer Vision (ECCV) 2020
链接：https://arxiv.org/abs/2008.05930

摘要：In this paper we propose a novel end-to-end learnable network that performs joint perception, prediction and motion planning for self-driving vehicles and produces interpretable intermediate representations. Unlike existing neural motion planners, our motion planning costs are consistent with our perception and prediction estimates. This is achieved by a novel differentiable semantic occupancy representation that is explicitly used as cost by the motion planning process. Our network is learned end-to-end from human demonstrations. The experiments in a large-scale manual-driving dataset and closed-loop simulation show that the proposed model significantly outperforms state-of-the-art planners in imitating the human behaviors while producing much safer trajectories.

[9]：LGNN: a Context-aware Line Segment Detector
标题：LGNN：一种上下文感知的线段检测器
作者：Quan Meng, Jiakai Zhang, Qiang Hu, Xuming He, Jingyi Yu
备注：9 pages, 7 figures
链接：https://arxiv.org/abs/2008.05892

摘要：We present a novel real-time line segment detection scheme called Line Graph Neural Network (LGNN). Existing approaches require a computationally expensive verification or postprocessing step. Our LGNN employs a deep convolutional neural network (DCNN) for proposing line segment directly, with a graph neural network (GNN) module for reasoning their connectivities. Specifically, LGNN exploits a new quadruplet representation for each line segment where the GNN module takes the predicted candidates as vertexes and constructs a sparse graph to enforce structural context. Compared with the state-of-the-art, LGNN achieves near real-time performance without compromising accuracy. LGNN further enables time-sensitive 3D applications. When a 3D point cloud is accessible, we present a multi-modal line segment classification technique for extracting a 3D wireframe of the environment robustly and efficiently.

[10]：Localizing the Common Action Among a Few Videos
标题：在几个视频中定位常见动作
作者：Pengwan Yang, Vincent Tao Hu, Pascal Mettes, Cees G. M. Snoek
备注：ECCV 2020
链接：https://arxiv.org/abs/2008.05826

摘要：This paper strives to localize the temporal extent of an action in a long untrimmed video. Where existing work leverages many examples with their start, their ending, and/or the class of the action during training time, we propose few-shot common action localization. The start and end of an action in a long untrimmed video is determined based on just a hand-full of trimmed video examples containing the same action, without knowing their common class label. To address this task, we introduce a new 3D convolutional network architecture able to align representations from the support videos with the relevant query video segments. The network contains: (\textit{i}) a mutual enhancement module to simultaneously complement the representation of the few trimmed support videos and the untrimmed query video; (\textit{ii}) a progressive alignment module that iteratively fuses the support videos into the query branch; and (\textit{iii}) a pairwise matching module to weigh the importance of different support videos. Evaluation of few-shot common action localization in untrimmed videos containing a single or multiple action instances demonstrates the effectiveness and general applicability of our proposal.

[11]：Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
标题：看、听、听、听：自监督视听表征学习的共同注意网络
作者：Ying Cheng, Ruize Wang, Zhihao Pan, Rui Feng, Yuejie Zhang
备注：Accepted by the 28th ACM International Conference on Multimedia (ACM MM 2020)
链接：https://arxiv.org/abs/2008.05789

摘要：When watching videos, the occurrence of a visual event is often accompanied by an audio event, e.g., the voice of lip motion, the music of playing instruments. There is an underlying correlation between audio and visual events, which can be utilized as free supervised information to train a neural network by solving the pretext task of audio-visual synchronization. In this paper, we propose a novel self-supervised framework with co-attention mechanism to learn generic cross-modal representations from unlabelled videos in the wild, and further benefit downstream tasks. Specifically, we explore three different co-attention modules to focus on discriminative visual regions correlated to the sounds and introduce the interactions between them. Experiments show that our model achieves state-of-the-art performance on the pretext task while having fewer parameters compared with existing methods. To further evaluate the generalizability and transferability of our approach, we apply the pre-trained model on two downstream tasks, i.e., sound source localization and action recognition. Extensive experiments demonstrate that our model provides competitive results with other self-supervised methods, and also indicate that our approach can tackle the challenging scenes which contain multiple sound sources.

[12]：Predicting Visual Overlap of Images Through Interpretable Non-Metric Box Embeddings
标题：利用可解释非度量盒嵌入预测图像的视觉重叠
作者：Anita Rau, Guillermo Garcia-Hernando, Danail Stoyanov, Gabriel J. Brostow, Daniyar Turmukhambetov
备注：ECCV 2020
链接：https://arxiv.org/abs/2008.05785

摘要：To what extent are two images picturing the same 3D surfaces? Even when this is a known scene, the answer typically requires an expensive search across scale space, with matching and geometric verification of large sets of local features. This expense is further multiplied when a query image is evaluated against a gallery, e.g. in visual relocalization. While we don't obviate the need for geometric verification, we propose an interpretable image-embedding that cuts the search in scale space to essentially a lookup.
Our approach measures the asymmetric relation between two images. The model then learns a scene-specific measure of similarity, from training examples with known 3D visible-surface overlaps. The result is that we can quickly identify, for example, which test image is a close-up version of another, and by what scale factor. Subsequently, local features need only be detected at that scale. We validate our scene-specific model by showing how this embedding yields competitive image-matching results, while being simpler, faster, and also interpretable by humans.

[13]：CycleMorph: Cycle Consistent Unsupervised Deformable Image Registration
标题：循环一致无监督可变形图像配准
作者：Boah Kim, Dong Hwan Kim, Seong Ho Park, Jieun Kim, June-Goo Lee, Jong Chul Ye
链接：https://arxiv.org/abs/2008.05772

摘要：Image registration is a fundamental task in medical image analysis. Recently, deep learning based image registration methods have been extensively investigated due to their excellent performance despite the ultra-fast computational time. However, the existing deep learning methods still have limitation in the preservation of original topology during the deformation with registration vector fields. To address this issues, here we present a cycle-consistent deformable image registration. The cycle consistency enhances image registration performance by providing an implicit regularization to preserve topology during the deformation. The proposed method is so flexible that can be applied for both 2D and 3D registration problems for various applications, and can be easily extended to multi-scale implementation to deal with the memory issues in large volume registration. Experimental results on various datasets from medical and non-medical applications demonstrate that the proposed method provides effective and accurate registration on diverse image pairs within a few seconds. Qualitative and quantitative evaluations on deformation fields also verify the effectiveness of the cycle consistency of the proposed method.

[14]：AdaIN-Switchable CycleGAN for Efficient Unsupervised Low-Dose CT Denoising
标题：AdaIN可切换CycleGAN高效无监督低剂量CT去噪
作者：Jawook Gu, Jong Chul Ye
备注：12 pages, 10 figures
链接：https://arxiv.org/abs/2008.05753

摘要：Recently, deep learning approaches have been extensively studied for low-dose CT denoising thanks to its superior performance despite the fast computational time. In particular, cycleGAN has been demonstrated as a powerful unsupervised learning scheme to improve the low-dose CT image quality without requiring matched high-dose reference data. Unfortunately, one of the main limitations of the cycleGAN approach is that it requires two deep neural network generators at the training phase, although only one of them is used at the inference phase. The secondary auxiliary generator is needed to enforce the cycle-consistency, but the additional memory requirement and increases of the learnable parameters are the main huddles for cycleGAN training. To address this issue, here we propose a novel cycleGAN architecture using a single switchable generator. In particular, a single generator is implemented using adaptive instance normalization (AdaIN) layers so that the baseline generator converting a low-dose CT image to a routine-dose CT image can be switched to a generator converting high-dose to low-dose by simply changing the AdaIN code. Thanks to the shared baseline network, the additional memory requirement and weight increases are minimized, and the training can be done more stably even with small training data. Experimental results show that the proposed method outperforms the previous cycleGAN approaches while using only about half the parameters.

[15]：Few shot clustering for indoor occupancy detection with extremely low-quality images from battery free cameras
标题：基于无电池摄像头的极低质量图像的室内占用检测的少镜头聚类
作者：Homagni Saha, Sin Yon Tan, Ali Saffari, Mohamad Katanbaf, Joshua R. Smith, Soumik Sarkar
备注：9 pages
链接：https://arxiv.org/abs/2008.05654

摘要：Reliable detection of human occupancy in indoor environments is critical for various energy efficiency, security, and safety applications. We consider this challenge of occupancy detection using extremely low-quality, privacy-preserving images from low power image sensors. We propose a combined few shot learning and clustering algorithm to address this challenge that has very low commissioning and maintenance cost. While the few shot learning concept enables us to commission our system with a few labeled examples, the clustering step serves the purpose of online adaptation to changing imaging environment over time. Apart from validating and comparing our algorithm on benchmark datasets, we also demonstrate performance of our algorithm on streaming images collected from real homes using our novel battery free camera hardware.

[16]：Towards Modality Transferable Visual Information Representation with Optimal Model Compression
标题：基于最优模型压缩的模态可转换视觉信息表示
作者：Rongqun Lin, Linwei Zhu, Shiqi Wang, Sam Kwong
备注：Accepted in ACM Multimedia 2020
链接：https://arxiv.org/abs/2008.05642

摘要：Compactly representing the visual signals is of fundamental importance in various image/video-centered applications. Although numerous approaches were developed for improving the image and video coding performance by removing the redundancies within visual signals, much less work has been dedicated to the transformation of the visual signals to another well-established modality for better representation capability. In this paper, we propose a new scheme for visual signal representation that leverages the philosophy of transferable modality. In particular, the deep learning model, which characterizes and absorbs the statistics of the input scene with online training, could be efficiently represented in the sense of rate-utility optimization to serve as the enhancement layer in the bitstream. As such, the overall performance can be further guaranteed by optimizing the new modality incorporated. The proposed framework is implemented on the state-of-the-art video coding standard (i.e., versatile video coding), and significantly better representation capability has been observed based on extensive evaluations.

[17]：Procedural Urban Forestry
标题：程序性城市林业
作者：Till Niese, Sören Pirk, Bedrich Benes, Oliver Deussen
备注：14 pages
链接：https://arxiv.org/abs/2008.05567

摘要：The placement of vegetation plays a central role in the realism of virtual scenes. We introduce procedural placement models (PPMs) for vegetation in urban layouts. PPMs are environmentally sensitive to city geometry and allow identifying plausible plant positions based on structural and functional zones in an urban layout. PPMs can either be directly used by defining their parameters or can be learned from satellite images and land register data. Together with approaches for generating buildings and trees, this allows us to populate urban landscapes with complex 3D vegetation. The effectiveness of our framework is shown through examples of large-scale city scenes and close-ups of individually grown tree models; we also validate it by a perceptual user study.

[18]：Multi-level Stress Assessment Using Multi-domain Fusion of ECG Signal
标题：基于心电信号多域融合的多层次应力评估
作者：Zeeshan Ahmad, Naimul Khan
链接：https://arxiv.org/abs/2008.05503

摘要：Stress analysis and assessment of affective states of mind using ECG as a physiological signal is a burning research topic in biomedical signal processing. However, existing literature provides only binary assessment of stress, while multiple levels of assessment may be more beneficial for healthcare applications. Furthermore, in present research, ECG signal for stress analysis is examined independently in spatial domain or in transform domains but the advantage of fusing these domains has not been fully utilized. To get the maximum advantage of fusing diferent domains, we introduce a dataset with multiple stress levels and then classify these levels using a novel deep learning approach by converting ECG signal into signal images based on R-R peaks without any feature extraction. Moreover, We made signal images multimodal and multidomain by converting them into time-frequency and frequency domain using Gabor wavelet transform (GWT) and Discrete Fourier Transform (DFT) respectively. Convolutional Neural networks (CNNs) are used to extract features from different modalities and then decision level fusion is performed for improving the classification accuracy. The experimental results on an in-house dataset collected with 15 users show that with proposed fusion framework and using ECG signal to image conversion, we reach an average accuracy of 85.45%.

NLP方向重复(4篇)

[1]：On the Importance of Local Information in Transformer Based Models
标题：论局部信息在变压器模型中的重要性
作者：Madhura Pande, Aakriti Budhraja, Preksha Nema, Pratyush Kumar, Mitesh M. Khapra
备注：10 pages, 4 figures
链接：https://arxiv.org/abs/2008.05828

摘要：The self-attention module is a key component of Transformer-based models, wherein each token pays attention to every other token. Recent studies have shown that these heads exhibit syntactic, semantic, or local behaviour. Some studies have also identified promise in restricting this attention to be local, i.e., a token attending to other tokens only in a small neighbourhood around it. However, no conclusive evidence exists that such local attention alone is sufficient to achieve high accuracy on multiple NLP tasks. In this work, we systematically analyse the role of locality information in learnt models and contrast it with the role of syntactic information. More specifically, we first do a sensitivity analysis and show that, at every layer, the representation of a token is much more sensitive to tokens in a small neighborhood around it than to tokens which are syntactically related to it. We then define an attention bias metric to determine whether a head pays more attention to local tokens or to syntactically related tokens. We show that a larger fraction of heads have a locality bias as compared to a syntactic bias. Having established the importance of local attention heads, we train and evaluate models where varying fractions of the attention heads are constrained to be local. Such models would be more efficient as they would have fewer computations in the attention layer. We evaluate these models on 4 GLUE datasets (QQP, SST-2, MRPC, QNLI) and 2 MT datasets (En-De, En-Ru) and clearly demonstrate that such constrained models have comparable performance to the unconstrained models. Through this systematic evaluation we establish that attention in Transformer-based models can be constrained to be local without affecting performance.

[2]：MASRI-HEADSET: A Maltese Corpus for Speech Recognition
标题：MASRI-HEADSET：用于语音识别的马耳他语料库
作者：Carlos Mena, Albert Gatt, Andrea DeMarco, Claudia Borg, Lonneke van der Plas, Amanda Muscat, Ian Padovani
备注：8 pages, 2 figures, 4 tables, 1 appendix. Appears in Proceedings of the 12th edition of the Language Resources and Evaluation Conference (LREC'20)
链接：https://arxiv.org/abs/2008.05760

摘要：Maltese, the national language of Malta, is spoken by approximately 500,000 people. Speech processing for Maltese is still in its early stages of development. In this paper, we present the first spoken Maltese corpus designed purposely for Automatic Speech Recognition (ASR). The MASRI-HEADSET corpus was developed by the MASRI project at the University of Malta. It consists of 8 hours of speech paired with text, recorded by using short text snippets in a laboratory environment. The speakers were recruited from different geographical locations all over the Maltese islands, and were roughly evenly distributed by gender. This paper also presents some initial results achieved in baseline experiments for Maltese ASR using Sphinx and Kaldi. The MASRI-HEADSET Corpus is publicly available for research/academic purposes.

[3]：MICE: Mining Idioms with Contextual Embeddings
标题：MICE：基于上下文嵌入的成语挖掘
作者：Tadej Škvorc, Polona Gantar, Marko Robnik-Šikonja
链接：https://arxiv.org/abs/2008.05759

摘要：Idiomatic expressions can be problematic for natural language processing applications as their meaning cannot be inferred from their constituting words. A lack of successful methodological approaches and sufficiently large datasets prevents the development of machine learning approaches for detecting idioms, especially for expressions that do not occur in the training set. We present an approach, called MICE, that uses contextual embeddings for that purpose. We present a new dataset of multi-word expressions with literal and idiomatic meanings and use it to train a classifier based on two state-of-the-art contextual word embeddings: ELMo and BERT. We show that deep neural networks using both embeddings perform much better than existing approaches, and are capable of detecting idiomatic word use, even for expressions that were not present in the training set. We demonstrate cross-lingual transfer of developed models and analyze the size of the required dataset.

[4]：Semantics-preserving adversarial attacks in NLP
标题：NLP中保持对抗攻击语义的方法
作者：Rahul Singh, Tarun Joshi, Vijayan N. Nair, Agus Sudjianto
备注：12 Pages, 3 Figures, 10 Tables
链接：https://arxiv.org/abs/2008.05536

摘要：We propose algorithms to create adversarial attacks to assess model robustness in text classification problems. They can be used to create white box attacks and black box attacks while at the same time preserving the semantics and syntax of the original text. The attacks cause significant number of flips in white-box setting and same rule based can be used in black-box setting. In a black-box setting, the attacks created are able to reverse decisions of transformer based architectures.

表情

图片

附件

热门资讯

北京大学CCL语料库【前沿】R语言元分析专题第七章：亚组分析【前沿】交叉滞后中介模型Mplus的应用【网上课堂】雨课堂+腾讯会议操作攻略语言学的主要分支 2020年最新语言学SSCI期刊影响因子排名... R语言元分析专题：计算效应量的大小兰卡斯特大学的语料库研究新工具LancsBox... R语言元分析专题第五章：森林图语系、语族、语支——世界语言万花筒

推荐工具