3257 阅读 2021-05-24 10:39:13 上传
以下文章来源于 浙江语言学
【1】 Stage-wise Fine-tuning for Graph-to-Text Generation
作者:Qingyun Wang,Semih Yavuz,Victoria Lin,Heng Ji,Nazneen Rajani
机构: University of Illinois at Urbana-Champaign, Salesforce Research , Facebook Research
备注:9 pages, Accepted by Proceedings of ACL-IJCNLP 2021 Student Research Workshop, Code and Resources at this this https URL
摘要:图形到文本的生成得益于预先训练的语言模型(plm),它比结构化的图形编码器具有更好的性能。然而,它们并没有充分利用输入图的结构信息。为了进一步提高预训练语言模型的性能,本文提出了一种带两步微调机制的结构化图-文本模型,首先对Wikipedia上的模型进行微调,然后再适应图-文本生成。除了使用传统的标记和位置嵌入对知识图进行编码外,我们还提出了一种新的树级嵌入方法来捕获输入图的相互依赖结构。这种新方法显著提高了英语WebNLG 2017数据集所有文本生成指标的性能。
摘要:Graph-to-text generation has benefited from pre-trained language models (PLMs) in achieving better performance than structured graph encoders. However, they fail to fully utilize the structure information of the input graph. In this paper, we aim to further improve the performance of the pre-trained language model by proposing a structured graph-to-text model with a two-step fine-tuning mechanism which first fine-tunes model on Wikipedia before adapting to the graph-to-text generation. In addition to using the traditional token and position embeddings to encode the knowledge graph (KG), we propose a novel tree-level embedding method to capture the inter-dependency structures of the input graph. This new approach has significantly improved the performance of all text generation metrics for the English WebNLG 2017 dataset.
【2】 Controlling an Inverted Pendulum with Policy Gradient Methods-A Tutorial
作者:Swagat Kumar
备注:8 pages, 3 figures, 2 tables etc
摘要:This paper provides the details of implementing two important policy gradient methods to solve the inverted pendulum problem. These are namely the Deep Deterministic Policy Gradient (DDPG) and the Proximal Policy Optimization (PPO) algorithm. The problem is solved by using an actor-critic model where an actor-network is used to learn the policy function and a critic network is to evaluate the actor-network by learning to estimate the Q function. Apart from briefly explaining the mathematics behind these two algorithms, the details of python implementation are provided which helps in demystifying the underlying complexity of the algorithm. In the process, the readers will be introduced to OpenAI/Gym, Tensorflow 2.x and Keras utilities used for implementing the above concepts.
【3】 Learning User Embeddings from Temporal Social Media Data: A Survey
作者:Fatema Hasan,Kevin S. Xu,James R. Foulds,Shimei Pan
机构:Information Systems, University of Maryland, Baltimore County, Electrical Engineering & Computer Science, University of Toledo
摘要:User-generated data on social media contain rich information about who we are, what we like and how we make decisions. In this paper, we survey representative work on learning a concise latent user representation (a.k.a. user embedding) that can capture the main characteristics of a social media user. The learned user embeddings can later be used to support different downstream user analysis tasks such as personality modeling, suicidal risk assessment and purchase decision prediction. The temporal nature of user-generated data on social media has largely been overlooked in much of the existing user embedding literature. In this survey, we focus on research that bridges the gap by incorporating temporal/sequential information in user representation learning. We categorize relevant papers along several key dimensions, identify limitations in the current work and suggest future research directions.
【4】 Learning to Automatically Catch Potholes in Worldwide Road Scene Images
作者:J. Javier Yebes,David Montero,Ignacio Arriola
机构: the Department of Transport in UK stated in 20 1 4 that more than £ 3 billion were spent nationally onroad repairs
备注:in IEEE Intelligent Transportation Systems Magazine
摘要:Among several road hazards that are present in any paved way in the world, potholes are one of the most annoying and also involving higher maintenance costs. There exists an increasing interest on the automated detection of these hazards enabled by technological and research progress. Our research work tackled the challenge of pothole detection from images of real world road scenes. The main novelty resides on the application of the latest progress in AI to learn the visual appearance of potholes. We built a large dataset of images with pothole annotations. They contained road scenes from different cities in the world, taken with different cameras, vehicles and viewpoints under varied environmental conditions. Then, we fine-tuned four different object detection models based on Faster R-CNN and SSD deep neural networks. We achieved high average precision and the pothole detector was tested on the Nvidia DrivePX2 platform with GPGPU capability, which can be embedded on vehicles. Moreover, it was deployed on a real vehicle to notify the detected potholes to a given IoT platform as part of AUTOPILOT H2020 project.
【5】 Gradient Masking and the Underestimated Robustness Threats of Differential Privacy in Deep Learning
作者:Franziska Boenisch,Philip Sperl,Konstantin Böttinger
机构:Fraunhofer Institute for Applied and Integrated Security
摘要:An important problem in deep learning is the privacy and security of neural networks (NNs). Both aspects have long been considered separately. To date, it is still poorly understood how privacy enhancing training affects the robustness of NNs. This paper experimentally evaluates the impact of training with Differential Privacy (DP), a standard method for privacy preservation, on model vulnerability against a broad range of adversarial attacks. The results suggest that private models are less robust than their non-private counterparts, and that adversarial examples transfer better among DP models than between non-private and private ones. Furthermore, detailed analyses of DP and non-DP models suggest significant differences between their gradients. Additionally, this work is the first to observe that an unfavorable choice of parameters in DP training can lead to gradient masking, and, thereby, results in a wrong sense of security.
【6】 Learn to Intervene: An Adaptive Learning Policy for Restless Bandits in Application to Preventive Healthcare
作者:Arpita Biswas,Gaurav Aggarwal,Pradeep Varakantham,Milind Tambe
机构:Harvard University, Google Research
备注:To appear in the 30th International Joint Conference on Artificial Intelligence (IJCAI 2021)
摘要:In many public health settings, it is important for patients to adhere to health programs, such as taking medications and periodic health checks. Unfortunately, beneficiaries may gradually disengage from such programs, which is detrimental to their health. A concrete example of gradual disengagement has been observed by an organization that carries out a free automated call-based program for spreading preventive care information among pregnant women. Many women stop picking up calls after being enrolled for a few months. To avoid such disengagements, it is important to provide timely interventions. Such interventions are often expensive and can be provided to only a small fraction of the beneficiaries. We model this scenario as a restless multi-armed bandit (RMAB) problem, where each beneficiary is assumed to transition from one state to another depending on the intervention. Moreover, since the transition probabilities are unknown a priori, we propose a Whittle index based Q-Learning mechanism and show that it converges to the optimal solution. Our method improves over existing learning-based methods for RMABs on multiple benchmarks from literature and also on the maternal healthcare dataset.
【7】 Behavior-based Neuroevolutionary Training in Reinforcement Learning
作者:Jörg Stork,Martin Zaefferer,Nils Eisler,Patrick Tichelmann,Thomas Bartz-Beielstein,A. E. Eiben
机构:TH Köln, Cologne, Germany, Vrije Universitat Amsterdam, Amsterdam, Netherlands
摘要:In addition to their undisputed success in solving classical optimization problems, neuroevolutionary and population-based algorithms have become an alternative to standard reinforcement learning methods. However, evolutionary methods often lack the sample efficiency of standard value-based methods that leverage gathered state and value experience. If reinforcement learning for real-world problems with significant resource cost is considered, sample efficiency is essential. The enhancement of evolutionary algorithms with experience exploiting methods is thus desired and promises valuable insights. This work presents a hybrid algorithm that combines topology-changing neuroevolutionary optimization with value-based reinforcement learning. We illustrate how the behavior of policies can be used to create distance and loss functions, which benefit from stored experiences and calculated state values. They allow us to model behavior and perform a directed search in the behavior space by gradient-free evolutionary algorithms and surrogate-based optimization. For this purpose, we consolidate different methods to generate and optimize agent policies, creating a diverse population. We exemplify the performance of our algorithm on standard benchmarks and a purpose-built real-world problem. Our results indicate that combining methods can enhance the sample efficiency and learning speed for evolutionary approaches.
【8】 Evolutionary Training and Abstraction Yields Algorithmic Generalization of Neural Computers
作者:Daniel Tanneberg,Elmar Rueckert,Jan Peters
机构:Intelligent Autonomous Systems, Technische Universit¨at Darmstadt, Darmstadt, Germany, Institute for Robotics and Cognitive Systems, Universit¨at zu L¨ubeck, L¨ubeck, Germany
备注:Nature Machine Intelligence
摘要:A key feature of intelligent behaviour is the ability to learn abstract strategies that scale and transfer to unfamiliar problems. An abstract strategy solves every sample from a problem class, no matter its representation or complexity -- like algorithms in computer science. Neural networks are powerful models for processing sensory data, discovering hidden patterns, and learning complex functions, but they struggle to learn such iterative, sequential or hierarchical algorithmic strategies. Extending neural networks with external memories has increased their capacities in learning such strategies, but they are still prone to data variations, struggle to learn scalable and transferable solutions, and require massive training data. We present the Neural Harvard Computer (NHC), a memory-augmented network based architecture, that employs abstraction by decoupling algorithmic operations from data manipulations, realized by splitting the information flow and separated modules. This abstraction mechanism and evolutionary training enable the learning of robust and scalable algorithmic solutions. On a diverse set of 11 algorithms with varying complexities, we show that the NHC reliably learns algorithmic solutions with strong generalization and abstraction: perfect generalization and scaling to arbitrary task configurations and complexities far beyond seen during training, and being independent of the data representation and the task domain.
【9】 MMGET: A Markov model for generalized evidence theory
作者:Yuanpeng He
机构: School of Computer and Information Science, SouthwestUniversity, School of Computer and InformationScience, Southwest University
备注:20 pages,24 figures
摘要:In real life, lots of information merges from time to time. To appropriately describe the actual situations, lots of theories have been proposed. Among them, Dempster-Shafer evidence theory is a very useful tool in managing uncertain information. To better adapt to complex situations of open world, a generalized evidence theory is designed. However, everything occurs in sequence and owns some underlying relationships with each other. In order to further embody the details of information and better conforms to situations of real world, a Markov model is introduced into the generalized evidence theory which helps extract complete information volume from evidence provided. Besides, some numerical examples is offered to verify the correctness and rationality of the proposed method.
【10】 TCL: Transformer-based Dynamic Graph Modelling via Contrastive Learning
作者:Lu Wang,Xiaofu Chang,Shuang Li,Yunfei Chu,Hui Li,Wei Zhang,Xiaofeng He,Le Song,Jingren Zhou,Hongxia Yang
机构:East China Normal University, Damo Academy, Alibaba Group, Harvard University, USA, Ant Group, Gatech
摘要:Dynamic graph modeling has recently attracted much attention due to its extensive applications in many real-world scenarios, such as recommendation systems, financial transactions, and social networks. Although many works have been proposed for dynamic graph modeling in recent years, effective and scalable models are yet to be developed. In this paper, we propose a novel graph neural network approach, called TCL, which deals with the dynamically-evolving graph in a continuous-time fashion and enables effective dynamic node representation learning that captures both the temporal and topology information. Technically, our model contains three novel aspects. First, we generalize the vanilla Transformer to temporal graph learning scenarios and design a graph-topology-aware transformer. Secondly, on top of the proposed graph transformer, we introduce a two-stream encoder that separately extracts representations from temporal neighborhoods associated with the two interaction nodes and then utilizes a co-attentional transformer to model inter-dependencies at a semantic level. Lastly, we are inspired by the recently developed contrastive learning and propose to optimize our model by maximizing mutual information (MI) between the predictive representations of two future interaction nodes. Benefiting from this, our dynamic representations can preserve high-level (or global) semantics about interactions and thus is robust to noisy interactions. To the best of our knowledge, this is the first attempt to apply contrastive learning to representation learning on dynamic graphs. We evaluate our model on four benchmark datasets for interaction prediction and experiment results demonstrate the superiority of our model.
【11】 Mean Field Games Flock! The Reinforcement Learning Way
作者:Sarah Perrin,Mathieu Laurière,Julien Pérolat,Matthieu Geist,Romuald Élie,Olivier Pietquin
机构:Univ. Lille, CNRS, Inria, Centrale Lille, UMR , CRIStAL, Princeton University, ORFE, DeepMind Paris, Google Research, Brain Team
摘要:We present a method enabling a large number of agents to learn how to flock, which is a natural behavior observed in large populations of animals. This problem has drawn a lot of interest but requires many structural assumptions and is tractable only in small dimensions. We phrase this problem as a Mean Field Game (MFG), where each individual chooses its acceleration depending on the population behavior. Combining Deep Reinforcement Learning (RL) and Normalizing Flows (NF), we obtain a tractable solution requiring only very weak assumptions. Our algorithm finds a Nash Equilibrium and the agents adapt their velocity to match the neighboring flock's average one. We use Fictitious Play and alternate: (1) computing an approximate best response with Deep RL, and (2) estimating the next population distribution with NF. We show numerically that our algorithm learn multi-group or high-dimensional flocking with obstacles.
【12】 Physics-informed attention-based neural network for solving non-linear partial differential equations
作者:Ruben Rodriguez-Torrado,Pablo Ruiz,Luis Cueto-Felgueroso,Michael Cerny Green,Tyler Friesen,Sebastien Matringe,Julian Togelius
机构:OriGen.AI and Universidad Politecnica de Madrid, Hess Corporation
摘要:Physics-Informed Neural Networks (PINNs) have enabled significant improvements in modelling physical processes described by partial differential equations (PDEs). PINNs are based on simple architectures, and learn the behavior of complex physical systems by optimizing the network parameters to minimize the residual of the underlying PDE. Current network architectures share some of the limitations of classical numerical discretization schemes when applied to non-linear differential equations in continuum mechanics. A paradigmatic example is the solution of hyperbolic conservation laws that develop highly localized nonlinear shock waves. Learning solutions of PDEs with dominant hyperbolic character is a challenge for current PINN approaches, which rely, like most grid-based numerical schemes, on adding artificial dissipation. Here, we address the fundamental question of which network architectures are best suited to learn the complex behavior of non-linear PDEs. We focus on network architecture rather than on residual regularization. Our new methodology, called Physics-Informed Attention-based Neural Networks, (PIANNs), is a combination of recurrent neural networks and attention mechanisms. The attention mechanism adapts the behavior of the deep neural network to the non-linear features of the solution, and break the current limitations of PINNs. We find that PIANNs effectively capture the shock front in a hyperbolic model problem, and are capable of providing high-quality solutions inside and beyond the training set.
【13】 HetMAML: Task-Heterogeneous Model-Agnostic Meta-Learning for Few-Shot Learning Across Modalities
作者:Jiayi Chen,Aidong Zhang
机构:University of Virginia
摘要:Most of existing gradient-based meta-learning approaches to few-shot learning assume that all tasks have the same input feature space. However, in the real world scenarios, there are many cases that the input structures of tasks can be different, that is, different tasks may vary in the number of input modalities or the data structure of each modality. Existing gradient-based approaches cannot handle such heterogeneous task distribution (HTD) as different types of tasks only share partial meta-parameters. In this paper, we propose HetMAML, a task-heterogeneous meta-agnostic meta-learning framework that can generalize not only common meta-parameters shared across different types of tasks but also type-specific meta-parameters. Specifically, we design a multi-channel backbone module that encodes the input of each type of tasks into the same length sequence of modality-specific embeddings. Then, we propose a task-aware multimodal encoder which can automatically take into account the context of task-specific input structures and adaptively project the heterogeneous input spaces to the same lower-dimensional concept space. The extensive experiments on five task-heterogeneous datasets demonstrate that our HetMAML successfully captures both type-specific and shared meta-parameters across heterogeneous tasks which fast adapt to all types of new tasks.
【14】 Efficient and accurate group testing via Belief Propagation: an empirical study
作者:AminCoja-Oghlan,Max Hahn-Klimroth,Philipp Loick,Manuel Penschuck
摘要:The group testing problem asks for efficient pooling schemes and algorithms that allow to screen moderately large numbers of samples for rare infections. The goal is to accurately identify the infected samples while conducting the least possible number of tests. Exploring the use of techniques centred around the Belief Propagation message passing algorithm, we suggest a new test design that significantly increases the accuracy of the results. The new design comes with Belief Propagation as an efficient inference algorithm. Aiming for results on practical rather than asymptotic problem sizes, we conduct an experimental study.
【15】 Conscious AI
作者:Hadi Esmaeilzadeh,Reza Vaezi
机构:University of California San Diego, Kennesaw State University
摘要:Recent advances in artificial intelligence (AI) have achieved human-scale speed and accuracy for classification tasks. In turn, these capabilities have made AI a viable replacement for many human activities that at their core involve classification, such as basic mechanical and analytical tasks in low-level service jobs. Current systems do not need to be conscious to recognize patterns and classify them. However, for AI to progress to more complicated tasks requiring intuition and empathy, it must develop capabilities such as metathinking, creativity, and empathy akin to human self-awareness or consciousness. We contend that such a paradigm shift is possible only through a fundamental shift in the state of artificial intelligence toward consciousness, a shift similar to what took place for humans through the process of natural selection and evolution. As such, this paper aims to theoretically explore the requirements for the emergence of consciousness in AI. It also provides a principled understanding of how conscious AI can be detected and how it might be manifested in contrast to the dominant paradigm that seeks to ultimately create machines that are linguistically indistinguishable from humans.
【16】 A Review on Explainability in Multimodal Deep Neural Nets
作者:Gargi Joshi,Rahee Walambe,Ketan Kotecha
备注:24 pages 6 figures
摘要:Artificial Intelligence techniques powered by deep neural nets have achieved much success in several application domains, most significantly and notably in the Computer Vision applications and Natural Language Processing tasks. Surpassing human-level performance propelled the research in the applications where different modalities amongst language, vision, sensory, text play an important role in accurate predictions and identification. Several multimodal fusion methods employing deep learning models are proposed in the literature. Despite their outstanding performance, the complex, opaque and black-box nature of the deep neural nets limits their social acceptance and usability. This has given rise to the quest for model interpretability and explainability, more so in the complex tasks involving multimodal AI methods. This paper extensively reviews the present literature to present a comprehensive survey and commentary on the explainability in multimodal deep neural nets, especially for the vision and language tasks. Several topics on multimodal AI and its applications for generic domains have been covered in this paper, including the significance, datasets, fundamental building blocks of the methods and techniques, challenges, applications, and future trends in this domain
【17】 Quantum Uncertainty in Decision Theory
作者:V. I. Yukalov
机构: YukalovBogolubov Laboratory of Theoretical Physics, Joint Institute for Nuclear Research, Universidade de S˜ao Paulo
备注:17 pages
摘要:An approach is presented treating decision theory as a probabilistic theory based on quantum techniques. Accurate definitions are given and thorough analysis is accomplished for the quantum probabilities describing the choice between separate alternatives, sequential alternatives characterizing conditional quantum probabilities, and behavioral quantum probabilities taking into account rational-irrational duality of decision making. The comparison between quantum and classical probabilities is explained. The analysis demonstrates that quantum probabilities serve as an essentially more powerful tool of characterizing various decision-making situations including the influence of psychological behavioral effects.
【18】 The challenges and realities of retailing in a COVID-19 world: Identifying trending and Vital During Crisis keywords during Covid-19 using Machine Learning (Austria as a case study)
作者:Reda Mastouri Et Al.,Joseph Gilkey
备注:easychair, ENSIAS Rabat, Morocco. Saint Peter's University, NJ- USA
摘要:From global pandemics to geopolitical turmoil, leaders in logistics, product allocation, procurement and operations are facing increasing difficulty with safeguarding their organizations against supply chain vulnerabilities. It is recommended to opt for forecasting against trending based benchmark because auditing a future forecast puts more focus on seasonality. The forecasting models provide with end-to-end, real time oversight of the entire supply chain, while utilizing predictive analytics and artificial intelligence to identify potential disruptions before they occur. By combining internal and external data points, coming up with an AI-enabled modelling engine can greatly reduce risk by helping retail companies proactively respond to supply and demand variability. This research paper puts focus on creating an ingenious way to tackle the impact of COVID19 on Supply chain, product allocation, trending and seasonality. Key words: Supply chain, covid-19, forecasting, coronavirus, manufacturing, seasonality, trending, retail.
【19】 An Extensive Analytical Approach on Human Resources using Random Forest Algorithm
作者:Swarajya lakshmi v papineni,A. Mallikarjuna Reddy,Sudeepti yarlagadda,Snigdha Yarlagadda,Haritha Akkinen
机构: Professor, Department of IT,Prasad V Potluri Siddhartha Institute of Technology, vijayawada, AP, India, Assistant Professor, Department of CSE, Anurag University, Freelance HR Consultant,Hyderabad,telangana State, India
摘要:The current job survey shows that most software employees are planning to change their job role due to high pay for recent jobs such as data scientists, business analysts and artificial intelligence fields. The survey also indicated that work life imbalances, low pay, uneven shifts and many other factors also make employees think about changing their work life. In this paper, for an efficient organisation of the company in terms of human resources, the proposed system designed a model with the help of a random forest algorithm by considering different employee parameters. This helps the HR department retain the employee by identifying gaps and helping the organisation to run smoothly with a good employee retention ratio. This combination of HR and data science can help the productivity, collaboration and well-being of employees of the organisation. It also helps to develop strategies that have an impact on the performance of employees in terms of external and social factors.
【20】 Hard Choices and Hard Limits for Artificial Intelligence
作者:Bryce Goodman
机构:Department of Philosophy, University of Oxford, United Kingdom
摘要:Artificial intelligence (AI) is supposed to help us make better choices. Some of these choices are small, like what route to take to work, or what music to listen to. Others are big, like what treatment to administer for a disease or how long to sentence someone for a crime. If AI can assist with these big decisions, we might think it can also help with hard choices, cases where alternatives are neither better, worse nor equal but on a par. The aim of this paper, however, is to show that this view is mistaken: the fact of parity shows that there are hard limits on AI in decision making and choices that AI cannot, and should not, resolve.
【21】 The Flipped Classroom model for teaching Conditional Random Fields in an NLP course
作者:Manex Agirrezabal
机构:Centre for Language Technology (CST) - Department of Nordic Studies and Linguistics, University of Copenhagen Københavns Universitet, Emil Holms Kanal , Copenhagen (Denmark)
备注:Accepted to the 5th Workshop on Teaching NLP at NAACL-HLT 2021
摘要:In this article, we show and discuss our experience in applying the flipped classroom method for teaching Conditional Random Fields in a Natural Language Processing course. We present the activities that we developed together with their relationship to a cognitive complexity model (Bloom's taxonomy). After this, we provide our own reflections and expectations of the model itself. Based on the evaluation got from students, it seems that students learn about the topic and also that the method is rewarding for some students. Additionally, we discuss some shortcomings and we propose possible solutions to them. We conclude the paper with some possible future work.
【22】 How to Explain Neural Networks: A perspective of data space division
作者:Hangcheng Dong,Bingguo Liu,Fengdong Chen,Dong Ye,Guodong Liu
机构:cn) are with School ofInstrumentation Science and Engineering, Harbin Institute of Techonoloy
摘要:Interpretability of intelligent algorithms represented by deep learning has been yet an open problem. We discuss the shortcomings of the existing explainable method based on the two attributes of explanation, which are called completeness and explicitness. Furthermore, we point out that a model that completely relies on feed-forward mapping is extremely easy to cause inexplicability because it is hard to quantify the relationship between this mapping and the final model. Based on the perspective of the data space division, the principle of complete local interpretable model-agnostic explanations (CLIMEP) is proposed in this paper. To study the classification problems, we further discussed the equivalence of the CLIMEP and the decision boundary. As a matter of fact, it is also difficult to implementation of CLIMEP. To tackle the challenge, motivated by the fact that a fully-connected neural network (FCNN) with piece-wise linear activation functions (PWLs) can partition the input space into several linear regions, we extend this result to arbitrary FCNNs by the strategy of linearizing the activation functions. Applying this technique to solving classification problems, it is the first time that the complete decision boundary of FCNNs has been able to be obtained. Finally, we propose the DecisionNet (DNet), which divides the input space by the hyper-planes of the decision boundary. Hence, each linear interval of the DNet merely contains samples of the same label. Experiments show that the surprising model compression efficiency of the DNet with an arbitrary controlled precision.
【23】 TopicsRanksDC: Distance-based Topic Ranking applied on Two-Class Data
作者:Malik Yousef,Jamal Al Qundus,Silvio Peikert,Adrian Paschke
机构:The Galilee Digital Health Research Center (GDH),Zefat, Data Analytics Center (DANA), Fraunhofer FOKUS, Berlin
备注:10 pages, 5 figures
摘要:In this paper, we introduce a novel approach named TopicsRanksDC for topics ranking based on the distance between two clusters that are generated by each topic. We assume that our data consists of text documents that are associated with two-classes. Our approach ranks each topic contained in these text documents by its significance for separating the two-classes. Firstly, the algorithm detects topics using Latent Dirichlet Allocation (LDA). The words defining each topic are represented as two clusters, where each one is associated with one of the classes. We compute four distance metrics, Single Linkage, Complete Linkage, Average Linkage and distance between the centroid. We compare the results of LDA topics and random topics. The results show that the rank for LDA topics is much higher than random topics. The results of TopicsRanksDC tool are promising for future work to enable search engines to suggest related topics.
【24】 Designer-User Communication for XAI: An epistemological approach to discuss XAI design
作者:Juliana Jansen Ferreira,Mateus Monteiro
机构:IBM Research, Rio de Janeiro, Brazil, Federal Fluminense, University
备注:ACM CHI Workshop on Operationalizing Human-Centered Perspectives in Explainable AI at CHI 2021. 6 pages
摘要:Artificial Intelligence is becoming part of any technology we use nowadays. If the AI informs people's decisions, the explanation about AI's outcomes, results, and behavior becomes a necessary capability. However, the discussion of XAI features with various stakeholders is not a trivial task. Most of the available frameworks and methods for XAI focus on data scientists and ML developers as users. Our research is about XAI for end-users of AI systems. We argue that we need to discuss XAI early in the AI-system design process and with all stakeholders. In this work, we aimed at investigating how to operationalize the discussion about XAI scenarios and opportunities among designers and developers of AI and its end-users. We took the Signifying Message as our conceptual tool to structure and discuss XAI scenarios. We experiment with its use for the discussion of a healthcare AI-System.
【25】 DISCO Verification: Division of Input Space into COnvex polytopes for neural network verification
作者:Julien Girard-Satabin,Aymeric Varasse,Marc Schoenauer,Guillaume Charpiat,Zakaria Chihani
机构: Université Paris-Saclay, CEA, List, F-, Palaiseau, France, TAU team, LISN (Université Paris-Saclay and CNRS), INRIA
摘要:The impressive results of modern neural networks partly come from their non linear behaviour. Unfortunately, this property makes it very difficult to apply formal verification tools, even if we restrict ourselves to networks with a piecewise linear structure. However, such networks yields subregions that are linear and thus simpler to analyse independently. In this paper, we propose a method to simplify the verification problem by operating a partitionning into multiple linear subproblems. To evaluate the feasibility of such an approach, we perform an empirical analysis of neural networks to estimate the number of linear regions, and compare them to the bounds currently known. We also present the impact of a technique aiming at reducing the number of linear regions during training.
【26】 Automated Biodesign Engineering by Abductive Meta-Interpretive Learning
作者:Wang-Zhou Dai,Liam Hallett,Stephen H. Muggleton,Geoff S. Baldwin
机构:Department of Computing, Imperial College London, SW,AZ, UK., Department of Life Science, Imperial College London, SW,AZ, UK.
备注:Accepted by SSS-21 (AAAI Spring Symposium Series 2021), Artificial Intelligence for Synthetic Biology (AI4Synbio) track
摘要:人工智能(AI)在合成生物学中的应用将为遗传设计的高通量自动化平台的创建提供基础,其中学习机用于通过设计构建测试学习(DTBL)循环迭代优化系统。然而,以深度学习为代表的主流机器学习技术缺乏表达关系知识的能力,需要大量的带注释的训练数据。这些缺点强烈地限制了人工智能在合成生物学中的作用,在合成生物学中,实验本身就是资源和时间密集型的。在这项工作中,我们提出了一个由诱因元解释学习(Meta{Abd}$)授权的自动化生物设计工程框架,这是一种结合符号和亚符号机器学习的新型机器学习方法,通过使学习机能够1)利用领域知识和学习由形式语言(如一阶逻辑)表示的人类可解释模型,进一步增强DBTL循环;2) 同时优化模型的结构和参数,使数值预测准确;3) 通过主动生成假设和示例,减少实验成本和数据注释工作。为了验证$Meta{Abd}$的有效性,我们建立了一个模拟微生物宿主中三基因操纵子产生蛋白质的合成数据集,这代表了一个常见的合成生物学问题。
摘要:The application of Artificial Intelligence (AI) to synthetic biology will provide the foundation for the creation of a high throughput automated platform for genetic design, in which a learning machine is used to iteratively optimise the system through a design-build-test-learn (DBTL) cycle. However, mainstream machine learning techniques represented by deep learning lacks the capability to represent relational knowledge and requires prodigious amounts of annotated training data. These drawbacks strongly restrict AI's role in synthetic biology in which experimentation is inherently resource and time intensive. In this work, we propose an automated biodesign engineering framework empowered by Abductive Meta-Interpretive Learning ($Meta_{Abd}$), a novel machine learning approach that combines symbolic and sub-symbolic machine learning, to further enhance the DBTL cycle by enabling the learning machine to 1) exploit domain knowledge and learn human-interpretable models that are expressed by formal languages such as first-order logic; 2) simultaneously optimise the structure and parameters of the models to make accurate numerical predictions; 3) reduce the cost of experiments and effort on data annotation by actively generating hypotheses and examples. To verify the effectiveness of $Meta_{Abd}$, we have modelled a synthetic dataset for the production of proteins from a three gene operon in a microbial host, which represents a common synthetic biology problem.
【27】 Explicit Semantic Cross Feature Learning via Pre-trained Graph Neural Networks for CTR Prediction
作者:Feng Li,Bencheng Yan,Qingqing Long,Pengjie Wang,Wei Lin,Jian Xu,Bo Zheng
机构:Alibaba Group
备注:SIGIR 2021, 5 pages; The first two authors contributed equally to this work; Pengjie Wang gave a lot of guidance in this work
摘要:Cross features play an important role in click-through rate (CTR) prediction. Most of the existing methods adopt a DNN-based model to capture the cross features in an implicit manner. These implicit methods may lead to a sub-optimized performance due to the limitation in explicit semantic modeling. Although traditional statistical explicit semantic cross features can address the problem in these implicit methods, it still suffers from some challenges, including lack of generalization and expensive memory cost. Few works focus on tackling these challenges. In this paper, we take the first step in learning the explicit semantic cross features and propose Pre-trained Cross Feature learning Graph Neural Networks (PCF-GNN), a GNN based pre-trained model aiming at generating cross features in an explicit fashion. Extensive experiments are conducted on both public and industrial datasets, where PCF-GNN shows competence in both performance and memory-efficiency in various tasks.
【28】 Towards a Better Tradeoff between Effectiveness and Efficiency in Pre-Ranking: A Learnable Feature Selection based Approach
作者:Xu Ma,Pengjie Wang,Hui Zhao,Shaoguo Liu,Chuhan Zhao,Wei Lin,Kuang-Chih Lee,Jian Xu,Bo Zheng
机构:Alibaba Group
摘要:In real-world search, recommendation, and advertising systems, the multi-stage ranking architecture is commonly adopted. Such architecture usually consists of matching, pre-ranking, ranking, and re-ranking stages. In the pre-ranking stage, vector-product based models with representation-focused architecture are commonly adopted to account for system efficiency. However, it brings a significant loss to the effectiveness of the system. In this paper, a novel pre-ranking approach is proposed which supports complicated models with interaction-focused architecture. It achieves a better tradeoff between effectiveness and efficiency by utilizing the proposed learnable Feature Selection method based on feature Complexity and variational Dropout (FSCD). Evaluations in a real-world e-commerce sponsored search system for a search engine demonstrate that utilizing the proposed pre-ranking, the effectiveness of the system is significantly improved. Moreover, compared to the systems with conventional pre-ranking models, an identical amount of computational resource is consumed.
【29】 Approximate Novelty Search
作者:Anubhav Singh,Nir Lipovetzky,Miquel Ramirez,Javier Segovia-Aguas
机构: School of Computing and Information Systems, University of Melbourne, Australia, Electrical and Electronic Engineering, University of Melbourne, Australia, Dept. Information and Communication Technologies, Universitat Pompeu Fabra, Spain
摘要:Width-based search algorithms seek plans by prioritizing states according to a suitably defined measure of novelty, that maps states into a set of novelty categories. Space and time complexity to evaluate state novelty is known to be exponential on the cardinality of the set. We present novel methods to obtain polynomial approximations of novelty and width-based search. First, we approximate novelty computation via random sampling and Bloom filters, reducing the runtime and memory footprint. Second, we approximate the best-first search using an adaptive policy that decides whether to forgo the expansion of nodes in the open list. These two techniques are integrated into existing width-based algorithms, resulting in new planners that perform significantly better than other state-of-the-art planners over benchmarks from the International Planning Competitions.
【30】 OntoEA: Ontology-guided Entity Alignment via Joint Knowledge Graph Embedding
作者:Yuejia Xiang,Ziheng Zhang,Jiaoyan Chen,Xi Chen,Zhenxi Lin,Yefeng Zheng
机构:Tencent Jarvis Lab, Shenzhen, China, Department of Computer Science, University of Oxford, UK
摘要:Semantic embedding has been widely investigated for aligning knowledge graph (KG) entities. Current methods have explored and utilized the graph structure, the entity names and attributes, but ignore the ontology (or ontological schema) which contains critical meta information such as classes and their membership relationships with entities. In this paper, we propose an ontology-guided entity alignment method named OntoEA, where both KGs and their ontologies are jointly embedded, and the class hierarchy and the class disjointness are utilized to avoid false mappings. Extensive experiments on seven public and industrial benchmarks have demonstrated the state-of-the-art performance of OntoEA and the effectiveness of the ontologies.
【31】 Continual Learning with Echo State Networks
作者:Andrea Cossu,Davide Bacciu,Antonio Carta,Claudio Gallicchio,Vincenzo Lomonaco
机构:- University of Pisa - Department of Computer Science, Largo B. Pontecorvo, Pisa - Italy, - Scuola Normale Superiore, Piazza dei Cavalieri, Pisa - Italy
备注:In review at ESANN 2021
摘要:Continual Learning (CL) refers to a learning setup where data is non stationary and the model has to learn without forgetting existing knowledge. The study of CL for sequential patterns revolves around trained recurrent networks. In this work, instead, we introduce CL in the context of Echo State Networks (ESNs), where the recurrent component is kept fixed. We provide the first evaluation of catastrophic forgetting in ESNs and we highlight the benefits in using CL strategies which are not applicable to trained recurrent models. Our results confirm the ESN as a promising model for CL and open to its use in streaming scenarios.
【32】 Dependency Parsing as MRC-based Span-Span Prediction
作者:Leilei Gan,Yuxian Meng,Kun Kuang,Xiaofei Sun,Chun Fan,Fei Wu,Jiwei Li
机构:♦Zhejiang University, ♠ Peking University, Peng Cheng Laboratory, ♣Shannon.AI
摘要:高阶依赖解析方法可以部分但不能完全解决依赖树中的边应该在文本跨度/子树级别而不是单词级别构造的问题。%这个缺点可能会导致一个不正确的跨度覆盖相应的树根在某个词,虽然这个词是正确地链接到其头部。本文提出了一种新的依赖分析方法来解决这个问题。该方法通过直接建立跨-跨(即子树-子树)关系来构造依赖树。它由两个模块组成:{\It text span proposal module}提出候选文本范围,每个候选文本范围表示依赖树中的子树,用(root,start,end)表示;以及{\it span linking module},它在建议的跨之间构建链接。我们使用机器阅读理解(MRC)框架作为主干来形式化MRC设置中的span链接模块,其中一个span作为查询来提取它应该链接到的文本span/子树。该方法具有以下优点:(1)解决了依赖树中的边需要在子树之间构造的基本问题(2) MRC框架允许该方法在跨度建议阶段检索缺失的跨度,从而提高合格跨度的召回率。在PTB、CTB和通用依赖(UD)基准上的大量实验证明了该方法的有效性。我们能够在PTB和UD基准上实现新的SOTA性能,并在CTB数据集上实现与以前的SOTA模型的竞争性能。代码位于https://github.com/ShannonAI/mrc-for-dependency-parsing.
摘要:Higher-order methods for dependency parsing can partially but not fully addresses the issue that edges in dependency tree should be constructed at the text span/subtree level rather than word level. % This shortcoming can cause an incorrect span covered the corresponding tree rooted at a certain word though the word is correctly linked to its head. In this paper, we propose a new method for dependency parsing to address this issue. The proposed method constructs dependency trees by directly modeling span-span (in other words, subtree-subtree) relations. It consists of two modules: the {\it text span proposal module} which proposes candidate text spans, each of which represents a subtree in the dependency tree denoted by (root, start, end); and the {\it span linking module}, which constructs links between proposed spans. We use the machine reading comprehension (MRC) framework as the backbone to formalize the span linking module in an MRC setup, where one span is used as a query to extract the text span/subtree it should be linked to. The proposed method comes with the following merits: (1) it addresses the fundamental problem that edges in a dependency tree should be constructed between subtrees; (2) the MRC framework allows the method to retrieve missing spans in the span proposal stage, which leads to higher recall for eligible spans. Extensive experiments on the PTB, CTB and Universal Dependencies (UD) benchmarks demonstrate the effectiveness of the proposed method. We are able to achieve new SOTA performances on PTB and UD benchmarks, and competitive performances to previous SOTA models on the CTB dataset. Code is available at https://github.com/ShannonAI/mrc-for-dependency-parsing.
【33】 Traffic-Aware Service Relocation in Cloud-Oriented Elastic Optical Networks
作者:Róża Goścień
机构: Go´scie´n is with the Department of Systems and Computer Networks, Wroclaw University of Science and Technology
摘要:In this paper, we study problem of efficient service relocation (i.e., changing assigned data center for a selected client node) in elastic optical networks (EONs) in order to increase network performance (measured by the volume of accepted traffic). To this end, we first propose novel traffic model for cloud ready transport networks. The model takes into account four flow types (i.e., city-to-city, city-to-data center, data center-to-data center and data center-to-data center) while the flow characteristics are based on real economical and geographical parameters of the cities related to network nodes. Then, we propose dedicated flow allocation algorithm that can be supported by the service relocation process. We also introduce 21 different relocation policies, which use three types of data for decision making - network topological characteristics, rejection history and traffic prediction. Eventually, we perform extensive numerical experiments in order to: (i) tune proposed optimization approaches and (ii) evaluate and compare their efficiency and select the best one. The results of the investigation prove high efficiency of the proposed policies. The propoerly designed relocation policy allowed to allocate up to 3% more traffic (compared to the allocation without that policy). The results also reveal that the most efficient relocation policy bases its decisions on two types of data simultaneously - the rejection history and traffic prediction.
【34】 A Formal Framework for Reasoning about Agents' Independence in Self-organizing Multi-agent Systems
作者:Jieting Luo,Beishui Liao,John-Jules Meyer
机构: Zhejiang University, Hangzhou, Zhejiang Province, China, Utrecht University, Utrecht, the Netherlands
摘要:Self-organization is a process where a stable pattern is formed by the cooperative behavior between parts of an initially disordered system without external control or influence. It has been introduced to multi-agent systems as an internal control process or mechanism to solve difficult problems spontaneously. However, because a self-organizing multi-agent system has autonomous agents and local interactions between them, it is difficult to predict the behavior of the system from the behavior of the local agents we design. This paper proposes a logic-based framework of self-organizing multi-agent systems, where agents interact with each other by following their prescribed local rules. The dependence relation between coalitions of agents regarding their contributions to the global behavior of the system is reasoned about from the structural and semantic perspectives. We show that the computational complexity of verifying such a self-organizing multi-agent system remains close to the domain of standard ATL. We then combine our framework with graph theory to decompose a system into different coalitions located in different layers, which allows us to verify agents' full contributions more efficiently. The resulting information about agents' full contributions allows us to understand the complex link between local agent behavior and system level behavior in a self-organizing multi-agent system. Finally, we show how we can use our framework to model a constraint satisfaction problem.
【35】 DOC3-Deep One Class Classification using Contradictions
作者:Sauptik Dhar,Bernardo Gonzalez Torres
机构: USA 2Universityof California
备注:Deep Learning, Anomaly Detection, Visual Inspection, Learning from Contradictions, Outlier Exposure, 18 pages, 14 tables, 6 Figures
摘要:This paper introduces the notion of learning from contradictions (a.k.a Universum learning) for deep one class classification problems. We formalize this notion for the widely adopted one class large-margin loss, and propose the Deep One Class Classification using Contradictions (DOC3) algorithm. We show that learning from contradictions incurs lower generalization error by comparing the Empirical Radamacher Complexity (ERC) of DOC3 against its traditional inductive learning counterpart. Our empirical results demonstrate the efficacy of DOC3 algorithm achieving > 30% for CIFAR-10 and >50% for MV-Tec AD data sets in test AUCs compared to its inductive learning counterpart and in many cases improving the state-of-the-art in anomaly detection.
【36】 Convex optimization for actionable \& plausible counterfactual explanations
作者:André Artelt,Barbara Hammer
机构:CITEC - Cognitive Interaction Technology, Inspiration , Bielefeld - Germany
摘要:Transparency is an essential requirement of machine learning based decision making systems that are deployed in real world. Often, transparency of a given system is achieved by providing explanations of the behavior and predictions of the given system. Counterfactual explanations are a prominent instance of particular intuitive explanations of decision making systems. While a lot of different methods for computing counterfactual explanations exist, only very few work (apart from work from the causality domain) considers feature dependencies as well as plausibility which might limit the set of possible counterfactual explanations. In this work we enhance our previous work on convex modeling for computing counterfactual explanations by a mechanism for ensuring actionability and plausibility of the resulting counterfactual explanations.
【37】 TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance
作者:Fengbin Zhu,Wenqiang Lei,Youcheng Huang,Chao Wang,Shuo Zhang,Jiancheng Lv,Fuli Feng,Tat-Seng Chua
机构:National University of Singapore,Estates Pte Ltd,Sichuan University,Bloomberg
备注:Accepted by ACL 2021
摘要:Hybrid data combining both tabular and textual content (e.g., financial reports) are quite pervasive in the real world. However, Question Answering (QA) over such hybrid data is largely neglected in existing research. In this work, we extract samples from real financial reports to build a new large-scale QA dataset containing both Tabular And Textual data, named TAT-QA, where numerical reasoning is usually required to infer the answer, such as addition, subtraction, multiplication, division, counting, comparison/sorting, and the compositions. We further propose a novel QA model termed TAGOP, which is capable of reasoning over both tables and text. It adopts sequence tagging to extract relevant cells from the table along with relevant spans from the text to infer their semantics, and then applies symbolic reasoning over them with a set of aggregation operators to arrive at the final answer. TAGOPachieves 58.0% inF1, which is an 11.1% absolute increase over the previous best baseline model, according to our experiments on TAT-QA. But this result still lags far behind performance of expert human, i.e.90.8% in F1. It is demonstrated that our TAT-QA is very challenging and can serve as a benchmark for training and testing powerful QA models that address hybrid form data.
【38】 Towards Unsupervised Domain Adaptation for Deep Face Recognition under Privacy Constraints via Federated Learning
作者:Weiming Zhuang,Xin Gan,Yonggang Wen,Xuesen Zhang,Shuai Zhang,Shuai Yi
机构:Nanyang Technological University, Singapore, SenseTime Research, China
摘要:无监督域自适应被广泛地应用于对目标域中的未标记数据、给定源域中的标记数据(其数据分布与目标域不同)的模型进行泛化。然而,现有的工作不适用于隐私约束下的人脸识别,因为它们需要在两个域之间共享敏感的人脸图像。为了解决这个问题,我们提出了一种新的无监督联合人脸识别方法(FedFR)。FedFR通过联邦学习迭代地聚集源领域的知识,提高了目标领域的性能。它通过在域之间传输模型而不是原始数据来保护数据隐私。此外,本文还提出了一种新的域约束丢失(DCL)方法来正则化源域训练。DCL抑制了源域的数据量优势。我们还改进了一种分层聚类算法来准确预测未标记目标域的伪标签。为此,FedFR形成了一个端到端的训练管道:(1)在源域进行预训练(2) 在目标域进行聚类预测伪标签(3) 跨两个域进行域约束联合学习。在两个新构造的基准上进行了大量的实验和分析,证明了FedFR的有效性。在更真实的基准上,它比目标域中的基线方法和经典方法的性能提高了4%以上。我们相信FedFR将为在隐私限制下将联合学习应用于更多的计算机视觉任务提供启示。
摘要:Unsupervised domain adaptation has been widely adopted to generalize models for unlabeled data in a target domain, given labeled data in a source domain, whose data distributions differ from the target domain. However, existing works are inapplicable to face recognition under privacy constraints because they require sharing sensitive face images between two domains. To address this problem, we propose a novel unsupervised federated face recognition approach (FedFR). FedFR improves the performance in the target domain by iteratively aggregating knowledge from the source domain through federated learning. It protects data privacy by transferring models instead of raw data between domains. Besides, we propose a new domain constraint loss (DCL) to regularize source domain training. DCL suppresses the data volume dominance of the source domain. We also enhance a hierarchical clustering algorithm to predict pseudo labels for the unlabeled target domain accurately. To this end, FedFR forms an end-to-end training pipeline: (1) pre-train in the source domain; (2) predict pseudo labels by clustering in the target domain; (3) conduct domain-constrained federated learning across two domains. Extensive experiments and analysis on two newly constructed benchmarks demonstrate the effectiveness of FedFR. It outperforms the baseline and classic methods in the target domain by over 4% on the more realistic benchmark. We believe that FedFR will shed light on applying federated learning to more computer vision tasks under privacy constraints.
【39】 EasyFL: A Low-code Federated Learning Platform For Dummies
作者:Weiming Zhuang,Xin Gan,Yonggang Wen,Shuai Zhang
机构:Nanyang Technological University, Singapore, SenseTime Research, China
摘要:学术界和工业界已经开发了几个平台来支持流行的隐私保护分布式学习方法——联邦学习(FL)。然而,这些平台使用起来很复杂,需要对FL有深入的了解,这给初学者设置了很高的入门门槛,限制了数据科学家的生产力,并影响了部署效率。在本文中,我们提出了第一个低代码FL平台EasyFL,使具有不同专业水平的用户能够用很少的代码来试验和原型FL应用程序。我们通过统一简单的API设计、模块化设计和细粒度的训练流抽象,实现了这个目标,同时确保了定制的极大灵活性。EasyFL只需几行代码,就可以使用许多现成的功能来加速实验和部署。这些实用功能包括异构仿真、分布式训练优化、全面跟踪和无缝部署。它们是根据拟议的飞行生命周期中确定的挑战提出的。我们的实现表明,EasyFL只需要三行代码就可以构建一个vanilla FL应用程序,至少比其他平台少10倍。此外,我们的评估表明,EasyFL加快了1.5倍的训练。提高了实验和部署的效率。我们相信EasyFL将提高数据科学家的生产力,并使FL面向更广泛的受众。
摘要:Academia and industry have developed several platforms to support the popular privacy-preserving distributed learning method -- Federated Learning (FL). However, these platforms are complex to use and require a deep understanding of FL, which imposes high barriers to entry for beginners, limits the productivity of data scientists, and compromises deployment efficiency. In this paper, we propose the first low-code FL platform, EasyFL, to enable users with various levels of expertise to experiment and prototype FL applications with little coding. We achieve this goal while ensuring great flexibility for customization by unifying simple API design, modular design, and granular training flow abstraction. With only a few lines of code, EasyFL empowers them with many out-of-the-box functionalities to accelerate experimentation and deployment. These practical functionalities are heterogeneity simulation, distributed training optimization, comprehensive tracking, and seamless deployment. They are proposed based on challenges identified in the proposed FL life cycle. Our implementations show that EasyFL requires only three lines of code to build a vanilla FL application, at least 10x lesser than other platforms. Besides, our evaluations demonstrate that EasyFL expedites training by 1.5x. It also improves the efficiency of experiments and deployment. We believe that EasyFL will increase the productivity of data scientists and democratize FL to wider audiences.
【40】 Differentiable SLAM-net: Learning Particle SLAM for Visual Navigation
作者:Peter Karkus,Shaojun Cai,David Hsu
机构:National University of Singapore
备注:CVPR 2021
摘要:同时定位与地图(SLAM)由于转弯速度快、墙壁无特征、相机质量差等原因,在视觉机器人导航等下游应用中仍然具有挑战性。我们引入了可微SLAM网络(SLAM-net)和一种导航结构,使平面机器人能够在以前看不见的室内环境中进行导航。SLAM网络将基于粒子滤波的SLAM算法编码到可微计算图中,通过SLAM算法进行反向传播学习面向任务的神经网络部件。由于SLAM-net可以为最终目标联合优化所有模型组件,因此SLAM-net可以学习在具有挑战性的条件下的鲁棒性。我们在Habitat平台上用不同的真实RGB和RGB-D数据集进行了实验。SLAM-net在噪声环境下的性能明显优于广泛采用的ORB-SLAM。我们采用SLAM网络的导航架构大大提高了人居挑战2020 PointNav任务的最新水平(成功率为37%至64%)。项目网站:http://sites.google.com/view/slamnet
摘要:Simultaneous localization and mapping (SLAM) remains challenging for a number of downstream applications, such as visual robot navigation, because of rapid turns, featureless walls, and poor camera quality. We introduce the Differentiable SLAM Network (SLAM-net) along with a navigation architecture to enable planar robot navigation in previously unseen indoor environments. SLAM-net encodes a particle filter based SLAM algorithm in a differentiable computation graph, and learns task-oriented neural network components by backpropagating through the SLAM algorithm. Because it can optimize all model components jointly for the end-objective, SLAM-net learns to be robust in challenging conditions. We run experiments in the Habitat platform with different real-world RGB and RGB-D datasets. SLAM-net significantly outperforms the widely adapted ORB-SLAM in noisy conditions. Our navigation architecture with SLAM-net improves the state-of-the-art for the Habitat Challenge 2020 PointNav task by a large margin (37% to 64% success). Project website: http://sites.google.com/view/slamnet
【41】 Monitoring electrical systems data-network equipment by means ofFuzzy and Paraconsistent Annotated Logic
作者:Hyghor Miranda Cortes,Paulo Eduardo Santos,Joao Inacio da Silva Filho
机构: Jo˜ao In´acio da Silva FilhocaCentro Universit´ario FEI, com)bSchool of Science and Engineering, Flinders University, com)cUniversidade Santa Cec´ılia
备注:38 pages; 14 figures; Under submission
摘要:The constant increase in the amount and complexity of information obtained from IT data networkelements, for its correct monitoring and management, is a reality. The same happens to data net-works in electrical systems that provide effective supervision and control of substations and hydro-electric plants. Contributing to this fact is the growing number of installations and new environmentsmonitored by such data networks and the constant evolution of the technologies involved. This sit-uation potentially leads to incomplete and/or contradictory data, issues that must be addressed inorder to maintain a good level of monitoring and, consequently, management of these systems. Inthis paper, a prototype of an expert system is developed to monitor the status of equipment of datanetworks in electrical systems, which deals with inconsistencies without trivialising the inferences.This is accomplished in the context of the remote control of hydroelectric plants and substationsby a Regional Operation Centre (ROC). The expert system is developed with algorithms definedupon a combination of Fuzzy logic and Paraconsistent Annotated Logic with Annotation of TwoValues (PAL2v) in order to analyse uncertain signals and generate the operating conditions (faulty,normal, unstable or inconsistent / indeterminate) of the equipment that are identified as importantfor the remote control of hydroelectric plants and substations. A prototype of this expert systemwas installed on a virtualised server with CLP500 software (from the EFACEC manufacturer) thatwas applied to investigate scenarios consisting of a Regional (Brazilian) Operation Centre, with aGeneric Substation and a Generic Hydroelectric Plant, representing a remote control environment.
【42】 Collaborative Graph Learning with Auxiliary Text for Temporal Event Prediction in Healthcare
作者:Chang Lu,Chandan K. Reddy,Prithwish Chakraborty,Samantha Kleinberg,Yue Ning
机构:Department of Computer Science, Stevens Institute of Technology, Department of Computer Science, Virginia Tech, IBM Research
摘要:准确和可解释的健康事件预测对于医疗保健提供者制定患者护理计划至关重要。电子健康记录(EHR)的可用性使得机器学习在提供这些预测方面取得了进步。然而,许多基于深度学习的方法并不能很好地解决几个关键问题:1)有效地利用疾病领域知识;2) 合作学习患者和疾病的表征;以及3)合并非结构化文本。为了解决这些问题,我们提出了一个协作图学习模型来探索病人与疾病的互动和医学领域知识。我们的解决方案能够捕获患者和疾病的结构特征。该模型还利用非结构化文本数据,采用注意调节策略,然后将注意文本特征整合到一个连续的学习过程中。我们在两个重要的医疗保健问题上进行了大量的实验,与现有的各种模型相比,证明了该方法的预测性能。我们还通过一组烧蚀和案例研究证实了学习表示和模型可解释性的有效性。
摘要:Accurate and explainable health event predictions are becoming crucial for healthcare providers to develop care plans for patients. The availability of electronic health records (EHR) has enabled machine learning advances in providing these predictions. However, many deep learning based methods are not satisfactory in solving several key challenges: 1) effectively utilizing disease domain knowledge; 2) collaboratively learning representations of patients and diseases; and 3) incorporating unstructured text. To address these issues, we propose a collaborative graph learning model to explore patient-disease interactions and medical domain knowledge. Our solution is able to capture structural features of both patients and diseases. The proposed model also utilizes unstructured text data by employing an attention regulation strategy and then integrates attentive text features into a sequential learning process. We conduct extensive experiments on two important healthcare problems to show the competitive prediction performance of the proposed method compared with various state-of-the-art models. We also confirm the effectiveness of learned representations and model interpretability by a set of ablation and case studies.
【43】 Private Facial Diagnosis as an Edge Service for Parkinson's DBS Treatment Valuation
作者:Richard Jiang,Paul Chazot,Danny Crookes,Ahmed Bouridane,M Emre Celebi
机构: Durham University, United Kingdom Danny Crookes is an emeritus professor with Department of Computer Sci-ence, Queen’s University Belfast
备注:Under review
摘要:Facial phenotyping has recently been successfully exploited for medical diagnosis as a novel way to diagnose a range of diseases, where facial biometrics has been revealed to have rich links to underlying genetic or medical causes. In this paper, taking Parkinson's Diseases (PD) as a case study, we proposed an Artificial-Intelligence-of-Things (AIoT) edge-oriented privacy-preserving facial diagnosis framework to analyze the treatment of Deep Brain Stimulation (DBS) on PD patients. In the proposed framework, a new edge-based information theoretically secure framework is proposed to implement private deep facial diagnosis as a service over a privacy-preserving AIoT-oriented multi-party communication scheme, where partial homomorphic encryption (PHE) is leveraged to enable privacy-preserving deep facial diagnosis directly on encrypted facial patterns. In our experiments with a collected facial dataset from PD patients, for the first time, we demonstrated that facial patterns could be used to valuate the improvement of PD patients undergoing DBS treatment. We further implemented a privacy-preserving deep facial diagnosis framework that can achieve the same accuracy as the non-encrypted one, showing the potential of our privacy-preserving facial diagnosis as an trustworthy edge service for grading the severity of PD in patients.
【44】 DRAS-CQSim: A Reinforcement Learning based Framework for HPC Cluster Scheduling
作者:Yuping Fan,Zhiling Lan
机构:Illinois Institute of Technology, Chicago, IL
摘要:For decades, system administrators have been striving to design and tune cluster scheduling policies to improve the performance of high performance computing (HPC) systems. However, the increasingly complex HPC systems combined with highly diverse workloads make such manual process challenging, time-consuming, and error-prone. We present a reinforcement learning based HPC scheduling framework named DRAS-CQSim to automatically learn optimal scheduling policy. DRAS-CQSim encapsulates simulation environments, agents, hyperparameter tuning options, and different reinforcement learning algorithms, which allows the system administrators to quickly obtain customized scheduling policies.
【45】 Graph-Free Knowledge Distillation for Graph Neural Networks
作者:Xiang Deng,Zhongfei Zhang
机构:State University of New York at Binghamton
摘要:知识提炼(Knowledge extraction,KD)通过强制学生模仿预先训练的教师对训练数据的输出,将知识从教师网络传递给学生。然而,由于数据量大、隐私性或保密性,在许多情况下数据样本并不总是可访问的。卷积神经网络(CNNs)的输入位于连续空间(如图像和视频)的网格域中,但在很大程度上忽略了处理离散空间中具有不同拓扑结构的非网格数据的图神经网络(GNNs)。它们的输入之间的固有差异使得这些基于CNN的方法不适用于GNNs。在本文中,我们提出了我们所知的第一个专门的方法提取知识从GNN没有图形数据。本文提出的无图KD(GFKD)通过多项式分布建模学习知识转移的图拓扑结构。然后我们引入一个梯度估计器来优化这个框架。从本质上讲,梯度w.r.t.图结构是通过只使用GNN正向传播而不使用反向传播获得的,这意味着GFKD与现代GNN库如DGL和Geometric兼容。此外,我们还提供了处理图数据或GNNs中不同类型先验知识的策略。大量的实验表明,GFKD在无需训练数据的情况下就可以实现从GNNs中提取知识的最新性能。
摘要:Knowledge distillation (KD) transfers knowledge from a teacher network to a student by enforcing the student to mimic the outputs of the pretrained teacher on training data. However, data samples are not always accessible in many cases due to large data sizes, privacy, or confidentiality. Many efforts have been made on addressing this problem for convolutional neural networks (CNNs) whose inputs lie in a grid domain within a continuous space such as images and videos, but largely overlook graph neural networks (GNNs) that handle non-grid data with different topology structures within a discrete space. The inherent differences between their inputs make these CNN-based approaches not applicable to GNNs. In this paper, we propose to our best knowledge the first dedicated approach to distilling knowledge from a GNN without graph data. The proposed graph-free KD (GFKD) learns graph topology structures for knowledge transfer by modeling them with multinomial distribution. We then introduce a gradient estimator to optimize this framework. Essentially, the gradients w.r.t. graph structures are obtained by only using GNN forward-propagation without back-propagation, which means that GFKD is compatible with modern GNN libraries such as DGL and Geometric. Moreover, we provide the strategies for handling different types of prior knowledge in the graph data or the GNNs. Extensive experiments demonstrate that GFKD achieves the state-of-the-art performance for distilling knowledge from GNNs without training data.
【46】 Decision Making with Differential Privacy under a Fairness Lens
作者:Ferdinando Fioretto,Cuong Tran,Pascal Van Hentenryck
机构:Syracuse University, Georgia Institute of Technology
备注:This paper is an extended version of the homonymous one, accepted at IJCAI-21
摘要:Agencies, such as the U.S. Census Bureau, release data sets and statistics about groups of individuals that are used as input to a number of critical decision processes. To conform to privacy and confidentiality requirements, these agencies are often required to release privacy-preserving versions of the data. This paper studies the release of differentially private data sets and analyzes their impact on some critical resource allocation tasks under a fairness perspective. {The paper shows that, when the decisions take as input differentially private data}, the noise added to achieve privacy disproportionately impacts some groups over others. The paper analyzes the reasons for these disproportionate impacts and proposes guidelines to mitigate these effects. The proposed approaches are evaluated on critical decision problems that use differentially private census data.
【47】 Substitutional Neural Image Compression
作者:Xiao Wang,Wei Jiang,Wei Wang,Shan Liu,Brian Kulis,Peter Chin
摘要:We describe Substitutional Neural Image Compression (SNIC), a general approach for enhancing any neural image compression model, that requires no data or additional tuning of the trained model. It boosts compression performance toward a flexible distortion metric and enables bit-rate control using a single model instance. The key idea is to replace the image to be compressed with a substitutional one that outperforms the original one in a desired way. Finding such a substitute is inherently difficult for conventional codecs, yet surprisingly favorable for neural compression models thanks to their fully differentiable structures. With gradients of a particular loss backpropogated to the input, a desired substitute can be efficiently crafted iteratively. We demonstrate the effectiveness of SNIC, when combined with various neural compression models and target metrics, in improving compression quality and performing bit-rate control measured by rate-distortion curves. Empirical results of control precision and generation speed are also discussed.
【48】 Doc2Dict: Information Extraction as Text Generation
作者:Benjamin Townsend,Eamon Ito-Fisher,Lily Zhang,Madison May
机构:∗Indico Data Solutions, †Franklin W. Olin College of Engineering, ‡New York University
摘要:Typically, information extraction (IE) requires a pipeline approach: first, a sequence labeling model is trained on manually annotated documents to extract relevant spans; then, when a new document arrives, a model predicts spans which are then post-processed and standardized to convert the information into a database entry. We replace this labor-intensive workflow with a transformer language model trained on existing database records to directly generate structured JSON. Our solution removes the workload associated with producing token-level annotations and takes advantage of a data source which is generally quite plentiful (e.g. database records). As long documents are common in information extraction tasks, we use gradient checkpointing and chunked encoding to apply our method to sequences of up to 32,000 tokens on a single GPU. Our Doc2Dict approach is competitive with more complex, hand-engineered pipelines and offers a simple but effective baseline for document-level information extraction. We release our Doc2Dict model and code to reproduce our experiments and facilitate future work.
【49】 Abstraction, Validation, and Generalization for Explainable Artificial Intelligence
作者:Scott Cheng-Hsin Yang,Tomas Folke,Patrick Shafto
机构:Department of Mathematics and Computer Science, Rutgers University, Warren Street, Newark, NJ
摘要:Neural network architectures are achieving superhuman performance on an expanding range of tasks. To effectively and safely deploy these systems, their decision-making must be understandable to a wide range of stakeholders. Methods to explain AI have been proposed to answer this challenge, but a lack of theory impedes the development of systematic abstractions which are necessary for cumulative knowledge gains. We propose Bayesian Teaching as a framework for unifying explainable AI (XAI) by integrating machine learning and human learning. Bayesian Teaching formalizes explanation as a communication act of an explainer to shift the beliefs of an explainee. This formalization decomposes any XAI method into four components: (1) the inference to be explained, (2) the explanatory medium, (3) the explainee model, and (4) the explainer model. The abstraction afforded by Bayesian Teaching to decompose any XAI method elucidates the invariances among them. The decomposition of XAI systems enables modular validation, as each of the first three components listed can be tested semi-independently. This decomposition also promotes generalization through recombination of components from different XAI systems, which facilitates the generation of novel variants. These new variants need not be evaluated one by one provided that each component has been validated, leading to an exponential decrease in development time. Finally, by making the goal of explanation explicit, Bayesian Teaching helps developers to assess how suitable an XAI system is for its intended real-world use case. Thus, Bayesian Teaching provides a theoretical framework that encourages systematic, scientific investigation of XAI.
【50】 Uncertainty in Minimum Cost Multicuts for Image and Motion Segmentation
作者:Amirhossein Kardoost,Margret Keuper
机构:Data and Web Science Group, University of Mannheim, Germany
备注:Accepted in the 37th Conference on Uncertainty in Artificial Intelligence (UAI 2021)
摘要:The minimum cost lifted multicut approach has proven practically good performance in a wide range of applications such as image decomposition, mesh segmentation, multiple object tracking, and motion segmentation. It addresses such problems in a graph-based model, where real-valued costs are assigned to the edges between entities such that the minimum cut decomposes the graph into an optimal number of segments. Driven by a probabilistic formulation of minimum cost multicuts, we provide a measure for the uncertainties of the decisions made during the optimization. We argue that access to such uncertainties is crucial for many practical applications and conduct an evaluation by means of sparsifications on three different, widely used datasets in the context of image decomposition (BSDS-500) and motion segmentation (DAVIS2016 and FBMS59) in terms of variation of information (VI) and Rand index (RI).
【51】 Few-NERD: A Few-Shot Named Entity Recognition Dataset
作者:Ning Ding,Guangwei Xu,Yulin Chen,Xiaobin Wang,Xu Han,Pengjun Xie,Hai-Tao Zheng,Zhiyuan Liu
机构:Department of Computer Science and Technology, Tsinghua University, Alibaba Group,Shenzhen International Graduate School, Tsinghua University
备注:Accepted by ACL-IJCNLP 2021, accepted version
摘要:Recently, considerable literature has grown up around the theme of few-shot named entity recognition (NER), but little published benchmark data specifically focused on the practical and challenging task. Current approaches collect existing supervised NER datasets and re-organize them to the few-shot setting for empirical study. These strategies conventionally aim to recognize coarse-grained entity types with few examples, while in practice, most unseen entity types are fine-grained. In this paper, we present Few-NERD, a large-scale human-annotated few-shot NER dataset with a hierarchy of 8 coarse-grained and 66 fine-grained entity types. Few-NERD consists of 188,238 sentences from Wikipedia, 4,601,160 words are included and each is annotated as context or a part of a two-level entity type. To the best of our knowledge, this is the first few-shot NER dataset and the largest human-crafted NER dataset. We construct benchmark tasks with different emphases to comprehensively assess the generalization capability of models. Extensive empirical results and analysis show that Few-NERD is challenging and the problem requires further research. We make Few-NERD public at https://ningding97.github.io/fewnerd/.
【52】 3D to 4D Facial Expressions Generation Guided by Landmarks
作者:Naima Otberdout,Claudio Ferrari,Mohamed Daoudi,Stefano Berretti,Alberto Del Bimbo
机构:Univ. Lille, CNRS, Centrale Lille, UMR , CRIStAL, F-, Lille, France, Media Integration ad Communication Center, University of Florence, Italy, IMT Lille Douai, Institut Mines-T´el´ecom, Univ. Lille, Centre for Digital Systems, F-, Lille, France
摘要:While deep learning-based 3D face generation has made a progress recently, the problem of dynamic 3D (4D) facial expression synthesis is less investigated. In this paper, we propose a novel solution to the following question: given one input 3D neutral face, can we generate dynamic 3D (4D) facial expressions from it? To tackle this problem, we first propose a mesh encoder-decoder architecture (Expr-ED) that exploits a set of 3D landmarks to generate an expressive 3D face from its neutral counterpart. Then, we extend it to 4D by modeling the temporal dynamics of facial expressions using a manifold-valued GAN capable of generating a sequence of 3D landmarks from an expression label (Motion3DGAN). The generated landmarks are fed into the mesh encoder-decoder, ultimately producing a sequence of 3D expressive faces. By decoupling the two steps, we separately address the non-linearity induced by the mesh deformation and motion dynamics. The experimental results on the CoMA dataset show that our mesh encoder-decoder guided by landmarks brings a significant improvement with respect to other landmark-based 3D fitting approaches, and that we can generate high quality dynamic facial expressions. This framework further enables the 3D expression intensity to be continuously adapted from low to high intensity. Finally, we show our framework can be applied to other tasks, such as 2D-3D facial expression transfer.
【53】 How Can Robots Trust Each Other? A Relative Needs Entropy Based Trust Assessment Models
作者:Qin Yang,Ramviyas Parasuraman
机构: Department of Computer Science, University of Georgia
备注:This paper already submitted to the SMC 2021 conference
摘要:Cooperation in multi-agent and multi-robot systems can help agents build various formations, shapes, and patterns presenting corresponding functions and purposes adapting to different situations. Relationship between agents such as their spatial proximity and functional similarities could play a crucial role in cooperation between agents. Trust level between agents is an essential factor in evaluating their relationships' reliability and stability, much as people do. This paper proposes a new model called Relative Needs Entropy (RNE) to assess trust between robotic agents. RNE measures the distance of needs distribution between individual agents or groups of agents. To exemplify its utility, we implement and demonstrate our trust model through experiments simulating a heterogeneous multi-robot grouping task in a persistent urban search and rescue mission consisting of tasks at two levels of difficulty. The results suggest that RNE trust-Based grouping of robots can achieve better performance and adaptability for diverse task execution compared to the state-of-the-art energy-based or distance-based grouping models.
【54】 Curiosity-driven Intuitive Physics Learning
作者:Tejas Gaikwad,Romi Banerjee
机构:Dept. of Computer Science and Engineering, Indian Institute of Technology Jodhpur, Rajasthan, India
摘要:Biological infants are naturally curious and try to comprehend their physical surroundings by interacting, in myriad multisensory ways, with different objects - primarily macroscopic solid objects - around them. Through their various interactions, they build hypotheses and predictions, and eventually learn, infer and understand the nature of the physical characteristics and behavior of these objects. Inspired thus, we propose a model for curiosity-driven learning and inference for real-world AI agents. This model is based on the arousal of curiosity, deriving from observations along discontinuities in the fundamental macroscopic solid-body physics parameters, i.e., shape constancy, spatial-temporal continuity, and object permanence. We use the term body-budget to represent the perceived fundamental properties of solid objects. The model aims to support the emulation of learning from scratch followed by substantiation through experience, irrespective of domain, in real-world AI agents.
【55】 Resource Planning for Hospitals Under Special Consideration of the COVID-19 Pandemic: Optimization and Sensitivity Analysis
作者:Thomas Bartz-Beielstein,Marcel Dröscher,Alpar Gür,Alexander Hinterleitner,Olaf Mersmann,Dessislava Peeva,Lennard Reese,Nicolas Rehbach,Frederik Rehbach,Amrita Sen,Aleksandr Subbotin,Martin Zaefferer
机构:TH Köln, Cologne, Germany, koeln.de
摘要:Crises like the COVID-19 pandemic pose a serious challenge to health-care institutions. They need to plan the resources required for handling the increased load, for instance, hospital beds and ventilators. To support the resource planning of local health authorities from the Cologne region, BaBSim.Hospital, a tool for capacity planning based on discrete event simulation, was created. The predictive quality of the simulation is determined by 29 parameters. Reasonable default values of these parameters were obtained in detailed discussions with medical professionals. We aim to investigate and optimize these parameters to improve BaBSim.Hospital. First approaches with "out-of-the-box" optimization algorithms failed. Implementing a surrogate-based optimization approach generated useful results in a reasonable time. To understand the behavior of the algorithm and to get valuable insights into the fitness landscape, an in-depth sensitivity analysis was performed. The sensitivity analysis is crucial for the optimization process because it allows focusing the optimization on the most important parameters. We illustrate how this reduces the problem dimension without compromising the resulting accuracy. The presented approach is applicable to many other real-world problems, e.g., the development of new elevator systems to cover the last mile or simulation of student flow in academic study periods.
【56】 Uncertainty Measurement of Basic Probability Assignment Integrity Based on Approximate Entropy in Evidence Theory
作者:Tianxiang Zhan,Yuanpeng He,Hanwen Li,Fuyuan Xiao
机构:School of Computer and Information Science, Southwest University, Chongqing, China
摘要:Evidence theory is that the extension of probability can better deal with unknowns and inaccurate information. Uncertainty measurement plays a vital role in both evidence theory and probability theory. Approximate Entropy (ApEn) is proposed by Pincus to describe the irregularities of complex systems. The more irregular the time series, the greater the approximate entropy. The ApEn of the network represents the ability of a network to generate new nodes, or the possibility of undiscovered nodes. Through the association of network characteristics and basic probability assignment (BPA) , a measure of the uncertainty of BPA regarding completeness can be obtained. The main contribution of paper is to define the integrity of the basic probability assignment then the approximate entropy of the BPA is proposed to measure the uncertainty of the integrity of the BPA. The proposed method is based on the logical network structure to calculate the uncertainty of BPA in evidence theory. The uncertainty based on the proposed method represents the uncertainty of integrity of BPA and contributes to the identification of the credibility of BPA.
【57】 Set2setRank: Collaborative Set to Set Ranking for Implicit Feedback based Recommendation
作者:Lei Chen,Le Wu,Kun Zhang,Richang Hong,Meng Wang
机构: Key Laboratory of Knowledge Engineering with Big Data, Hefei University of Technology,China, School of Computer Science and Information Engineering, Hefei University of Technology,China
备注:The paper is accepted by SIGIR 2021
摘要:As users often express their preferences with binary behavior data~(implicit feedback), such as clicking items or buying products, implicit feedback based Collaborative Filtering~(CF) models predict the top ranked items a user might like by leveraging implicit user-item interaction data. For each user, the implicit feedback is divided into two sets: an observed item set with limited observed behaviors, and a large unobserved item set that is mixed with negative item behaviors and unknown behaviors. Given any user preference prediction model, researchers either designed ranking based optimization goals or relied on negative item mining techniques for better optimization. Despite the performance gain of these implicit feedback based models, the recommendation results are still far from satisfactory due to the sparsity of the observed item set for each user. To this end, in this paper, we explore the unique characteristics of the implicit feedback and propose Set2setRank framework for recommendation. The optimization criteria of Set2setRank are two folds: First, we design an item to an item set comparison that encourages each observed item from the sampled observed set is ranked higher than any unobserved item from the sampled unobserved set. Second, we model set level comparison that encourages a margin between the distance summarized from the observed item set and the most "hard" unobserved item from the sampled negative set. Further, an adaptive sampling technique is designed to implement these two goals. We have to note that our proposed framework is model-agnostic and can be easily applied to most recommendation prediction approaches, and is time efficient in practice. Finally, extensive experiments on three real-world datasets demonstrate the superiority of our proposed approach.
【58】 Order Effects in Bayesian Updates
作者:Catarina Moreira,Jose Acacio de Barros
机构:School of Information Systems, Queensland University of Technology, Brisbane, Australia, School of Humanities and Liberal Studies, San Francisco State University, San Francisco, CA, USA
摘要:当给定一系列信息时,对假设概率的判断不等于信息反转时同一假设的概率时,就会产生顺序效应。文献中已经进行了不同的实验来支持有序效应的证据。我们提出了一个顺序效应的贝叶斯更新模型,每个问题都可以看作是一个小实验,在这个小实验中,被调查者反映他们的信念。我们证明了顺序效应的出现,它们有一个简单的认知解释:被调查者先前认为两个问题是相关的。所提出的贝叶斯模型允许我们做出几个预测:(1)我们在先验上找到了一些限制序效应存在的条件(2) 我们证明,对于我们的模型,QQ等式不一定满足(由于对称性假设);与量子贝叶斯模型相比,贝叶斯模型具有参数少的优点。
摘要:Order effects occur when judgments about a hypothesis's probability given a sequence of information do not equal the probability of the same hypothesis when the information is reversed. Different experiments have been performed in the literature that supports evidence of order effects. We proposed a Bayesian update model for order effects where each question can be thought of as a mini-experiment where the respondents reflect on their beliefs. We showed that order effects appear, and they have a simple cognitive explanation: the respondent's prior belief that two questions are correlated. The proposed Bayesian model allows us to make several predictions: (1) we found certain conditions on the priors that limit the existence of order effects; (2) we show that, for our model, the QQ equality is not necessarily satisfied (due to symmetry assumptions); and (3) the proposed Bayesian model has the advantage of possessing fewer parameters than its quantum counterpart.
【59】 Model-Based Offline Planning with Trajectory Pruning
作者:Xianyuan Zhan,Xiangyu Zhu,Haoran Xu
机构:JD Intelligent Cities Research, Beijing, China, Xidian University, China
摘要:Offline reinforcement learning (RL) enables learning policies using pre-collected datasets without environment interaction, which provides a promising direction to make RL useable in real-world systems. Although recent offline RL studies have achieved much progress, existing methods still face many practical challenges in real-world system control tasks, such as computational restriction during agent training and the requirement of extra control flexibility. Model-based planning framework provides an attractive solution for such tasks. However, most model-based planning algorithms are not designed for offline settings. Simply combining the ingredients of offline RL with existing methods either provides over-restrictive planning or leads to inferior performance. We propose a new light-weighted model-based offline planning framework, namely MOPP, which tackles the dilemma between the restrictions of offline learning and high-performance planning. MOPP encourages more aggressive trajectory rollout guided by the behavior policy learned from data, and prunes out problematic trajectories to avoid potential out-of-distribution samples. Experimental results show that MOPP provides competitive performance compared with existing model-based offline planning and RL approaches, and allows easy adaptation to varying objectives and extra constraints.
【60】 Explainable Hierarchical Imitation Learning for Robotic Drink Pouring
作者:Dandan Zhang,Yu Zheng,Qiang Li,Lei Wei,Dongsheng Zhang,Zhengyou Zhang
机构: Bielefeld University
备注:15 pages, 12 figures
摘要:To accurately pour drinks into various containers is an essential skill for service robots. However, drink pouring is a dynamic process and difficult to model. Traditional deep imitation learning techniques for implementing autonomous robotic pouring have an inherent black-box effect and require a large amount of demonstration data for model training. To address these issues, an Explainable Hierarchical Imitation Learning (EHIL) method is proposed in this paper such that a robot can learn high-level general knowledge and execute low-level actions across multiple drink pouring scenarios. Moreover, with EHIL, a logical graph can be constructed for task execution, through which the decision-making process for action generation can be made explainable to users and the causes of failure can be traced out. Based on the logical graph, the framework is manipulable to achieve different targets while the adaptability to unseen scenarios can be achieved in an explainable manner. A series of experiments have been conducted to verify the effectiveness of the proposed method. Results indicate that EHIL outperforms the traditional behavior cloning method in terms of success rate, adaptability, manipulability and explainability.
【61】 Understanding the Effect of Bias in Deep Anomaly Detection
作者:Ziyu Ye,Yuxin Chen,Haitao Zheng
机构:University of Chicago
备注:Accepted at IJCAI '21. Codes available on github.com/ZIYU-DEEP/Understanding-Bias-in-Deep-Anomaly-Detection-PyTorch
摘要:Anomaly detection presents a unique challenge in machine learning, due to the scarcity of labeled anomaly data. Recent work attempts to mitigate such problems by augmenting training of deep anomaly detection models with additional labeled anomaly samples. However, the labeled data often does not align with the target distribution and introduces harmful bias to the trained model. In this paper, we aim to understand the effect of a biased anomaly set on anomaly detection. Concretely, we view anomaly detection as a supervised learning task where the objective is to optimize the recall at a given false positive rate. We formally study the relative scoring bias of an anomaly detector, defined as the difference in performance with respect to a baseline anomaly detector. We establish the first finite sample rates for estimating the relative scoring bias for deep anomaly detection, and empirically validate our theoretical results on both synthetic and real-world datasets. We also provide an extensive empirical study on how a biased training anomaly set affects the anomaly score function and therefore the detection performance on different anomaly classes. Our study demonstrates scenarios in which the biased anomaly set can be useful or problematic, and provides a solid benchmark for future research.
【62】 Self-supervised on Graphs: Contrastive, Generative,or Predictive
作者:Lirong Wu,Haitao Lin,Zhangyang Gao,Cheng Tan,Stan. Z. Li
机构: School of Engineering, Westlake University
摘要:Deep learning on graphs has recently achieved remarkable success on a variety of tasks while such success relies heavily on the massive and carefully labeled data. However, precise annotations are generally very expensive and time-consuming. To address this problem, self-supervised learning (SSL) is emerging as a new paradigm for extracting informative knowledge through well-designed pretext tasks without relying on manual labels. In this survey, we extend the concept of SSL, which first emerged in the fields of computer vision and natural language processing, to present a timely and comprehensive review of the existing SSL techniques for graph data. Specifically, we divide existing graph SSL methods into three categories: contrastive, generative, and predictive. More importantly, unlike many other surveys that only provide a high-level description of published research, we present an additional mathematical summary of the existing works in a unified framework. Furthermore, to facilitate methodological development and empirical comparisons, we also summarize the commonly used datasets, evaluation metrics, downstream tasks, and open-source implementations of various algorithms. Finally, we discuss the technical challenges and potential future directions for improving graph self-supervised learning.
【63】 Real-time Detection of Practical Universal Adversarial Perturbations
作者:Kenneth T. Co,Luis Muñoz-González,Leslie Kanthan,Emil C. Lupu
机构:Lupu,[,−,−,−,], Imperial College London, London SW,AZ, United Kingdom, DataSpartan, London EC,Y ,ST, United Kingdom
摘要:Universal Adversarial Perturbations (UAPs) are a prominent class of adversarial examples that exploit the systemic vulnerabilities and enable physically realizable and robust attacks against Deep Neural Networks (DNNs). UAPs generalize across many different inputs; this leads to realistic and effective attacks that can be applied at scale. In this paper we propose HyperNeuron, an efficient and scalable algorithm that allows for the real-time detection of UAPs by identifying suspicious neuron hyper-activations. Our results show the effectiveness of HyperNeuron on multiple tasks (image classification, object detection), against a wide variety of universal attacks, and in realistic scenarios, like perceptual ad-blocking and adversarial patches. HyperNeuron is able to simultaneously detect both adversarial mask and patch UAPs with comparable or better performance than existing UAP defenses whilst introducing a significantly reduced latency of only 0.86 milliseconds per image. This suggests that many realistic and practical universal attacks can be reliably mitigated in real-time, which shows promise for the robust deployment of machine learning systems.
【64】 Towards a Predictive Processing Implementation of the Common Model of Cognition
作者:M. A. Kelly,Alexander Ororbia
机构:Rochester Institute of Technology, Rochester, NY, USA, M. Alex Kelly, Bucknell University, Lewisburg, PA, USA, Carleton University, Ottawa, ON, Canada
备注:6 pages, 2 figures
摘要:In this article, we present a cognitive architecture that is built from powerful yet simple neural models. Specifically, we describe an implementation of the common model of cognition grounded in neural generative coding and holographic associative memory. The proposed system creates the groundwork for developing agents that learn continually from diverse tasks as well as model human performance at larger scales than what is possible with existant cognitive architectures.
【65】 Texture Generation with Neural Cellular Automata
作者:Alexander Mordvintsev,Eyvind Niklasson,Ettore Randazzo
机构:Google Research
备注:AI for Content Creation Workshop, CVPR 2021
摘要:神经细胞自动机(neuralcellular Automata,NCA)在学习图像生长、形态分类、图像分割以及路径搜索等一般计算规则方面表现出了卓越的能力。我们相信它们引入的归纳先验有助于纹理的生成。自然界中的纹理通常是由局部相互作用的反应扩散系统的变体产生的。同样,人造纹理通常以局部方式(例如织物编织)或使用具有局部依赖性的规则(规则网格或几何图案)生成。我们演示了如何从单个模板图像中学习纹理生成器,其生成方法具有令人尴尬的并行性、收敛速度快、输出保真度高,并且只需要对底层状态流形进行一些最小的假设。此外,我们还研究了学习模型的一些有用和有趣的性质,如非平稳动力学和对损伤的固有鲁棒性。最后,我们定性地宣称NCA模型所表现出的行为是一种学习的、分布式的、局部的纹理生成算法,这使得我们的方法与现有的纹理生成工作不同。我们讨论这样一个范例的优点。
摘要:Neural Cellular Automata (NCA) have shown a remarkable ability to learn the required rules to "grow" images, classify morphologies, segment images, as well as to do general computation such as path-finding. We believe the inductive prior they introduce lends itself to the generation of textures. Textures in the natural world are often generated by variants of locally interacting reaction-diffusion systems. Human-made textures are likewise often generated in a local manner (textile weaving, for instance) or using rules with local dependencies (regular grids or geometric patterns). We demonstrate learning a texture generator from a single template image, with the generation method being embarrassingly parallel, exhibiting quick convergence and high fidelity of output, and requiring only some minimal assumptions around the underlying state manifold. Furthermore, we investigate properties of the learned models that are both useful and interesting, such as non-stationary dynamics and an inherent robustness to damage. Finally, we make qualitative claims that the behaviour exhibited by the NCA model is a learned, distributed, local algorithm to generate a texture, setting our method apart from existing work on texture generation. We discuss the advantages of such a paradigm.
【66】 Annotation Uncertainty in the Context of Grammatical Change
作者:Marie-Luis Merten,Marcel Wever,Michaela Geierhos,Doris Tophinke,Eyke Hüllermeier
机构:∗ University of Zurich, Zurich, Switzerland, † Paderborn University, Paderborn, Germany, ♦ Universit¨at der Bundeswehr M¨unchen, Munich, Germany, △ LMU Munich, Munich, Germany
摘要:This paper elaborates on the notion of uncertainty in the context of annotation in large text corpora, specifically focusing on (but not limited to) historical languages. Such uncertainty might be due to inherent properties of the language, for example, linguistic ambiguity and overlapping categories of linguistic description, but could also be caused by lacking annotation expertise. By examining annotation uncertainty in more detail, we identify the sources and deepen our understanding of the nature and different types of uncertainty encountered in daily annotation practice. Moreover, some practical implications of our theoretical findings are also discussed. Last but not least, this article can be seen as an attempt to reconcile the perspectives of the main scientific disciplines involved in corpus projects, linguistics and computer science, to develop a unified view and to highlight the potential synergies between these disciplines.
【67】 A Deep Metric Learning Approach to Account Linking
作者:Aleem Khan,Elizabeth Fleming,Noah Schofield,Marcus Bishop,Nicholas Andrews
机构:Human Language Technology Center of Excellence, Johns Hopkins University
备注:13 pages; to be published in NAACL 2021
摘要:We consider the task of linking social media accounts that belong to the same author in an automated fashion on the basis of the content and metadata of their corresponding document streams. We focus on learning an embedding that maps variable-sized samples of user activity -- ranging from single posts to entire months of activity -- to a vector space, where samples by the same author map to nearby points. The approach does not require human-annotated data for training purposes, which allows us to leverage large amounts of social media content. The proposed model outperforms several competitive baselines under a novel evaluation framework modeled after established recognition benchmarks in other domains. Our method achieves high linking accuracy, even with small samples from accounts not seen at training time, a prerequisite for practical applications of the proposed linking framework.
【68】 Regret Minimization Experience Replay
作者:Zhenghai Xue,Xu-Hui Liu,Jing-Cheng Pang,Shengyi Jiang,Feng Xu,Yang Yu
机构: Nanjing University
备注:9 pages, 5 figures
摘要:Experience replay is widely used in various deep off-policy reinforcement learning (RL) algorithms. It stores previously collected samples for further reuse. To better utilize these samples, prioritized sampling is a promising technique to improve the performance of RL agents. Previous prioritization methods based on temporal-difference (TD) error are highly heuristic and divergent from the objective of RL. In this work, we analyze the optimal prioritization strategy that can minimize the regret of RL policy theoretically. Our theory suggests that the data with higher TD error, better on-policiness and more corrective feedback should be assigned with higher weights during sampling. Based on this theory, we propose two practical algorithms, RM-DisCor and RM-TCE. RM-DisCor is a general algorithm and RM-TCE is a more efficient variant relying on the temporal ordering of states. Both algorithms improve the performance of off-policy RL algorithms in challenging RL benchmarks, including MuJoCo, Atari and Meta-World.
【69】 Composite Localization for Human Pose Estimation
作者:ZiFan Chen,Xin Qin,Chao Yang,Li Zhang
摘要:由于学习目标的复杂性,现有的人体姿态估计方法存在着长距离回归不准确或计算量大的问题。本文提出了一种新的人体姿态估计深度学习框架,称为复合定位,将复杂的学习目标分为两个简单的目标:一个稀疏的热图来寻找关键点的近似位置,两个短距离的偏移图来获得最终的精确坐标。为了实现该框架,我们构造了两种复合定位网络:CLNet ResNet和CLNet沙漏。我们在三个基准数据集上评估了网络,包括Leeds运动姿势数据集、MPII人体姿势数据集和COCO关键点检测数据集。实验结果表明,我们的CLNet-ResNet50在约1/2 GFLOPs的情况下比SimpleBaseline高出1.14%。我们的CLNet沙漏优于原来的堆叠沙漏4.45%的可可。
摘要:The existing human pose estimation methods are confronted with inaccurate long-distance regression or high computational cost due to the complex learning objectives. This work proposes a novel deep learning framework for human pose estimation called composite localization to divide the complex learning objective into two simpler ones: a sparse heatmap to find the keypoint's approximate location and two short-distance offsetmaps to obtain its final precise coordinates. To realize the framework, we construct two types of composite localization networks: CLNet-ResNet and CLNet-Hourglass. We evaluate the networks on three benchmark datasets, including the Leeds Sports Pose dataset, the MPII Human Pose dataset, and the COCO keypoints detection dataset. The experimental results show that our CLNet-ResNet50 outperforms SimpleBaseline by 1.14% with about 1/2 GFLOPs. Our CLNet-Hourglass outperforms the original stacked-hourglass by 4.45% on COCO.
【70】 AgeFlow: Conditional Age Progression and Regression with Normalizing Flows
作者:Zhizhong Huang,Shouzhen Chen,Junping Zhang,Hongming Shan
机构:Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Institute of Science and Technology for Brain-inspired Intelligence and MOE Frontiers Center, for Brain Science, Fudan University, Shanghai , China
备注:IJCAI 2021
摘要:Age progression and regression aim to synthesize photorealistic appearance of a given face image with aging and rejuvenation effects, respectively. Existing generative adversarial networks (GANs) based methods suffer from the following three major issues: 1) unstable training introducing strong ghost artifacts in the generated faces, 2) unpaired training leading to unexpected changes in facial attributes such as genders and races, and 3) non-bijective age mappings increasing the uncertainty in the face transformation. To overcome these issues, this paper proposes a novel framework, termed AgeFlow, to integrate the advantages of both flow-based models and GANs. The proposed AgeFlow contains three parts: an encoder that maps a given face to a latent space through an invertible neural network, a novel invertible conditional translation module (ICTM) that translates the source latent vector to target one, and a decoder that reconstructs the generated face from the target latent vector using the same encoder network; all parts are invertible achieving bijective age mappings. The novelties of ICTM are two-fold. First, we propose an attribute-aware knowledge distillation to learn the manipulation direction of age progression while keeping other unrelated attributes unchanged, alleviating unexpected changes in facial attributes. Second, we propose to use GANs in the latent space to ensure the learned latent vector indistinguishable from the real ones, which is much easier than traditional use of GANs in the image domain. Experimental results demonstrate superior performance over existing GANs-based methods on two benchmarked datasets. The source code is available at https://github.com/Hzzone/AgeFlow.
【71】 Heterogeneous Causal Effect of Polysubstance Usage on Drug Overdose
作者:Vaishali Mahipal,Mohammad Arif Ul Alam
机构:Department of Computer Science, University of Massachusetts Lowell, USA
备注:Submitted to EMBS BHI
摘要:在本文中,我们提出了一个系统来估计异类并发用药对过量估计的影响,包括有效的协变量选择,子组选择,产生和异类因果关系估计。虽然目前已有多个关联性研究被提出,但在同时用药和药物过量问题上,异质性因果关系尚未被研究。我们应用我们的框架来回答一个关键问题,“同时使用苯二氮卓类和阿片类药物是否会对阿片类药物过量流行产生不同的因果影响?”使用Truven MarketScan收集的2001年至2013年的索赔数据显示了我们提出的框架的有效性的重大前景。我们的有效因果推断模型估计,因果效应(19%)高于回归研究(15%),以估计与阿片类药物和苯二氮卓类药物过量同时使用相关的风险。
摘要:In this paper, we propose a system to estimate heterogeneous concurrent drug usage effects on overdose estimation, that consists of efficient co-variate selection, sub-group selection, generation of and heterogeneous causal effect estimation. Although, there has been several association studies have been proposed in the state-of-art methods, heterogeneous causal effects have never been studied in concurrent drug usage and drug overdose problem. We apply our framework to answer a critical question, "can concurrent usage of benzodiazepines and opioids has heterogeneous causal effects on opioid overdose epidemic?" Using Truven MarketScan claim data collected from 2001 to 2013 have shown significant promise of our proposed framework's efficacy. Our efficient causal inference model estimated that the causal effect is higher (19%) than the regression studies (15%) to estimate the risks associated with the concurrent usage of opioid and benzodiazepines on opioid overdose.
【72】 XAI Method Properties: A (Meta-)study
作者:Gesina Schwalbe,Bettina Finzel
机构: Continental AG, Regensburg, Germany, Cognitive Systems Group, University of Bamberg, Germany
备注:37 pages, 2 figures, submitted to Data Mining and Knowledge Discovery
摘要:In the meantime, a wide variety of terminologies, motivations, approaches and evaluation criteria have been developed within the scope of research on explainable artificial intelligence (XAI). Many taxonomies can be found in the literature, each with a different focus, but also showing many points of overlap. In this paper, we summarize the most cited and current taxonomies in a meta-analysis in order to highlight the essential aspects of the state-of-the-art in XAI. We also present and add terminologies as well as concepts from a large number of survey articles on the topic. Last but not least, we illustrate concepts from the higher-level taxonomy with more than 50 example methods, which we categorize accordingly, thus providing a wide-ranging overview of aspects of XAI and paving the way for use case-appropriate as well as context-specific subsequent research.
【73】 Content Analysis Application in Nursing: A Synthetic Knowledge Synthesis Meta-Study
作者:Helena Blažun Vošner,Peter Kokol,Jernej Završnik,Danica Železnik
机构: Zdravstveni dom dr. Adolfa Drolca Maribor, Ulica talcev , Maribor, Fakulteta za zdravstvene in socialne vede Slovenj Gradec, Glavni trg , Slovenj Gradec, Alma Mater Europaea, Slovenska ulica , Maribor
摘要:Theoretical issues: With the explosive growth in the research literature production, the need for new approaches to structure knowledge emerged. Method: Synthetic content analysis was used in our meta-study. Results and discussion: Our meta-study showed that content analysis is frequently used in nursing research in a very wide spectrum of applications. The trend of its use is positive and it is used globally in a variety of research settings. The synthetic content analysis used in our study showed to be a very helpful tool in performing knowledge synthesis, replacing many of the routine activities of conventional synthesis with automated activities this making such studies more economically viable and easier to perform.
【74】 Stacked Deep Multi-Scale Hierarchical Network for Fast Bokeh Effect Rendering from a Single Image
作者:Saikat Dutta,Sourya Dipta Das,Nisarg A. Shah,Anil Kumar Tiwari
机构:IIT Madras, Chennai, India, Jadavpur University, Kolkata, India, IIT Jodhpur, Jodhpur, India
备注:Accepted to MAI workshop, CVPR 2021. Code and models: this https URL
摘要:The Bokeh Effect is one of the most desirable effects in photography for rendering artistic and aesthetic photos. Usually, it requires a DSLR camera with different aperture and shutter settings and certain photography skills to generate this effect. In smartphones, computational methods and additional sensors are used to overcome the physical lens and sensor limitations to achieve such effect. Most of the existing methods utilized additional sensor's data or pretrained network for fine depth estimation of the scene and sometimes use portrait segmentation pretrained network module to segment salient objects in the image. Because of these reasons, networks have many parameters, become runtime intensive and unable to run in mid-range devices. In this paper, we used an end-to-end Deep Multi-Scale Hierarchical Network (DMSHN) model for direct Bokeh effect rendering of images captured from the monocular camera. To further improve the perceptual quality of such effect, a stacked model consisting of two DMSHN modules is also proposed. Our model does not rely on any pretrained network module for Monocular Depth Estimation or Saliency Detection, thus significantly reducing the size of model and run time. Stacked DMSHN achieves state-of-the-art results on a large scale EBB! dataset with around 6x less runtime compared to the current state-of-the-art model in processing HD quality images.
【75】 Cohort Shapley value for algorithmic fairness
作者:Masayoshi Mase,Art B. Owen,Benjamin B. Seiler
机构:Hitachi, Ltd., Stanford University
摘要:Cohort Shapley value is a model-free method of variable importance grounded in game theory that does not use any unobserved and potentially impossible feature combinations. We use it to evaluate algorithmic fairness, using the well known COMPAS recidivism data as our example. This approach allows one to identify for each individual in a data set the extent to which they were adversely or beneficially affected by their value of a protected attribute such as their race. The method can do this even if race was not one of the original predictors and even if it does not have access to a proprietary algorithm that has made the predictions. The grounding in game theory lets us define aggregate variable importance for a data set consistently with its per subject definitions. We can investigate variable importance for multiple quantities of interest in the fairness literature including false positive predictions.
【76】 Analyzing Images for Music Recommendation
作者:Anant Baijal,Vivek Agarwal,Danny Hyun
备注:IEEE International Conference on Consumer Electronics (IEEE ICCE 2021)
摘要:Experiencing images with suitable music can greatly enrich the overall user experience. The proposed image analysis method treats an artwork image differently from a photograph image. Automatic image classification is performed using deep-learning based models. An illustrative analysis showcasing the ability of our deep-models to inherently learn and utilize perceptually relevant features when classifying artworks is also presented. The Mean Opinion Score (MOS) obtained from subjective assessments of the respective image and recommended music pairs supports the effectiveness of our approach.
【77】 Hardware Synthesis of State-Space Equations; Application to FPGA Implementation of Shallow and Deep Neural Networks
作者:Amir-Hossein Kiamarzi,Pezhman Torabi,Reza Sameni
机构:Department of Biomedical Informatics, Emory University
摘要:Nowadays, shallow and deep Neural Networks (NNs) have vast applications including biomedical engineering, image processing, computer vision, and speech recognition. Many researchers have developed hardware accelerators including field-programmable gate arrays (FPGAs) for implementing high-performance and energy efficient NNs. Apparently, the hardware architecture design process is specific and time-consuming for each NN. Therefore, a systematic way to design, implement and optimize NNs is highly demanded. The paper presents a systematic approach to implement state-space models in register transfer level (RTL), with special interest for NN implementation. The proposed design flow is based on the iterative nature of state-space models and the analogy between state-space formulations and finite-state machines. The method can be used in linear/nonlinear and time-varying/time-invariant systems. It can also be used to implement either intrinsically iterative systems (widely used in various domains such as signal processing, numerical analysis, computer arithmetic, and control engineering), or systems that could be rewritten in equivalent iterative forms. The implementation of recurrent NNs such as long short-term memory (LSTM) NNs, which have intrinsic state-space forms, are another major applications for this framework. As a case study, it is shown that state-space systems can be used for the systematic implementation and optimization of NNs (as nonlinear and time-varying dynamic systems). An RTL code generating software is also provided online, which simplifies the automatic generation of NNs of arbitrary size.
【78】 Prescriptive Process Monitoring for Cost-Aware Cycle Time Reduction
作者:Zahra Dasht Bozorgi,Irene Teinemaa,Marlon Dumas,Marcello La Rosa
机构: The University of Melbourne, Melbourne, Australia, University of Tartu, Tartu, Estonia
摘要:Reducing cycle time is a recurrent concern in the field of business process management. Depending on the process, various interventions may be triggered to reduce the cycle time of a case, for example, using a faster shipping service in an order-to-delivery process or giving a phone call to a customer to obtain missing information rather than waiting passively. Each of these interventions comes with a cost. This paper tackles the problem of determining if and when to trigger a time-reducing intervention in a way that maximizes the total net gain. The paper proposes a prescriptive process monitoring method that uses orthogonal random forest models to estimate the causal effect of triggering a time-reducing intervention for each ongoing case of a process. Based on this causal effect estimate, the method triggers interventions according to a user-defined policy. The method is evaluated on two real-life logs.
【79】 An Effective Baseline for Robustness to Distributional Shift
作者:Sunil Thulasidasan,Sushil Thapa,Sayera Dhaubhadel,Gopinath Chennupati,Tanmoy Bhattacharya,Jeff Bilmes
机构:Los Alamos National Laboratory, Los Alamos, NM, USA, Dept. of Computer Science and Engineering, New Mexico Tech, Socorro, NM, USA, Dept. of Electrical & Computer Engineering, University of Washington, Seattle, USA
摘要:Refraining from confidently predicting when faced with categories of inputs different from those seen during training is an important requirement for the safe deployment of deep learning systems. While simple to state, this has been a particularly challenging problem in deep learning, where models often end up making overconfident predictions in such situations. In this work we present a simple, but highly effective approach to deal with out-of-distribution detection that uses the principle of abstention: when encountering a sample from an unseen class, the desired behavior is to abstain from predicting. Our approach uses a network with an extra abstention class and is trained on a dataset that is augmented with an uncurated set that consists of a large number of out-of-distribution (OoD) samples that are assigned the label of the abstention class; the model is then trained to learn an effective discriminator between in and out-of-distribution samples. We compare this relatively simple approach against a wide variety of more complex methods that have been proposed both for out-of-distribution detection as well as uncertainty modeling in deep learning, and empirically demonstrate its effectiveness on a wide variety of of benchmarks and deep architectures for image recognition and text classification, often outperforming existing approaches by significant margins. Given the simplicity and effectiveness of this method, we propose that this approach be used as a new additional baseline for future work in this domain.
【80】 Verification of Image-based Neural Network Controllers Using Generative Models
作者:Sydney M. Katz,Anthony L. Corso,Christopher A. Strong,Mykel J. Kochenderfer
机构:∗Denotes equal contribution
备注:10 pages, 12 figures, presented at the 2021 AIAA Digital Avionics Systems Conference (DASC)
摘要:Neural networks are often used to process information from image-based sensors to produce control actions. While they are effective for this task, the complex nature of neural networks makes their output difficult to verify and predict, limiting their use in safety-critical systems. For this reason, recent work has focused on combining techniques in formal methods and reachability analysis to obtain guarantees on the closed-loop performance of neural network controllers. However, these techniques do not scale to the high-dimensional and complicated input space of image-based neural network controllers. In this work, we propose a method to address these challenges by training a generative adversarial network (GAN) to map states to plausible input images. By concatenating the generator network with the control network, we obtain a network with a low-dimensional input space. This insight allows us to use existing closed-loop verification tools to obtain formal guarantees on the performance of image-based controllers. We apply our approach to provide safety guarantees for an image-based neural network controller for an autonomous aircraft taxi problem. We guarantee that the controller will keep the aircraft on the runway and guide the aircraft towards the center of the runway. The guarantees we provide are with respect to the set of input images modeled by our generator network, so we provide a recall metric to evaluate how well the generator captures the space of plausible images.
【81】 Interpretable Drug Synergy Prediction with Graph Neural Networks for Human-AI Collaboration in Healthcare
作者:Zehao Dong,Heming Zhang,Yixin Chen,Fuhai Li
机构:Computer Science, Washington University in St. Louis, St. Louis, MO, USA., Institute for Informatics (I,), Washington University School of Medicine, Washington, Department of Pediatrics, Washington University School of Medicine, Washington University
摘要:我们以归纳和解释的方式研究肿瘤药物联合治疗的耐药或敏感反应的分子机制。虽然深度学习算法在药物协同预测问题中得到了广泛的应用,但如何建立具有生物学意义的预测模型来研究医疗系统中人工智能协同的神秘机制仍然是一个悬而未决的问题。为了应对这些挑战,我们提出了一种深度图神经网络,IDSP(可解释的深度信号通路),将基因-基因以及基因-药物调控关系纳入协同药物组合预测中。IDSP通过多层感知器(MLP)根据基因和药物节点的关系(即信号相互作用)自动学习边缘的权重,并以归纳的方式聚集信息。该体系结构通过检测重要的信号相互作用产生可解释的药物协同预测,并且可以在潜在的分子机制遇到未知的基因或信号通路时实现。我们在46个核心癌症信号通路的基因和NCI年鉴药物组合筛选数据中的药物组合形成的信号网络上测试IDWSP。实验结果表明:(1)IDSP可以从潜在的分子机制中学习,在不需要额外的药物化学信息的情况下进行预测,同时可以获得与现有方法相当的性能;2) IDSP在导入任务和归纳任务的协同预测任务的实现上都表现出良好的通用性和灵活性。3) IDSP可以通过检测不同细胞系的不同显著信号模式(即MoS)产生可解释的结果。
摘要:We investigate molecular mechanisms of resistant or sensitive response of cancer drug combination therapies in an inductive and interpretable manner. Though deep learning algorithms are widely used in the drug synergy prediction problem, it is still an open problem to formulate the prediction model with biological meaning to investigate the mysterious mechanisms of synergy (MoS) for the human-AI collaboration in healthcare systems. To address the challenges, we propose a deep graph neural network, IDSP (Interpretable Deep Signaling Pathways), to incorporate the gene-gene as well as gene-drug regulatory relationships in synergic drug combination predictions. IDSP automatically learns weights of edges based on the gene and drug node relations, i.e., signaling interactions, by a multi-layer perceptron (MLP) and aggregates information in an inductive manner. The proposed architecture generates interpretable drug synergy prediction by detecting important signaling interactions, and can be implemented when the underlying molecular mechanism encounters unseen genes or signaling pathways. We test IDWSP on signaling networks formulated by genes from 46 core cancer signaling pathways and drug combinations from NCI ALMANAC drug combination screening data. The experimental results demonstrated that 1) IDSP can learn from the underlying molecular mechanism to make prediction without additional drug chemical information while achieving highly comparable performance with current state-of-art methods; 2) IDSP show superior generality and flexibility to implement the synergy prediction task on both transductive tasks and inductive tasks. 3) IDSP can generate interpretable results by detecting different salient signaling patterns (i.e. MoS) for different cell lines.
【82】 High-Robustness, Low-Transferability Fingerprinting of Neural Networks
作者:Siyue Wang,Xiao Wang,Pin-Yu Chen,Pu Zhao,Xue Lin
机构:. Northeastern University ,. Boston University ,. IBM Research
备注:ICLR 2021 Workshop on Security and Safety in Machine Learning Systems
摘要:本文提出了一种有效地对深度神经网络进行指纹识别的特征实例,该特征实例对基础模型具有很强的鲁棒性,对模型剪枝具有很强的鲁棒性,对非关联模型的可移植性很低。这是第一个同时考虑鲁棒性和可转移性来生成真实指纹的工作,而目前的方法缺乏实际的假设,并且可能会产生较大的假阳性率。为了在鲁棒性和可转移性之间取得更好的平衡,我们提出了三种特征示例:vanilla C示例、RC示例和LTRC示例,从原始的基础模型中提取指纹。为了公平地描述稳健性和可转移性之间的权衡,我们提出了唯一性得分,这是一个衡量稳健性和可转移性之间差异的综合指标,同时也是虚警问题的一个指标。
摘要:This paper proposes Characteristic Examples for effectively fingerprinting deep neural networks, featuring high-robustness to the base model against model pruning as well as low-transferability to unassociated models. This is the first work taking both robustness and transferability into consideration for generating realistic fingerprints, whereas current methods lack practical assumptions and may incur large false positive rates. To achieve better trade-off between robustness and transferability, we propose three kinds of characteristic examples: vanilla C-examples, RC-examples, and LTRC-example, to derive fingerprints from the original base model. To fairly characterize the trade-off between robustness and transferability, we propose Uniqueness Score, a comprehensive metric that measures the difference between robustness and transferability, which also serves as an indicator to the false alarm problem.
【83】 Node Selection Toward Faster Convergence for Federated Learning on Non-IID Data
作者:Hongda Wu,Ping Wang
机构: which is offloaded via wireless network to edgeHongda Wu and Ping Wang are with the Department of Electrical En-gineering and Computer Science, Lassonde School of Engineering, YorkUniversity
摘要:Federated Learning (FL) is a distributed learning paradigm that enables a large number of resource-limited nodes to collaboratively train a model without data sharing. The non-independent-and-identically-distributed (non-i.i.d.) data samples invoke discrepancy between global and local objectives, making the FL model slow to converge. In this paper, we proposed Optimal Aggregation algorithm for better aggregation, which finds out the optimal subset of local updates of participating nodes in each global round, by identifying and excluding the adverse local updates via checking the relationship between the local gradient and the global gradient. Then, we proposed a Probabilistic Node Selection framework (FedPNS) to dynamically change the probability for each node to be selected based on the output of Optimal Aggregation. FedPNS can preferentially select nodes that propel faster model convergence. The unbiasedness of the proposed FedPNS design is illustrated and the convergence rate improvement of FedPNS over the commonly adopted Federated Averaging (FedAvg) algorithm is analyzed theoretically. Experimental results demonstrate the effectiveness of FedPNS in accelerating the FL convergence rate, as compared to FedAvg with random node selection.
【84】 Visual analogy: Deep learning versus compositional models
作者:Nicholas Ichien,Qing Liu,Shuhao Fu,Keith J. Holyoak,Alan Yuille,Hongjing Lu
机构:Denotes equal author contribution, Department of Psychology,Department of Statistics, University of California, Los Angeles, Los Angeles, CA , USA, Department of Computer Science,Department of Cognitive Science, Johns Hopkins University, Baltimore, MD , USA
摘要:Is analogical reasoning a task that must be learned to solve from scratch by applying deep learning models to massive numbers of reasoning problems? Or are analogies solved by computing similarities between structured representations of analogs? We address this question by comparing human performance on visual analogies created using images of familiar three-dimensional objects (cars and their subregions) with the performance of alternative computational models. Human reasoners achieved above-chance accuracy for all problem types, but made more errors in several conditions (e.g., when relevant subregions were occluded). We compared human performance to that of two recent deep learning models (Siamese Network and Relation Network) directly trained to solve these analogy problems, as well as to that of a compositional model that assesses relational similarity between part-based representations. The compositional model based on part representations, but not the deep learning models, generated qualitative performance similar to that of human reasoners.
【85】 Improving Graph Neural Networks with Simple Architecture Design
作者:Sunil Kumar Maurya,Xin Liu,Tsuyoshi Murata
机构:Tokyo Institute of Technology, Tokyo, Japan, AIRC, AIST
摘要:Graph Neural Networks have emerged as a useful tool to learn on the data by applying additional constraints based on the graph structure. These graphs are often created with assumed intrinsic relations between the entities. In recent years, there have been tremendous improvements in the architecture design, pushing the performance up in various prediction tasks. In general, these neural architectures combine layer depth and node feature aggregation steps. This makes it challenging to analyze the importance of features at various hops and the expressiveness of the neural network layers. As different graph datasets show varying levels of homophily and heterophily in features and class label distribution, it becomes essential to understand which features are important for the prediction tasks without any prior information. In this work, we decouple the node feature aggregation step and depth of graph neural network and introduce several key design strategies for graph neural networks. More specifically, we propose to use softmax as a regularizer and "Soft-Selector" of features aggregated from neighbors at different hop distances; and "Hop-Normalization" over GNN layers. Combining these techniques, we present a simple and shallow model, Feature Selection Graph Neural Network (FSGNN), and show empirically that the proposed model outperforms other state of the art GNN models and achieves up to 64% improvements in accuracy on node classification tasks. Moreover, analyzing the learned soft-selection parameters of the model provides a simple way to study the importance of features in the prediction tasks. Finally, we demonstrate with experiments that the model is scalable for large graphs with millions of nodes and billions of edges.
【86】 Deep learning for detecting pulmonary tuberculosis via chest radiography: an international study across 10 countries
作者:Sahar Kazemzadeh,Jin Yu,Shahar Jamshy,Rory Pilgrim,Zaid Nabulsi,Christina Chen,Neeral Beladia,Charles Lau,Scott Mayer McKinney,Thad Hughes,Atilla Kiraly,Sreenivasa Raju Kalidindi,Monde Muyoyeta,Jameson Malemela,Ting Shih,Greg S. Corrado,Lily Peng,Katherine Chou,Po-Hsuan Cameron Chen,Yun Liu,Krish Eswaran,Daniel Tse,Shravya Shetty,Shruthi Prabhakara
机构:Prabhakara,‡, Affiliations, Google Health, Palo Alto, CA, USA, Work done at Google via Advanced Clinical, Deerfield, IL, USA, Apollo Radiology International, Hyderabad, India, TB department,Center of Infectious Disease Research in Zambia, Lusaka, Zambia
摘要:Tuberculosis (TB) is a top-10 cause of death worldwide. Though the WHO recommends chest radiographs (CXRs) for TB screening, the limited availability of CXR interpretation is a barrier. We trained a deep learning system (DLS) to detect active pulmonary TB using CXRs from 9 countries across Africa, Asia, and Europe, and utilized large-scale CXR pretraining, attention pooling, and noisy student semi-supervised learning. Evaluation was on (1) a combined test set spanning China, India, US, and Zambia, and (2) an independent mining population in South Africa. Given WHO targets of 90% sensitivity and 70% specificity, the DLS's operating point was prespecified to favor sensitivity over specificity. On the combined test set, the DLS's ROC curve was above all 9 India-based radiologists, with an AUC of 0.90 (95%CI 0.87-0.92). The DLS's sensitivity (88%) was higher than the India-based radiologists (75% mean sensitivity), p<0.001 for superiority; and its specificity (79%) was non-inferior to the radiologists (84% mean specificity), p=0.004. Similar trends were observed within HIV positive and sputum smear positive sub-groups, and in the South Africa test set. We found that 5 US-based radiologists (where TB isn't endemic) were more sensitive and less specific than the India-based radiologists (where TB is endemic). The DLS also remained non-inferior to the US-based radiologists. In simulations, using the DLS as a prioritization tool for confirmatory testing reduced the cost per positive case detected by 40-80% compared to using confirmatory testing alone. To conclude, our DLS generalized to 5 countries, and merits prospective evaluation to assist cost-effective screening efforts in radiologist-limited settings. Operating point flexibility may permit customization of the DLS to account for site-specific factors such as TB prevalence, demographics, clinical resources, and customary practice patterns.
【87】 Optimal control of robust team stochastic games
作者:Feng Huang,Ming Cao,Long Wang
机构: Universityof Groningen
备注:under review
摘要:In stochastic dynamic environments, team stochastic games have emerged as a versatile paradigm for studying sequential decision-making problems of fully cooperative multi-agent systems. However, the optimality of the derived policies is usually sensitive to the model parameters, which are typically unknown and required to be estimated from noisy data in practice. To mitigate the sensitivity of the optimal policy to these uncertain parameters, in this paper, we propose a model of "robust" team stochastic games, where players utilize a robust optimization approach to make decisions. This model extends team stochastic games to the scenario of incomplete information and meanwhile provides an alternative solution concept of robust team optimality. To seek such a solution, we develop a learning algorithm in the form of a Gauss-Seidel modified policy iteration and prove its convergence. This algorithm, compared with robust dynamic programming, not only possesses a faster convergence rate, but also allows for using approximation calculations to alleviate the curse of dimensionality. Moreover, some numerical simulations are presented to demonstrate the effectiveness of the algorithm by generalizing the game model of social dilemmas to sequential robust scenarios.
【88】 A brain basis of dynamical intelligence for AI and computational neuroscience
作者:Joseph D. Monaco,Kanaka Rajan,Grace M. Hwang
机构: Department of Biomedical Engineering, Johns Hopkins University (JHU) School, of Medicine, Baltimore, MD, USA;, Icahn School of Medicine at Mount Sinai, New York, NY, USA;, JHUApplied Physics Lab, Laurel, MD, USA; JHU Kavli Neuroscience Discovery
备注:Perspective article: 178 references, 24 pages, 3 figures, and 1 glossary box
摘要:The deep neural nets of modern artificial intelligence (AI) have not achieved defining features of biological intelligence, including abstraction, causal learning, and energy-efficiency. While scaling to larger models has delivered performance improvements for current applications, more brain-like capacities may demand new theories, models, and methods for designing artificial learning systems. Here, we argue that this opportunity to reassess insights from the brain should stimulate cooperation between AI research and theory-driven computational neuroscience (CN). To motivate a brain basis of neural computation, we present a dynamical view of intelligence from which we elaborate concepts of sparsity in network structure, temporal dynamics, and interactive learning. In particular, we suggest that temporal dynamics, as expressed through neural synchrony, nested oscillations, and flexible sequences, provide a rich computational layer for reading and updating hierarchical models distributed in long-term memory networks. Moreover, embracing agent-centered paradigms in AI and CN will accelerate our understanding of the complex dynamics and behaviors that build useful world models. A convergence of AI/CN theories and objectives will reveal dynamical principles of intelligence for brains and engineered learning systems. This article was inspired by our symposium on dynamical neuroscience and machine learning at the 6th Annual US/NIH BRAIN Initiative Investigators Meeting.
【89】 A Monotone Approximate Dynamic Programming Approach for the Stochastic Scheduling, Allocation, and Inventory Replenishment Problem: Applications to Drone and Electric Vehicle Battery Swap Stations
作者:Amin Asadi,Sarah Nurre Pinkley
机构:Department of Industrial Engineering, University of Arkansas, Bell Engineering, Fayetteville, AR
摘要:There is a growing interest in using electric vehicles (EVs) and drones for many applications. However, battery-oriented issues, including range anxiety and battery degradation, impede adoption. Battery swap stations are one alternative to reduce these concerns that allow the swap of depleted for full batteries in minutes. We consider the problem of deriving actions at a battery swap station when explicitly considering the uncertain arrival of swap demand, battery degradation, and replacement. We model the operations at a battery swap station using a finite horizon Markov Decision Process model for the stochastic scheduling, allocation, and inventory replenishment problem (SAIRP), which determines when and how many batteries are charged, discharged, and replaced over time. We present theoretical proofs for the monotonicity of the value function and monotone structure of an optimal policy for special SAIRP cases. Due to the curses of dimensionality, we develop a new monotone approximate dynamic programming (ADP) method, which intelligently initializes a value function approximation using regression. In computational tests, we demonstrate the superior performance of the new regression-based monotone ADP method as compared to exact methods and other monotone ADP methods. Further, with the tests, we deduce policy insights for drone swap stations.