3234 阅读 2021-05-24 10:39:13 上传
以下文章来源于 浙江语言学
cs.AI人工智能,共计89篇
【1】 Stage-wise Fine-tuning for Graph-to-Text Generation
标题:用于图形到文本生成的阶段性微调
作者:Qingyun Wang,Semih Yavuz,Victoria Lin,Heng Ji,Nazneen Rajani
机构: University of Illinois at Urbana-Champaign, Salesforce Research , Facebook Research
备注:9 pages, Accepted by Proceedings of ACL-IJCNLP 2021 Student Research Workshop, Code and Resources at this this https URL
链接:https://arxiv.org/abs/2105.08021
摘要:图形到文本的生成得益于预先训练的语言模型(plm),它比结构化的图形编码器具有更好的性能。然而,它们并没有充分利用输入图的结构信息。为了进一步提高预训练语言模型的性能,本文提出了一种带两步微调机制的结构化图-文本模型,首先对Wikipedia上的模型进行微调,然后再适应图-文本生成。除了使用传统的标记和位置嵌入对知识图进行编码外,我们还提出了一种新的树级嵌入方法来捕获输入图的相互依赖结构。这种新方法显著提高了英语WebNLG 2017数据集所有文本生成指标的性能。
摘要:Graph-to-text generation has benefited from pre-trained language models (PLMs) in achieving better performance than structured graph encoders. However, they fail to fully utilize the structure information of the input graph. In this paper, we aim to further improve the performance of the pre-trained language model by proposing a structured graph-to-text model with a two-step fine-tuning mechanism which first fine-tunes model on Wikipedia before adapting to the graph-to-text generation. In addition to using the traditional token and position embeddings to encode the knowledge graph (KG), we propose a novel tree-level embedding method to capture the inter-dependency structures of the input graph. This new approach has significantly improved the performance of all text generation metrics for the English WebNLG 2017 dataset.
【2】 Controlling an Inverted Pendulum with Policy Gradient Methods-A Tutorial
标题:使用策略梯度方法控制倒立摆-教程
作者:Swagat Kumar
机构:EdgeHillUniversity
备注:8 pages, 3 figures, 2 tables etc
链接:https://arxiv.org/abs/2105.07998
摘要:本文详细介绍了实现两种重要的策略梯度方法来解决倒立摆问题。这两种算法分别是深层确定性策略梯度(DDPG)和近端策略优化(PPO)算法。该问题通过一个演员-评论家模型来解决,其中演员网络用于学习策略函数,评论家网络通过学习估计Q函数来评估演员网络。除了简要解释这两种算法背后的数学原理外,本文还提供了python实现的细节,这有助于揭开算法潜在复杂性的神秘面纱。在此过程中,读者将了解用于实现上述概念的OpenAI/Gym、tensorflow2.x和Keras实用程序。
摘要:This paper provides the details of implementing two important policy gradient methods to solve the inverted pendulum problem. These are namely the Deep Deterministic Policy Gradient (DDPG) and the Proximal Policy Optimization (PPO) algorithm. The problem is solved by using an actor-critic model where an actor-network is used to learn the policy function and a critic network is to evaluate the actor-network by learning to estimate the Q function. Apart from briefly explaining the mathematics behind these two algorithms, the details of python implementation are provided which helps in demystifying the underlying complexity of the algorithm. In the process, the readers will be introduced to OpenAI/Gym, Tensorflow 2.x and Keras utilities used for implementing the above concepts.
【3】 Learning User Embeddings from Temporal Social Media Data: A Survey
标题:从时态社交媒体数据中学习用户嵌入:综述
作者:Fatema Hasan,Kevin S. Xu,James R. Foulds,Shimei Pan
机构:Information Systems, University of Maryland, Baltimore County, Electrical Engineering & Computer Science, University of Toledo
链接:https://arxiv.org/abs/2105.07996
摘要:社交媒体上用户生成的数据包含关于我们是谁、我们喜欢什么以及我们如何做决定的丰富信息。在这篇论文中,我们调查了学习一个简明的潜在用户表示(又称用户嵌入)的代表性工作,它可以捕捉社会媒体用户的主要特征。学习到的用户嵌入可用于支持不同的下游用户分析任务,如个性建模、自杀风险评估和购买决策预测。在许多现有的用户嵌入文献中,用户在社交媒体上生成的数据的时间性在很大程度上被忽略了。在这项调查中,我们专注于研究,桥梁的差距,纳入时间/顺序信息的用户表示学习。我们将相关论文按几个关键维度进行分类,找出目前工作的局限性,并提出未来的研究方向。
摘要:User-generated data on social media contain rich information about who we are, what we like and how we make decisions. In this paper, we survey representative work on learning a concise latent user representation (a.k.a. user embedding) that can capture the main characteristics of a social media user. The learned user embeddings can later be used to support different downstream user analysis tasks such as personality modeling, suicidal risk assessment and purchase decision prediction. The temporal nature of user-generated data on social media has largely been overlooked in much of the existing user embedding literature. In this survey, we focus on research that bridges the gap by incorporating temporal/sequential information in user representation learning. We categorize relevant papers along several key dimensions, identify limitations in the current work and suggest future research directions.
【4】 Learning to Automatically Catch Potholes in Worldwide Road Scene Images
标题:学习自动捕捉全球道路场景图像中的坑洞
作者:J. Javier Yebes,David Montero,Ignacio Arriola
机构: the Department of Transport in UK stated in 20 1 4 that more than £ 3 billion were spent nationally onroad repairs
备注:in IEEE Intelligent Transportation Systems Magazine
链接:https://arxiv.org/abs/2105.07986
摘要:在世界上任何铺砌道路上都存在的几种道路危险中,坑洞是最令人讨厌的,也涉及更高的维护成本。由于技术和研究的进步,人们对这些危险的自动检测越来越感兴趣。我们的研究工作解决了从真实道路场景图像中检测坑洞的难题。主要的新奇之处在于应用人工智能的最新进展来学习坑洞的视觉外观。我们建立了一个带有坑洞注释的大型图像数据集。它们包含了来自世界不同城市的道路场景,在不同的环境条件下用不同的相机、车辆和视点拍摄。然后,基于快速R-CNN和SSD深度神经网络对四种不同的目标检测模型进行了微调。在具有GPGPU功能的nvidiadrivepx2平台上进行了测试,测试结果表明该探测器具有较高的平均精度,可以嵌入到车辆上。此外,作为自动驾驶仪H2020项目的一部分,它被部署在实车上,将检测到的坑洞通知给给定的物联网平台。
摘要:Among several road hazards that are present in any paved way in the world, potholes are one of the most annoying and also involving higher maintenance costs. There exists an increasing interest on the automated detection of these hazards enabled by technological and research progress. Our research work tackled the challenge of pothole detection from images of real world road scenes. The main novelty resides on the application of the latest progress in AI to learn the visual appearance of potholes. We built a large dataset of images with pothole annotations. They contained road scenes from different cities in the world, taken with different cameras, vehicles and viewpoints under varied environmental conditions. Then, we fine-tuned four different object detection models based on Faster R-CNN and SSD deep neural networks. We achieved high average precision and the pothole detector was tested on the Nvidia DrivePX2 platform with GPGPU capability, which can be embedded on vehicles. Moreover, it was deployed on a real vehicle to notify the detected potholes to a given IoT platform as part of AUTOPILOT H2020 project.
【5】 Gradient Masking and the Underestimated Robustness Threats of Differential Privacy in Deep Learning
标题:梯度掩蔽与深度学习中差分隐私的低估稳健性威胁
作者:Franziska Boenisch,Philip Sperl,Konstantin Böttinger
机构:Fraunhofer Institute for Applied and Integrated Security
链接:https://arxiv.org/abs/2105.07985
摘要:深度学习中的一个重要问题是神经网络的保密性和安全性。长期以来,这两个方面一直是分开考虑的。迄今为止,人们对隐私增强训练如何影响NNs的健壮性仍知之甚少。本文通过实验评估了隐私保护标准方法差分隐私(DP)训练对模型抵御多种攻击的影响。结果表明,私有模型的鲁棒性不如非私有模型,而且对抗性例子在DP模型之间的传递比在非私有模型和私有模型之间的传递好。此外,对DP和非DP模型的详细分析表明,它们的梯度存在显著差异。此外,本文首次发现DP训练中参数的不当选择会导致梯度掩蔽,从而导致错误的安全感。
摘要:An important problem in deep learning is the privacy and security of neural networks (NNs). Both aspects have long been considered separately. To date, it is still poorly understood how privacy enhancing training affects the robustness of NNs. This paper experimentally evaluates the impact of training with Differential Privacy (DP), a standard method for privacy preservation, on model vulnerability against a broad range of adversarial attacks. The results suggest that private models are less robust than their non-private counterparts, and that adversarial examples transfer better among DP models than between non-private and private ones. Furthermore, detailed analyses of DP and non-DP models suggest significant differences between their gradients. Additionally, this work is the first to observe that an unfavorable choice of parameters in DP training can lead to gradient masking, and, thereby, results in a wrong sense of security.
【6】 Learn to Intervene: An Adaptive Learning Policy for Restless Bandits in Application to Preventive Healthcare
标题:学会干预:一种不安分土匪在预防性医疗中的自适应学习策略
作者:Arpita Biswas,Gaurav Aggarwal,Pradeep Varakantham,Milind Tambe
机构:Harvard University, Google Research
备注:To appear in the 30th International Joint Conference on Artificial Intelligence (IJCAI 2021)
链接:https://arxiv.org/abs/2105.07965
摘要:在许多公共卫生环境中,患者坚持健康计划是很重要的,例如服药和定期健康检查。不幸的是,受益人可能会逐渐脱离这类计划,这对他们的健康有害。一个组织观察到一个逐步脱离接触的具体例子,该组织执行了一个免费的自动呼叫方案,在孕妇中传播预防保健信息。很多女性在注册几个月后就不再接电话了。为了避免这种脱离接触,必须及时采取干预措施。这种干预措施往往费用高昂,只能提供给一小部分受益者。我们将这种情况建模为一个不安的多武装强盗(RMAB)问题,假设每个受益者根据干预从一个国家过渡到另一个国家。此外,由于转移概率是先验未知的,我们提出了一种基于Whittle指数的Q-学习机制,并证明了其收敛于最优解。我们的方法改进了现有的基于学习的方法,从多个基准,从文献和产妇保健数据集。
摘要:In many public health settings, it is important for patients to adhere to health programs, such as taking medications and periodic health checks. Unfortunately, beneficiaries may gradually disengage from such programs, which is detrimental to their health. A concrete example of gradual disengagement has been observed by an organization that carries out a free automated call-based program for spreading preventive care information among pregnant women. Many women stop picking up calls after being enrolled for a few months. To avoid such disengagements, it is important to provide timely interventions. Such interventions are often expensive and can be provided to only a small fraction of the beneficiaries. We model this scenario as a restless multi-armed bandit (RMAB) problem, where each beneficiary is assumed to transition from one state to another depending on the intervention. Moreover, since the transition probabilities are unknown a priori, we propose a Whittle index based Q-Learning mechanism and show that it converges to the optimal solution. Our method improves over existing learning-based methods for RMABs on multiple benchmarks from literature and also on the maternal healthcare dataset.
【7】 Behavior-based Neuroevolutionary Training in Reinforcement Learning
标题:强化学习中基于行为的神经进化训练
作者:Jörg Stork,Martin Zaefferer,Nils Eisler,Patrick Tichelmann,Thomas Bartz-Beielstein,A. E. Eiben
机构:TH Köln, Cologne, Germany, Vrije Universitat Amsterdam, Amsterdam, Netherlands
链接:https://arxiv.org/abs/2105.07960
摘要:除了在解决经典优化问题方面无可争议的成功之外,神经进化算法和基于群体的算法已经成为标准强化学习方法的替代方法。然而,进化方法通常缺乏基于标准价值的方法的样本效率,这些方法利用收集的状态和价值经验。对于实际问题的强化学习,如果考虑到大量的资源消耗,那么样本效率是至关重要的。因此,用经验开发方法增强进化算法是需要的,并有望获得有价值的见解。提出了一种将拓扑变化神经进化优化与基于值的强化学习相结合的混合算法。我们将说明如何使用策略的行为来创建距离和损失函数,这将受益于存储的经验和计算的状态值。它们允许我们通过无梯度进化算法和基于代理的优化在行为空间建模和执行定向搜索。为此,我们整合不同的方法来生成和优化代理策略,从而创建不同的群体。我们举例说明了我们的算法在标准基准测试和一个专门构建的实际问题上的性能。结果表明,组合方法可以提高进化方法的样本效率和学习速度。
摘要:In addition to their undisputed success in solving classical optimization problems, neuroevolutionary and population-based algorithms have become an alternative to standard reinforcement learning methods. However, evolutionary methods often lack the sample efficiency of standard value-based methods that leverage gathered state and value experience. If reinforcement learning for real-world problems with significant resource cost is considered, sample efficiency is essential. The enhancement of evolutionary algorithms with experience exploiting methods is thus desired and promises valuable insights. This work presents a hybrid algorithm that combines topology-changing neuroevolutionary optimization with value-based reinforcement learning. We illustrate how the behavior of policies can be used to create distance and loss functions, which benefit from stored experiences and calculated state values. They allow us to model behavior and perform a directed search in the behavior space by gradient-free evolutionary algorithms and surrogate-based optimization. For this purpose, we consolidate different methods to generate and optimize agent policies, creating a diverse population. We exemplify the performance of our algorithm on standard benchmarks and a purpose-built real-world problem. Our results indicate that combining methods can enhance the sample efficiency and learning speed for evolutionary approaches.
【8】 Evolutionary Training and Abstraction Yields Algorithmic Generalization of Neural Computers
标题:进化训练和抽象产生神经计算机的算法泛化
作者:Daniel Tanneberg,Elmar Rueckert,Jan Peters
机构:Intelligent Autonomous Systems, Technische Universit¨at Darmstadt, Darmstadt, Germany, Institute for Robotics and Cognitive Systems, Universit¨at zu L¨ubeck, L¨ubeck, Germany
备注:Nature Machine Intelligence
链接:https://arxiv.org/abs/2105.07957
摘要:智能行为的一个关键特征是学习抽象策略的能力,这种策略可以扩展并转移到不熟悉的问题。抽象策略解决问题类中的每一个样本,无论其表示形式或复杂性如何——就像计算机科学中的算法一样。神经网络是处理感官数据、发现隐藏模式和学习复杂函数的强大模型,但它们很难学习这种迭代、顺序或层次算法策略。扩展具有外部记忆的神经网络提高了它们学习这些策略的能力,但是它们仍然容易发生数据变化,难以学习可伸缩和可转移的解决方案,并且需要大量的训练数据。我们提出了一种基于记忆增强网络的神经哈佛计算机(NHC),它通过将算法操作与数据操作解耦来实现抽象,通过信息流的分裂和模块的分离来实现。这种抽象机制和进化训练使得学习鲁棒的和可伸缩的算法解决方案成为可能。在11种不同复杂度的算法上,我们证明了NHC能够可靠地学习具有强泛化和抽象性的算法解:完美的泛化和扩展到任意的任务配置和复杂度远远超出了训练中所看到的,独立于数据表示和任务域。
摘要:A key feature of intelligent behaviour is the ability to learn abstract strategies that scale and transfer to unfamiliar problems. An abstract strategy solves every sample from a problem class, no matter its representation or complexity -- like algorithms in computer science. Neural networks are powerful models for processing sensory data, discovering hidden patterns, and learning complex functions, but they struggle to learn such iterative, sequential or hierarchical algorithmic strategies. Extending neural networks with external memories has increased their capacities in learning such strategies, but they are still prone to data variations, struggle to learn scalable and transferable solutions, and require massive training data. We present the Neural Harvard Computer (NHC), a memory-augmented network based architecture, that employs abstraction by decoupling algorithmic operations from data manipulations, realized by splitting the information flow and separated modules. This abstraction mechanism and evolutionary training enable the learning of robust and scalable algorithmic solutions. On a diverse set of 11 algorithms with varying complexities, we show that the NHC reliably learns algorithmic solutions with strong generalization and abstraction: perfect generalization and scaling to arbitrary task configurations and complexities far beyond seen during training, and being independent of the data representation and the task domain.
【9】 MMGET: A Markov model for generalized evidence theory
标题:MMGET:广义证据理论的马尔可夫模型
作者:Yuanpeng He
机构: School of Computer and Information Science, SouthwestUniversity, School of Computer and InformationScience, Southwest University
备注:20 pages,24 figures
链接:https://arxiv.org/abs/2105.07952
摘要:在现实生活中,大量的信息不时地融合在一起。为了恰当地描述实际情况,人们提出了许多理论。其中,Dempster-Shafer证据理论是一种非常有用的不确定信息管理工具。为了更好地适应开放世界的复杂环境,设计了一种广义证据理论。然而,每件事都是按顺序发生的,并且彼此之间有一些潜在的关系。为了进一步体现信息的细节,更好地符合现实世界的情况,在广义证据理论中引入马尔可夫模型,从所提供的证据中提取完整的信息量。此外,通过数值算例验证了该方法的正确性和合理性。
摘要:In real life, lots of information merges from time to time. To appropriately describe the actual situations, lots of theories have been proposed. Among them, Dempster-Shafer evidence theory is a very useful tool in managing uncertain information. To better adapt to complex situations of open world, a generalized evidence theory is designed. However, everything occurs in sequence and owns some underlying relationships with each other. In order to further embody the details of information and better conforms to situations of real world, a Markov model is introduced into the generalized evidence theory which helps extract complete information volume from evidence provided. Besides, some numerical examples is offered to verify the correctness and rationality of the proposed method.
【10】 TCL: Transformer-based Dynamic Graph Modelling via Contrastive Learning
标题:TCL:基于对比学习的Transformer动态图建模
作者:Lu Wang,Xiaofu Chang,Shuang Li,Yunfei Chu,Hui Li,Wei Zhang,Xiaofeng He,Le Song,Jingren Zhou,Hongxia Yang
机构:East China Normal University, Damo Academy, Alibaba Group, Harvard University, USA, Ant Group, Gatech
链接:https://arxiv.org/abs/2105.07944
摘要:动态图建模由于其在推荐系统、金融交易和社交网络等现实场景中的广泛应用,近年来受到了广泛的关注。尽管近年来已有许多关于动态图建模的工作被提出,但是有效的、可扩展的模型还有待开发。本文提出了一种新的图神经网络方法TCL,它以连续时间的方式处理动态演化的图,实现了有效的动态节点表示学习。从技术上讲,我们的模型包含三个新的方面。首先,我们将vanilla变换器推广到时态图学习场景中,并设计了一个图拓扑感知变换器。其次,在提出的图变换器的基础上,我们引入了一个双流编码器,分别从与两个交互节点相关联的时间邻域中提取表示,然后利用一个共同注意变换器在语义层次上建模相互依赖关系。最后,我们受到最近发展的对比学习的启发,建议通过最大化两个未来交互节点的预测表示之间的互信息来优化我们的模型。得益于此,我们的动态表示可以保留有关交互的高级(或全局)语义,因此对噪声交互具有鲁棒性。据我们所知,这是第一次尝试将对比学习应用于动态图形的表征学习。我们在四个相互作用预测的基准数据集上评估了我们的模型,实验结果证明了我们模型的优越性。
摘要:Dynamic graph modeling has recently attracted much attention due to its extensive applications in many real-world scenarios, such as recommendation systems, financial transactions, and social networks. Although many works have been proposed for dynamic graph modeling in recent years, effective and scalable models are yet to be developed. In this paper, we propose a novel graph neural network approach, called TCL, which deals with the dynamically-evolving graph in a continuous-time fashion and enables effective dynamic node representation learning that captures both the temporal and topology information. Technically, our model contains three novel aspects. First, we generalize the vanilla Transformer to temporal graph learning scenarios and design a graph-topology-aware transformer. Secondly, on top of the proposed graph transformer, we introduce a two-stream encoder that separately extracts representations from temporal neighborhoods associated with the two interaction nodes and then utilizes a co-attentional transformer to model inter-dependencies at a semantic level. Lastly, we are inspired by the recently developed contrastive learning and propose to optimize our model by maximizing mutual information (MI) between the predictive representations of two future interaction nodes. Benefiting from this, our dynamic representations can preserve high-level (or global) semantics about interactions and thus is robust to noisy interactions. To the best of our knowledge, this is the first attempt to apply contrastive learning to representation learning on dynamic graphs. We evaluate our model on four benchmark datasets for interaction prediction and experiment results demonstrate the superiority of our model.
【11】 Mean Field Games Flock! The Reinforcement Learning Way
标题:卑鄙的田径运动会蜂拥而至!强化学习方法
作者:Sarah Perrin,Mathieu Laurière,Julien Pérolat,Matthieu Geist,Romuald Élie,Olivier Pietquin
机构:Univ. Lille, CNRS, Inria, Centrale Lille, UMR , CRIStAL, Princeton University, ORFE, DeepMind Paris, Google Research, Brain Team
链接:https://arxiv.org/abs/2105.07933
摘要:我们提出了一种方法,使大量的代理人学习如何羊群,这是一个自然的行为,观察到大量的动物种群。这个问题引起了很多人的兴趣,但需要许多结构性假设,而且只能在小范围内处理。我们把这个问题称为平均场博弈(MFG),其中每个个体根据群体行为选择其加速度。结合深度强化学习(RL)和规范化流(NF),我们得到了一个易于处理的解决方案,只需要非常弱的假设。我们的算法找到了一个纳什均衡,并且代理调整他们的速度以匹配相邻群体的平均速度。我们使用虚拟游戏和替代方法:(1)用深度RL计算近似最佳响应,(2)用NF估计下一个种群分布。数值结果表明,该算法能有效地学习多群体或高维障碍物群集。
摘要:We present a method enabling a large number of agents to learn how to flock, which is a natural behavior observed in large populations of animals. This problem has drawn a lot of interest but requires many structural assumptions and is tractable only in small dimensions. We phrase this problem as a Mean Field Game (MFG), where each individual chooses its acceleration depending on the population behavior. Combining Deep Reinforcement Learning (RL) and Normalizing Flows (NF), we obtain a tractable solution requiring only very weak assumptions. Our algorithm finds a Nash Equilibrium and the agents adapt their velocity to match the neighboring flock's average one. We use Fictitious Play and alternate: (1) computing an approximate best response with Deep RL, and (2) estimating the next population distribution with NF. We show numerically that our algorithm learn multi-group or high-dimensional flocking with obstacles.
【12】 Physics-informed attention-based neural network for solving non-linear partial differential equations
标题:求解非线性偏微分方程的物理感知注意力神经网络
作者:Ruben Rodriguez-Torrado,Pablo Ruiz,Luis Cueto-Felgueroso,Michael Cerny Green,Tyler Friesen,Sebastien Matringe,Julian Togelius
机构:OriGen.AI and Universidad Politecnica de Madrid, Hess Corporation
链接:https://arxiv.org/abs/2105.07898
摘要:物理信息神经网络(PINNs)在偏微分方程(pde)描述的物理过程建模方面有了显著的改进。pinn基于简单的体系结构,通过优化网络参数来学习复杂物理系统的行为,使底层PDE的残差最小化。当应用于连续介质力学中的非线性微分方程时,现有的网络结构与经典的数值离散格式有一些共同的局限性。一个典型的例子是解决双曲守恒律发展高度本地化的非线性冲击波。对于当前的PINN方法来说,学习具有双曲线特征的偏微分方程的解是一个挑战,与大多数基于网格的数值格式一样,PINN方法依赖于添加人工耗散。在这里,我们讨论的基本问题是,哪些网络体系结构最适合学习非线性偏微分方程的复杂行为。我们关注于网络结构而不是剩余正则化。我们的新方法称为基于物理信息的注意神经网络(PIANNs),它是递归神经网络和注意机制的结合。注意机制使深层神经网络的行为适应解的非线性特征,打破了目前PINNs的局限性。我们发现,PIANNs能有效地捕捉双曲模型问题中的激波前沿,并能在训练集内外提供高质量的解。
摘要:Physics-Informed Neural Networks (PINNs) have enabled significant improvements in modelling physical processes described by partial differential equations (PDEs). PINNs are based on simple architectures, and learn the behavior of complex physical systems by optimizing the network parameters to minimize the residual of the underlying PDE. Current network architectures share some of the limitations of classical numerical discretization schemes when applied to non-linear differential equations in continuum mechanics. A paradigmatic example is the solution of hyperbolic conservation laws that develop highly localized nonlinear shock waves. Learning solutions of PDEs with dominant hyperbolic character is a challenge for current PINN approaches, which rely, like most grid-based numerical schemes, on adding artificial dissipation. Here, we address the fundamental question of which network architectures are best suited to learn the complex behavior of non-linear PDEs. We focus on network architecture rather than on residual regularization. Our new methodology, called Physics-Informed Attention-based Neural Networks, (PIANNs), is a combination of recurrent neural networks and attention mechanisms. The attention mechanism adapts the behavior of the deep neural network to the non-linear features of the solution, and break the current limitations of PINNs. We find that PIANNs effectively capture the shock front in a hyperbolic model problem, and are capable of providing high-quality solutions inside and beyond the training set.
【13】 HetMAML: Task-Heterogeneous Model-Agnostic Meta-Learning for Few-Shot Learning Across Modalities
标题:HetMAML:基于任务异构模型不可知元学习的多通道学习
作者:Jiayi Chen,Aidong Zhang
机构:University of Virginia
链接:https://arxiv.org/abs/2105.07889
摘要:现有的基于梯度的元学习方法大多假设所有任务具有相同的输入特征空间。然而,在现实场景中,有许多情况下,任务的输入结构可能不同,即不同的任务可能在输入模式的数量或每个模式的数据结构上有所不同。现有的基于梯度的方法不能处理这种异构任务分布(HTD),因为不同类型的任务只共享部分元参数。在本文中,我们提出了一个任务异构元不可知元学习框架hetmam,它不仅可以概括不同类型任务共享的公共元参数,而且可以概括特定类型的元参数。具体来说,我们设计了一个多通道主干模块,将每种任务的输入编码到相同长度的特定于模态的嵌入序列中。然后,我们提出了一种任务感知的多模态编码器,该编码器能够自动考虑任务特定输入结构的上下文,并自适应地将异构输入空间投影到相同的低维概念空间。在五个任务异构数据集上的大量实验表明,我们的HetMAML成功地捕获了跨异构任务的类型特定元参数和共享元参数,能够快速适应所有类型的新任务。
摘要:Most of existing gradient-based meta-learning approaches to few-shot learning assume that all tasks have the same input feature space. However, in the real world scenarios, there are many cases that the input structures of tasks can be different, that is, different tasks may vary in the number of input modalities or the data structure of each modality. Existing gradient-based approaches cannot handle such heterogeneous task distribution (HTD) as different types of tasks only share partial meta-parameters. In this paper, we propose HetMAML, a task-heterogeneous meta-agnostic meta-learning framework that can generalize not only common meta-parameters shared across different types of tasks but also type-specific meta-parameters. Specifically, we design a multi-channel backbone module that encodes the input of each type of tasks into the same length sequence of modality-specific embeddings. Then, we propose a task-aware multimodal encoder which can automatically take into account the context of task-specific input structures and adaptively project the heterogeneous input spaces to the same lower-dimensional concept space. The extensive experiments on five task-heterogeneous datasets demonstrate that our HetMAML successfully captures both type-specific and shared meta-parameters across heterogeneous tasks which fast adapt to all types of new tasks.
【14】 Efficient and accurate group testing via Belief Propagation: an empirical study
标题:基于信念传播的高效准确团体测试的实证研究
作者:AminCoja-Oghlan,Max Hahn-Klimroth,Philipp Loick,Manuel Penschuck
链接:https://arxiv.org/abs/2105.07882
摘要:组测试问题要求高效的池方案和算法,允许为罕见感染筛选中等数量的样本。我们的目标是在进行尽可能少的检测的同时,准确地识别被感染的样本。通过探索以信念传播消息传递算法为中心的技术,我们提出了一种新的测试设计,显著提高了结果的准确性。新设计将信念传播作为一种有效的推理算法。针对实际问题而非渐近问题的结果,我们进行了实验研究。
摘要:The group testing problem asks for efficient pooling schemes and algorithms that allow to screen moderately large numbers of samples for rare infections. The goal is to accurately identify the infected samples while conducting the least possible number of tests. Exploring the use of techniques centred around the Belief Propagation message passing algorithm, we suggest a new test design that significantly increases the accuracy of the results. The new design comes with Belief Propagation as an efficient inference algorithm. Aiming for results on practical rather than asymptotic problem sizes, we conduct an experimental study.
【15】 Conscious AI
标题:有意识的人工智能
作者:Hadi Esmaeilzadeh,Reza Vaezi
机构:University of California San Diego, Kennesaw State University
链接:https://arxiv.org/abs/2105.07879
摘要:人工智能(AI)的最新进展使分类任务的速度和精度达到了人的尺度。反过来,这些能力使人工智能成为许多人类活动的可行替代品,这些活动的核心涉及分类,例如低级服务工作中的基本机械和分析任务。当前的系统不需要有意识地识别模式并对其进行分类。然而,要让人工智能发展到需要直觉和移情的更复杂的任务,它必须发展类似于人类自我意识或意识的元思维、创造力和移情等能力。我们认为,只有通过人工智能向意识状态的根本转变,这种转变才有可能实现,这种转变类似于人类通过自然选择和进化过程所发生的转变。因此,本文旨在从理论上探讨人工智能中意识出现的要求。它还提供了一个有意识的人工智能是如何被检测到的原则性理解,以及它是如何表现出来的,这与试图最终创造语言上与人类无法区分的机器的主导范式形成了对比。
摘要:Recent advances in artificial intelligence (AI) have achieved human-scale speed and accuracy for classification tasks. In turn, these capabilities have made AI a viable replacement for many human activities that at their core involve classification, such as basic mechanical and analytical tasks in low-level service jobs. Current systems do not need to be conscious to recognize patterns and classify them. However, for AI to progress to more complicated tasks requiring intuition and empathy, it must develop capabilities such as metathinking, creativity, and empathy akin to human self-awareness or consciousness. We contend that such a paradigm shift is possible only through a fundamental shift in the state of artificial intelligence toward consciousness, a shift similar to what took place for humans through the process of natural selection and evolution. As such, this paper aims to theoretically explore the requirements for the emergence of consciousness in AI. It also provides a principled understanding of how conscious AI can be detected and how it might be manifested in contrast to the dominant paradigm that seeks to ultimately create machines that are linguistically indistinguishable from humans.
【16】 A Review on Explainability in Multimodal Deep Neural Nets
标题:多模态深度神经网络的可解释性研究综述
作者:Gargi Joshi,Rahee Walambe,Ketan Kotecha
备注:24 pages 6 figures
链接:https://arxiv.org/abs/2105.07878
摘要:基于深度神经网络的人工智能技术在许多应用领域取得了巨大的成功,尤其是在计算机视觉应用和自然语言处理领域。超越人类水平的表现推动了语言、视觉、感官、文本等不同形态在准确预测和识别中的应用研究。文献中提出了几种采用深度学习模型的多模态融合方法。尽管它们表现出色,但深层神经网络的复杂、不透明和黑盒特性限制了它们的社会接受度和可用性。这引起了对模型可解释性和可解释性的追求,在涉及多模态人工智能方法的复杂任务中更是如此。本文对多模态深层神经网络的可解释性,特别是视觉和语言任务的可解释性进行了综述。本文讨论了多模态人工智能及其在一般领域中的应用,包括多模态人工智能的意义、数据集、方法和技术的基本组成部分、挑战、应用以及该领域的未来趋势
摘要:Artificial Intelligence techniques powered by deep neural nets have achieved much success in several application domains, most significantly and notably in the Computer Vision applications and Natural Language Processing tasks. Surpassing human-level performance propelled the research in the applications where different modalities amongst language, vision, sensory, text play an important role in accurate predictions and identification. Several multimodal fusion methods employing deep learning models are proposed in the literature. Despite their outstanding performance, the complex, opaque and black-box nature of the deep neural nets limits their social acceptance and usability. This has given rise to the quest for model interpretability and explainability, more so in the complex tasks involving multimodal AI methods. This paper extensively reviews the present literature to present a comprehensive survey and commentary on the explainability in multimodal deep neural nets, especially for the vision and language tasks. Several topics on multimodal AI and its applications for generic domains have been covered in this paper, including the significance, datasets, fundamental building blocks of the methods and techniques, challenges, applications, and future trends in this domain
【17】 Quantum Uncertainty in Decision Theory
标题:决策论中的量子不确定性
作者:V. I. Yukalov
机构: YukalovBogolubov Laboratory of Theoretical Physics, Joint Institute for Nuclear Research, Universidade de S˜ao Paulo
备注:17 pages
链接:https://arxiv.org/abs/2105.07877
摘要:提出了一种将决策理论视为基于量子技术的概率理论的方法。给出了描述独立方案选择的量子概率、表征条件量子概率的序贯方案和考虑决策的理性-非理性二元性的行为量子概率的精确定义,并进行了深入分析。解释了量子概率与经典概率的比较。分析表明,量子概率作为一个本质上更强大的工具来描述各种决策情况,包括心理行为效应的影响。
摘要:An approach is presented treating decision theory as a probabilistic theory based on quantum techniques. Accurate definitions are given and thorough analysis is accomplished for the quantum probabilities describing the choice between separate alternatives, sequential alternatives characterizing conditional quantum probabilities, and behavioral quantum probabilities taking into account rational-irrational duality of decision making. The comparison between quantum and classical probabilities is explained. The analysis demonstrates that quantum probabilities serve as an essentially more powerful tool of characterizing various decision-making situations including the influence of psychological behavioral effects.
【18】 The challenges and realities of retailing in a COVID-19 world: Identifying trending and Vital During Crisis keywords during Covid-19 using Machine Learning (Austria as a case study)
标题:冠状病毒世界中零售业的挑战和现实:使用机器学习识别冠状病毒期间的趋势和关键字(奥地利作为案例研究)
作者:Reda Mastouri Et Al.,Joseph Gilkey
备注:easychair, ENSIAS Rabat, Morocco. Saint Peter's University, NJ- USA
链接:https://arxiv.org/abs/2105.07876
摘要:从全球大流行病到地缘政治动荡,物流、产品分配、采购和运营领域的领导者在保护自己的组织免受供应链漏洞的侵害方面正面临越来越大的困难。建议选择基于趋势基准的预测,因为审核未来预测更注重季节性。预测模型提供了对整个供应链的端到端实时监控,同时利用预测分析和人工智能在潜在中断发生之前识别它们。通过结合内部和外部数据点,提出一个支持人工智能的建模引擎,可以帮助零售企业主动应对供需变化,从而大大降低风险。本文的研究重点在于创造一种巧妙的方法来应对COVID19对供应链、产品配置、趋势和季节性的影响。关键词:供应链,covid-19,预测,冠状病毒,制造业,季节性,趋势,零售。
摘要:From global pandemics to geopolitical turmoil, leaders in logistics, product allocation, procurement and operations are facing increasing difficulty with safeguarding their organizations against supply chain vulnerabilities. It is recommended to opt for forecasting against trending based benchmark because auditing a future forecast puts more focus on seasonality. The forecasting models provide with end-to-end, real time oversight of the entire supply chain, while utilizing predictive analytics and artificial intelligence to identify potential disruptions before they occur. By combining internal and external data points, coming up with an AI-enabled modelling engine can greatly reduce risk by helping retail companies proactively respond to supply and demand variability. This research paper puts focus on creating an ingenious way to tackle the impact of COVID19 on Supply chain, product allocation, trending and seasonality. Key words: Supply chain, covid-19, forecasting, coronavirus, manufacturing, seasonality, trending, retail.
【19】 An Extensive Analytical Approach on Human Resources using Random Forest Algorithm
标题:一种基于随机森林算法的人力资源扩展分析方法
作者:Swarajya lakshmi v papineni,A. Mallikarjuna Reddy,Sudeepti yarlagadda,Snigdha Yarlagadda,Haritha Akkinen
机构: Professor, Department of IT,Prasad V Potluri Siddhartha Institute of Technology, vijayawada, AP, India, Assistant Professor, Department of CSE, Anurag University, Freelance HR Consultant,Hyderabad,telangana State, India
链接:https://arxiv.org/abs/2105.07855
摘要:目前的工作调查显示,由于数据科学家、商业分析师和人工智能领域等近期工作的高薪,大多数软件员工正计划改变自己的工作角色。调查还表明,工作生活不平衡、工资低、轮班不均等诸多因素也让员工考虑改变工作生活。在本文中,为了使公司在人力资源方面有效率的组织,所提出的系统在考虑不同员工参数的情况下,借助随机森林算法设计了一个模型。这有助于人力资源部通过找出差距来留住员工,并帮助组织以良好的员工保留率顺利运行。人力资源和数据科学的结合有助于提高组织员工的生产力、协作和幸福感。它还有助于制定从外部和社会因素方面影响员工绩效的战略。
摘要:The current job survey shows that most software employees are planning to change their job role due to high pay for recent jobs such as data scientists, business analysts and artificial intelligence fields. The survey also indicated that work life imbalances, low pay, uneven shifts and many other factors also make employees think about changing their work life. In this paper, for an efficient organisation of the company in terms of human resources, the proposed system designed a model with the help of a random forest algorithm by considering different employee parameters. This helps the HR department retain the employee by identifying gaps and helping the organisation to run smoothly with a good employee retention ratio. This combination of HR and data science can help the productivity, collaboration and well-being of employees of the organisation. It also helps to develop strategies that have an impact on the performance of employees in terms of external and social factors.
【20】 Hard Choices and Hard Limits for Artificial Intelligence
标题:人工智能的艰难抉择与硬限制
作者:Bryce Goodman
机构:Department of Philosophy, University of Oxford, United Kingdom
链接:https://arxiv.org/abs/2105.07852
摘要:人工智能(AI)应该能帮助我们做出更好的选择。其中有些选择很小,比如去上班的路线,或者听什么音乐。另一些则很重要,比如对一种疾病采取什么治疗,或者对一个人的罪行判处多长时间。如果人工智能能帮助做出这些重大决策,我们可能会认为它也能帮助做出艰难的选择,在这种情况下,选择既不是更好,也不是更差,也不是平等的,而是平等的。然而,本文的目的是证明这种观点是错误的:对等的事实表明,人工智能在决策和选择中存在着人工智能无法、也不应该解决的硬限制。
摘要:Artificial intelligence (AI) is supposed to help us make better choices. Some of these choices are small, like what route to take to work, or what music to listen to. Others are big, like what treatment to administer for a disease or how long to sentence someone for a crime. If AI can assist with these big decisions, we might think it can also help with hard choices, cases where alternatives are neither better, worse nor equal but on a par. The aim of this paper, however, is to show that this view is mistaken: the fact of parity shows that there are hard limits on AI in decision making and choices that AI cannot, and should not, resolve.
【21】 The Flipped Classroom model for teaching Conditional Random Fields in an NLP course
标题:NLP课程条件随机场教学的翻转课堂模式
作者:Manex Agirrezabal
机构:Centre for Language Technology (CST) - Department of Nordic Studies and Linguistics, University of Copenhagen Københavns Universitet, Emil Holms Kanal , Copenhagen (Denmark)
备注:Accepted to the 5th Workshop on Teaching NLP at NAACL-HLT 2021
链接:https://arxiv.org/abs/2105.07850
摘要:在这篇文章中,我们展示并讨论了我们在自然语言处理课程中应用翻转课堂教学法来教授条件随机场的经验。我们介绍了我们共同开发的活动及其与认知复杂性模型(布鲁姆分类法)的关系。在此之后,我们将对模型本身提出自己的思考和期望。从学生的评价来看,学生似乎了解了这个话题,而且这种方法对一些学生来说是有益的。此外,我们还讨论了一些不足之处,并提出了可能的解决方案。最后对本文的工作进行了展望。
摘要:In this article, we show and discuss our experience in applying the flipped classroom method for teaching Conditional Random Fields in a Natural Language Processing course. We present the activities that we developed together with their relationship to a cognitive complexity model (Bloom's taxonomy). After this, we provide our own reflections and expectations of the model itself. Based on the evaluation got from students, it seems that students learn about the topic and also that the method is rewarding for some students. Additionally, we discuss some shortcomings and we propose possible solutions to them. We conclude the paper with some possible future work.
【22】 How to Explain Neural Networks: A perspective of data space division
标题:如何解释神经网络:数据空间划分的视角
作者:Hangcheng Dong,Bingguo Liu,Fengdong Chen,Dong Ye,Guodong Liu
机构:cn) are with School ofInstrumentation Science and Engineering, Harbin Institute of Techonoloy
链接:https://arxiv.org/abs/2105.07831
摘要:以深度学习为代表的智能算法的可解释性一直是一个开放的问题。从解释的两个属性,即完整性和明确性出发,讨论了现有解释方法的不足。此外,我们还指出,完全依赖前馈映射的模型极易造成不可解释性,因为很难量化这种映射与最终模型之间的关系。从数据空间划分的角度出发,提出了完全局部可解释模型不可知解释(CLIMEP)的原理。为了研究分类问题,我们进一步讨论了CLIMEP和决策边界的等价性。事实上,CLIMEP的实现也很困难。为了解决这一问题,基于一个具有分段线性激活函数(PWLs)的全连通神经网络(FCNN)可以将输入空间划分为若干个线性区域这一事实,我们通过线性化激活函数的策略将这一结果推广到任意FCNN。将该方法应用于分类问题的求解,首次得到了模糊神经网络的完全决策边界。最后,我们提出了决策网(DecisionNet,DNet),它用决策边界的超平面划分输入空间。因此,DNet的每个线性间隔仅包含相同标签的样本。实验结果表明,在任意控制精度下,DNet的模型压缩效率惊人。
摘要:Interpretability of intelligent algorithms represented by deep learning has been yet an open problem. We discuss the shortcomings of the existing explainable method based on the two attributes of explanation, which are called completeness and explicitness. Furthermore, we point out that a model that completely relies on feed-forward mapping is extremely easy to cause inexplicability because it is hard to quantify the relationship between this mapping and the final model. Based on the perspective of the data space division, the principle of complete local interpretable model-agnostic explanations (CLIMEP) is proposed in this paper. To study the classification problems, we further discussed the equivalence of the CLIMEP and the decision boundary. As a matter of fact, it is also difficult to implementation of CLIMEP. To tackle the challenge, motivated by the fact that a fully-connected neural network (FCNN) with piece-wise linear activation functions (PWLs) can partition the input space into several linear regions, we extend this result to arbitrary FCNNs by the strategy of linearizing the activation functions. Applying this technique to solving classification problems, it is the first time that the complete decision boundary of FCNNs has been able to be obtained. Finally, we propose the DecisionNet (DNet), which divides the input space by the hyper-planes of the decision boundary. Hence, each linear interval of the DNet merely contains samples of the same label. Experiments show that the surprising model compression efficiency of the DNet with an arbitrary controlled precision.
【23】 TopicsRanksDC: Distance-based Topic Ranking applied on Two-Class Data
标题:TopicsRanksDC:基于距离的两类数据主题排序
作者:Malik Yousef,Jamal Al Qundus,Silvio Peikert,Adrian Paschke
机构:The Galilee Digital Health Research Center (GDH),Zefat, Data Analytics Center (DANA), Fraunhofer FOKUS, Berlin
备注:10 pages, 5 figures
链接:https://arxiv.org/abs/2105.07826
摘要:本文提出了一种新的主题排序方法TopicsRanksDC,该方法基于每个主题生成的两个聚类之间的距离来进行主题排序。我们假设我们的数据由与两个类相关联的文本文档组成。我们的方法将这些文本文档中包含的每个主题按其对区分这两类的重要性进行排序。该算法首先利用潜在Dirichlet分配(LDA)进行主题检测。定义每个主题的单词表示为两个簇,其中每个簇与一个类相关联。我们计算了四个距离度量,单连杆,完全连杆,平均连杆和质心之间的距离。我们比较了LDA主题和随机主题的结果。结果表明,LDA主题的排名远高于随机主题。TopicsRanksDC工具的结果对未来的工作很有希望,使搜索引擎能够提出相关的主题。
摘要:In this paper, we introduce a novel approach named TopicsRanksDC for topics ranking based on the distance between two clusters that are generated by each topic. We assume that our data consists of text documents that are associated with two-classes. Our approach ranks each topic contained in these text documents by its significance for separating the two-classes. Firstly, the algorithm detects topics using Latent Dirichlet Allocation (LDA). The words defining each topic are represented as two clusters, where each one is associated with one of the classes. We compute four distance metrics, Single Linkage, Complete Linkage, Average Linkage and distance between the centroid. We compare the results of LDA topics and random topics. The results show that the rank for LDA topics is much higher than random topics. The results of TopicsRanksDC tool are promising for future work to enable search engines to suggest related topics.
【24】 Designer-User Communication for XAI: An epistemological approach to discuss XAI design
标题:XAI的设计者-用户沟通:探讨XAI设计的认识论视角
作者:Juliana Jansen Ferreira,Mateus Monteiro
机构:IBM Research, Rio de Janeiro, Brazil, Federal Fluminense, University
备注:ACM CHI Workshop on Operationalizing Human-Centered Perspectives in Explainable AI at CHI 2021. 6 pages
链接:https://arxiv.org/abs/2105.07804
摘要:人工智能正在成为我们现在使用的任何技术的一部分。如果人工智能通知人们的决策,那么对人工智能的结果、结果和行为的解释就成为一种必要的能力。然而,与不同的涉众讨论XAI特性并不是一项简单的任务。大多数可用的XAI框架和方法都将数据科学家和ML开发人员作为用户。我们的研究是针对人工智能系统终端用户的XAI。我们认为,我们需要在AI系统设计过程的早期与所有利益相关者讨论XAI。在这项工作中,我们旨在研究如何在人工智能的设计者和开发者及其最终用户之间实施关于XAI场景和机会的讨论。我们将符号化信息作为概念工具来构建和讨论XAI场景。我们尝试使用它来讨论医疗AI系统。
摘要:Artificial Intelligence is becoming part of any technology we use nowadays. If the AI informs people's decisions, the explanation about AI's outcomes, results, and behavior becomes a necessary capability. However, the discussion of XAI features with various stakeholders is not a trivial task. Most of the available frameworks and methods for XAI focus on data scientists and ML developers as users. Our research is about XAI for end-users of AI systems. We argue that we need to discuss XAI early in the AI-system design process and with all stakeholders. In this work, we aimed at investigating how to operationalize the discussion about XAI scenarios and opportunities among designers and developers of AI and its end-users. We took the Signifying Message as our conceptual tool to structure and discuss XAI scenarios. We experiment with its use for the discussion of a healthcare AI-System.
【25】 DISCO Verification: Division of Input Space into COnvex polytopes for neural network verification
标题:DISCO验证:用于神经网络验证的输入空间凸多面体划分
作者:Julien Girard-Satabin,Aymeric Varasse,Marc Schoenauer,Guillaume Charpiat,Zakaria Chihani
机构: Université Paris-Saclay, CEA, List, F-, Palaiseau, France, TAU team, LISN (Université Paris-Saclay and CNRS), INRIA
链接:https://arxiv.org/abs/2105.07776
摘要:现代神经网络令人印象深刻的结果部分来自于它们的非线性行为。不幸的是,这种特性使得应用形式化验证工具非常困难,即使我们将自己局限于具有分段线性结构的网络。然而,这种网络产生的子区域是线性的,因此更容易独立分析。在本文中,我们提出了一种简化验证问题的方法,将一个划分为多个线性子问题。为了评估这种方法的可行性,我们对神经网络进行了实证分析,以估计线性区域的数量,并将其与目前已知的边界进行了比较。我们还介绍了一种旨在减少训练过程中线性区域数量的技术的影响。
摘要:The impressive results of modern neural networks partly come from their non linear behaviour. Unfortunately, this property makes it very difficult to apply formal verification tools, even if we restrict ourselves to networks with a piecewise linear structure. However, such networks yields subregions that are linear and thus simpler to analyse independently. In this paper, we propose a method to simplify the verification problem by operating a partitionning into multiple linear subproblems. To evaluate the feasibility of such an approach, we perform an empirical analysis of neural networks to estimate the number of linear regions, and compare them to the bounds currently known. We also present the impact of a technique aiming at reducing the number of linear regions during training.
【26】 Automated Biodesign Engineering by Abductive Meta-Interpretive Learning
标题:基于外推元解释学习的自动化生物设计工程
作者:Wang-Zhou Dai,Liam Hallett,Stephen H. Muggleton,Geoff S. Baldwin
机构:Department of Computing, Imperial College London, SW,AZ, UK., Department of Life Science, Imperial College London, SW,AZ, UK.
备注:Accepted by SSS-21 (AAAI Spring Symposium Series 2021), Artificial Intelligence for Synthetic Biology (AI4Synbio) track
链接:https://arxiv.org/abs/2105.07758
摘要:人工智能(AI)在合成生物学中的应用将为遗传设计的高通量自动化平台的创建提供基础,其中学习机用于通过设计构建测试学习(DTBL)循环迭代优化系统。然而,以深度学习为代表的主流机器学习技术缺乏表达关系知识的能力,需要大量的带注释的训练数据。这些缺点强烈地限制了人工智能在合成生物学中的作用,在合成生物学中,实验本身就是资源和时间密集型的。在这项工作中,我们提出了一个由诱因元解释学习(Meta{Abd}$)授权的自动化生物设计工程框架,这是一种结合符号和亚符号机器学习的新型机器学习方法,通过使学习机能够1)利用领域知识和学习由形式语言(如一阶逻辑)表示的人类可解释模型,进一步增强DBTL循环;2) 同时优化模型的结构和参数,使数值预测准确;3) 通过主动生成假设和示例,减少实验成本和数据注释工作。为了验证$Meta{Abd}$的有效性,我们建立了一个模拟微生物宿主中三基因操纵子产生蛋白质的合成数据集,这代表了一个常见的合成生物学问题。
摘要:The application of Artificial Intelligence (AI) to synthetic biology will provide the foundation for the creation of a high throughput automated platform for genetic design, in which a learning machine is used to iteratively optimise the system through a design-build-test-learn (DBTL) cycle. However, mainstream machine learning techniques represented by deep learning lacks the capability to represent relational knowledge and requires prodigious amounts of annotated training data. These drawbacks strongly restrict AI's role in synthetic biology in which experimentation is inherently resource and time intensive. In this work, we propose an automated biodesign engineering framework empowered by Abductive Meta-Interpretive Learning ($Meta_{Abd}$), a novel machine learning approach that combines symbolic and sub-symbolic machine learning, to further enhance the DBTL cycle by enabling the learning machine to 1) exploit domain knowledge and learn human-interpretable models that are expressed by formal languages such as first-order logic; 2) simultaneously optimise the structure and parameters of the models to make accurate numerical predictions; 3) reduce the cost of experiments and effort on data annotation by actively generating hypotheses and examples. To verify the effectiveness of $Meta_{Abd}$, we have modelled a synthetic dataset for the production of proteins from a three gene operon in a microbial host, which represents a common synthetic biology problem.
【27】 Explicit Semantic Cross Feature Learning via Pre-trained Graph Neural Networks for CTR Prediction
标题:基于预训练图神经网络的CTR预测显式语义交叉特征学习
作者:Feng Li,Bencheng Yan,Qingqing Long,Pengjie Wang,Wei Lin,Jian Xu,Bo Zheng
机构:Alibaba Group
备注:SIGIR 2021, 5 pages; The first two authors contributed equally to this work; Pengjie Wang gave a lot of guidance in this work
链接:https://arxiv.org/abs/2105.07752
摘要:交叉特征在点击率预测中起着重要的作用。现有的方法大多采用基于DNN的模型来隐式地获取交叉特征。由于显式语义建模的局限性,这些隐式方法可能导致性能的次优化。虽然传统的统计显式语义交叉特征可以解决这些隐式方法中的问题,但是它仍然面临一些挑战,包括缺乏泛化和昂贵的内存开销。很少有工作专注于应对这些挑战。本文从学习显式语义交叉特征入手,提出了一种基于GNN的预训练交叉特征学习图神经网络(PCF-GNN),旨在以显式方式生成交叉特征。在公共和工业数据集上进行了广泛的实验,其中PCF-GNN在各种任务中表现出在性能和内存效率方面的能力。
摘要:Cross features play an important role in click-through rate (CTR) prediction. Most of the existing methods adopt a DNN-based model to capture the cross features in an implicit manner. These implicit methods may lead to a sub-optimized performance due to the limitation in explicit semantic modeling. Although traditional statistical explicit semantic cross features can address the problem in these implicit methods, it still suffers from some challenges, including lack of generalization and expensive memory cost. Few works focus on tackling these challenges. In this paper, we take the first step in learning the explicit semantic cross features and propose Pre-trained Cross Feature learning Graph Neural Networks (PCF-GNN), a GNN based pre-trained model aiming at generating cross features in an explicit fashion. Extensive experiments are conducted on both public and industrial datasets, where PCF-GNN shows competence in both performance and memory-efficiency in various tasks.
【28】 Towards a Better Tradeoff between Effectiveness and Efficiency in Pre-Ranking: A Learnable Feature Selection based Approach
标题:一种基于可学习特征选择的预排序有效性与效率折衷方法
作者:Xu Ma,Pengjie Wang,Hui Zhao,Shaoguo Liu,Chuhan Zhao,Wei Lin,Kuang-Chih Lee,Jian Xu,Bo Zheng
机构:Alibaba Group
链接:https://arxiv.org/abs/2105.07706
摘要:在现实世界的搜索、推荐和广告系统中,通常采用多级排名体系结构。这种体系结构通常包括匹配、预排序、排序和重新排序阶段。在预排序阶段,通常采用基于向量积的模型和基于表示的体系结构来考虑系统效率。然而,这给系统的有效性带来了很大的损失。本文提出了一种新的预排序方法,该方法支持以交互为中心的复杂模型。提出了一种基于特征复杂度和变分丢失的可学习特征选择方法(FSCD),在有效性和效率之间取得了较好的折衷。在一个真实的电子商务赞助搜索系统中对一个搜索引擎的评价表明,利用所提出的预排序,系统的有效性得到了显著的提高。此外,与传统的预排序模型相比,系统消耗的计算资源是相同的。
摘要:In real-world search, recommendation, and advertising systems, the multi-stage ranking architecture is commonly adopted. Such architecture usually consists of matching, pre-ranking, ranking, and re-ranking stages. In the pre-ranking stage, vector-product based models with representation-focused architecture are commonly adopted to account for system efficiency. However, it brings a significant loss to the effectiveness of the system. In this paper, a novel pre-ranking approach is proposed which supports complicated models with interaction-focused architecture. It achieves a better tradeoff between effectiveness and efficiency by utilizing the proposed learnable Feature Selection method based on feature Complexity and variational Dropout (FSCD). Evaluations in a real-world e-commerce sponsored search system for a search engine demonstrate that utilizing the proposed pre-ranking, the effectiveness of the system is significantly improved. Moreover, compared to the systems with conventional pre-ranking models, an identical amount of computational resource is consumed.
【29】 Approximate Novelty Search
标题:近似查新
作者:Anubhav Singh,Nir Lipovetzky,Miquel Ramirez,Javier Segovia-Aguas
机构: School of Computing and Information Systems, University of Melbourne, Australia, Electrical and Electronic Engineering, University of Melbourne, Australia, Dept. Information and Communication Technologies, Universitat Pompeu Fabra, Spain
链接:https://arxiv.org/abs/2105.07691
摘要:基于宽度的搜索算法通过根据适当定义的新颖性度量对状态进行优先级排序来寻找计划,该度量将状态映射到一组新颖性类别中。评估状态新颖性的空间和时间复杂度是已知的以集合的基数为指数的。我们提出了新的方法来获得多项式逼近的新颖性和宽度为基础的搜索。首先,我们通过随机抽样和Bloom过滤器来近似新颖性计算,减少了运行时和内存占用。其次,我们使用一个自适应策略来近似最佳优先搜索,该策略决定是否放弃开放列表中节点的扩展。这两种技术被集成到现有的基于宽度的算法中,使得新的规划师比其他最先进的规划师在国际规划竞赛的基准上表现得更好。
摘要:Width-based search algorithms seek plans by prioritizing states according to a suitably defined measure of novelty, that maps states into a set of novelty categories. Space and time complexity to evaluate state novelty is known to be exponential on the cardinality of the set. We present novel methods to obtain polynomial approximations of novelty and width-based search. First, we approximate novelty computation via random sampling and Bloom filters, reducing the runtime and memory footprint. Second, we approximate the best-first search using an adaptive policy that decides whether to forgo the expansion of nodes in the open list. These two techniques are integrated into existing width-based algorithms, resulting in new planners that perform significantly better than other state-of-the-art planners over benchmarks from the International Planning Competitions.
【30】 OntoEA: Ontology-guided Entity Alignment via Joint Knowledge Graph Embedding
标题:OntoEA:基于联合知识图嵌入的本体引导实体对齐
作者:Yuejia Xiang,Ziheng Zhang,Jiaoyan Chen,Xi Chen,Zhenxi Lin,Yefeng Zheng
机构:Tencent Jarvis Lab, Shenzhen, China, Department of Computer Science, University of Oxford, UK
链接:https://arxiv.org/abs/2105.07688
摘要:语义嵌入在知识图实体对齐中得到了广泛的研究。目前的方法已经探索和利用了图形结构、实体名称和属性,但是忽略了包含关键元信息的本体(或本体模式),例如类及其与实体的成员关系。本文提出了一种本体指导的实体对齐方法ontology-guided-entity-alignment-method-ontology-ea,其中KGs和它们的本体被联合嵌入,并利用类的层次结构和类的不相交性来避免错误映射。在七个公共和工业基准上的广泛实验已经证明了ontologya的最新性能和ontology的有效性。
摘要:Semantic embedding has been widely investigated for aligning knowledge graph (KG) entities. Current methods have explored and utilized the graph structure, the entity names and attributes, but ignore the ontology (or ontological schema) which contains critical meta information such as classes and their membership relationships with entities. In this paper, we propose an ontology-guided entity alignment method named OntoEA, where both KGs and their ontologies are jointly embedded, and the class hierarchy and the class disjointness are utilized to avoid false mappings. Extensive experiments on seven public and industrial benchmarks have demonstrated the state-of-the-art performance of OntoEA and the effectiveness of the ontologies.
【31】 Continual Learning with Echo State Networks
标题:利用回声状态网络进行持续学习
作者:Andrea Cossu,Davide Bacciu,Antonio Carta,Claudio Gallicchio,Vincenzo Lomonaco
机构:- University of Pisa - Department of Computer Science, Largo B. Pontecorvo, Pisa - Italy, - Scuola Normale Superiore, Piazza dei Cavalieri, Pisa - Italy
备注:In review at ESANN 2021
链接:https://arxiv.org/abs/2105.07674
摘要:连续学习(CL)指的是一种学习设置,其中数据是非平稳的,模型必须在不忘记现有知识的情况下进行学习。序列模式分类的研究是围绕训练的递归网络展开的。在这项工作中,相反,我们在回声状态网络(esn)的上下文中引入CL,其中经常性分量保持不变。我们提供了ESN中灾难性遗忘的首次评估,并强调了使用CL策略的好处,这些策略不适用于经过训练的递归模型。我们的研究结果证实了ESN是一种很有前途的CL模型,并且可以在流媒体场景中使用。
摘要:Continual Learning (CL) refers to a learning setup where data is non stationary and the model has to learn without forgetting existing knowledge. The study of CL for sequential patterns revolves around trained recurrent networks. In this work, instead, we introduce CL in the context of Echo State Networks (ESNs), where the recurrent component is kept fixed. We provide the first evaluation of catastrophic forgetting in ESNs and we highlight the benefits in using CL strategies which are not applicable to trained recurrent models. Our results confirm the ESN as a promising model for CL and open to its use in streaming scenarios.
【32】 Dependency Parsing as MRC-based Span-Span Prediction
标题:依赖关系分析作为基于MRC的跨度预测
作者:Leilei Gan,Yuxian Meng,Kun Kuang,Xiaofei Sun,Chun Fan,Fei Wu,Jiwei Li
机构:♦Zhejiang University, ♠ Peking University, Peng Cheng Laboratory, ♣Shannon.AI
链接:https://arxiv.org/abs/2105.07654
摘要:高阶依赖解析方法可以部分但不能完全解决依赖树中的边应该在文本跨度/子树级别而不是单词级别构造的问题。%这个缺点可能会导致一个不正确的跨度覆盖相应的树根在某个词,虽然这个词是正确地链接到其头部。本文提出了一种新的依赖分析方法来解决这个问题。该方法通过直接建立跨-跨(即子树-子树)关系来构造依赖树。它由两个模块组成:{\It text span proposal module}提出候选文本范围,每个候选文本范围表示依赖树中的子树,用(root,start,end)表示;以及{\it span linking module},它在建议的跨之间构建链接。我们使用机器阅读理解(MRC)框架作为主干来形式化MRC设置中的span链接模块,其中一个span作为查询来提取它应该链接到的文本span/子树。该方法具有以下优点:(1)解决了依赖树中的边需要在子树之间构造的基本问题(2) MRC框架允许该方法在跨度建议阶段检索缺失的跨度,从而提高合格跨度的召回率。在PTB、CTB和通用依赖(UD)基准上的大量实验证明了该方法的有效性。我们能够在PTB和UD基准上实现新的SOTA性能,并在CTB数据集上实现与以前的SOTA模型的竞争性能。代码位于https://github.com/ShannonAI/mrc-for-dependency-parsing.
摘要:Higher-order methods for dependency parsing can partially but not fully addresses the issue that edges in dependency tree should be constructed at the text span/subtree level rather than word level. % This shortcoming can cause an incorrect span covered the corresponding tree rooted at a certain word though the word is correctly linked to its head. In this paper, we propose a new method for dependency parsing to address this issue. The proposed method constructs dependency trees by directly modeling span-span (in other words, subtree-subtree) relations. It consists of two modules: the {\it text span proposal module} which proposes candidate text spans, each of which represents a subtree in the dependency tree denoted by (root, start, end); and the {\it span linking module}, which constructs links between proposed spans. We use the machine reading comprehension (MRC) framework as the backbone to formalize the span linking module in an MRC setup, where one span is used as a query to extract the text span/subtree it should be linked to. The proposed method comes with the following merits: (1) it addresses the fundamental problem that edges in a dependency tree should be constructed between subtrees; (2) the MRC framework allows the method to retrieve missing spans in the span proposal stage, which leads to higher recall for eligible spans. Extensive experiments on the PTB, CTB and Universal Dependencies (UD) benchmarks demonstrate the effectiveness of the proposed method. We are able to achieve new SOTA performances on PTB and UD benchmarks, and competitive performances to previous SOTA models on the CTB dataset. Code is available at https://github.com/ShannonAI/mrc-for-dependency-parsing.
【33】 Traffic-Aware Service Relocation in Cloud-Oriented Elastic Optical Networks
标题:面向云的弹性光网络中业务感知的业务重定位
作者:Róża Goścień
机构: Go´scie´n is with the Department of Systems and Computer Networks, Wroclaw University of Science and Technology
链接:https://arxiv.org/abs/2105.07653
摘要:本文研究了弹性光网络(EONs)中有效的业务重定位问题,即为选定的客户节点改变指定的数据中心,以提高网络性能(以接受的业务量来衡量)。为此,我们首先提出了一种新的云传输网络流量模型。该模型考虑了城市到城市、城市到数据中心、数据中心到数据中心、数据中心到数据中心四种流量类型,而流量特性是基于与网络节点相关的城市的实际经济和地理参数。然后,提出了服务重定位过程支持的专用流分配算法。我们还介绍了21种不同的重定位策略,这些策略使用三种类型的数据进行决策-网络拓扑特征、拒绝历史和流量预测。最后,我们进行了大量的数值实验,以便:(i)调整提出的优化方法和(ii)评估和比较它们的效率,并选择最佳的方法。调查结果证明,所提出的政策是有效的。适当设计的重新定位策略允许分配最多3%的流量(与没有该策略的分配相比)。结果还表明,最有效的重定位策略同时基于拒绝历史和流量预测两类数据。
摘要:In this paper, we study problem of efficient service relocation (i.e., changing assigned data center for a selected client node) in elastic optical networks (EONs) in order to increase network performance (measured by the volume of accepted traffic). To this end, we first propose novel traffic model for cloud ready transport networks. The model takes into account four flow types (i.e., city-to-city, city-to-data center, data center-to-data center and data center-to-data center) while the flow characteristics are based on real economical and geographical parameters of the cities related to network nodes. Then, we propose dedicated flow allocation algorithm that can be supported by the service relocation process. We also introduce 21 different relocation policies, which use three types of data for decision making - network topological characteristics, rejection history and traffic prediction. Eventually, we perform extensive numerical experiments in order to: (i) tune proposed optimization approaches and (ii) evaluate and compare their efficiency and select the best one. The results of the investigation prove high efficiency of the proposed policies. The propoerly designed relocation policy allowed to allocate up to 3% more traffic (compared to the allocation without that policy). The results also reveal that the most efficient relocation policy bases its decisions on two types of data simultaneously - the rejection history and traffic prediction.
【34】 A Formal Framework for Reasoning about Agents' Independence in Self-organizing Multi-agent Systems
标题:自组织多Agent系统中Agent独立性的形式化推理框架
作者:Jieting Luo,Beishui Liao,John-Jules Meyer
机构: Zhejiang University, Hangzhou, Zhejiang Province, China, Utrecht University, Utrecht, the Netherlands
链接:https://arxiv.org/abs/2105.07648
摘要:自组织是一个过程,在没有外部控制或影响的情况下,由最初无序系统的各个部分之间的合作行为形成一个稳定的模式。它作为一种内部控制过程或机制被引入到多智能体系统中,以自发地解决难题。然而,由于自组织多智能体系统具有自治的智能体和它们之间的局部交互作用,因此很难从我们设计的局部智能体的行为来预测系统的行为。提出了一种基于逻辑的自组织多智能体系统框架,其中智能体之间通过遵循其指定的局部规则进行交互。从结构和语义两个角度分析了agent联盟对系统全局行为的依赖关系。我们证明了验证这种自组织多智能体系统的计算复杂度仍然接近于标准ATL。然后结合图论将系统分解为不同层次的联盟,从而更有效地验证代理的贡献。由此产生的关于代理的全部贡献的信息使我们能够理解自组织多代理系统中本地代理行为和系统级行为之间的复杂联系。最后,我们将展示如何使用我们的框架来建模约束满足问题。
摘要:Self-organization is a process where a stable pattern is formed by the cooperative behavior between parts of an initially disordered system without external control or influence. It has been introduced to multi-agent systems as an internal control process or mechanism to solve difficult problems spontaneously. However, because a self-organizing multi-agent system has autonomous agents and local interactions between them, it is difficult to predict the behavior of the system from the behavior of the local agents we design. This paper proposes a logic-based framework of self-organizing multi-agent systems, where agents interact with each other by following their prescribed local rules. The dependence relation between coalitions of agents regarding their contributions to the global behavior of the system is reasoned about from the structural and semantic perspectives. We show that the computational complexity of verifying such a self-organizing multi-agent system remains close to the domain of standard ATL. We then combine our framework with graph theory to decompose a system into different coalitions located in different layers, which allows us to verify agents' full contributions more efficiently. The resulting information about agents' full contributions allows us to understand the complex link between local agent behavior and system level behavior in a self-organizing multi-agent system. Finally, we show how we can use our framework to model a constraint satisfaction problem.
【35】 DOC3-Deep One Class Classification using Contradictions
标题:DOC3-基于矛盾的深度一类分类
作者:Sauptik Dhar,Bernardo Gonzalez Torres
机构: USA 2Universityof California
备注:Deep Learning, Anomaly Detection, Visual Inspection, Learning from Contradictions, Outlier Exposure, 18 pages, 14 tables, 6 Figures
链接:https://arxiv.org/abs/2105.07636
摘要:本文介绍了从矛盾中学习的概念(又称普遍学习)来解决深层一类分类问题。针对广泛采用的一类大边际损失问题,提出了基于矛盾的深层一类分类(DOC3)算法。通过比较DOC3的经验雷达复杂度(ERC)和传统的归纳学习方法,我们发现从矛盾中学习会产生较低的泛化误差。我们的实验结果表明,与归纳学习算法相比,DOC3算法在测试AUC中对CIFAR-10和MV-Tec-AD数据集的有效性分别达到>30%和>50%,并且在许多情况下提高了异常检测的最新水平。
摘要:This paper introduces the notion of learning from contradictions (a.k.a Universum learning) for deep one class classification problems. We formalize this notion for the widely adopted one class large-margin loss, and propose the Deep One Class Classification using Contradictions (DOC3) algorithm. We show that learning from contradictions incurs lower generalization error by comparing the Empirical Radamacher Complexity (ERC) of DOC3 against its traditional inductive learning counterpart. Our empirical results demonstrate the efficacy of DOC3 algorithm achieving > 30% for CIFAR-10 and >50% for MV-Tec AD data sets in test AUCs compared to its inductive learning counterpart and in many cases improving the state-of-the-art in anomaly detection.
【36】 Convex optimization for actionable \& plausible counterfactual explanations
作者:André Artelt,Barbara Hammer
机构:CITEC - Cognitive Interaction Technology, Inspiration , Bielefeld - Germany
链接:https://arxiv.org/abs/2105.07630
摘要:透明性是在现实世界中部署的基于机器学习的决策系统的一个基本要求。通常,给定系统的透明性是通过对给定系统的行为和预测提供解释来实现的。反事实解释是一个突出的例子,特别是直觉解释的决策系统。虽然存在许多不同的反事实解释计算方法,但只有很少的工作(除了因果关系领域的工作)考虑到特征依赖性以及可能限制可能的反事实解释集的合理性。在这项工作中,我们加强了我们以前的工作凸模型计算反事实解释的机制,以确保可操作性和合理性的结果反事实解释。
摘要:Transparency is an essential requirement of machine learning based decision making systems that are deployed in real world. Often, transparency of a given system is achieved by providing explanations of the behavior and predictions of the given system. Counterfactual explanations are a prominent instance of particular intuitive explanations of decision making systems. While a lot of different methods for computing counterfactual explanations exist, only very few work (apart from work from the causality domain) considers feature dependencies as well as plausibility which might limit the set of possible counterfactual explanations. In this work we enhance our previous work on convex modeling for computing counterfactual explanations by a mechanism for ensuring actionability and plausibility of the resulting counterfactual explanations.
【37】 TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance
标题:TAT-QA:金融学中表格内容与文本内容混合的问答基准
作者:Fengbin Zhu,Wenqiang Lei,Youcheng Huang,Chao Wang,Shuo Zhang,Jiancheng Lv,Fuli Feng,Tat-Seng Chua
机构:National University of Singapore,Estates Pte Ltd,Sichuan University,Bloomberg
备注:Accepted by ACL 2021
链接:https://arxiv.org/abs/2105.07624
摘要:结合表格和文本内容的混合数据(如财务报告)在现实世界中非常普遍。然而,在现有的研究中,对这种混合数据的问答(QA)基本上被忽略了。在这项工作中,我们从真实的财务报告中提取样本,构建一个新的大规模QA数据集,包含表格和文本数据,命名为TAT-QA,其中通常需要数字推理来推断答案,例如加法、减法、乘法、除法、计数、比较/排序和成分。我们进一步提出了一个新的QA模型,称为TAGOP,它能够同时对表格和文本进行推理。该算法采用序列标记的方法从表中提取相关单元格,并从文本中提取相关跨距来推断其语义,然后用一组聚合算子对其进行符号推理,得到最终答案。根据我们在TAT-QA上的实验,tagopacceeve58.0%inF1,这比之前的最佳基线模型绝对增加了11.1%。但这一结果仍远远落后于专家人的表现,即在F1中为90.8%。结果表明,我们的TAT-QA非常具有挑战性,可以作为训练和测试处理混合表单数据的强大QA模型的基准。
摘要:Hybrid data combining both tabular and textual content (e.g., financial reports) are quite pervasive in the real world. However, Question Answering (QA) over such hybrid data is largely neglected in existing research. In this work, we extract samples from real financial reports to build a new large-scale QA dataset containing both Tabular And Textual data, named TAT-QA, where numerical reasoning is usually required to infer the answer, such as addition, subtraction, multiplication, division, counting, comparison/sorting, and the compositions. We further propose a novel QA model termed TAGOP, which is capable of reasoning over both tables and text. It adopts sequence tagging to extract relevant cells from the table along with relevant spans from the text to infer their semantics, and then applies symbolic reasoning over them with a set of aggregation operators to arrive at the final answer. TAGOPachieves 58.0% inF1, which is an 11.1% absolute increase over the previous best baseline model, according to our experiments on TAT-QA. But this result still lags far behind performance of expert human, i.e.90.8% in F1. It is demonstrated that our TAT-QA is very challenging and can serve as a benchmark for training and testing powerful QA models that address hybrid form data.
【38】 Towards Unsupervised Domain Adaptation for Deep Face Recognition under Privacy Constraints via Federated Learning
标题:基于联合学习的隐私约束下深度人脸识别的无监督区域自适应
作者:Weiming Zhuang,Xin Gan,Yonggang Wen,Xuesen Zhang,Shuai Zhang,Shuai Yi
机构:Nanyang Technological University, Singapore, SenseTime Research, China
链接:https://arxiv.org/abs/2105.07606
摘要:无监督域自适应被广泛地应用于对目标域中的未标记数据、给定源域中的标记数据(其数据分布与目标域不同)的模型进行泛化。然而,现有的工作不适用于隐私约束下的人脸识别,因为它们需要在两个域之间共享敏感的人脸图像。为了解决这个问题,我们提出了一种新的无监督联合人脸识别方法(FedFR)。FedFR通过联邦学习迭代地聚集源领域的知识,提高了目标领域的性能。它通过在域之间传输模型而不是原始数据来保护数据隐私。此外,本文还提出了一种新的域约束丢失(DCL)方法来正则化源域训练。DCL抑制了源域的数据量优势。我们还改进了一种分层聚类算法来准确预测未标记目标域的伪标签。为此,FedFR形成了一个端到端的训练管道:(1)在源域进行预训练(2) 在目标域进行聚类预测伪标签(3) 跨两个域进行域约束联合学习。在两个新构造的基准上进行了大量的实验和分析,证明了FedFR的有效性。在更真实的基准上,它比目标域中的基线方法和经典方法的性能提高了4%以上。我们相信FedFR将为在隐私限制下将联合学习应用于更多的计算机视觉任务提供启示。
摘要:Unsupervised domain adaptation has been widely adopted to generalize models for unlabeled data in a target domain, given labeled data in a source domain, whose data distributions differ from the target domain. However, existing works are inapplicable to face recognition under privacy constraints because they require sharing sensitive face images between two domains. To address this problem, we propose a novel unsupervised federated face recognition approach (FedFR). FedFR improves the performance in the target domain by iteratively aggregating knowledge from the source domain through federated learning. It protects data privacy by transferring models instead of raw data between domains. Besides, we propose a new domain constraint loss (DCL) to regularize source domain training. DCL suppresses the data volume dominance of the source domain. We also enhance a hierarchical clustering algorithm to predict pseudo labels for the unlabeled target domain accurately. To this end, FedFR forms an end-to-end training pipeline: (1) pre-train in the source domain; (2) predict pseudo labels by clustering in the target domain; (3) conduct domain-constrained federated learning across two domains. Extensive experiments and analysis on two newly constructed benchmarks demonstrate the effectiveness of FedFR. It outperforms the baseline and classic methods in the target domain by over 4% on the more realistic benchmark. We believe that FedFR will shed light on applying federated learning to more computer vision tasks under privacy constraints.
【39】 EasyFL: A Low-code Federated Learning Platform For Dummies
标题:EasyFL:一个面向哑巴的低码联合学习平台
作者:Weiming Zhuang,Xin Gan,Yonggang Wen,Shuai Zhang
机构:Nanyang Technological University, Singapore, SenseTime Research, China
链接:https://arxiv.org/abs/2105.07603
摘要:学术界和工业界已经开发了几个平台来支持流行的隐私保护分布式学习方法——联邦学习(FL)。然而,这些平台使用起来很复杂,需要对FL有深入的了解,这给初学者设置了很高的入门门槛,限制了数据科学家的生产力,并影响了部署效率。在本文中,我们提出了第一个低代码FL平台EasyFL,使具有不同专业水平的用户能够用很少的代码来试验和原型FL应用程序。我们通过统一简单的API设计、模块化设计和细粒度的训练流抽象,实现了这个目标,同时确保了定制的极大灵活性。EasyFL只需几行代码,就可以使用许多现成的功能来加速实验和部署。这些实用功能包括异构仿真、分布式训练优化、全面跟踪和无缝部署。它们是根据拟议的飞行生命周期中确定的挑战提出的。我们的实现表明,EasyFL只需要三行代码就可以构建一个vanilla FL应用程序,至少比其他平台少10倍。此外,我们的评估表明,EasyFL加快了1.5倍的训练。提高了实验和部署的效率。我们相信EasyFL将提高数据科学家的生产力,并使FL面向更广泛的受众。
摘要:Academia and industry have developed several platforms to support the popular privacy-preserving distributed learning method -- Federated Learning (FL). However, these platforms are complex to use and require a deep understanding of FL, which imposes high barriers to entry for beginners, limits the productivity of data scientists, and compromises deployment efficiency. In this paper, we propose the first low-code FL platform, EasyFL, to enable users with various levels of expertise to experiment and prototype FL applications with little coding. We achieve this goal while ensuring great flexibility for customization by unifying simple API design, modular design, and granular training flow abstraction. With only a few lines of code, EasyFL empowers them with many out-of-the-box functionalities to accelerate experimentation and deployment. These practical functionalities are heterogeneity simulation, distributed training optimization, comprehensive tracking, and seamless deployment. They are proposed based on challenges identified in the proposed FL life cycle. Our implementations show that EasyFL requires only three lines of code to build a vanilla FL application, at least 10x lesser than other platforms. Besides, our evaluations demonstrate that EasyFL expedites training by 1.5x. It also improves the efficiency of experiments and deployment. We believe that EasyFL will increase the productivity of data scientists and democratize FL to wider audiences.
【40】 Differentiable SLAM-net: Learning Particle SLAM for Visual Navigation
标题:可微SLAM网:视觉导航的粒子SLAM学习
作者:Peter Karkus,Shaojun Cai,David Hsu
机构:National University of Singapore
备注:CVPR 2021
链接:https://arxiv.org/abs/2105.07593
摘要:同时定位与地图(SLAM)由于转弯速度快、墙壁无特征、相机质量差等原因,在视觉机器人导航等下游应用中仍然具有挑战性。我们引入了可微SLAM网络(SLAM-net)和一种导航结构,使平面机器人能够在以前看不见的室内环境中进行导航。SLAM网络将基于粒子滤波的SLAM算法编码到可微计算图中,通过SLAM算法进行反向传播学习面向任务的神经网络部件。由于SLAM-net可以为最终目标联合优化所有模型组件,因此SLAM-net可以学习在具有挑战性的条件下的鲁棒性。我们在Habitat平台上用不同的真实RGB和RGB-D数据集进行了实验。SLAM-net在噪声环境下的性能明显优于广泛采用的ORB-SLAM。我们采用SLAM网络的导航架构大大提高了人居挑战2020 PointNav任务的最新水平(成功率为37%至64%)。项目网站:http://sites.google.com/view/slamnet
摘要:Simultaneous localization and mapping (SLAM) remains challenging for a number of downstream applications, such as visual robot navigation, because of rapid turns, featureless walls, and poor camera quality. We introduce the Differentiable SLAM Network (SLAM-net) along with a navigation architecture to enable planar robot navigation in previously unseen indoor environments. SLAM-net encodes a particle filter based SLAM algorithm in a differentiable computation graph, and learns task-oriented neural network components by backpropagating through the SLAM algorithm. Because it can optimize all model components jointly for the end-objective, SLAM-net learns to be robust in challenging conditions. We run experiments in the Habitat platform with different real-world RGB and RGB-D datasets. SLAM-net significantly outperforms the widely adapted ORB-SLAM in noisy conditions. Our navigation architecture with SLAM-net improves the state-of-the-art for the Habitat Challenge 2020 PointNav task by a large margin (37% to 64% success). Project website: http://sites.google.com/view/slamnet
【41】 Monitoring electrical systems data-network equipment by means ofFuzzy and Paraconsistent Annotated Logic
标题:用模糊次协调注释逻辑监测电力系统数据网络设备
作者:Hyghor Miranda Cortes,Paulo Eduardo Santos,Joao Inacio da Silva Filho
机构: Jo˜ao In´acio da Silva FilhocaCentro Universit´ario FEI, com)bSchool of Science and Engineering, Flinders University, com)cUniversidade Santa Cec´ılia
备注:38 pages; 14 figures; Under submission
链接:https://arxiv.org/abs/2105.07579
摘要:为了正确的监控和管理,从IT数据网络中获取的信息的数量和复杂性不断增加,这是一个现实。在为变电站和水电站提供有效监督和控制的电力系统中,数据网络也是如此。促成这一事实的是,由此类数据网络监控的装置和新环境的数量不断增加,所涉及的技术也不断发展。这种情况可能导致不完整和/或矛盾的数据,必须解决这些问题,以保持良好的监测水平,从而管理这些系统。本文开发了一个专家系统的原型,用于监测电力系统中数据网络设备的状态,该系统处理不一致性而不忽略推断,这是在区域操作中心(ROC)远程控制水电站和变电站的背景下完成的。该专家系统采用模糊逻辑和带两值注释的仿协调注释逻辑(PAL2v)相结合的算法,对不确定信号进行分析,生成故障、正常、故障等工况,不稳定或不一致/不确定)的设备,被确定为水电站和变电站远程控制的重要设备。该专家系统的原型安装在虚拟服务器上,带有CLP500软件(来自EFACEC制造商),用于调查由区域(巴西)操作中心、aGeneric变电站和通用水电站组成的场景,代表远程控制环境。
摘要:The constant increase in the amount and complexity of information obtained from IT data networkelements, for its correct monitoring and management, is a reality. The same happens to data net-works in electrical systems that provide effective supervision and control of substations and hydro-electric plants. Contributing to this fact is the growing number of installations and new environmentsmonitored by such data networks and the constant evolution of the technologies involved. This sit-uation potentially leads to incomplete and/or contradictory data, issues that must be addressed inorder to maintain a good level of monitoring and, consequently, management of these systems. Inthis paper, a prototype of an expert system is developed to monitor the status of equipment of datanetworks in electrical systems, which deals with inconsistencies without trivialising the inferences.This is accomplished in the context of the remote control of hydroelectric plants and substationsby a Regional Operation Centre (ROC). The expert system is developed with algorithms definedupon a combination of Fuzzy logic and Paraconsistent Annotated Logic with Annotation of TwoValues (PAL2v) in order to analyse uncertain signals and generate the operating conditions (faulty,normal, unstable or inconsistent / indeterminate) of the equipment that are identified as importantfor the remote control of hydroelectric plants and substations. A prototype of this expert systemwas installed on a virtualised server with CLP500 software (from the EFACEC manufacturer) thatwas applied to investigate scenarios consisting of a Regional (Brazilian) Operation Centre, with aGeneric Substation and a Generic Hydroelectric Plant, representing a remote control environment.
【42】 Collaborative Graph Learning with Auxiliary Text for Temporal Event Prediction in Healthcare
标题:带辅助文本的协作图学习在医疗时间事件预测中的应用
作者:Chang Lu,Chandan K. Reddy,Prithwish Chakraborty,Samantha Kleinberg,Yue Ning
机构:Department of Computer Science, Stevens Institute of Technology, Department of Computer Science, Virginia Tech, IBM Research
链接:https://arxiv.org/abs/2105.07542
摘要:准确和可解释的健康事件预测对于医疗保健提供者制定患者护理计划至关重要。电子健康记录(EHR)的可用性使得机器学习在提供这些预测方面取得了进步。然而,许多基于深度学习的方法并不能很好地解决几个关键问题:1)有效地利用疾病领域知识;2) 合作学习患者和疾病的表征;以及3)合并非结构化文本。为了解决这些问题,我们提出了一个协作图学习模型来探索病人与疾病的互动和医学领域知识。我们的解决方案能够捕获患者和疾病的结构特征。该模型还利用非结构化文本数据,采用注意调节策略,然后将注意文本特征整合到一个连续的学习过程中。我们在两个重要的医疗保健问题上进行了大量的实验,与现有的各种模型相比,证明了该方法的预测性能。我们还通过一组烧蚀和案例研究证实了学习表示和模型可解释性的有效性。
摘要:Accurate and explainable health event predictions are becoming crucial for healthcare providers to develop care plans for patients. The availability of electronic health records (EHR) has enabled machine learning advances in providing these predictions. However, many deep learning based methods are not satisfactory in solving several key challenges: 1) effectively utilizing disease domain knowledge; 2) collaboratively learning representations of patients and diseases; and 3) incorporating unstructured text. To address these issues, we propose a collaborative graph learning model to explore patient-disease interactions and medical domain knowledge. Our solution is able to capture structural features of both patients and diseases. The proposed model also utilizes unstructured text data by employing an attention regulation strategy and then integrates attentive text features into a sequential learning process. We conduct extensive experiments on two important healthcare problems to show the competitive prediction performance of the proposed method compared with various state-of-the-art models. We also confirm the effectiveness of learned representations and model interpretability by a set of ablation and case studies.
【43】 Private Facial Diagnosis as an Edge Service for Parkinson's DBS Treatment Valuation
标题:私人面部诊断作为帕金森DBS治疗评估的边缘服务
作者:Richard Jiang,Paul Chazot,Danny Crookes,Ahmed Bouridane,M Emre Celebi
机构: Durham University, United Kingdom Danny Crookes is an emeritus professor with Department of Computer Sci-ence, Queen’s University Belfast
备注:Under review
链接:https://arxiv.org/abs/2105.07533
摘要:面部分型作为一种诊断一系列疾病的新方法,最近被成功地应用于医学诊断,面部生物特征与潜在的遗传或医学原因有着密切的联系。本文以帕金森病(PD)为例,提出了一种面向人工智能(AIoT)边缘隐私保护的面部诊断框架,分析了深部脑刺激(DBS)对PD患者的治疗效果。在该框架中,提出了一个新的基于边缘信息的理论安全框架,在一个面向隐私保护的AIoT多方通信方案上实现了作为服务的私人深度面部诊断,其中部分同态加密(PHE)被用来直接在加密的面部模式上实现隐私保护的深度面部诊断。在我们收集帕金森病患者面部数据集的实验中,我们首次证明了面部模式可以用来评估接受DBS治疗的帕金森病患者的改善情况。我们进一步实现了一个隐私保护的深度面部诊断框架,该框架可以达到与非加密的诊断框架相同的准确性,显示了我们的隐私保护面部诊断作为一个值得信赖的边缘服务在患者PD严重程度分级方面的潜力。
摘要:Facial phenotyping has recently been successfully exploited for medical diagnosis as a novel way to diagnose a range of diseases, where facial biometrics has been revealed to have rich links to underlying genetic or medical causes. In this paper, taking Parkinson's Diseases (PD) as a case study, we proposed an Artificial-Intelligence-of-Things (AIoT) edge-oriented privacy-preserving facial diagnosis framework to analyze the treatment of Deep Brain Stimulation (DBS) on PD patients. In the proposed framework, a new edge-based information theoretically secure framework is proposed to implement private deep facial diagnosis as a service over a privacy-preserving AIoT-oriented multi-party communication scheme, where partial homomorphic encryption (PHE) is leveraged to enable privacy-preserving deep facial diagnosis directly on encrypted facial patterns. In our experiments with a collected facial dataset from PD patients, for the first time, we demonstrated that facial patterns could be used to valuate the improvement of PD patients undergoing DBS treatment. We further implemented a privacy-preserving deep facial diagnosis framework that can achieve the same accuracy as the non-encrypted one, showing the potential of our privacy-preserving facial diagnosis as an trustworthy edge service for grading the severity of PD in patients.
【44】 DRAS-CQSim: A Reinforcement Learning based Framework for HPC Cluster Scheduling
标题:DRAS-CQSim:一种基于强化学习的HPC集群调度框架
作者:Yuping Fan,Zhiling Lan
机构:Illinois Institute of Technology, Chicago, IL
备注:None
链接:https://arxiv.org/abs/2105.07526
摘要:几十年来,系统管理员一直致力于设计和调整集群调度策略,以提高高性能计算(HPC)系统的性能。然而,日益复杂的HPC系统与高度多样化的工作负载相结合,使得这种手动过程具有挑战性、耗时性和易出错性。提出了一种基于强化学习的高性能计算机调度框架DRAS-CQSim,用于自动学习最优调度策略。DRAS-CQSim封装了仿真环境、代理、超参数优化选项和不同的强化学习算法,使系统管理员能够快速获得定制的调度策略。
摘要:For decades, system administrators have been striving to design and tune cluster scheduling policies to improve the performance of high performance computing (HPC) systems. However, the increasingly complex HPC systems combined with highly diverse workloads make such manual process challenging, time-consuming, and error-prone. We present a reinforcement learning based HPC scheduling framework named DRAS-CQSim to automatically learn optimal scheduling policy. DRAS-CQSim encapsulates simulation environments, agents, hyperparameter tuning options, and different reinforcement learning algorithms, which allows the system administrators to quickly obtain customized scheduling policies.
【45】 Graph-Free Knowledge Distillation for Graph Neural Networks
标题:基于图神经网络的无图知识抽取
作者:Xiang Deng,Zhongfei Zhang
机构:State University of New York at Binghamton
链接:https://arxiv.org/abs/2105.07519
摘要:知识提炼(Knowledge extraction,KD)通过强制学生模仿预先训练的教师对训练数据的输出,将知识从教师网络传递给学生。然而,由于数据量大、隐私性或保密性,在许多情况下数据样本并不总是可访问的。卷积神经网络(CNNs)的输入位于连续空间(如图像和视频)的网格域中,但在很大程度上忽略了处理离散空间中具有不同拓扑结构的非网格数据的图神经网络(GNNs)。它们的输入之间的固有差异使得这些基于CNN的方法不适用于GNNs。在本文中,我们提出了我们所知的第一个专门的方法提取知识从GNN没有图形数据。本文提出的无图KD(GFKD)通过多项式分布建模学习知识转移的图拓扑结构。然后我们引入一个梯度估计器来优化这个框架。从本质上讲,梯度w.r.t.图结构是通过只使用GNN正向传播而不使用反向传播获得的,这意味着GFKD与现代GNN库如DGL和Geometric兼容。此外,我们还提供了处理图数据或GNNs中不同类型先验知识的策略。大量的实验表明,GFKD在无需训练数据的情况下就可以实现从GNNs中提取知识的最新性能。
摘要:Knowledge distillation (KD) transfers knowledge from a teacher network to a student by enforcing the student to mimic the outputs of the pretrained teacher on training data. However, data samples are not always accessible in many cases due to large data sizes, privacy, or confidentiality. Many efforts have been made on addressing this problem for convolutional neural networks (CNNs) whose inputs lie in a grid domain within a continuous space such as images and videos, but largely overlook graph neural networks (GNNs) that handle non-grid data with different topology structures within a discrete space. The inherent differences between their inputs make these CNN-based approaches not applicable to GNNs. In this paper, we propose to our best knowledge the first dedicated approach to distilling knowledge from a GNN without graph data. The proposed graph-free KD (GFKD) learns graph topology structures for knowledge transfer by modeling them with multinomial distribution. We then introduce a gradient estimator to optimize this framework. Essentially, the gradients w.r.t. graph structures are obtained by only using GNN forward-propagation without back-propagation, which means that GFKD is compatible with modern GNN libraries such as DGL and Geometric. Moreover, we provide the strategies for handling different types of prior knowledge in the graph data or the GNNs. Extensive experiments demonstrate that GFKD achieves the state-of-the-art performance for distilling knowledge from GNNs without training data.
【46】 Decision Making with Differential Privacy under a Fairness Lens
标题:公平镜头下的差分隐私决策
作者:Ferdinando Fioretto,Cuong Tran,Pascal Van Hentenryck
机构:Syracuse University, Georgia Institute of Technology
备注:This paper is an extended version of the homonymous one, accepted at IJCAI-21
链接:https://arxiv.org/abs/2105.07513
摘要:美国人口普查局(U.S.CensusBureau)等机构发布了有关个人群体的数据集和统计数据,这些数据集和统计数据被用作许多关键决策过程的输入。为了符合隐私和保密要求,这些机构经常被要求发布数据的隐私保护版本。本文从公平的角度研究了不同私有数据集的释放,并分析了它们对一些关键资源分配任务的影响{本文表明,当决策以差异私有数据作为输入时,为实现隐私而添加的噪声对某些群体的影响不成比例。本文分析了这些不相称影响的原因,并提出了减轻这些影响的准则。所提出的方法是评估关键决策问题,使用不同的私人人口普查数据。
摘要:Agencies, such as the U.S. Census Bureau, release data sets and statistics about groups of individuals that are used as input to a number of critical decision processes. To conform to privacy and confidentiality requirements, these agencies are often required to release privacy-preserving versions of the data. This paper studies the release of differentially private data sets and analyzes their impact on some critical resource allocation tasks under a fairness perspective. {The paper shows that, when the decisions take as input differentially private data}, the noise added to achieve privacy disproportionately impacts some groups over others. The paper analyzes the reasons for these disproportionate impacts and proposes guidelines to mitigate these effects. The proposed approaches are evaluated on critical decision problems that use differentially private census data.
【47】 Substitutional Neural Image Compression
标题:替代神经图像压缩
作者:Xiao Wang,Wei Jiang,Wei Wang,Shan Liu,Brian Kulis,Peter Chin
链接:https://arxiv.org/abs/2105.07512
摘要:我们描述了替代神经图像压缩(SNIC),一种增强任何神经图像压缩模型的通用方法,它不需要数据或对训练模型进行额外的调整。它通过灵活的失真度量来提高压缩性能,并使用单个模型实例实现比特率控制。其核心思想是用一个替代图像来代替要压缩的图像,该替代图像以期望的方式优于原始图像。对于传统的编解码器来说,找到这样一个替代品本来就很困难,但由于神经压缩模型的结构完全可微,因此对它们来说却出人意料地有利。通过将特定损失的梯度反向传播到输入,可以有效地迭代地构建所需的替代品。我们证明了SNIC与各种神经压缩模型和目标度量相结合时,在提高压缩质量和执行由率失真曲线测量的比特率控制方面的有效性。文中还讨论了控制精度和发电速度的实验结果。
摘要:We describe Substitutional Neural Image Compression (SNIC), a general approach for enhancing any neural image compression model, that requires no data or additional tuning of the trained model. It boosts compression performance toward a flexible distortion metric and enables bit-rate control using a single model instance. The key idea is to replace the image to be compressed with a substitutional one that outperforms the original one in a desired way. Finding such a substitute is inherently difficult for conventional codecs, yet surprisingly favorable for neural compression models thanks to their fully differentiable structures. With gradients of a particular loss backpropogated to the input, a desired substitute can be efficiently crafted iteratively. We demonstrate the effectiveness of SNIC, when combined with various neural compression models and target metrics, in improving compression quality and performing bit-rate control measured by rate-distortion curves. Empirical results of control precision and generation speed are also discussed.
【48】 Doc2Dict: Information Extraction as Text Generation
标题:Doc2Dict:作为文本生成的信息提取
作者:Benjamin Townsend,Eamon Ito-Fisher,Lily Zhang,Madison May
机构:∗Indico Data Solutions, †Franklin W. Olin College of Engineering, ‡New York University
链接:https://arxiv.org/abs/2105.07510
摘要:通常,信息提取(IE)需要一种流水线方法:首先,在手工标注的文档上训练序列标记模型,以提取相关的跨度;然后,当一个新文档到达时,模型预测跨度,然后对跨度进行后处理和标准化,将信息转换为数据库条目。我们将这个劳动密集型的工作流替换为一个在现有数据库记录上训练的transformer语言模型,以直接生成结构化JSON。我们的解决方案消除了与生成令牌级注释相关的工作负载,并利用了通常非常丰富的数据源(例如数据库记录)。由于长文档在信息提取任务中很常见,因此我们使用梯度检查点和分块编码将我们的方法应用于单个GPU上多达32000个令牌的序列。我们的Doc2Dict方法与更复杂、手工设计的管道具有竞争力,并为文档级信息提取提供了一个简单但有效的基线。我们发布了Doc2Dict模型和代码来重现我们的实验,并促进未来的工作。
摘要:Typically, information extraction (IE) requires a pipeline approach: first, a sequence labeling model is trained on manually annotated documents to extract relevant spans; then, when a new document arrives, a model predicts spans which are then post-processed and standardized to convert the information into a database entry. We replace this labor-intensive workflow with a transformer language model trained on existing database records to directly generate structured JSON. Our solution removes the workload associated with producing token-level annotations and takes advantage of a data source which is generally quite plentiful (e.g. database records). As long documents are common in information extraction tasks, we use gradient checkpointing and chunked encoding to apply our method to sequences of up to 32,000 tokens on a single GPU. Our Doc2Dict approach is competitive with more complex, hand-engineered pipelines and offers a simple but effective baseline for document-level information extraction. We release our Doc2Dict model and code to reproduce our experiments and facilitate future work.
【49】 Abstraction, Validation, and Generalization for Explainable Artificial Intelligence
标题:可解释人工智能的抽象、验证和泛化
作者:Scott Cheng-Hsin Yang,Tomas Folke,Patrick Shafto
机构:Department of Mathematics and Computer Science, Rutgers University, Warren Street, Newark, NJ
链接:https://arxiv.org/abs/2105.07508
摘要:神经网络结构在不断扩大的任务范围内实现了超人的性能。为了有效和安全地部署这些系统,他们的决策必须被广泛的利益相关者理解。解释人工智能的方法已经被提出来应对这一挑战,但是缺乏理论阻碍了系统抽象的发展,而系统抽象是积累知识所必需的。我们提出贝叶斯教学作为一个框架,通过整合机器学习和人类学习来统一可解释人工智能(XAI)。贝叶斯教学将解释形式化为解释者改变被解释者信念的一种交际行为。这种形式化将任何XAI方法分解为四个部分:(1)要解释的推理,(2)解释介质,(3)被解释者模型,和(4)解释者模型。贝叶斯教学提供的对任何XAI方法进行分解的抽象说明了它们之间的不变性。XAI系统的分解实现了模块化验证,因为列出的前三个组件中的每一个都可以半独立地进行测试。这种分解还通过重组来自不同XAI系统的组件来促进泛化,这有助于生成新的变体。如果每个组件都经过验证,那么这些新的变体就不需要逐个进行评估,从而导致开发时间的指数级减少。最后,通过明确解释的目标,贝叶斯教学帮助开发人员评估XAI系统对于其预期的实际用例的适用性。因此,贝叶斯教学提供了一个理论框架,鼓励系统,科学的调查赛。
摘要:Neural network architectures are achieving superhuman performance on an expanding range of tasks. To effectively and safely deploy these systems, their decision-making must be understandable to a wide range of stakeholders. Methods to explain AI have been proposed to answer this challenge, but a lack of theory impedes the development of systematic abstractions which are necessary for cumulative knowledge gains. We propose Bayesian Teaching as a framework for unifying explainable AI (XAI) by integrating machine learning and human learning. Bayesian Teaching formalizes explanation as a communication act of an explainer to shift the beliefs of an explainee. This formalization decomposes any XAI method into four components: (1) the inference to be explained, (2) the explanatory medium, (3) the explainee model, and (4) the explainer model. The abstraction afforded by Bayesian Teaching to decompose any XAI method elucidates the invariances among them. The decomposition of XAI systems enables modular validation, as each of the first three components listed can be tested semi-independently. This decomposition also promotes generalization through recombination of components from different XAI systems, which facilitates the generation of novel variants. These new variants need not be evaluated one by one provided that each component has been validated, leading to an exponential decrease in development time. Finally, by making the goal of explanation explicit, Bayesian Teaching helps developers to assess how suitable an XAI system is for its intended real-world use case. Thus, Bayesian Teaching provides a theoretical framework that encourages systematic, scientific investigation of XAI.
【50】 Uncertainty in Minimum Cost Multicuts for Image and Motion Segmentation
标题:图像和运动分割中最小代价多路径的不确定性
作者:Amirhossein Kardoost,Margret Keuper
机构:Data and Web Science Group, University of Mannheim, Germany
备注:Accepted in the 37th Conference on Uncertainty in Artificial Intelligence (UAI 2021)
链接:https://arxiv.org/abs/2105.07469
摘要:最小代价提升多切分方法在图像分解、网格分割、多目标跟踪和运动分割等领域有着广泛的应用。它在一个基于图的模型中解决了这样的问题,在这个模型中,实体之间的边被分配了实值代价,使得最小割将图分解成一个最优的段数。在最小成本多切口概率公式的驱动下,我们提供了优化过程中决策不确定性的度量。我们认为,获取这些不确定性对于许多实际应用至关重要,并在图像分解(BSDS-500)和运动分割(DAVIS2016和FBMS59)的背景下,根据信息变化(VI)和兰德指数(RI)对三种不同的、广泛使用的数据集进行稀疏化评估。
摘要:The minimum cost lifted multicut approach has proven practically good performance in a wide range of applications such as image decomposition, mesh segmentation, multiple object tracking, and motion segmentation. It addresses such problems in a graph-based model, where real-valued costs are assigned to the edges between entities such that the minimum cut decomposes the graph into an optimal number of segments. Driven by a probabilistic formulation of minimum cost multicuts, we provide a measure for the uncertainties of the decisions made during the optimization. We argue that access to such uncertainties is crucial for many practical applications and conduct an evaluation by means of sparsifications on three different, widely used datasets in the context of image decomposition (BSDS-500) and motion segmentation (DAVIS2016 and FBMS59) in terms of variation of information (VI) and Rand index (RI).
【51】 Few-NERD: A Few-Shot Named Entity Recognition Dataset
标题:几个书呆子:一个简单的命名实体识别数据集
作者:Ning Ding,Guangwei Xu,Yulin Chen,Xiaobin Wang,Xu Han,Pengjun Xie,Hai-Tao Zheng,Zhiyuan Liu
机构:Department of Computer Science and Technology, Tsinghua University, Alibaba Group,Shenzhen International Graduate School, Tsinghua University
备注:Accepted by ACL-IJCNLP 2021, accepted version
链接:https://arxiv.org/abs/2105.07464
摘要:近年来,大量的文献围绕着实体识别(NER)这一主题展开,但很少有公开的基准数据专门针对这一实际而富有挑战性的任务。目前的方法是收集现有的监督NER数据集,并将其重新组织为少数镜头设置进行实证研究。传统上,这些策略的目标是识别粗粒度的实体类型,而实际上,大多数看不见的实体类型都是细粒度的。本文提出了一个由8种粗粒度实体类型和66种细粒度实体类型组成的大规模人类注释Few-ShotNER数据集。很少有书呆子包含来自维基百科的188238个句子,包括4601160个单词,每个单词都被注释为上下文或两级实体类型的一部分。据我们所知,这是第一批拍摄的NER数据集,也是最大的人工制作的NER数据集。通过构建不同侧重点的基准任务来综合评价模型的泛化能力。大量的实证结果和分析表明,很少有书呆子是具有挑战性的,这个问题需要进一步的研究。我们很少把书呆子公之于众https://ningding97.github.io/fewnerd/.
摘要:Recently, considerable literature has grown up around the theme of few-shot named entity recognition (NER), but little published benchmark data specifically focused on the practical and challenging task. Current approaches collect existing supervised NER datasets and re-organize them to the few-shot setting for empirical study. These strategies conventionally aim to recognize coarse-grained entity types with few examples, while in practice, most unseen entity types are fine-grained. In this paper, we present Few-NERD, a large-scale human-annotated few-shot NER dataset with a hierarchy of 8 coarse-grained and 66 fine-grained entity types. Few-NERD consists of 188,238 sentences from Wikipedia, 4,601,160 words are included and each is annotated as context or a part of a two-level entity type. To the best of our knowledge, this is the first few-shot NER dataset and the largest human-crafted NER dataset. We construct benchmark tasks with different emphases to comprehensively assess the generalization capability of models. Extensive empirical results and analysis show that Few-NERD is challenging and the problem requires further research. We make Few-NERD public at https://ningding97.github.io/fewnerd/.
【52】 3D to 4D Facial Expressions Generation Guided by Landmarks
标题:基于地标的3D到4D人脸表情生成
作者:Naima Otberdout,Claudio Ferrari,Mohamed Daoudi,Stefano Berretti,Alberto Del Bimbo
机构:Univ. Lille, CNRS, Centrale Lille, UMR , CRIStAL, F-, Lille, France, Media Integration ad Communication Center, University of Florence, Italy, IMT Lille Douai, Institut Mines-T´el´ecom, Univ. Lille, Centre for Digital Systems, F-, Lille, France
链接:https://arxiv.org/abs/2105.07463
摘要:近年来,基于深度学习的三维人脸生成技术取得了一定的进展,而动态三维人脸表情合成的研究却相对较少。本文针对以下问题提出了一种新的解决方案:给定一个输入的三维中性人脸,能从中生成动态的三维表情吗?为了解决这个问题,我们首先提出了一个网格编码-解码体系结构(Expr-ED),它利用一组三维地标从中性的对应物生成一个富有表现力的三维人脸。然后,我们通过使用流形值GAN(Motion3DGAN)建模面部表情的时间动力学,将其扩展到4D,该GAN能够从表情标签(Motion3DGAN)生成3D地标序列。生成的地标被输入到mesh编码器-解码器中,最终生成一系列3D表情人脸。通过解耦这两个步骤,我们分别解决了由网格变形和运动动力学引起的非线性问题。在CoMA数据集上的实验结果表明,与其他基于地标的三维拟合方法相比,本文提出的基于地标的mesh编码-解码器具有显著的改进,能够生成高质量的动态面部表情。该框架进一步使得3D表达强度能够从低强度到高强度连续地适应。最后,我们展示了我们的框架可以应用于其他任务,如二维-三维面部表情转换。
摘要:While deep learning-based 3D face generation has made a progress recently, the problem of dynamic 3D (4D) facial expression synthesis is less investigated. In this paper, we propose a novel solution to the following question: given one input 3D neutral face, can we generate dynamic 3D (4D) facial expressions from it? To tackle this problem, we first propose a mesh encoder-decoder architecture (Expr-ED) that exploits a set of 3D landmarks to generate an expressive 3D face from its neutral counterpart. Then, we extend it to 4D by modeling the temporal dynamics of facial expressions using a manifold-valued GAN capable of generating a sequence of 3D landmarks from an expression label (Motion3DGAN). The generated landmarks are fed into the mesh encoder-decoder, ultimately producing a sequence of 3D expressive faces. By decoupling the two steps, we separately address the non-linearity induced by the mesh deformation and motion dynamics. The experimental results on the CoMA dataset show that our mesh encoder-decoder guided by landmarks brings a significant improvement with respect to other landmark-based 3D fitting approaches, and that we can generate high quality dynamic facial expressions. This framework further enables the 3D expression intensity to be continuously adapted from low to high intensity. Finally, we show our framework can be applied to other tasks, such as 2D-3D facial expression transfer.
【53】 How Can Robots Trust Each Other? A Relative Needs Entropy Based Trust Assessment Models
标题:机器人如何相互信任?一种基于相对需求熵的信任评估模型
作者:Qin Yang,Ramviyas Parasuraman
机构: Department of Computer Science, University of Georgia
备注:This paper already submitted to the SMC 2021 conference
链接:https://arxiv.org/abs/2105.07443
摘要:多智能体和多机器人系统中的协作可以帮助智能体建立各种形状、形状和模式,以适应不同的情况,表现出相应的功能和目的。agent之间的空间邻近性和功能相似性等关系在agent之间的协作中起着至关重要的作用。与人们一样,代理人之间的信任水平是评价其关系可靠性和稳定性的重要因素。提出了一种新的机器人代理间信任评估模型相对需求熵。RNE度量单个代理或代理组之间的需求分布距离。为了验证该模型的实用性,我们通过模拟一个由两个难度的任务组成的持久性城市搜索救援任务中的异构多机器人分组任务,实现并演示了我们的信任模型。结果表明,与现有的基于能量或距离的分组模型相比,基于RNE信任的机器人分组模型可以获得更好的性能和对不同任务执行的适应性。
摘要:Cooperation in multi-agent and multi-robot systems can help agents build various formations, shapes, and patterns presenting corresponding functions and purposes adapting to different situations. Relationship between agents such as their spatial proximity and functional similarities could play a crucial role in cooperation between agents. Trust level between agents is an essential factor in evaluating their relationships' reliability and stability, much as people do. This paper proposes a new model called Relative Needs Entropy (RNE) to assess trust between robotic agents. RNE measures the distance of needs distribution between individual agents or groups of agents. To exemplify its utility, we implement and demonstrate our trust model through experiments simulating a heterogeneous multi-robot grouping task in a persistent urban search and rescue mission consisting of tasks at two levels of difficulty. The results suggest that RNE trust-Based grouping of robots can achieve better performance and adaptability for diverse task execution compared to the state-of-the-art energy-based or distance-based grouping models.
【54】 Curiosity-driven Intuitive Physics Learning
标题:好奇心驱动下的直观物理学习
作者:Tejas Gaikwad,Romi Banerjee
机构:Dept. of Computer Science and Engineering, Indian Institute of Technology Jodhpur, Rajasthan, India
链接:https://arxiv.org/abs/2105.07426
摘要:生物婴儿天生好奇,试图通过与周围不同的物体(主要是宏观的固体物体)以各种各样的多感官方式相互作用来理解他们的物理环境。通过他们的各种互动,他们建立假设和预测,并最终学习、推断和理解这些物体的物理特征和行为的本质。受此启发,我们提出了一个真实世界人工智能代理的好奇心驱动学习和推理模型。这个模型是基于好奇心的激发,从基本宏观固体物理参数(即形状恒常性、时空连续性和物体永久性)的不连续性的观察中得出的。我们用身体预算这个术语来表示固体物体的基本属性。该模型的目的是支持模拟从零开始的学习,然后通过经验,无论领域,在真实世界的人工智能代理的实证。
摘要:Biological infants are naturally curious and try to comprehend their physical surroundings by interacting, in myriad multisensory ways, with different objects - primarily macroscopic solid objects - around them. Through their various interactions, they build hypotheses and predictions, and eventually learn, infer and understand the nature of the physical characteristics and behavior of these objects. Inspired thus, we propose a model for curiosity-driven learning and inference for real-world AI agents. This model is based on the arousal of curiosity, deriving from observations along discontinuities in the fundamental macroscopic solid-body physics parameters, i.e., shape constancy, spatial-temporal continuity, and object permanence. We use the term body-budget to represent the perceived fundamental properties of solid objects. The model aims to support the emulation of learning from scratch followed by substantiation through experience, irrespective of domain, in real-world AI agents.
【55】 Resource Planning for Hospitals Under Special Consideration of the COVID-19 Pandemic: Optimization and Sensitivity Analysis
标题:冠状病毒大流行特殊考虑下的医院资源规划:优化与敏感度分析
作者:Thomas Bartz-Beielstein,Marcel Dröscher,Alpar Gür,Alexander Hinterleitner,Olaf Mersmann,Dessislava Peeva,Lennard Reese,Nicolas Rehbach,Frederik Rehbach,Amrita Sen,Aleksandr Subbotin,Martin Zaefferer
机构:TH Köln, Cologne, Germany, koeln.de
链接:https://arxiv.org/abs/2105.07420
摘要:像COVID-19大流行这样的危机对卫生保健机构构成了严重挑战。他们需要规划处理增加的负荷所需的资源,例如,病床和呼吸机。为支持科隆地区地方卫生当局的资源规划,创建了基于离散事件模拟的容量规划工具BaBSim.Hospital。模拟的预测质量由29个参数决定。这些参数的合理默认值是在与医学专业人员的详细讨论中获得的。我们的目的是调查和优化这些参数,以提高巴布西姆医院。第一种采用“开箱即用”优化算法的方法失败了。实施基于代理的优化方法在合理的时间内产生了有用的结果。为了理解该算法的行为并获得对适应度景观有价值的见解,进行了深入的敏感性分析。灵敏度分析对于优化过程至关重要,因为它允许将优化集中在最重要的参数上。我们将说明如何在不影响结果准确性的情况下降低问题维度。所提出的方法适用于许多其他实际问题,例如开发新的电梯系统以覆盖最后一英里或模拟学术学习期间的学生流。
摘要:Crises like the COVID-19 pandemic pose a serious challenge to health-care institutions. They need to plan the resources required for handling the increased load, for instance, hospital beds and ventilators. To support the resource planning of local health authorities from the Cologne region, BaBSim.Hospital, a tool for capacity planning based on discrete event simulation, was created. The predictive quality of the simulation is determined by 29 parameters. Reasonable default values of these parameters were obtained in detailed discussions with medical professionals. We aim to investigate and optimize these parameters to improve BaBSim.Hospital. First approaches with "out-of-the-box" optimization algorithms failed. Implementing a surrogate-based optimization approach generated useful results in a reasonable time. To understand the behavior of the algorithm and to get valuable insights into the fitness landscape, an in-depth sensitivity analysis was performed. The sensitivity analysis is crucial for the optimization process because it allows focusing the optimization on the most important parameters. We illustrate how this reduces the problem dimension without compromising the resulting accuracy. The presented approach is applicable to many other real-world problems, e.g., the development of new elevator systems to cover the last mile or simulation of student flow in academic study periods.
【56】 Uncertainty Measurement of Basic Probability Assignment Integrity Based on Approximate Entropy in Evidence Theory
标题:基于证据理论近似熵的基本概率赋值完整性不确定性度量
作者:Tianxiang Zhan,Yuanpeng He,Hanwen Li,Fuyuan Xiao
机构:School of Computer and Information Science, Southwest University, Chongqing, China
链接:https://arxiv.org/abs/2105.07382
摘要:证据理论认为,概率的扩展可以更好地处理未知和不准确的信息。不确定度测量在证据理论和概率论中都占有重要的地位。Pincus提出近似熵(ApEn)来描述复杂系统的不规则性。时间序列越不规则,近似熵越大。网络的ApEn表示一个网络生成新节点的能力,或者表示未发现节点的可能性。通过网络特征与基本概率分配(BPA)的关联,可以得到BPA关于完备性的不确定性度量。本文的主要贡献是定义了基本概率分配的完整性,然后提出了BPA的近似熵来度量BPA完整性的不确定性。提出了一种基于逻辑网络结构的证据理论BPA不确定度计算方法。基于该方法的不确定性表征了BPA完整性的不确定性,有助于BPA可信度的识别。
摘要:Evidence theory is that the extension of probability can better deal with unknowns and inaccurate information. Uncertainty measurement plays a vital role in both evidence theory and probability theory. Approximate Entropy (ApEn) is proposed by Pincus to describe the irregularities of complex systems. The more irregular the time series, the greater the approximate entropy. The ApEn of the network represents the ability of a network to generate new nodes, or the possibility of undiscovered nodes. Through the association of network characteristics and basic probability assignment (BPA) , a measure of the uncertainty of BPA regarding completeness can be obtained. The main contribution of paper is to define the integrity of the basic probability assignment then the approximate entropy of the BPA is proposed to measure the uncertainty of the integrity of the BPA. The proposed method is based on the logical network structure to calculate the uncertainty of BPA in evidence theory. The uncertainty based on the proposed method represents the uncertainty of integrity of BPA and contributes to the identification of the credibility of BPA.
【57】 Set2setRank: Collaborative Set to Set Ranking for Implicit Feedback based Recommendation
标题:Set2setRank:协同设置隐式反馈推荐排名
作者:Lei Chen,Le Wu,Kun Zhang,Richang Hong,Meng Wang
机构: Key Laboratory of Knowledge Engineering with Big Data, Hefei University of Technology,China, School of Computer Science and Information Engineering, Hefei University of Technology,China
备注:The paper is accepted by SIGIR 2021
链接:https://arxiv.org/abs/2105.07377
摘要:由于用户经常使用二元行为数据(隐式反馈)来表达他们的偏好,例如点击物品或购买产品,基于隐式反馈的协同过滤(CF)模型通过利用隐式用户-物品交互数据来预测用户可能喜欢的排名靠前的物品。对于每个用户,内隐反馈被分为两组:一组观察到的行为有限的观察项目集和一组大量的未观察到的混合了消极项目行为和未知行为的项目集。对于任何用户偏好预测模型,研究人员要么设计基于排名的优化目标,要么依赖负项挖掘技术进行更好的优化。尽管这些基于隐式反馈的模型在性能上有所提高,但是由于每个用户的观察项目集的稀疏性,推荐结果仍然不尽如人意。为此,本文探讨了内隐反馈的独特性,提出了Set2setRank推荐框架。Set2setRank的优化准则有两个方面:首先,我们设计了一个项目到一个项目集的比较,鼓励每个观察到的项目从抽样的观察集中排序高于任何未观察到的项目从抽样的未观察集。第二,我们建立了集级比较模型,该模型鼓励从观察项目集总结出的距离与从抽样的负集总结出的最“硬”的未观察项目之间存在一定的差距。此外,还设计了一种自适应采样技术来实现这两个目标。我们需要注意的是,我们提出的框架是模型不可知的,可以很容易地应用于大多数推荐预测方法,并且在实践中具有时间效率。最后,在三个真实数据集上进行了大量的实验,证明了该方法的优越性。
摘要:As users often express their preferences with binary behavior data~(implicit feedback), such as clicking items or buying products, implicit feedback based Collaborative Filtering~(CF) models predict the top ranked items a user might like by leveraging implicit user-item interaction data. For each user, the implicit feedback is divided into two sets: an observed item set with limited observed behaviors, and a large unobserved item set that is mixed with negative item behaviors and unknown behaviors. Given any user preference prediction model, researchers either designed ranking based optimization goals or relied on negative item mining techniques for better optimization. Despite the performance gain of these implicit feedback based models, the recommendation results are still far from satisfactory due to the sparsity of the observed item set for each user. To this end, in this paper, we explore the unique characteristics of the implicit feedback and propose Set2setRank framework for recommendation. The optimization criteria of Set2setRank are two folds: First, we design an item to an item set comparison that encourages each observed item from the sampled observed set is ranked higher than any unobserved item from the sampled unobserved set. Second, we model set level comparison that encourages a margin between the distance summarized from the observed item set and the most "hard" unobserved item from the sampled negative set. Further, an adaptive sampling technique is designed to implement these two goals. We have to note that our proposed framework is model-agnostic and can be easily applied to most recommendation prediction approaches, and is time efficient in practice. Finally, extensive experiments on three real-world datasets demonstrate the superiority of our proposed approach.
【58】 Order Effects in Bayesian Updates
标题:贝叶斯更新中的顺序效应
作者:Catarina Moreira,Jose Acacio de Barros
机构:School of Information Systems, Queensland University of Technology, Brisbane, Australia, School of Humanities and Liberal Studies, San Francisco State University, San Francisco, CA, USA
备注:None
链接:https://arxiv.org/abs/2105.07354
摘要:当给定一系列信息时,对假设概率的判断不等于信息反转时同一假设的概率时,就会产生顺序效应。文献中已经进行了不同的实验来支持有序效应的证据。我们提出了一个顺序效应的贝叶斯更新模型,每个问题都可以看作是一个小实验,在这个小实验中,被调查者反映他们的信念。我们证明了顺序效应的出现,它们有一个简单的认知解释:被调查者先前认为两个问题是相关的。所提出的贝叶斯模型允许我们做出几个预测:(1)我们在先验上找到了一些限制序效应存在的条件(2) 我们证明,对于我们的模型,QQ等式不一定满足(由于对称性假设);与量子贝叶斯模型相比,贝叶斯模型具有参数少的优点。
摘要:Order effects occur when judgments about a hypothesis's probability given a sequence of information do not equal the probability of the same hypothesis when the information is reversed. Different experiments have been performed in the literature that supports evidence of order effects. We proposed a Bayesian update model for order effects where each question can be thought of as a mini-experiment where the respondents reflect on their beliefs. We showed that order effects appear, and they have a simple cognitive explanation: the respondent's prior belief that two questions are correlated. The proposed Bayesian model allows us to make several predictions: (1) we found certain conditions on the priors that limit the existence of order effects; (2) we show that, for our model, the QQ equality is not necessarily satisfied (due to symmetry assumptions); and (3) the proposed Bayesian model has the advantage of possessing fewer parameters than its quantum counterpart.
【59】 Model-Based Offline Planning with Trajectory Pruning
标题:基于模型的轨迹剪枝离线规划
作者:Xianyuan Zhan,Xiangyu Zhu,Haoran Xu
机构:JD Intelligent Cities Research, Beijing, China, Xidian University, China
链接:https://arxiv.org/abs/2105.07351
摘要:离线强化学习(RL)使得学习策略能够使用预先收集的数据集而不需要环境交互,这为RL在现实系统中的应用提供了一个很有前途的方向。虽然近年来离线RL的研究取得了很大的进展,但是现有的方法在实际系统控制任务中仍然面临着许多实际的挑战,如agent训练过程中的计算限制和对额外控制灵活性的要求。基于模型的规划框架为此类任务提供了一个有吸引力的解决方案。然而,大多数基于模型的规划算法并不是为离线设置而设计的。简单地将离线RL的成分与现有方法结合起来,要么会提供过度限制性的规划,要么会导致较差的性能。本文提出了一种新的基于模型的离线规划框架MOPP,解决了离线学习和高性能规划之间的矛盾。MOPP鼓励在从数据中学习到的行为策略的指导下更积极地推出轨迹,并删减有问题的轨迹以避免潜在的分布外样本。实验结果表明,与现有的基于模型的离线规划和RL方法相比,MOPP方法具有更好的性能,并且可以很容易地适应不同的目标和额外的约束条件。
摘要:Offline reinforcement learning (RL) enables learning policies using pre-collected datasets without environment interaction, which provides a promising direction to make RL useable in real-world systems. Although recent offline RL studies have achieved much progress, existing methods still face many practical challenges in real-world system control tasks, such as computational restriction during agent training and the requirement of extra control flexibility. Model-based planning framework provides an attractive solution for such tasks. However, most model-based planning algorithms are not designed for offline settings. Simply combining the ingredients of offline RL with existing methods either provides over-restrictive planning or leads to inferior performance. We propose a new light-weighted model-based offline planning framework, namely MOPP, which tackles the dilemma between the restrictions of offline learning and high-performance planning. MOPP encourages more aggressive trajectory rollout guided by the behavior policy learned from data, and prunes out problematic trajectories to avoid potential out-of-distribution samples. Experimental results show that MOPP provides competitive performance compared with existing model-based offline planning and RL approaches, and allows easy adaptation to varying objectives and extra constraints.
【60】 Explainable Hierarchical Imitation Learning for Robotic Drink Pouring
标题:机器人灌装饮料的可解释分层模仿学习
作者:Dandan Zhang,Yu Zheng,Qiang Li,Lei Wei,Dongsheng Zhang,Zhengyou Zhang
机构: Bielefeld University
备注:15 pages, 12 figures
链接:https://arxiv.org/abs/2105.07348
摘要:准确地将饮料倒入各种容器是服务机器人的一项基本技能。然而,倒酒是一个动态的过程,很难建模。传统的实现机器人自主浇注的深度模拟学习技术存在着固有的黑箱效应,需要大量的演示数据进行模型训练。为了解决这些问题,本文提出了一种可解释的分层模仿学习(EHIL)方法,使机器人能够在多个倒酒场景中学习高层知识和执行底层动作。此外,利用EHIL可以构造任务执行的逻辑图,使动作生成的决策过程对用户具有解释性,并能找出失败的原因。该框架以逻辑图为基础,具有可操作性,可以实现不同的目标,同时对未知场景的适应性也可以以一种可解释的方式实现。通过一系列实验验证了该方法的有效性。结果表明,EHIL在成功率、适应性、可操作性和解释性等方面优于传统的行为克隆方法。
摘要:To accurately pour drinks into various containers is an essential skill for service robots. However, drink pouring is a dynamic process and difficult to model. Traditional deep imitation learning techniques for implementing autonomous robotic pouring have an inherent black-box effect and require a large amount of demonstration data for model training. To address these issues, an Explainable Hierarchical Imitation Learning (EHIL) method is proposed in this paper such that a robot can learn high-level general knowledge and execute low-level actions across multiple drink pouring scenarios. Moreover, with EHIL, a logical graph can be constructed for task execution, through which the decision-making process for action generation can be made explainable to users and the causes of failure can be traced out. Based on the logical graph, the framework is manipulable to achieve different targets while the adaptability to unseen scenarios can be achieved in an explainable manner. A series of experiments have been conducted to verify the effectiveness of the proposed method. Results indicate that EHIL outperforms the traditional behavior cloning method in terms of success rate, adaptability, manipulability and explainability.
【61】 Understanding the Effect of Bias in Deep Anomaly Detection
标题:认识偏差在深层异常检测中的作用
作者:Ziyu Ye,Yuxin Chen,Haitao Zheng
机构:University of Chicago
备注:Accepted at IJCAI '21. Codes available on github.com/ZIYU-DEEP/Understanding-Bias-in-Deep-Anomaly-Detection-PyTorch
链接:https://arxiv.org/abs/2105.07346
摘要:异常检测在机器学习中是一个独特的挑战,因为标记的异常数据很少。最近的工作试图通过使用额外的标记异常样本来增强深度异常检测模型的训练来缓解这些问题。然而,标记的数据往往与目标分布不一致,给训练模型引入了有害的偏差。在本文中,我们的目的是了解偏差异常集对异常检测的影响。具体来说,我们将异常检测视为一个有监督的学习任务,其目标是在给定的假阳性率下优化召回。我们正式研究了异常检测器的相对评分偏差,定义为相对于基线异常检测器的性能差异。我们建立了第一个有限样本率来估计深度异常检测的相对评分偏差,并在合成数据集和真实数据集上验证了我们的理论结果。我们还提供了一个广泛的实证研究如何有偏见的训练异常集影响异常得分函数,从而在不同的异常类检测性能。我们的研究表明,有偏异常集可能是有用的或有问题的,并为未来的研究提供了一个坚实的基准。
摘要:Anomaly detection presents a unique challenge in machine learning, due to the scarcity of labeled anomaly data. Recent work attempts to mitigate such problems by augmenting training of deep anomaly detection models with additional labeled anomaly samples. However, the labeled data often does not align with the target distribution and introduces harmful bias to the trained model. In this paper, we aim to understand the effect of a biased anomaly set on anomaly detection. Concretely, we view anomaly detection as a supervised learning task where the objective is to optimize the recall at a given false positive rate. We formally study the relative scoring bias of an anomaly detector, defined as the difference in performance with respect to a baseline anomaly detector. We establish the first finite sample rates for estimating the relative scoring bias for deep anomaly detection, and empirically validate our theoretical results on both synthetic and real-world datasets. We also provide an extensive empirical study on how a biased training anomaly set affects the anomaly score function and therefore the detection performance on different anomaly classes. Our study demonstrates scenarios in which the biased anomaly set can be useful or problematic, and provides a solid benchmark for future research.
【62】 Self-supervised on Graphs: Contrastive, Generative,or Predictive
标题:图的自我监督:对比、生成或预测
作者:Lirong Wu,Haitao Lin,Zhangyang Gao,Cheng Tan,Stan. Z. Li
机构: School of Engineering, Westlake University
链接:https://arxiv.org/abs/2105.07342
摘要:图形深度学习最近在各种任务上取得了显著的成功,而这种成功在很大程度上依赖于大量仔细标记的数据。然而,精确的注释通常是非常昂贵和耗时的。为了解决这个问题,自我监督学习(SSL)正在成为一种新的范式,通过精心设计的借口任务提取信息性知识,而不依赖于人工标注。在这个调查中,我们扩展了SSL的概念,它首先出现在计算机视觉和自然语言处理领域,提出了一个及时和全面的审查现有的SSL技术的图形数据。具体来说,我们将现有的图SSL方法分为三类:对比、生成和预测。更重要的是,与其他许多只对已发表的研究进行高水平描述的调查不同,我们在一个统一的框架中对现有研究进行了额外的数学总结。此外,为了促进方法的发展和实证比较,我们还总结了常用的数据集、评估指标、下游任务和各种算法的开源实现。最后,我们讨论了改进图自监督学习的技术挑战和未来的发展方向。
摘要:Deep learning on graphs has recently achieved remarkable success on a variety of tasks while such success relies heavily on the massive and carefully labeled data. However, precise annotations are generally very expensive and time-consuming. To address this problem, self-supervised learning (SSL) is emerging as a new paradigm for extracting informative knowledge through well-designed pretext tasks without relying on manual labels. In this survey, we extend the concept of SSL, which first emerged in the fields of computer vision and natural language processing, to present a timely and comprehensive review of the existing SSL techniques for graph data. Specifically, we divide existing graph SSL methods into three categories: contrastive, generative, and predictive. More importantly, unlike many other surveys that only provide a high-level description of published research, we present an additional mathematical summary of the existing works in a unified framework. Furthermore, to facilitate methodological development and empirical comparisons, we also summarize the commonly used datasets, evaluation metrics, downstream tasks, and open-source implementations of various algorithms. Finally, we discuss the technical challenges and potential future directions for improving graph self-supervised learning.
【63】 Real-time Detection of Practical Universal Adversarial Perturbations
标题:实用通用对抗性扰动的实时检测
作者:Kenneth T. Co,Luis Muñoz-González,Leslie Kanthan,Emil C. Lupu
机构:Lupu,[,−,−,−,], Imperial College London, London SW,AZ, United Kingdom, DataSpartan, London EC,Y ,ST, United Kingdom
链接:https://arxiv.org/abs/2105.07334
摘要:普遍对抗性扰动(UAPs)是一类突出的对抗性例子,它利用了系统的弱点,实现了对深度神经网络(DNNs)的物理可实现和鲁棒性攻击。UAP可以跨多个不同的输入进行概括;这导致了可以大规模应用的现实有效的攻击。在这篇论文中,我们提出了一个高效且可扩展的算法HyperNeuron,它允许通过识别可疑的神经元超激活来实时检测UAPs。我们的结果显示了超神经元在多任务(图像分类,目标检测)中的有效性,对抗各种通用攻击,以及在现实场景中,如感知广告阻断和对抗补丁。HyperNeuron能够同时检测对抗性掩膜和补丁UAP,其性能与现有的UAP防御相当或更好,同时显著减少了每幅图像0.86毫秒的延迟。这表明,许多现实和实用的通用攻击可以实时可靠地缓解,这显示了机器学习系统的健壮部署的前景。
摘要:Universal Adversarial Perturbations (UAPs) are a prominent class of adversarial examples that exploit the systemic vulnerabilities and enable physically realizable and robust attacks against Deep Neural Networks (DNNs). UAPs generalize across many different inputs; this leads to realistic and effective attacks that can be applied at scale. In this paper we propose HyperNeuron, an efficient and scalable algorithm that allows for the real-time detection of UAPs by identifying suspicious neuron hyper-activations. Our results show the effectiveness of HyperNeuron on multiple tasks (image classification, object detection), against a wide variety of universal attacks, and in realistic scenarios, like perceptual ad-blocking and adversarial patches. HyperNeuron is able to simultaneously detect both adversarial mask and patch UAPs with comparable or better performance than existing UAP defenses whilst introducing a significantly reduced latency of only 0.86 milliseconds per image. This suggests that many realistic and practical universal attacks can be reliably mitigated in real-time, which shows promise for the robust deployment of machine learning systems.
【64】 Towards a Predictive Processing Implementation of the Common Model of Cognition
标题:共同认知模型的预测性加工实现研究
作者:M. A. Kelly,Alexander Ororbia
机构:Rochester Institute of Technology, Rochester, NY, USA, M. Alex Kelly, Bucknell University, Lewisburg, PA, USA, Carleton University, Ottawa, ON, Canada
备注:6 pages, 2 figures
链接:https://arxiv.org/abs/2105.07308
摘要:在这篇文章中,我们提出了一个认知架构,是建立在强大而简单的神经模型。具体来说,我们描述了一个基于神经生成编码和全息联想记忆的通用认知模型的实现。所提出的系统为开发代理打下了基础,这些代理可以从不同的任务中不断学习,并在比现有认知架构更大的范围内模拟人的表现。
摘要:In this article, we present a cognitive architecture that is built from powerful yet simple neural models. Specifically, we describe an implementation of the common model of cognition grounded in neural generative coding and holographic associative memory. The proposed system creates the groundwork for developing agents that learn continually from diverse tasks as well as model human performance at larger scales than what is possible with existant cognitive architectures.
【65】 Texture Generation with Neural Cellular Automata
标题:基于神经元胞自动机的纹理生成
作者:Alexander Mordvintsev,Eyvind Niklasson,Ettore Randazzo
机构:Google Research
备注:AI for Content Creation Workshop, CVPR 2021
链接:https://arxiv.org/abs/2105.07299
摘要:神经细胞自动机(neuralcellular Automata,NCA)在学习图像生长、形态分类、图像分割以及路径搜索等一般计算规则方面表现出了卓越的能力。我们相信它们引入的归纳先验有助于纹理的生成。自然界中的纹理通常是由局部相互作用的反应扩散系统的变体产生的。同样,人造纹理通常以局部方式(例如织物编织)或使用具有局部依赖性的规则(规则网格或几何图案)生成。我们演示了如何从单个模板图像中学习纹理生成器,其生成方法具有令人尴尬的并行性、收敛速度快、输出保真度高,并且只需要对底层状态流形进行一些最小的假设。此外,我们还研究了学习模型的一些有用和有趣的性质,如非平稳动力学和对损伤的固有鲁棒性。最后,我们定性地宣称NCA模型所表现出的行为是一种学习的、分布式的、局部的纹理生成算法,这使得我们的方法与现有的纹理生成工作不同。我们讨论这样一个范例的优点。
摘要:Neural Cellular Automata (NCA) have shown a remarkable ability to learn the required rules to "grow" images, classify morphologies, segment images, as well as to do general computation such as path-finding. We believe the inductive prior they introduce lends itself to the generation of textures. Textures in the natural world are often generated by variants of locally interacting reaction-diffusion systems. Human-made textures are likewise often generated in a local manner (textile weaving, for instance) or using rules with local dependencies (regular grids or geometric patterns). We demonstrate learning a texture generator from a single template image, with the generation method being embarrassingly parallel, exhibiting quick convergence and high fidelity of output, and requiring only some minimal assumptions around the underlying state manifold. Furthermore, we investigate properties of the learned models that are both useful and interesting, such as non-stationary dynamics and an inherent robustness to damage. Finally, we make qualitative claims that the behaviour exhibited by the NCA model is a learned, distributed, local algorithm to generate a texture, setting our method apart from existing work on texture generation. We discuss the advantages of such a paradigm.
【66】 Annotation Uncertainty in the Context of Grammatical Change
标题:语法变化语境中的注释不确定性
作者:Marie-Luis Merten,Marcel Wever,Michaela Geierhos,Doris Tophinke,Eyke Hüllermeier
机构:∗ University of Zurich, Zurich, Switzerland, † Paderborn University, Paderborn, Germany, ♦ Universit¨at der Bundeswehr M¨unchen, Munich, Germany, △ LMU Munich, Munich, Germany
链接:https://arxiv.org/abs/2105.07270
摘要:本文阐述了大型文本语料库中注释语境中的不确定性概念,特别是(但不限于)历史语言。这种不确定性可能是由于语言的固有特性,例如语言的模糊性和语言描述的重叠类别,但也可能是由于缺乏注释专业知识造成的。通过对注释不确定性的详细考察,我们发现了注释不确定性的来源,加深了我们对注释不确定性的本质和不同类型的理解。此外,还讨论了我们的理论发现的一些实际意义。最后但并非最不重要的是,本文试图调和语料库项目、语言学和计算机科学中涉及的主要科学学科的观点,形成一个统一的观点,并强调这些学科之间潜在的协同作用。
摘要:This paper elaborates on the notion of uncertainty in the context of annotation in large text corpora, specifically focusing on (but not limited to) historical languages. Such uncertainty might be due to inherent properties of the language, for example, linguistic ambiguity and overlapping categories of linguistic description, but could also be caused by lacking annotation expertise. By examining annotation uncertainty in more detail, we identify the sources and deepen our understanding of the nature and different types of uncertainty encountered in daily annotation practice. Moreover, some practical implications of our theoretical findings are also discussed. Last but not least, this article can be seen as an attempt to reconcile the perspectives of the main scientific disciplines involved in corpus projects, linguistics and computer science, to develop a unified view and to highlight the potential synergies between these disciplines.
【67】 A Deep Metric Learning Approach to Account Linking
标题:一种用于账户链接的深度度量学习方法
作者:Aleem Khan,Elizabeth Fleming,Noah Schofield,Marcus Bishop,Nicholas Andrews
机构:Human Language Technology Center of Excellence, Johns Hopkins University
备注:13 pages; to be published in NAACL 2021
链接:https://arxiv.org/abs/2105.07263
摘要:我们考虑的任务是,根据同一作者相应文档流的内容和元数据,以自动方式链接属于同一作者的社交媒体帐户。我们专注于学习一种嵌入方法,它将用户活动的可变大小样本(从单个帖子到整个月的活动)映射到向量空间,同一作者的样本映射到附近的点。这种方法不需要人工注释数据来进行训练,这使得我们能够利用大量的社交媒体内容。在一个新的评估框架下,该模型在其他领域建立了识别基准,其性能优于多个竞争基准。我们的方法达到了很高的链接精度,即使是训练时没有看到的小样本,这是该链接框架实际应用的先决条件。
摘要:We consider the task of linking social media accounts that belong to the same author in an automated fashion on the basis of the content and metadata of their corresponding document streams. We focus on learning an embedding that maps variable-sized samples of user activity -- ranging from single posts to entire months of activity -- to a vector space, where samples by the same author map to nearby points. The approach does not require human-annotated data for training purposes, which allows us to leverage large amounts of social media content. The proposed model outperforms several competitive baselines under a novel evaluation framework modeled after established recognition benchmarks in other domains. Our method achieves high linking accuracy, even with small samples from accounts not seen at training time, a prerequisite for practical applications of the proposed linking framework.
【68】 Regret Minimization Experience Replay
标题:遗憾最小化体验回放
作者:Zhenghai Xue,Xu-Hui Liu,Jing-Cheng Pang,Shengyi Jiang,Feng Xu,Yang Yu
机构: Nanjing University
备注:9 pages, 5 figures
链接:https://arxiv.org/abs/2105.07253
摘要:经验重演广泛应用于各种深度策略强化学习(RL)算法中。它存储以前收集的样本以供进一步使用。为了更好地利用这些样本,优先采样是一种很有前途的技术,以提高性能的RL代理。以往的基于时差(TD)误差的排序方法具有很强的启发性,与RL的目标有很大的不同。在这项工作中,我们从理论上分析了可以最小化RL策略遗憾的最优优先级策略。我们的理论建议,对于TD误差越大、策略性越好、纠正性反馈越多的数据,在抽样过程中应赋予更高的权重。基于这一理论,我们提出了两种实用的算法:RM-DisCor和RM-TCE。RM-DisCor是一种通用算法,RM-TCE是一种更有效的依赖于状态时序的变体。这两种算法都在挑战RL基准(包括MuJoCo、Atari和Meta-World)时提高了非策略RL算法的性能。
摘要:Experience replay is widely used in various deep off-policy reinforcement learning (RL) algorithms. It stores previously collected samples for further reuse. To better utilize these samples, prioritized sampling is a promising technique to improve the performance of RL agents. Previous prioritization methods based on temporal-difference (TD) error are highly heuristic and divergent from the objective of RL. In this work, we analyze the optimal prioritization strategy that can minimize the regret of RL policy theoretically. Our theory suggests that the data with higher TD error, better on-policiness and more corrective feedback should be assigned with higher weights during sampling. Based on this theory, we propose two practical algorithms, RM-DisCor and RM-TCE. RM-DisCor is a general algorithm and RM-TCE is a more efficient variant relying on the temporal ordering of states. Both algorithms improve the performance of off-policy RL algorithms in challenging RL benchmarks, including MuJoCo, Atari and Meta-World.
【69】 Composite Localization for Human Pose Estimation
标题:一种用于人体姿态估计的复合定位方法
作者:ZiFan Chen,Xin Qin,Chao Yang,Li Zhang
链接:https://arxiv.org/abs/2105.07245
摘要:由于学习目标的复杂性,现有的人体姿态估计方法存在着长距离回归不准确或计算量大的问题。本文提出了一种新的人体姿态估计深度学习框架,称为复合定位,将复杂的学习目标分为两个简单的目标:一个稀疏的热图来寻找关键点的近似位置,两个短距离的偏移图来获得最终的精确坐标。为了实现该框架,我们构造了两种复合定位网络:CLNet ResNet和CLNet沙漏。我们在三个基准数据集上评估了网络,包括Leeds运动姿势数据集、MPII人体姿势数据集和COCO关键点检测数据集。实验结果表明,我们的CLNet-ResNet50在约1/2 GFLOPs的情况下比SimpleBaseline高出1.14%。我们的CLNet沙漏优于原来的堆叠沙漏4.45%的可可。
摘要:The existing human pose estimation methods are confronted with inaccurate long-distance regression or high computational cost due to the complex learning objectives. This work proposes a novel deep learning framework for human pose estimation called composite localization to divide the complex learning objective into two simpler ones: a sparse heatmap to find the keypoint's approximate location and two short-distance offsetmaps to obtain its final precise coordinates. To realize the framework, we construct two types of composite localization networks: CLNet-ResNet and CLNet-Hourglass. We evaluate the networks on three benchmark datasets, including the Leeds Sports Pose dataset, the MPII Human Pose dataset, and the COCO keypoints detection dataset. The experimental results show that our CLNet-ResNet50 outperforms SimpleBaseline by 1.14% with about 1/2 GFLOPs. Our CLNet-Hourglass outperforms the original stacked-hourglass by 4.45% on COCO.
【70】 AgeFlow: Conditional Age Progression and Regression with Normalizing Flows
标题:AgeFlow:带归一化流的条件年龄递进和回归
作者:Zhizhong Huang,Shouzhen Chen,Junping Zhang,Hongming Shan
机构:Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Institute of Science and Technology for Brain-inspired Intelligence and MOE Frontiers Center, for Brain Science, Fudan University, Shanghai , China
备注:IJCAI 2021
链接:https://arxiv.org/abs/2105.07239
摘要:年龄推进和回归的目的是合成一个给定的人脸图像照片真实感外观老化和年轻化的影响,分别。现有的基于生成性对抗网络的方法存在以下三个主要问题:1)不稳定的训练在生成的人脸中引入了强烈的幽灵伪影;2)不成对的训练导致了诸如性别和种族等人脸属性的意外变化,非双射年龄映射增加了人脸变换的不确定性。为了克服这些问题,本文提出了一个新的框架,称为AgeFlow,它综合了基于流的模型和GANs的优点。提出的AgeFlow包含三个部分:通过可逆神经网络将给定人脸映射到潜在空间的编码器,将源潜在向量转换为目标潜在向量的新型可逆条件翻译模块(ICTM),以及解码器,其使用相同的编码器网络从目标潜在向量重构生成的面部;所有部分都是可逆的,实现了双射年龄映射。ICTM的新颖之处有两个。首先,我们提出一种属性感知的知识提取方法,在保持其他无关属性不变的情况下,学习年龄序列的操作方向,以减轻面部属性的意外变化。其次,我们提出在潜在空间中使用GANs,以确保学习到的潜在向量与真实向量不可区分,这比传统的GANs在图像域中的使用要容易得多。实验结果表明,在两个基准数据集上的性能优于现有的基于GANs的方法。源代码位于https://github.com/Hzzone/AgeFlow.
摘要:Age progression and regression aim to synthesize photorealistic appearance of a given face image with aging and rejuvenation effects, respectively. Existing generative adversarial networks (GANs) based methods suffer from the following three major issues: 1) unstable training introducing strong ghost artifacts in the generated faces, 2) unpaired training leading to unexpected changes in facial attributes such as genders and races, and 3) non-bijective age mappings increasing the uncertainty in the face transformation. To overcome these issues, this paper proposes a novel framework, termed AgeFlow, to integrate the advantages of both flow-based models and GANs. The proposed AgeFlow contains three parts: an encoder that maps a given face to a latent space through an invertible neural network, a novel invertible conditional translation module (ICTM) that translates the source latent vector to target one, and a decoder that reconstructs the generated face from the target latent vector using the same encoder network; all parts are invertible achieving bijective age mappings. The novelties of ICTM are two-fold. First, we propose an attribute-aware knowledge distillation to learn the manipulation direction of age progression while keeping other unrelated attributes unchanged, alleviating unexpected changes in facial attributes. Second, we propose to use GANs in the latent space to ensure the learned latent vector indistinguishable from the real ones, which is much easier than traditional use of GANs in the image domain. Experimental results demonstrate superior performance over existing GANs-based methods on two benchmarked datasets. The source code is available at https://github.com/Hzzone/AgeFlow.
【71】 Heterogeneous Causal Effect of Polysubstance Usage on Drug Overdose
标题:多物质使用对药物过量的异质性因果效应
作者:Vaishali Mahipal,Mohammad Arif Ul Alam
机构:Department of Computer Science, University of Massachusetts Lowell, USA
备注:Submitted to EMBS BHI
链接:https://arxiv.org/abs/2105.07224
摘要:在本文中,我们提出了一个系统来估计异类并发用药对过量估计的影响,包括有效的协变量选择,子组选择,产生和异类因果关系估计。虽然目前已有多个关联性研究被提出,但在同时用药和药物过量问题上,异质性因果关系尚未被研究。我们应用我们的框架来回答一个关键问题,“同时使用苯二氮卓类和阿片类药物是否会对阿片类药物过量流行产生不同的因果影响?”使用Truven MarketScan收集的2001年至2013年的索赔数据显示了我们提出的框架的有效性的重大前景。我们的有效因果推断模型估计,因果效应(19%)高于回归研究(15%),以估计与阿片类药物和苯二氮卓类药物过量同时使用相关的风险。
摘要:In this paper, we propose a system to estimate heterogeneous concurrent drug usage effects on overdose estimation, that consists of efficient co-variate selection, sub-group selection, generation of and heterogeneous causal effect estimation. Although, there has been several association studies have been proposed in the state-of-art methods, heterogeneous causal effects have never been studied in concurrent drug usage and drug overdose problem. We apply our framework to answer a critical question, "can concurrent usage of benzodiazepines and opioids has heterogeneous causal effects on opioid overdose epidemic?" Using Truven MarketScan claim data collected from 2001 to 2013 have shown significant promise of our proposed framework's efficacy. Our efficient causal inference model estimated that the causal effect is higher (19%) than the regression studies (15%) to estimate the risks associated with the concurrent usage of opioid and benzodiazepines on opioid overdose.
【72】 XAI Method Properties: A (Meta-)study
标题:XAI方法性质的(元)研究
作者:Gesina Schwalbe,Bettina Finzel
机构: Continental AG, Regensburg, Germany, Cognitive Systems Group, University of Bamberg, Germany
备注:37 pages, 2 figures, submitted to Data Mining and Knowledge Discovery
链接:https://arxiv.org/abs/2105.07190
摘要:与此同时,在可解释人工智能(XAI)的研究范围内,人们开发了各种各样的术语、动机、方法和评价标准。在文献中可以找到许多分类法,每个分类法有不同的侧重点,但也有许多重叠点。在这篇论文中,我们总结了meta分析中被引用最多和最新的分类法,以突出XAI最新技术的基本方面。我们还介绍并添加了大量关于这个主题的调查文章中的术语和概念。最后但并非最不重要的一点是,我们用50多个示例方法说明了来自更高级别分类法的概念,我们对这些方法进行了相应的分类,从而提供了XAI方面的广泛概述,并为适用于用例以及特定于上下文的后续研究铺平了道路。
摘要:In the meantime, a wide variety of terminologies, motivations, approaches and evaluation criteria have been developed within the scope of research on explainable artificial intelligence (XAI). Many taxonomies can be found in the literature, each with a different focus, but also showing many points of overlap. In this paper, we summarize the most cited and current taxonomies in a meta-analysis in order to highlight the essential aspects of the state-of-the-art in XAI. We also present and add terminologies as well as concepts from a large number of survey articles on the topic. Last but not least, we illustrate concepts from the higher-level taxonomy with more than 50 example methods, which we categorize accordingly, thus providing a wide-ranging overview of aspects of XAI and paving the way for use case-appropriate as well as context-specific subsequent research.
【73】 Content Analysis Application in Nursing: A Synthetic Knowledge Synthesis Meta-Study
标题:内容分析在护理中的应用:综合知识综合元研究
作者:Helena Blažun Vošner,Peter Kokol,Jernej Završnik,Danica Železnik
机构: Zdravstveni dom dr. Adolfa Drolca Maribor, Ulica talcev , Maribor, Fakulteta za zdravstvene in socialne vede Slovenj Gradec, Glavni trg , Slovenj Gradec, Alma Mater Europaea, Slovenska ulica , Maribor
链接:https://arxiv.org/abs/2105.07189
摘要:理论问题:随着研究文献产量的爆炸性增长,需要新的方法来构建知识。方法:采用综合含量分析法进行meta研究。结果与讨论:我们的meta研究显示内容分析在护理研究中的应用非常广泛。它的使用趋势是积极的,并在全球范围内用于各种研究环境。在我们的研究中使用的合成内容分析是执行知识合成的一个非常有用的工具,用自动化的活动取代了传统合成的许多常规活动,这使得此类研究在经济上更可行,更容易执行。
摘要:Theoretical issues: With the explosive growth in the research literature production, the need for new approaches to structure knowledge emerged. Method: Synthetic content analysis was used in our meta-study. Results and discussion: Our meta-study showed that content analysis is frequently used in nursing research in a very wide spectrum of applications. The trend of its use is positive and it is used globally in a variety of research settings. The synthetic content analysis used in our study showed to be a very helpful tool in performing knowledge synthesis, replacing many of the routine activities of conventional synthesis with automated activities this making such studies more economically viable and easier to perform.
【74】 Stacked Deep Multi-Scale Hierarchical Network for Fast Bokeh Effect Rendering from a Single Image
标题:基于堆叠式深度多尺度分层网络的单幅图像快速Bokeh效果绘制
作者:Saikat Dutta,Sourya Dipta Das,Nisarg A. Shah,Anil Kumar Tiwari
机构:IIT Madras, Chennai, India, Jadavpur University, Kolkata, India, IIT Jodhpur, Jodhpur, India
备注:Accepted to MAI workshop, CVPR 2021. Code and models: this https URL
链接:https://arxiv.org/abs/2105.07174
摘要:波基效应是摄影艺术中最理想的效果之一。通常,它需要一个具有不同光圈和快门设置的单反相机以及一定的摄影技巧来产生这种效果。在智能手机中,使用计算方法和附加传感器来克服物理镜头和传感器的限制,从而达到这种效果。现有的方法大多利用附加的传感器数据或预训练网络对场景进行精细的深度估计,有时采用纵向分割预训练网络模块对图像中的突出物体进行分割。由于这些原因,网络有许多参数,成为运行时密集型,无法在中端设备上运行。本文采用端到端的深度多尺度层次网络(DMSHN)模型对单目相机拍摄的图像进行Bokeh效果的直接渲染。为了进一步提高这种效果的感知质量,还提出了由两个DMSHN模块组成的叠加模型。我们的模型不依赖于任何预训练的网络模块来进行单目深度估计或显著性检测,从而大大减少了模型的大小和运行时间。堆叠DMSHN在大规模衰退中实现了最先进的结果!与当前处理高清图像的最先进模型相比,数据集的运行时间减少了约6倍。
摘要:The Bokeh Effect is one of the most desirable effects in photography for rendering artistic and aesthetic photos. Usually, it requires a DSLR camera with different aperture and shutter settings and certain photography skills to generate this effect. In smartphones, computational methods and additional sensors are used to overcome the physical lens and sensor limitations to achieve such effect. Most of the existing methods utilized additional sensor's data or pretrained network for fine depth estimation of the scene and sometimes use portrait segmentation pretrained network module to segment salient objects in the image. Because of these reasons, networks have many parameters, become runtime intensive and unable to run in mid-range devices. In this paper, we used an end-to-end Deep Multi-Scale Hierarchical Network (DMSHN) model for direct Bokeh effect rendering of images captured from the monocular camera. To further improve the perceptual quality of such effect, a stacked model consisting of two DMSHN modules is also proposed. Our model does not rely on any pretrained network module for Monocular Depth Estimation or Saliency Detection, thus significantly reducing the size of model and run time. Stacked DMSHN achieves state-of-the-art results on a large scale EBB! dataset with around 6x less runtime compared to the current state-of-the-art model in processing HD quality images.
【75】 Cohort Shapley value for algorithmic fairness
标题:算法公平性的队列Shapley值
作者:Masayoshi Mase,Art B. Owen,Benjamin B. Seiler
机构:Hitachi, Ltd., Stanford University
链接:https://arxiv.org/abs/2105.07168
摘要:队列Shapley值是一种基于博弈论的变量重要性的无模型方法,不使用任何未观察到的和潜在不可能的特征组合。以著名的COMPAS累犯数据为例,用它来评价算法的公平性。这种方法允许我们为数据集中的每一个人确定他们在多大程度上受到诸如种族之类的受保护属性值的不利或有利影响。这种方法可以做到这一点,即使种族不是最初的预测之一,即使它没有访问一个专有的算法,作出了预测。博弈论的基础使我们能够定义数据集的总体变量重要性,并与每个主题的定义保持一致。我们可以研究公平文献中多个感兴趣的变量的重要性,包括假阳性预测。
摘要:Cohort Shapley value is a model-free method of variable importance grounded in game theory that does not use any unobserved and potentially impossible feature combinations. We use it to evaluate algorithmic fairness, using the well known COMPAS recidivism data as our example. This approach allows one to identify for each individual in a data set the extent to which they were adversely or beneficially affected by their value of a protected attribute such as their race. The method can do this even if race was not one of the original predictors and even if it does not have access to a proprietary algorithm that has made the predictions. The grounding in game theory lets us define aggregate variable importance for a data set consistently with its per subject definitions. We can investigate variable importance for multiple quantities of interest in the fairness literature including false positive predictions.
【76】 Analyzing Images for Music Recommendation
标题:面向音乐推荐的图像分析
作者:Anant Baijal,Vivek Agarwal,Danny Hyun
备注:IEEE International Conference on Consumer Electronics (IEEE ICCE 2021)
链接:https://arxiv.org/abs/2105.07135
摘要:使用合适的音乐体验图像可以极大地丰富整体用户体验。所提出的图像分析方法将艺术品图像与照片图像区别对待。利用深度学习模型进行图像自动分类。一个说明性的分析表明,我们的深层模型的能力,内在地学习和利用感知相关的特点时,分类艺术品也提出。平均意见得分(MOS)获得的主观评估各自的形象和推荐的音乐对支持我们的方法的有效性。
摘要:Experiencing images with suitable music can greatly enrich the overall user experience. The proposed image analysis method treats an artwork image differently from a photograph image. Automatic image classification is performed using deep-learning based models. An illustrative analysis showcasing the ability of our deep-models to inherently learn and utilize perceptually relevant features when classifying artworks is also presented. The Mean Opinion Score (MOS) obtained from subjective assessments of the respective image and recommended music pairs supports the effectiveness of our approach.
【77】 Hardware Synthesis of State-Space Equations; Application to FPGA Implementation of Shallow and Deep Neural Networks
标题:状态空间方程的硬件综合浅层和深层神经网络在FPGA实现中的应用
作者:Amir-Hossein Kiamarzi,Pezhman Torabi,Reza Sameni
机构:Department of Biomedical Informatics, Emory University
链接:https://arxiv.org/abs/2105.07131
摘要:目前,浅层和深层神经网络在生物医学工程、图像处理、计算机视觉、语音识别等领域有着广泛的应用。许多研究人员已经开发了硬件加速器,包括现场可编程门阵列(fpga),用于实现高性能和节能的NNs。显然,硬件架构设计过程对于每个神经网络都是特定且耗时的。因此,迫切需要一种系统的方法来设计、实现和优化神经网络。提出了一种在寄存器传输层(RTL)中实现状态空间模型的系统方法,并对NN的实现进行了研究。提出的设计流程基于状态空间模型的迭代性质以及状态空间公式和有限状态机之间的类比。该方法适用于线性/非线性和时变/时不变系统。它也可以用来实现本质上的迭代系统(广泛应用于各种领域,如信号处理、数值分析、计算机算法和控制工程),或者可以用等价的迭代形式重写的系统。循环神经网络的实现,如长短时记忆(LSTM)神经网络,具有内在的状态空间形式,是该框架的另一个主要应用。作为一个实例,研究表明状态空间系统可以作为非线性时变动态系统,用于NNs的系统实现和优化。在线提供了RTL代码生成软件,简化了任意大小NNs的自动生成。
摘要:Nowadays, shallow and deep Neural Networks (NNs) have vast applications including biomedical engineering, image processing, computer vision, and speech recognition. Many researchers have developed hardware accelerators including field-programmable gate arrays (FPGAs) for implementing high-performance and energy efficient NNs. Apparently, the hardware architecture design process is specific and time-consuming for each NN. Therefore, a systematic way to design, implement and optimize NNs is highly demanded. The paper presents a systematic approach to implement state-space models in register transfer level (RTL), with special interest for NN implementation. The proposed design flow is based on the iterative nature of state-space models and the analogy between state-space formulations and finite-state machines. The method can be used in linear/nonlinear and time-varying/time-invariant systems. It can also be used to implement either intrinsically iterative systems (widely used in various domains such as signal processing, numerical analysis, computer arithmetic, and control engineering), or systems that could be rewritten in equivalent iterative forms. The implementation of recurrent NNs such as long short-term memory (LSTM) NNs, which have intrinsic state-space forms, are another major applications for this framework. As a case study, it is shown that state-space systems can be used for the systematic implementation and optimization of NNs (as nonlinear and time-varying dynamic systems). An RTL code generating software is also provided online, which simplifies the automatic generation of NNs of arbitrary size.
【78】 Prescriptive Process Monitoring for Cost-Aware Cycle Time Reduction
标题:用于节约成本的周期时间缩短的规定性流程监控
作者:Zahra Dasht Bozorgi,Irene Teinemaa,Marlon Dumas,Marcello La Rosa
机构: The University of Melbourne, Melbourne, Australia, University of Tartu, Tartu, Estonia
链接:https://arxiv.org/abs/2105.07111
摘要:缩短周期时间是业务流程管理领域经常关注的问题。根据流程,可以触发各种干预以缩短案例的周期时间,例如,在订单到交货流程中使用更快的装运服务,或者给客户打电话以获取丢失的信息,而不是被动地等待。每一种干预措施都要付出代价。本文解决的问题,确定是否和何时触发时间减少干预的方式,最大限度地提高总净收益。本文提出了一种规范化的过程监控方法,该方法使用正交随机森林模型来估计触发过程中每个正在进行的案例的时间缩短干预的因果效应。基于这种因果关系估计,该方法根据用户定义的策略触发干预。在两个实际测井曲线上对该方法进行了评价。
摘要:Reducing cycle time is a recurrent concern in the field of business process management. Depending on the process, various interventions may be triggered to reduce the cycle time of a case, for example, using a faster shipping service in an order-to-delivery process or giving a phone call to a customer to obtain missing information rather than waiting passively. Each of these interventions comes with a cost. This paper tackles the problem of determining if and when to trigger a time-reducing intervention in a way that maximizes the total net gain. The paper proposes a prescriptive process monitoring method that uses orthogonal random forest models to estimate the causal effect of triggering a time-reducing intervention for each ongoing case of a process. Based on this causal effect estimate, the method triggers interventions according to a user-defined policy. The method is evaluated on two real-life logs.
【79】 An Effective Baseline for Robustness to Distributional Shift
标题:对分布移位具有稳健性的有效基准
作者:Sunil Thulasidasan,Sushil Thapa,Sayera Dhaubhadel,Gopinath Chennupati,Tanmoy Bhattacharya,Jeff Bilmes
机构:Los Alamos National Laboratory, Los Alamos, NM, USA, Dept. of Computer Science and Engineering, New Mexico Tech, Socorro, NM, USA, Dept. of Electrical & Computer Engineering, University of Washington, Seattle, USA
链接:https://arxiv.org/abs/2105.07107
摘要:当面对不同于训练中所见的输入类别时,避免自信地预测是安全部署深度学习系统的一个重要要求。虽然说起来很简单,但这在深度学习中是一个特别具有挑战性的问题,在这种情况下,模型往往最终会做出过度自信的预测。在这项工作中,我们提出了一个简单,但非常有效的方法来处理分布外检测,使用弃权的原则:当遇到一个样本从一个看不见的类,期望的行为是放弃预测。我们的方法使用了一个带有额外弃权类的网络,并在一个数据集上进行训练,该数据集由一个由大量分配了弃权类标签的分布外(OoD)样本组成的未赋值集扩充;然后对模型进行训练,学习分布内和分布外样本之间的有效鉴别器。我们将这种相对简单的方法与各种更复杂的方法进行了比较,这些方法既适用于分布外检测,也适用于深度学习中的不确定性建模,并在大量的图像识别和文本分类的基准测试和深层结构上证明了它的有效性,通常比现有的方法有显著的优势。鉴于此方法的简单性和有效性,我们建议将此方法用作此领域未来工作的新的额外基线。
摘要:Refraining from confidently predicting when faced with categories of inputs different from those seen during training is an important requirement for the safe deployment of deep learning systems. While simple to state, this has been a particularly challenging problem in deep learning, where models often end up making overconfident predictions in such situations. In this work we present a simple, but highly effective approach to deal with out-of-distribution detection that uses the principle of abstention: when encountering a sample from an unseen class, the desired behavior is to abstain from predicting. Our approach uses a network with an extra abstention class and is trained on a dataset that is augmented with an uncurated set that consists of a large number of out-of-distribution (OoD) samples that are assigned the label of the abstention class; the model is then trained to learn an effective discriminator between in and out-of-distribution samples. We compare this relatively simple approach against a wide variety of more complex methods that have been proposed both for out-of-distribution detection as well as uncertainty modeling in deep learning, and empirically demonstrate its effectiveness on a wide variety of of benchmarks and deep architectures for image recognition and text classification, often outperforming existing approaches by significant margins. Given the simplicity and effectiveness of this method, we propose that this approach be used as a new additional baseline for future work in this domain.
【80】 Verification of Image-based Neural Network Controllers Using Generative Models
标题:基于产生式模型的图像神经网络控制器验证
作者:Sydney M. Katz,Anthony L. Corso,Christopher A. Strong,Mykel J. Kochenderfer
机构:∗Denotes equal contribution
备注:10 pages, 12 figures, presented at the 2021 AIAA Digital Avionics Systems Conference (DASC)
链接:https://arxiv.org/abs/2105.07091
摘要:神经网络通常用于处理来自基于图像的传感器的信息以产生控制动作。虽然神经网络在这项任务中是有效的,但神经网络的复杂性使其输出难以验证和预测,从而限制了其在安全关键系统中的应用。为此,近年来的研究主要集中在将形式化方法与可达性分析相结合,以保证神经网络控制器的闭环性能。然而,这些技术并不适用于基于图像的神经网络控制器的高维复杂输入空间。在这项工作中,我们提出了一种方法来解决这些挑战,通过训练生成性对抗网络(GAN)将状态映射到合理的输入图像。通过将发电机网络和控制网络连接起来,我们得到了一个具有低维输入空间的网络。这种洞察使我们能够使用现有的闭环验证工具来获得基于图像的控制器性能的正式保证。我们应用我们的方法为基于图像的神经网络控制器提供安全保障,以解决自主飞机滑行问题。我们保证管制员将飞机保持在跑道上,并引导飞机朝向跑道中心。我们提供的保证是关于由生成器网络建模的输入图像集,因此我们提供了一个召回度量来评估生成器捕获可信图像空间的能力。
摘要:Neural networks are often used to process information from image-based sensors to produce control actions. While they are effective for this task, the complex nature of neural networks makes their output difficult to verify and predict, limiting their use in safety-critical systems. For this reason, recent work has focused on combining techniques in formal methods and reachability analysis to obtain guarantees on the closed-loop performance of neural network controllers. However, these techniques do not scale to the high-dimensional and complicated input space of image-based neural network controllers. In this work, we propose a method to address these challenges by training a generative adversarial network (GAN) to map states to plausible input images. By concatenating the generator network with the control network, we obtain a network with a low-dimensional input space. This insight allows us to use existing closed-loop verification tools to obtain formal guarantees on the performance of image-based controllers. We apply our approach to provide safety guarantees for an image-based neural network controller for an autonomous aircraft taxi problem. We guarantee that the controller will keep the aircraft on the runway and guide the aircraft towards the center of the runway. The guarantees we provide are with respect to the set of input images modeled by our generator network, so we provide a recall metric to evaluate how well the generator captures the space of plausible images.
【81】 Interpretable Drug Synergy Prediction with Graph Neural Networks for Human-AI Collaboration in Healthcare
标题:基于图神经网络的医疗卫生人工智能协同可解释药物协同预测
作者:Zehao Dong,Heming Zhang,Yixin Chen,Fuhai Li
机构:Computer Science, Washington University in St. Louis, St. Louis, MO, USA., Institute for Informatics (I,), Washington University School of Medicine, Washington, Department of Pediatrics, Washington University School of Medicine, Washington University
链接:https://arxiv.org/abs/2105.07082
摘要:我们以归纳和解释的方式研究肿瘤药物联合治疗的耐药或敏感反应的分子机制。虽然深度学习算法在药物协同预测问题中得到了广泛的应用,但如何建立具有生物学意义的预测模型来研究医疗系统中人工智能协同的神秘机制仍然是一个悬而未决的问题。为了应对这些挑战,我们提出了一种深度图神经网络,IDSP(可解释的深度信号通路),将基因-基因以及基因-药物调控关系纳入协同药物组合预测中。IDSP通过多层感知器(MLP)根据基因和药物节点的关系(即信号相互作用)自动学习边缘的权重,并以归纳的方式聚集信息。该体系结构通过检测重要的信号相互作用产生可解释的药物协同预测,并且可以在潜在的分子机制遇到未知的基因或信号通路时实现。我们在46个核心癌症信号通路的基因和NCI年鉴药物组合筛选数据中的药物组合形成的信号网络上测试IDWSP。实验结果表明:(1)IDSP可以从潜在的分子机制中学习,在不需要额外的药物化学信息的情况下进行预测,同时可以获得与现有方法相当的性能;2) IDSP在导入任务和归纳任务的协同预测任务的实现上都表现出良好的通用性和灵活性。3) IDSP可以通过检测不同细胞系的不同显著信号模式(即MoS)产生可解释的结果。
摘要:We investigate molecular mechanisms of resistant or sensitive response of cancer drug combination therapies in an inductive and interpretable manner. Though deep learning algorithms are widely used in the drug synergy prediction problem, it is still an open problem to formulate the prediction model with biological meaning to investigate the mysterious mechanisms of synergy (MoS) for the human-AI collaboration in healthcare systems. To address the challenges, we propose a deep graph neural network, IDSP (Interpretable Deep Signaling Pathways), to incorporate the gene-gene as well as gene-drug regulatory relationships in synergic drug combination predictions. IDSP automatically learns weights of edges based on the gene and drug node relations, i.e., signaling interactions, by a multi-layer perceptron (MLP) and aggregates information in an inductive manner. The proposed architecture generates interpretable drug synergy prediction by detecting important signaling interactions, and can be implemented when the underlying molecular mechanism encounters unseen genes or signaling pathways. We test IDWSP on signaling networks formulated by genes from 46 core cancer signaling pathways and drug combinations from NCI ALMANAC drug combination screening data. The experimental results demonstrated that 1) IDSP can learn from the underlying molecular mechanism to make prediction without additional drug chemical information while achieving highly comparable performance with current state-of-art methods; 2) IDSP show superior generality and flexibility to implement the synergy prediction task on both transductive tasks and inductive tasks. 3) IDSP can generate interpretable results by detecting different salient signaling patterns (i.e. MoS) for different cell lines.
【82】 High-Robustness, Low-Transferability Fingerprinting of Neural Networks
标题:一种高鲁棒性、低可移植性的神经网络指纹
作者:Siyue Wang,Xiao Wang,Pin-Yu Chen,Pu Zhao,Xue Lin
机构:. Northeastern University ,. Boston University ,. IBM Research
备注:ICLR 2021 Workshop on Security and Safety in Machine Learning Systems
链接:https://arxiv.org/abs/2105.07078
摘要:本文提出了一种有效地对深度神经网络进行指纹识别的特征实例,该特征实例对基础模型具有很强的鲁棒性,对模型剪枝具有很强的鲁棒性,对非关联模型的可移植性很低。这是第一个同时考虑鲁棒性和可转移性来生成真实指纹的工作,而目前的方法缺乏实际的假设,并且可能会产生较大的假阳性率。为了在鲁棒性和可转移性之间取得更好的平衡,我们提出了三种特征示例:vanilla C示例、RC示例和LTRC示例,从原始的基础模型中提取指纹。为了公平地描述稳健性和可转移性之间的权衡,我们提出了唯一性得分,这是一个衡量稳健性和可转移性之间差异的综合指标,同时也是虚警问题的一个指标。
摘要:This paper proposes Characteristic Examples for effectively fingerprinting deep neural networks, featuring high-robustness to the base model against model pruning as well as low-transferability to unassociated models. This is the first work taking both robustness and transferability into consideration for generating realistic fingerprints, whereas current methods lack practical assumptions and may incur large false positive rates. To achieve better trade-off between robustness and transferability, we propose three kinds of characteristic examples: vanilla C-examples, RC-examples, and LTRC-example, to derive fingerprints from the original base model. To fairly characterize the trade-off between robustness and transferability, we propose Uniqueness Score, a comprehensive metric that measures the difference between robustness and transferability, which also serves as an indicator to the false alarm problem.
【83】 Node Selection Toward Faster Convergence for Federated Learning on Non-IID Data
标题:基于非IID数据的联合学习快速收敛节点选择
作者:Hongda Wu,Ping Wang
机构: which is offloaded via wireless network to edgeHongda Wu and Ping Wang are with the Department of Electrical En-gineering and Computer Science, Lassonde School of Engineering, YorkUniversity
链接:https://arxiv.org/abs/2105.07066
摘要:联邦学习(FL)是一种分布式学习范式,它使得大量资源有限的节点能够在不共享数据的情况下协作训练模型。非独立同分布(non-i.i.d.)数据样本引起全局目标和局部目标之间的差异,使得FL模型收敛较慢。本文提出了一种优化聚合算法,通过检测局部梯度和全局梯度之间的关系,识别并排除不利的局部更新,从而找出每轮全局搜索中参与节点局部更新的最优子集。然后,提出了一种概率节点选择框架(FedPNS),根据最优聚合的输出动态地改变每个节点被选择的概率。FedPNS可以优先选择能够加快模型收敛速度的节点。说明了所提出的FedPNS设计的无偏性,并从理论上分析了FedPNS相对于常用的联邦平均(FedAvg)算法收敛速度的提高。实验结果表明,与随机节点选择的FedAvg相比,FedPNS在加快FL收敛速度方面是有效的。
摘要:Federated Learning (FL) is a distributed learning paradigm that enables a large number of resource-limited nodes to collaboratively train a model without data sharing. The non-independent-and-identically-distributed (non-i.i.d.) data samples invoke discrepancy between global and local objectives, making the FL model slow to converge. In this paper, we proposed Optimal Aggregation algorithm for better aggregation, which finds out the optimal subset of local updates of participating nodes in each global round, by identifying and excluding the adverse local updates via checking the relationship between the local gradient and the global gradient. Then, we proposed a Probabilistic Node Selection framework (FedPNS) to dynamically change the probability for each node to be selected based on the output of Optimal Aggregation. FedPNS can preferentially select nodes that propel faster model convergence. The unbiasedness of the proposed FedPNS design is illustrated and the convergence rate improvement of FedPNS over the commonly adopted Federated Averaging (FedAvg) algorithm is analyzed theoretically. Experimental results demonstrate the effectiveness of FedPNS in accelerating the FL convergence rate, as compared to FedAvg with random node selection.
【84】 Visual analogy: Deep learning versus compositional models
标题:视觉类比:深度学习与构图模型
作者:Nicholas Ichien,Qing Liu,Shuhao Fu,Keith J. Holyoak,Alan Yuille,Hongjing Lu
机构:Denotes equal author contribution, Department of Psychology,Department of Statistics, University of California, Los Angeles, Los Angeles, CA , USA, Department of Computer Science,Department of Cognitive Science, Johns Hopkins University, Baltimore, MD , USA
链接:https://arxiv.org/abs/2105.07065
摘要:类比推理是一项必须通过应用深度学习模型来解决大量推理问题的任务吗?或者类比是通过计算类比的结构化表示之间的相似性来解决的?我们通过比较人类在使用熟悉的三维物体(汽车及其子区域)图像创建的视觉模拟上的表现与替代计算模型的表现来解决这个问题。人类推理者在所有问题类型上都达到了高于概率的准确性,但在一些情况下(例如,当相关的子区域被遮挡时)出现了更多的错误。我们将人的表现与最近两个直接训练用来解决这些类比问题的深度学习模型(暹罗网络和关系网络)以及评估基于零件的表示之间关系相似性的合成模型进行了比较。基于零件表示的组合模型(而非深度学习模型)产生了类似于人类推理者的定性性能。
摘要:Is analogical reasoning a task that must be learned to solve from scratch by applying deep learning models to massive numbers of reasoning problems? Or are analogies solved by computing similarities between structured representations of analogs? We address this question by comparing human performance on visual analogies created using images of familiar three-dimensional objects (cars and their subregions) with the performance of alternative computational models. Human reasoners achieved above-chance accuracy for all problem types, but made more errors in several conditions (e.g., when relevant subregions were occluded). We compared human performance to that of two recent deep learning models (Siamese Network and Relation Network) directly trained to solve these analogy problems, as well as to that of a compositional model that assesses relational similarity between part-based representations. The compositional model based on part representations, but not the deep learning models, generated qualitative performance similar to that of human reasoners.
【85】 Improving Graph Neural Networks with Simple Architecture Design
标题:用简单的结构设计改进图神经网络
作者:Sunil Kumar Maurya,Xin Liu,Tsuyoshi Murata
机构:Tokyo Institute of Technology, Tokyo, Japan, AIRC, AIST
链接:https://arxiv.org/abs/2105.07634
摘要:图神经网络是一种基于图结构的附加约束学习数据的有效工具。这些图通常是通过假定实体之间的内在关系来创建的。近年来,在体系结构设计上有了很大的改进,使得在各种预测任务中的性能得到了提高。一般来说,这些神经结构结合了层深度和节点特征聚合步骤。这使得分析不同跳数特征的重要性和神经网络层的表达能力成为一个挑战。由于不同的图数据集在特征和类标签分布上表现出不同程度的同亲性和异亲性,因此在没有任何先验信息的情况下,了解哪些特征对预测任务是重要的就变得非常重要。本文对图神经网络的节点特征聚合步骤和深度进行了解耦,并介绍了几种关键的图神经网络设计策略。更具体地说,我们建议使用softmax作为正则化器和“软选择器”的特点聚集从邻居在不同的跳跃距离;以及GNN层上的“跃点规范化”。结合这些技术,我们提出了一个简单而浅显的模型,即特征选择图神经网络(FSGNN),并通过实验证明,该模型优于其他最新的GNN模型,在节点分类任务上的准确率提高了64%。此外,通过对模型学习到的软选择参数的分析,为研究特征在预测任务中的重要性提供了一种简单的方法。最后,我们通过实验证明了该模型对于具有数百万个节点和数十亿条边的大型图是可伸缩的。
摘要:Graph Neural Networks have emerged as a useful tool to learn on the data by applying additional constraints based on the graph structure. These graphs are often created with assumed intrinsic relations between the entities. In recent years, there have been tremendous improvements in the architecture design, pushing the performance up in various prediction tasks. In general, these neural architectures combine layer depth and node feature aggregation steps. This makes it challenging to analyze the importance of features at various hops and the expressiveness of the neural network layers. As different graph datasets show varying levels of homophily and heterophily in features and class label distribution, it becomes essential to understand which features are important for the prediction tasks without any prior information. In this work, we decouple the node feature aggregation step and depth of graph neural network and introduce several key design strategies for graph neural networks. More specifically, we propose to use softmax as a regularizer and "Soft-Selector" of features aggregated from neighbors at different hop distances; and "Hop-Normalization" over GNN layers. Combining these techniques, we present a simple and shallow model, Feature Selection Graph Neural Network (FSGNN), and show empirically that the proposed model outperforms other state of the art GNN models and achieves up to 64% improvements in accuracy on node classification tasks. Moreover, analyzing the learned soft-selection parameters of the model provides a simple way to study the importance of features in the prediction tasks. Finally, we demonstrate with experiments that the model is scalable for large graphs with millions of nodes and billions of edges.
【86】 Deep learning for detecting pulmonary tuberculosis via chest radiography: an international study across 10 countries
标题:通过胸片检测肺结核的深度学习:一项涉及10个国家的国际研究
作者:Sahar Kazemzadeh,Jin Yu,Shahar Jamshy,Rory Pilgrim,Zaid Nabulsi,Christina Chen,Neeral Beladia,Charles Lau,Scott Mayer McKinney,Thad Hughes,Atilla Kiraly,Sreenivasa Raju Kalidindi,Monde Muyoyeta,Jameson Malemela,Ting Shih,Greg S. Corrado,Lily Peng,Katherine Chou,Po-Hsuan Cameron Chen,Yun Liu,Krish Eswaran,Daniel Tse,Shravya Shetty,Shruthi Prabhakara
机构:Prabhakara,‡, Affiliations, Google Health, Palo Alto, CA, USA, Work done at Google via Advanced Clinical, Deerfield, IL, USA, Apollo Radiology International, Hyderabad, India, TB department,Center of Infectious Disease Research in Zambia, Lusaka, Zambia
链接:https://arxiv.org/abs/2105.07540
摘要:肺结核(TB)是全球十大死因之一。尽管世卫组织建议胸片(CXR)用于结核病筛查,但有限的CXR解释是一个障碍。我们使用非洲、亚洲和欧洲9个国家的CXR训练了一个深度学习系统(DLS)来检测活动性肺结核,并使用了大规模CXR预训练、注意力集中和噪声学生半监督学习。评估是基于(1)一个横跨中国、印度、美国和赞比亚的综合测试集,以及(2)南非的一个独立采矿人口。鉴于世卫组织的目标是90%的敏感性和70%的特异性,DLS的操作点被预先指定为敏感性高于特异性。在综合测试数据集上,DLS的ROC曲线高于所有9位印度放射科医生,AUC为0.90(95%可信区间0.87-0.92)。DLS的敏感性(88%)高于印度放射科医生(75%)的平均敏感性(p<0.001);其特异性(79%)不低于放射科医生(84%平均特异性),p=0.004。在HIV阳性和痰涂片阳性的亚组中,以及在南非的测试集中,也观察到了类似的趋势。我们发现5名美国放射科医生(结核病不是地方病)比印度放射科医生(结核病是地方病)更敏感,特异性更低。DLS也不逊于美国的放射科医生。在模拟实验中,使用DLS作为验证性测试的优先工具,与单独使用验证性测试相比,每个阳性病例的成本降低了40-80%。总之,我们的DLS已推广到5个国家,值得进行前瞻性评估,以帮助在放射科医生有限的环境中进行成本效益高的筛查工作。操作点的灵活性可能允许定制DLS,以考虑特定地点的因素,如结核病流行率、人口统计学、临床资源和习惯做法模式。
摘要:Tuberculosis (TB) is a top-10 cause of death worldwide. Though the WHO recommends chest radiographs (CXRs) for TB screening, the limited availability of CXR interpretation is a barrier. We trained a deep learning system (DLS) to detect active pulmonary TB using CXRs from 9 countries across Africa, Asia, and Europe, and utilized large-scale CXR pretraining, attention pooling, and noisy student semi-supervised learning. Evaluation was on (1) a combined test set spanning China, India, US, and Zambia, and (2) an independent mining population in South Africa. Given WHO targets of 90% sensitivity and 70% specificity, the DLS's operating point was prespecified to favor sensitivity over specificity. On the combined test set, the DLS's ROC curve was above all 9 India-based radiologists, with an AUC of 0.90 (95%CI 0.87-0.92). The DLS's sensitivity (88%) was higher than the India-based radiologists (75% mean sensitivity), p<0.001 for superiority; and its specificity (79%) was non-inferior to the radiologists (84% mean specificity), p=0.004. Similar trends were observed within HIV positive and sputum smear positive sub-groups, and in the South Africa test set. We found that 5 US-based radiologists (where TB isn't endemic) were more sensitive and less specific than the India-based radiologists (where TB is endemic). The DLS also remained non-inferior to the US-based radiologists. In simulations, using the DLS as a prioritization tool for confirmatory testing reduced the cost per positive case detected by 40-80% compared to using confirmatory testing alone. To conclude, our DLS generalized to 5 countries, and merits prospective evaluation to assist cost-effective screening efforts in radiologist-limited settings. Operating point flexibility may permit customization of the DLS to account for site-specific factors such as TB prevalence, demographics, clinical resources, and customary practice patterns.
【87】 Optimal control of robust team stochastic games
标题:鲁棒团队随机对策的最优控制
作者:Feng Huang,Ming Cao,Long Wang
机构: Universityof Groningen
备注:under review
链接:https://arxiv.org/abs/2105.07405
摘要:在随机动态环境中,团队随机博弈是研究完全合作多智能体系统顺序决策问题的一种通用范式。然而,导出的策略的最优性通常对模型参数敏感,而模型参数通常是未知的,在实际应用中需要根据噪声数据进行估计。为了降低最优策略对这些不确定参数的敏感性,本文提出了一个“鲁棒”团队随机博弈模型,参与者利用鲁棒优化方法进行决策。该模型将团队随机博弈扩展到不完全信息的情形,同时提出了鲁棒团队最优解的概念。为了寻求这样的解,我们提出了一种Gauss-Seidel修正策略迭代的学习算法,并证明了其收敛性。与鲁棒动态规划算法相比,该算法不仅具有更快的收敛速度,而且允许使用近似计算来减轻维数灾难。此外,通过将社会困境博弈模型推广到序贯稳健情景,通过数值仿真验证了该算法的有效性。
摘要:In stochastic dynamic environments, team stochastic games have emerged as a versatile paradigm for studying sequential decision-making problems of fully cooperative multi-agent systems. However, the optimality of the derived policies is usually sensitive to the model parameters, which are typically unknown and required to be estimated from noisy data in practice. To mitigate the sensitivity of the optimal policy to these uncertain parameters, in this paper, we propose a model of "robust" team stochastic games, where players utilize a robust optimization approach to make decisions. This model extends team stochastic games to the scenario of incomplete information and meanwhile provides an alternative solution concept of robust team optimality. To seek such a solution, we develop a learning algorithm in the form of a Gauss-Seidel modified policy iteration and prove its convergence. This algorithm, compared with robust dynamic programming, not only possesses a faster convergence rate, but also allows for using approximation calculations to alleviate the curse of dimensionality. Moreover, some numerical simulations are presented to demonstrate the effectiveness of the algorithm by generalizing the game model of social dilemmas to sequential robust scenarios.
【88】 A brain basis of dynamical intelligence for AI and computational neuroscience
标题:人工智能和计算神经科学的动态智能的脑基础
作者:Joseph D. Monaco,Kanaka Rajan,Grace M. Hwang
机构: Department of Biomedical Engineering, Johns Hopkins University (JHU) School, of Medicine, Baltimore, MD, USA;, Icahn School of Medicine at Mount Sinai, New York, NY, USA;, JHUApplied Physics Lab, Laurel, MD, USA; JHU Kavli Neuroscience Discovery
备注:Perspective article: 178 references, 24 pages, 3 figures, and 1 glossary box
链接:https://arxiv.org/abs/2105.07284
摘要:现代人工智能的深层神经网络还没有实现生物智能的定义性特征,包括抽象、因果学习和能量效率。虽然扩展到更大的模型已经为当前的应用带来了性能改进,但更多类似大脑的能力可能需要新的理论、模型和方法来设计人工学习系统。在这里,我们认为,这个机会重新评估的见解,从大脑应该刺激人工智能研究和理论驱动的计算神经科学(CN)之间的合作。为了激发神经计算的大脑基础,我们提出了智能的动力学观点,从中我们阐述了网络结构、时间动力学和交互学习中的稀疏性概念。特别是,我们认为,时间动力学,通过神经同步,嵌套振荡和灵活的序列表示,为读取和更新分布在长期记忆网络中的层次模型提供了丰富的计算层。此外,在AI和CN中采用以代理为中心的范例将加速我们对构建有用世界模型的复杂动力学和行为的理解。AI/CN理论和目标的融合将揭示大脑和工程学习系统智能的动力学原理。本文的灵感来自于我们在第六届美国/美国国立卫生研究院大脑倡议研究者年会上召开的关于动态神经科学和机器学习的研讨会。
摘要:The deep neural nets of modern artificial intelligence (AI) have not achieved defining features of biological intelligence, including abstraction, causal learning, and energy-efficiency. While scaling to larger models has delivered performance improvements for current applications, more brain-like capacities may demand new theories, models, and methods for designing artificial learning systems. Here, we argue that this opportunity to reassess insights from the brain should stimulate cooperation between AI research and theory-driven computational neuroscience (CN). To motivate a brain basis of neural computation, we present a dynamical view of intelligence from which we elaborate concepts of sparsity in network structure, temporal dynamics, and interactive learning. In particular, we suggest that temporal dynamics, as expressed through neural synchrony, nested oscillations, and flexible sequences, provide a rich computational layer for reading and updating hierarchical models distributed in long-term memory networks. Moreover, embracing agent-centered paradigms in AI and CN will accelerate our understanding of the complex dynamics and behaviors that build useful world models. A convergence of AI/CN theories and objectives will reveal dynamical principles of intelligence for brains and engineered learning systems. This article was inspired by our symposium on dynamical neuroscience and machine learning at the 6th Annual US/NIH BRAIN Initiative Investigators Meeting.
【89】 A Monotone Approximate Dynamic Programming Approach for the Stochastic Scheduling, Allocation, and Inventory Replenishment Problem: Applications to Drone and Electric Vehicle Battery Swap Stations
标题:随机调度、分配和库存补充问题的单调近似动态规划方法:在无人机和电动汽车电池交换站的应用
作者:Amin Asadi,Sarah Nurre Pinkley
机构:Department of Industrial Engineering, University of Arkansas, Bell Engineering, Fayetteville, AR
链接:https://arxiv.org/abs/2105.07026
摘要:电动汽车(EV)和无人机在许多领域的应用越来越受到人们的关注。然而,电池为导向的问题,包括范围焦虑和电池退化,阻碍采用。电池交换站是减少这些问题的一种替代方法,这些问题允许在几分钟内将耗尽的电池交换为充满的电池。当明确考虑交换需求的不确定到达、电池退化和更换时,我们考虑在电池交换站导出动作的问题。我们使用有限时间马尔可夫决策过程模型来模拟电池交换站的操作,该模型用于随机调度、分配和库存补充问题(SAIRP),该问题决定了电池充电、放电和更换的时间和数量。对于特殊的SAIRP情形,我们给出了最优策略的值函数的单调性和最优策略的单调结构的理论证明。由于维数的限制,本文提出了一种新的单调近似动态规划(ADP)方法,该方法通过回归智能地初始化值函数逼近。在计算测试中,我们证明了新的基于回归的单调ADP方法比精确方法和其他单调ADP方法具有更好的性能。此外,通过测试,我们推断出无人机交换站的政策见解。
摘要:There is a growing interest in using electric vehicles (EVs) and drones for many applications. However, battery-oriented issues, including range anxiety and battery degradation, impede adoption. Battery swap stations are one alternative to reduce these concerns that allow the swap of depleted for full batteries in minutes. We consider the problem of deriving actions at a battery swap station when explicitly considering the uncertain arrival of swap demand, battery degradation, and replacement. We model the operations at a battery swap station using a finite horizon Markov Decision Process model for the stochastic scheduling, allocation, and inventory replenishment problem (SAIRP), which determines when and how many batteries are charged, discharged, and replaced over time. We present theoretical proofs for the monotonicity of the value function and monotone structure of an optimal policy for special SAIRP cases. Due to the curses of dimensionality, we develop a new monotone approximate dynamic programming (ADP) method, which intelligently initializes a value function approximation using regression. In computational tests, we demonstrate the superior performance of the new regression-based monotone ADP method as compared to exact methods and other monotone ADP methods. Further, with the tests, we deduce policy insights for drone swap stations.