联系客服
客服二维码

联系客服获取更多资料

微信号:LingLab1

客服电话:010-82185409

意见反馈
关注我们
关注公众号

关注公众号

linglab语言实验室

回到顶部
人工智能方向

1025 阅读 2021-02-22 09:31:05 上传

以下文章来源于 语言学札记

 

Artificial Intelligence(20篇)

[1]:Privacy Information Classification: A Hybrid Approach
标题:隐私信息分类:一种混合方法
作者:Jiaqi Wu, Weihua Li, Quan Bai, Takayuki Ito, Ahmed Moustafa
备注:IJCAI 2019 Workshop. The 4th International Workshop on Smart Simulation and Modelling for Complex Systems
链接:https://arxiv.org/abs/2101.11574
 

摘要:A large amount of information has been published to online social networks every day. Individual privacy-related information is also possibly disclosed unconsciously by the end-users. Identifying privacy-related data and protecting the online social network users from privacy leakage turn out to be significant. Under such a motivation, this study aims to propose and develop a hybrid privacy classification approach to detect and classify privacy information from OSNs. The proposed hybrid approach employs both deep learning models and ontology-based models for privacy-related information extraction. Extensive experiments are conducted to validate the proposed hybrid approach, and the empirical results demonstrate its superiority in assisting online social network users against privacy leakage.

 

[2]:Evolution of artificial intelligence languages, a systematic literature  review
标题:人工智能语言的进化:一个系统的文献综述
作者:Emmanuel Adetiba, Temitope John, Adekunle Akinrinmade, Funmilayo Moninuola, Oladipupo Akintade, Joke Badejo
链接:https://arxiv.org/abs/2101.11501
 

摘要:The field of Artificial Intelligence (AI) has undoubtedly received significant attention in recent years. AI is being adopted to provide solutions to problems in fields such as medicine, engineering, education, government and several other domains. In order to analyze the state of the art of research in the field of AI, we present a systematic literature review focusing on the Evolution of AI programming languages. We followed the systematic literature review method by searching relevant databases like SCOPUS, IEEE Xplore and Google Scholar. EndNote reference manager was used to catalog the relevant extracted papers. Our search returned a total of 6565 documents, whereof 69 studies were retained. Of the 69 retained studies, 15 documents discussed LISP programming language, another 34 discussed PROLOG programming language, the remaining 20 documents were spread between Logic and Object Oriented Programming (LOOP), ARCHLOG, Epistemic Ontology Language with Constraints (EOLC), Python, C++, ADA and JAVA programming languages. This review provides information on the year of implementation, development team, capabilities, limitations and applications of each of the AI programming languages discussed. The information in this review could guide practitioners and researchers in AI to make the right choice of languages to implement their novel AI methods.

 

[3]:On formal concepts of random formal contexts
标题:随机形式语境的形式概念
作者:Taro Sakurai
备注:7 pages, 2 figures, 1 table
链接:https://arxiv.org/abs/2101.11023
 

摘要:In formal concept analysis, it is well-known that the number of formal concepts can be exponential in the worst case. To analyze the average case, we introduce a probabilistic model for random formal contexts and prove that the average number of formal concepts has a superpolynomial asymptotic lower bound.

 

[4]:An Integrated Localisation, Motion Planning and Obstacle Avoidance  Algorithm in Belief Space
标题:一种基于信念空间的综合定位、运动规划和避障算法
作者:Antony Thomas, Fulvio Mastrogiovanni, Marco Baglietto
备注:Accepted for publication in Intelligent Service Robotics
链接:https://arxiv.org/abs/2101.11566
 

摘要:As robots are being increasingly used in close proximity to humans and objects, it is imperative that robots operate safely and efficiently under real-world conditions. Yet, the environment is seldom known perfectly. Noisy sensors and actuation errors compound to the errors introduced while estimating features of the environment. We present a novel approach (1) to incorporate these uncertainties for robot state estimation and (2) to compute the probability of collision pertaining to the estimated robot configurations. The expression for collision probability is obtained as an infinite series and we prove its convergence. An upper bound for the truncation error is also derived and the number of terms required is demonstrated by analyzing the convergence for different robot and obstacle configurations. We evaluate our approach using two simulation domains which use a roadmap-based strategy to synthesize trajectories that satisfy collision probability bounds.

 

[5]:Wisdom of the Contexts: Active Ensemble Learning for Contextual Anomaly  Detection
标题:语境智慧:语境异常检测的主动集成学习
作者:Ece Calikus, Slawomir Nowaczyk, Mohamed-Rafik Bouguelia, Onur Dikmen
备注:Submitted to IEEE TKDE
链接:https://arxiv.org/abs/2101.11560
 

摘要:In contextual anomaly detection (CAD), an object is only considered anomalous within a specific context. Most existing methods for CAD use a single context based on a set of user-specified contextual features. However, identifying the right context can be very challenging in practice, especially in datasets, with a large number of attributes. Furthermore, in real-world systems, there might be multiple anomalies that occur in different contexts and, therefore, require a combination of several "useful" contexts to unveil them. In this work, we leverage active learning and ensembles to effectively detect complex contextual anomalies in situations where the true contextual and behavioral attributes are unknown. We propose a novel approach, called WisCon (Wisdom of the Contexts), that automatically creates contexts from the feature set. Our method constructs an ensemble of multiple contexts, with varying importance scores, based on the assumption that not all useful contexts are equally so. Experiments show that WisCon significantly outperforms existing baselines in different categories (i.e., active classifiers, unsupervised contextual and non-contextual anomaly detectors, and supervised classifiers) on seven datasets. Furthermore, the results support our initial hypothesis that there is no single perfect context that successfully uncovers all kinds of contextual anomalies, and leveraging the "wisdom" of multiple contexts is necessary.

 

[6]:Modelling the Impact of Scandals: the case of the 2017 French  Presidential Election
标题:丑闻影响的模型化:以2017年法国总统大选为例
作者:Yassine Bouachrine, Carole Adam
备注:internship report
链接:https://arxiv.org/abs/2101.11548
 

摘要:This paper proposes an agent-based simulation of a presidential election, inspired by the French 2017 presidential election. The simulation is based on data extracted from polls, media coverage, and Twitter. The main contribution is to consider the impact of scandals and media bashing on the result of the election. In particular, it is shown that scandals can lead to higher abstention at the election, as voters have no relevant candidate left to vote for. The simulation is implemented in Unity 3D and is available to play online.

 

[7]:Multi-agent simulation of voter's behaviour
标题:选民行为的多智能体仿真
作者:Albin Soutif, Carole Adam, Sylvain Bouveret
备注:internship report
链接:https://arxiv.org/abs/2101.11538
 

摘要:The goal of this paper is to simulate the voters behaviour given a voting method. Our approach uses a multi-agent simulation in order to model a voting process through many iterations, so that the voters can vote by taking into account the results of polls. Here we only tried basic rules and a single voting method, but further attempts could explore new features.

 

[8]:A Balance for Fairness: Fair Distribution Utilising Physics in Games of  Characteristic Function Form
标题:公平的平衡:在特征函数形式的博弈中利用物理学的公平分配
作者:Song-Ju Kim, Taiki Takahashi, Kazuo Sano
备注:13 pages, 5 figures
链接:https://arxiv.org/abs/2101.11496
 

摘要:In chaotic modern society, there is an increasing demand for the realization of true 'fairness'. In Greek mythology, Themis, the 'goddess of justice', has a sword in her right hand to protect society from vices, and a 'balance of judgment' in her left hand that measures good and evil. In this study, we propose a fair distribution method 'utilising physics' for the profit in games of characteristic function form. Specifically, we show that the linear programming problem for calculating 'nucleolus' can be efficiently solved by considering it as a physical system in which gravity works. In addition to being able to significantly reduce computational complexity thereby, we believe that this system could have flexibility necessary to respond to real-time changes in the parameter.

 

[9]:Formulating and solving integrated order batching and routing in  multi-depot AGV-assisted mixed-shelves warehouses
标题:多仓库AGV辅助混架仓库订单批量与路径集成的制定与求解
作者:Lin Xie, Hanyi Li, Laurin Luttmann
链接:https://arxiv.org/abs/2101.11473
 

摘要:Different retail and e-commerce companies are facing the challenge of assembling large numbers of time-critical picking orders that include both single-line and multi-line orders. To reduce unproductive picker working time as in traditional picker-to-parts warehousing systems, different solutions are proposed in the literature and in practice. For example, in a mixed-shelves storage policy, items of the same stock keeping unit are spread over several shelves in a warehouse; or automated guided vehicles (AGVs) are used to transport the picked items from the storage area to packing stations instead of human pickers. This is the first paper to combine both solutions, creating what we call AGV-assisted mixed-shelves picking systems. We model the new integrated order batching and routing problem in such systems as an extended multi-depot vehicle routing problem with both three-index and two-commodity network flow formulations. Due to the complexity of the integrated problem, we develop a novel variable neighborhood search algorithm to solve the integrated problem more efficiently. We test our methods with different sizes of instances, and conclude that the mixed-shelves storage policy is more suitable than the usual storage policy in AGV-assisted mixed-shelves systems for both single-line and multi-line orders (saving up to 67% on driving distances for AGVs). Our variable neighborhood search algorithm provides close-to-optimal solutions within an acceptable computational time.

 

[10]:Online LDA based brain-computer interface system to aid disabled people
标题:基于LDA的残疾人在线脑机接口系统
作者:Apdullah Yayik, Yakup Kutlu
备注:13 pages, 4 figures, Natural and Engineering Sciences
链接:https://arxiv.org/abs/2101.11435
 

摘要:This paper aims to develop brain-computer interface system based on electroencephalography that can aid disabled people in daily life. The system relies on one of the most effective event-related potential wave, P300, which can be elicited by oddball paradigm. Developed application has a basic interaction tool that enables disabled people to convey their needs to other people selecting related objects. These objects pseudo-randomly flash in a visual interface on computer screen. The user must focus on related object to convey desired needs. The system can convey desired needs correctly by detecting P300 wave in acquired 14-channel EEG signal and classifying using linear discriminant analysis classifier just in 15 seconds. Experiments have been carried out on 19 volunteers to validate developed BCI system. As a result, accuracy rate of 90.83% is achieved in online performance

 

[11]:Learning Abstract Representations through Lossy Compression of  Multi-Modal Signals
标题:多模态信号有损压缩学习抽象表示
作者:Charles Wilmot, Jochen Triesch
链接:https://arxiv.org/abs/2101.11376
 

摘要:A key competence for open-ended learning is the formation of increasingly abstract representations useful for driving complex behavior. Abstract representations ignore specific details and facilitate generalization. Here we consider the learning of abstract representations in a multi-modal setting with two or more input modalities. We treat the problem as a lossy compression problem and show that generic lossy compression of multimodal sensory input naturally extracts abstract representations that tend to strip away modalitiy specific details and preferentially retain information that is shared across the different modalities. Furthermore, we propose an architecture to learn abstract representations by identifying and retaining only the information that is shared across multiple modalities while discarding any modality specific information.

 

[12]:Combat Data Shift in Few-shot Learning with Knowledge Graph
标题:基于知识图的少射击学习中的作战数据转移
作者:Yongchun zhu, Fuzhen Zhuang, Xiangliang Zhang, Zhiyuan Qi, Zhiping Shi, Qing He
备注:10 pages, 3 figures
链接:https://arxiv.org/abs/2101.11354
 

摘要:Many few-shot learning approaches have been designed under the meta-learning framework, which learns from a variety of learning tasks and generalizes to new tasks. These meta-learning approaches achieve the expected performance in the scenario where all samples are drawn from the same distributions (i.i.d. observations). However, in real-world applications, few-shot learning paradigm often suffers from data shift, i.e., samples in different tasks, even in the same task, could be drawn from various data distributions. Most existing few-shot learning approaches are not designed with the consideration of data shift, and thus show downgraded performance when data distribution shifts. However, it is non-trivial to address the data shift problem in few-shot learning, due to the limited number of labeled samples in each task. Targeting at addressing this problem, we propose a novel metric-based meta-learning framework to extract task-specific representations and task-shared representations with the help of knowledge graph. The data shift within/between tasks can thus be combated by the combination of task-shared and task-specific representations. The proposed model is evaluated on popular benchmarks and two constructed new challenging datasets. The evaluation results demonstrate its remarkable performance.

 

[13]:Compositional Semantics for Probabilistic Programs with Exact  Conditioning
标题:精确条件概率规划的组合语义
作者:Dario Stein, Sam Staton
备注:16 pages, 5 figures
链接:https://arxiv.org/abs/2101.11351
 

摘要:We define a probabilistic programming language for Gaussian random variables with a first-class exact conditioning construct. We give operational, denotational and equational semantics for this language, establishing convenient properties like exchangeability of conditions. Conditioning on equality of continuous random variables is nontrivial, as the exact observation may have probability zero; this is Borel's paradox. Using categorical formulations of conditional probability, we show that the good properties of our language are not particular to Gaussians, but can be derived from universal properties, thus generalizing to wider settings. We define the Cond construction, which internalizes conditioning as a morphism, providing general compositional semantics for probabilistic programming with exact conditioning.

 

[14]:FedH2L: Federated Learning with Model and Statistical Heterogeneity
标题:FedH2L:具有模型和统计异质性的联邦学习
作者:Yiying Li, Wei Zhou, Huaimin Wang, Haibo Mi, Timothy M. Hospedales
链接:https://arxiv.org/abs/2101.11296
 

摘要:Federated learning (FL) enables distributed participants to collectively learn a strong global model without sacrificing their individual data privacy. Mainstream FL approaches require each participant to share a common network architecture and further assume that data are are sampled IID across participants. However, in real-world deployments participants may require heterogeneous network architectures; and the data distribution is almost certainly non-uniform across participants. To address these issues we introduce FedH2L, which is agnostic to both the model architecture and robust to different data distributions across participants. In contrast to approaches sharing parameters or gradients, FedH2L relies on mutual distillation, exchanging only posteriors on a shared seed set between participants in a decentralized manner. This makes it extremely bandwidth efficient, model agnostic, and crucially produces models capable of performing well on the whole data distribution when learning from heterogeneous silos.

 

[15]:Modeling opinion leader's role in the diffusion of innovation
标题:论意见领袖在创新扩散中的作用
作者:Natasa Vodopivec, Carole Adam, Jean-Pierre Chanteau
备注:Internship report
链接:https://arxiv.org/abs/2101.11260
 

摘要:The diffusion of innovations is an important topic for the consumer markets. Early research focused on how innovations spread on the level of the whole society. To get closer to the real world scenarios agent based models (ABM) started focusing on individual-level agents. In our work we will translate an existing ABM that investigates the role of opinion leaders in the process of diffusion of innovations to a new, more expressive platform designed for agent based modeling, GAMA. We will do it to show that taking advantage of new features of the chosen platform should be encouraged when making models in the field of social sciences in the future, because it can be beneficial for the explanatory power of simulation results.

 

[16]:Graph Neural Network for Traffic Forecasting: A Survey
标题:交通预测的图神经网络方法综述
作者:Weiwei Jiang, Jiayun Luo
链接:https://arxiv.org/abs/2101.11174
 

摘要:Traffic forecasting is an important factor for the success of intelligent transportation systems. Deep learning models including convolution neural networks and recurrent neural networks have been applied in traffic forecasting problems to model the spatial and temporal dependencies. In recent years, to model the graph structures in the transportation systems as well as the contextual information, graph neural networks (GNNs) are introduced as new tools and have achieved the state-of-the-art performance in a series of traffic forecasting problems. In this survey, we review the rapidly growing body of recent research using different GNNs, e.g., graph convolutional and graph attention networks, in various traffic forecasting problems, e.g., road traffic flow and speed forecasting, passenger flow forecasting in urban rail transit systems, demand forecasting in ride-hailing platforms, etc. We also present a collection of open data and source resources for each problem, as well as future research directions. To the best of our knowledge, this paper is the first comprehensive survey that explores the application of graph neural networks for traffic forecasting problems. We have also created a public Github repository to update the latest papers, open data and source resources.

 

[17]:Autonomous Off-road Navigation over Extreme Terrains with  Perceptually-challenging Conditions
标题:具有感知挑战性条件的极端地形上的自主越野导航
作者:Rohan Thakker, Nikhilesh Alatur, David D. Fan, Jesus Tordesillas, Michael Paton, Kyohei Otsu, Olivier Toupet, Ali-akbar Agha-mohammadi
备注:12 Pages, 7 Figures, 2020 International Symposium on Experimental Robotics (ISER 2020)
链接:https://arxiv.org/abs/2101.11110
 

摘要:We propose a framework for resilient autonomous navigation in perceptually challenging unknown environments with mobility-stressing elements such as uneven surfaces with rocks and boulders, steep slopes, negative obstacles like cliffs and holes, and narrow passages. Environments are GPS-denied and perceptually-degraded with variable lighting from dark to lit and obscurants (dust, fog, smoke). Lack of prior maps and degraded communication eliminates the possibility of prior or off-board computation or operator intervention. This necessitates real-time on-board computation using noisy sensor data. To address these challenges, we propose a resilient architecture that exploits redundancy and heterogeneity in sensing modalities. Further resilience is achieved by triggering recovery behaviors upon failure. We propose a fast settling algorithm to generate robust multi-fidelity traversability estimates in real-time. The proposed approach was deployed on multiple physical systems including skid-steer and tracked robots, a high-speed RC car and legged robots, as a part of Team CoSTAR's effort to the DARPA Subterranean Challenge, where the team won 2nd and 1st place in the Tunnel and Urban Circuits, respectively.

 

[18]:Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged  Gradient Method for Stochastic Optimization
标题:不折不扣的自适应:随机优化的动量化自适应双平均梯度法
作者:Aaron Defazio, Samy Jelassi
链接:https://arxiv.org/abs/2101.11075
 

摘要:We introduce MADGRAD, a novel optimization method in the family of AdaGrad adaptive gradient methods. MADGRAD shows excellent performance on deep learning optimization problems from multiple fields, including classification and image-to-image tasks in vision, and recurrent and bidirectionally-masked models in natural language processing. For each of these tasks, MADGRAD matches or outperforms both SGD and ADAM in test set performance, even on problems for which adaptive methods normally perform poorly.

 

[19]:The MineRL 2020 Competition on Sample Efficient Reinforcement Learning  using Human Priors
标题:基于人类先验的样本有效强化学习minell 2020竞赛
作者:William H. Guss, Mario Ynocente Castro, Sam Devlin, Brandon Houghton, Noboru Sean Kuno, Crissman Loomis, Stephanie Milani, Sharada Mohanty, Keisuke Nakata, Ruslan Salakhutdinov, John Schulman, Shinya Shiroshita, Nicholay Topin, Avinash Ummadisingu, Oriol Vinyals
备注:37 pages, initial submission, accepted at NeurIPS. arXiv admin note: substantial text overlap witharXiv:1904.10079
链接:https://arxiv.org/abs/2101.11071
 

摘要:Although deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples, affording only a shrinking segment of the AI community access to their development. Resolution of these limitations requires new, sample-efficient methods. To facilitate research in this direction, we propose this second iteration of the MineRL Competition. The primary goal of the competition is to foster the development of algorithms which can efficiently leverage human demonstrations to drastically reduce the number of samples needed to solve complex, hierarchical, and sparse environments. To that end, participants compete under a limited environment sample-complexity budget to develop systems which solve the MineRL ObtainDiamond task in Minecraft, a sequential decision making environment requiring long-term planning, hierarchical control, and efficient exploration methods. The competition is structured into two rounds in which competitors are provided several paired versions of the dataset and environment with different game textures and shaders. At the end of each round, competitors submit containerized versions of their learning algorithms to the AIcrowd platform where they are trained from scratch on a hold-out dataset-environment pair for a total of 4-days on a pre-specified hardware platform. In this follow-up iteration to the NeurIPS 2019 MineRL Competition, we implement new features to expand the scale and reach of the competition. In response to the feedback of the previous participants, we introduce a second minor track focusing on solutions without access to environment interactions of any kind except during test-time. Further we aim to prompt domain agnostic submissions by implementing several novel competition mechanics including action-space randomization and desemantization of observations and actions.

 

[20]:Logical-Combinatorial Approaches in Dynamic Recognition Problems
标题:动态识别问题中的逻辑组合方法
作者:L. Aslanyan, V. Krasnoproshin, V. Ryazanov, H. Sahakyan
备注:research paper
链接:https://arxiv.org/abs/2101.11066
 

摘要:A pattern recognition scenario, where instead of object classification into the classes by the learning set, the algorithm aims to allocate all objects to the same, the so-called normal class, is the research objective.

 

CV方向重复(10篇)

[1]:Self-Calibrating Active Binocular Vision via Active Efficient Coding  with Deep Autoencoders
标题:基于深度自动编码器的主动有效编码自标定主动双目视觉
作者:Charles Wilmot, Bertram E. Shi, Jochen Triesch
链接:https://arxiv.org/abs/2101.11391
 

摘要:We present a model of the self-calibration of active binocular vision comprising the simultaneous learning of visual representations, vergence, and pursuit eye movements. The model follows the principle of Active Efficient Coding (AEC), a recent extension of the classic Efficient Coding Hypothesis to active perception. In contrast to previous AEC models, the present model uses deep autoencoders to learn sensory representations. We also propose a new formulation of the intrinsic motivation signal that guides the learning of behavior. We demonstrate the performance of the model in simulations.

 

[2]:Learning task-agnostic representation via toddler-inspired learning
标题:幼儿启发学习的学习任务不可知表征
作者:Kwanyoung Park, Junseok Park, Hyunseok Oh, Byoung-Tak Zhang, Youngki Lee
链接:https://arxiv.org/abs/2101.11221
 

摘要:One of the inherent limitations of current AI systems, stemming from the passive learning mechanisms (e.g., supervised learning), is that they perform well on labeled datasets but cannot deduce knowledge on their own. To tackle this problem, we derive inspiration from a highly intentional learning system via action: the toddler. Inspired by the toddler's learning procedure, we design an interactive agent that can learn and store task-agnostic visual representation while exploring and interacting with objects in the virtual environment. Experimental results show that such obtained representation was expandable to various vision tasks such as image classification, object localization, and distance estimation tasks. In specific, the proposed model achieved 100%, 75.1% accuracy and 1.62% relative error, respectively, which is noticeably better than autoencoder-based model (99.7%, 66.1%, 1.95%), and also comparable with those of supervised models (100%, 87.3%, 0.71%).

 

[3]:Bottleneck Transformers for Visual Recognition
标题:视觉识别的瓶颈变压器
作者:Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, Ashish Vaswani
备注:Technical Report, 20 pages, 13 figures, 19 tables
链接:https://arxiv.org/abs/2101.11605
 

摘要:We present BoTNet, a conceptually simple yet powerful backbone architecture that incorporates self-attention for multiple computer vision tasks including image classification, object detection and instance segmentation. By just replacing the spatial convolutions with global self-attention in the final three bottleneck blocks of a ResNet and no other changes, our approach improves upon the baselines significantly on instance segmentation and object detection while also reducing the parameters, with minimal overhead in latency. Through the design of BoTNet, we also point out how ResNet bottleneck blocks with self-attention can be viewed as Transformer blocks. Without any bells and whistles, BoTNet achieves 44.4% Mask AP and 49.7% Box AP on the COCO Instance Segmentation benchmark using the Mask R-CNN framework; surpassing the previous best published single model and single scale results of ResNeSt evaluated on the COCO validation set. Finally, we present a simple adaptation of the BoTNet design for image classification, resulting in models that achieve a strong performance of 84.7% top-1 accuracy on the ImageNet benchmark while being up to 2.33x faster in compute time than the popular EfficientNet models on TPU-v3 hardware. We hope our simple and effective approach will serve as a strong baseline for future research in self-attention models for vision.

 

[4]:The Work of Art in an Age of Mechanical Generation
标题:机械时代的艺术品
作者:Steven J. Frank
备注:This is the author's final version; the article has been accepted for publication in Leonardo Journal
链接:https://arxiv.org/abs/2101.11587
 

摘要:Can we define what it means to be "creative," and if so, can our definition drive artificial intelligence (AI) systems to feats of creativity indistinguishable from human efforts? This mixed question is considered from technological and social perspectives. Beginning with an exploration of the value we attach to authenticity in works of art, the article considers the ability of AI to detect forgeries of renowned paintings and, in so doing, somehow reveal the quiddity of a work of art. We conclude by considering whether evolving technical capability can revise traditional relationships among art, artist, and the market.

 

[5]:Detecting Deepfake Videos Using Euler Video Magnification
标题:利用Euler视频放大率检测假视频
作者:Rashmiranjan Das, Gaurav Negi, Alan F. Smeaton
备注:Presented at Electronic Imaging: Media Watermarking, Security, and Forensics, 27 January 2021, 6 pages, 6 figures
链接:https://arxiv.org/abs/2101.11563
 

摘要:Recent advances in artificial intelligence make it progressively hard to distinguish between genuine and counterfeit media, especially images and videos. One recent development is the rise of deepfake videos, based on manipulating videos using advanced machine learning techniques. This involves replacing the face of an individual from a source video with the face of a second person, in the destination video. This idea is becoming progressively refined as deepfakes are getting progressively seamless and simpler to compute. Combined with the outreach and speed of social media, deepfakes could easily fool individuals when depicting someone saying things that never happened and thus could persuade people in believing fictional scenarios, creating distress, and spreading fake news. In this paper, we examine a technique for possible identification of deepfake videos. We use Euler video magnification which applies spatial decomposition and temporal filtering on video data to highlight and magnify hidden features like skin pulsation and subtle motions. Our approach uses features extracted from the Euler technique to train three models to classify counterfeit and unaltered videos and compare the results with existing techniques.

 

[6]:Automatic Detection of Occulted Hard X-ray Flares Using Deep-Learning  Methods
标题:用深度学习方法自动探测隐藏的硬X射线耀斑
作者:Shin-nosuke Ishikawa, Hideaki Matsumura, Yasunobu Uchiyama, Lindsay Glesener
备注:11 pages, 3 figures, accepted for publication in Solar Physics
链接:https://arxiv.org/abs/2101.11550
 

摘要:We present a concept for a machine-learning classification of hard X-ray (HXR) emissions from solar flares observed by the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI), identifying flares that are either occulted by the solar limb or located on the solar disk. Although HXR observations of occulted flares are important for particle-acceleration studies, HXR data analyses for past observations were time consuming and required specialized expertise. Machine-learning techniques are promising for this situation, and we constructed a sample model to demonstrate the concept using a deep-learning technique. Input data to the model are HXR spectrograms that are easily produced from RHESSI data. The model can detect occulted flares without the need for image reconstruction nor for visual inspection by experts. A technique of convolutional neural networks was used in this model by regarding the input data as images. Our model achieved a classification accuracy better than 90 %, and the ability for the application of the method to either event screening or for an event alert for occulted flares was successfully demonstrated.

 

[7]:Meta Adversarial Training
标题:元对抗训练
作者:Jan Hendrik Metzen, Nicole Finnie, Robin Hutmacher
链接:https://arxiv.org/abs/2101.11453
 

摘要:Recently demonstrated physical-world adversarial attacks have exposed vulnerabilities in perception systems that pose severe risks for safety-critical applications such as autonomous driving. These attacks place adversarial artifacts in the physical world that indirectly cause the addition of universal perturbations to inputs of a model that can fool it in a variety of contexts. Adversarial training is the most effective defense against image-dependent adversarial attacks. However, tailoring adversarial training to universal perturbations is computationally expensive since the optimal universal perturbations depend on the model weights which change during training. We propose meta adversarial training (MAT), a novel combination of adversarial training with meta-learning, which overcomes this challenge by meta-learning universal perturbations along with model training. MAT requires little extra computation while continuously adapting a large set of perturbations to the current model. We present results for universal patch and universal perturbation attacks on image classification and traffic-light detection. MAT considerably increases robustness against universal patch attacks compared to prior work.

 

[8]:Learning Non-linear Wavelet Transformation via Normalizing Flow
标题:用归一化流学习非线性小波变换
作者:Shuo-Hui Li
备注:Main text: 7 pages, 5 figures. Supplement: 5 pages. Github link:this https URL
链接:https://arxiv.org/abs/2101.11306
 

摘要:Wavelet transformation stands as a cornerstone in modern data analysis and signal processing. Its mathematical essence is an invertible transformation that discerns slow patterns from fast patterns in the frequency domain, which repeats at each level. Such an invertible transformation can be learned by a designed normalizing flow model. With a factor-out scheme resembling the wavelet downsampling mechanism, a mutually independent prior, and parameter sharing along the depth of the network, one can train normalizing flow models to factor-out variables corresponding to fast patterns at different levels, thus extending linear wavelet transformations to non-linear learnable models. In this paper, a concrete way of building such flows is given. Then, a demonstration of the model's ability in lossless compression task, progressive loading, and super-resolution (upsampling) task. Lastly, an analysis of the learned model in terms of low-pass/high-pass filters is given.

 

[9]:TorchPRISM: Principal Image Sections Mapping, a novel method for  Convolutional Neural Network features visualization
标题:TorchPRISM:一种新的卷积神经网络特征可视化方法&主像截面映射
作者:Tomasz Szandala
备注:Very early draft, software can be found:this https URL
链接:https://arxiv.org/abs/2101.11266
 

摘要:In this paper we introduce a tool called Principal Image Sections Mapping - PRISM, dedicated for PyTorch, but can be easily ported to other deep learning frameworks. Presented software relies on Principal Component Analysis to visualize the most significant features recognized by a given Convolutional Neural Network. Moreover, it allows to display comparative set features between images processed in the same batch, therefore PRISM can be a method well synerging with technique Explanation by Example.

 

[10]:Puzzle-CAM: Improved localization via matching partial and full features
标题:拼图摄像头:通过匹配部分和全部特征改进定位
作者:Sanhyun Jo, In-Jae Yu
链接:https://arxiv.org/abs/2101.11253
 

摘要:Weakly-supervised semantic segmentation (WSSS) is introduced to narrow the gap for semantic segmentation performance from pixel-level supervision to image-level supervision. Most advanced approaches are based on class activation maps (CAMs) to generate pseudo-labels to train the segmentation network. The main limitation of WSSS is that the process of generating pseudo-labels from CAMs that use an image classifier is mainly focused on the most discriminative parts of the objects. To address this issue, we propose Puzzle-CAM, a process that minimizes differences between the features from separate patches and the whole image. Our method consists of a puzzle module and two regularization terms to discover the most integrated region in an object. Puzzle-CAM can activate the overall region of an object using image-level supervision without requiring extra parameters. % In experiments, Puzzle-CAM outperformed previous state-of-the-art methods using the same labels for supervision on the PASCAL VOC 2012 test dataset. In experiments, Puzzle-CAM outperformed previous state-of-the-art methods using the same labels for supervision on the PASCAL VOC 2012 dataset. Code associated with our experiments is available at \url{this https URL}.

 

NLP方向重复(9篇)

[1]:Using Finite-State Machines to Automatically Scan Classical Greek  Hexameter
标题:用有限状态机自动扫描古希腊六边形
作者:Anne-Kathrin Schumann, Christoph Beierle, Norbert Blößner
备注:13 pages, 5 figures
链接:https://arxiv.org/abs/2101.11437
 

摘要:This paper presents a fully automatic approach to the scansion of Classical Greek hexameter verse. In particular, the paper describes an algorithm that uses deterministic finite-state automata and local linguistic rules to implement a targeted search for valid spondeus patterns and, in addition, a weighted finite-state transducer to correct and complete partial analyses and to reject invalid candidates. The paper also details the results of an empirical evaluation of the annotation quality resulting from this approach on hand-annotated data. It is shown that a finite-state approach provides quick and linguistically sound analyses of hexameter verses as well as an efficient formalisation of linguistic knowledge. The project code is available (seethis https URL).

 

[2]:Challenges Encountered in Turkish Natural Language Processing Studies
标题:土耳其语自然语言处理研究面临的挑战
作者:Kadir Tohma, Yakup Kutlu
备注:8 pages, Natural and Engineering Sciences
链接:https://arxiv.org/abs/2101.11436
 

摘要:Natural language processing is a branch of computer science that combines artificial intelligence with linguistics. It aims to analyze a language element such as writing or speaking with software and convert it into information. Considering that each language has its own grammatical rules and vocabulary diversity, the complexity of the studies in this field is somewhat understandable. For instance, Turkish is a very interesting language in many ways. Examples of this are agglutinative word structure, consonant/vowel harmony, a large number of productive derivational morphemes (practically infinite vocabulary), derivation and syntactic relations, a complex emphasis on vocabulary and phonological rules. In this study, the interesting features of Turkish in terms of natural language processing are mentioned. In addition, summary info about natural language processing techniques, systems and various sources developed for Turkish are given.

 

[3]:TSQA: Tabular Scenario Based Question Answering
标题:TSQA:基于表格场景的问答
作者:Xiao Li, Yawei Sun, Gong Cheng
备注:9 pages, accepted to AAAI 2021
链接:https://arxiv.org/abs/2101.11429
 

摘要:Scenario-based question answering (SQA) has attracted an increasing research interest. Compared with the well-studied machine reading comprehension (MRC), SQA is a more challenging task: a scenario may contain not only a textual passage to read but also structured data like tables, i.e., tabular scenario based question answering (TSQA). AI applications of TSQA such as answering multiple-choice questions in high-school exams require synthesizing data in multiple cells and combining tables with texts and domain knowledge to infer answers. To support the study of this task, we construct GeoTSQA. This dataset contains 1k real questions contextualized by tabular scenarios in the geography domain. To solve the task, we extend state-of-the-art MRC methods with TTGen, a novel table-to-text generator. It generates sentences from variously synthesized tabular data and feeds the downstream MRC method with the most useful sentences. Its sentence ranking model fuses the information in the scenario, question, and domain knowledge. Our approach outperforms a variety of strong baseline methods on GeoTSQA.

 

[4]:Inheritance-guided Hierarchical Assignment for Clinical Automatic  Diagnosis
标题:遗传指导的临床自动诊断分层分配
作者:Yichao Du, Pengfei Luo, Xudong Hong, Tong Xu, Zhe Zhang, Chao Ren, Yi Zheng, Enhong Chen
备注:17 pages, 5 figures, DASFAA 2021
链接:https://arxiv.org/abs/2101.11374
 

摘要:Clinical diagnosis, which aims to assign diagnosis codes for a patient based on the clinical note, plays an essential role in clinical decision-making. Considering that manual diagnosis could be error-prone and time-consuming, many intelligent approaches based on clinical text mining have been proposed to perform automatic diagnosis. However, these methods may not achieve satisfactory results due to the following challenges. First, most of the diagnosis codes are rare, and the distribution is extremely unbalanced. Second, existing methods are challenging to capture the correlation between diagnosis codes. Third, the lengthy clinical note leads to the excessive dispersion of key information related to codes. To tackle these challenges, we propose a novel framework to combine the inheritance-guided hierarchical assignment and co-occurrence graph propagation for clinical automatic diagnosis. Specifically, we propose a hierarchical joint prediction strategy to address the challenge of unbalanced codes distribution. Then, we utilize graph convolutional neural networks to obtain the correlation and semantic representations of medical ontology. Furthermore, we introduce multi attention mechanisms to extract crucial information. Finally, extensive experiments on MIMIC-III dataset clearly validate the effectiveness of our method.

 

[5]:Enquire One's Parent and Child Before Decision: Fully Exploit  Hierarchical Structure for Self-Supervised Taxonomy Expansion
标题:决策前询问父母和子女:充分利用层次结构进行自我监督分类扩展
作者:Suyuchen Wang, Ruihui Zhao, Xi Chen, Yefeng Zheng, Bang Liu
备注:12 pages, 6 figures. To appear in The Web Conference (WWW) 2021
链接:https://arxiv.org/abs/2101.11268
 

摘要:Taxonomy is a hierarchically structured knowledge graph that plays a crucial role in machine intelligence. The taxonomy expansion task aims to find a position for a new term in an existing taxonomy to capture the emerging knowledge in the world and keep the taxonomy dynamically updated. Previous taxonomy expansion solutions neglect valuable information brought by the hierarchical structure and evaluate the correctness of merely an added edge, which downgrade the problem to node-pair scoring or mini-path classification. In this paper, we propose the Hierarchy Expansion Framework (HEF), which fully exploits the hierarchical structure's properties to maximize the coherence of expanded taxonomy. HEF makes use of taxonomy's hierarchical structure in multiple aspects: i) HEF utilizes subtrees containing most relevant nodes as self-supervision data for a complete comparison of parental and sibling relations; ii) HEF adopts a coherence modeling module to evaluate the coherence of a taxonomy's subtree by integrating hypernymy relation detection and several tree-exclusive features; iii) HEF introduces the Fitting Score for position selection, which explicitly evaluates both path and level selections and takes full advantage of parental relations to interchange information for disambiguation and self-correction. Extensive experiments show that by better exploiting the hierarchical structure and optimizing taxonomy's coherence, HEF vastly surpasses the prior state-of-the-art on three benchmark datasets by an average improvement of 46.7% in accuracy and 32.3% in mean reciprocal rank.

 

[6]:Joint Coreference Resolution and Character Linkingfor Multiparty  Conversation
标题:多方会话的联合共指消解与字符链接
作者:Jiaxin Bai, Hongming Zhang, Yangqiu Song, Kun Xu
备注:EACL-2021
链接:https://arxiv.org/abs/2101.11204
 

摘要:Character linking, the task of linking mentioned people in conversations to the real world, is crucial for understanding the conversations. For the efficiency of communication, humans often choose to use pronouns (e.g., "she") or normal phrases (e.g., "that girl") rather than named entities (e.g., "Rachel") in the spoken language, which makes linking those mentions to real people a much more challenging than a regular entity linking task. To address this challenge, we propose to incorporate the richer context from the coreference relations among different mentions to help the linking. On the other hand, considering that finding coreference clusters itself is not a trivial task and could benefit from the global character information, we propose to jointly solve these two tasks. Specifically, we propose C$^2$, the joint learning model of Coreference resolution and Character linking. The experimental results demonstrate that C$^2$ can significantly outperform previous works on both tasks. Further analyses are conducted to analyze the contribution of all modules in the proposed model and the effect of all hyper-parameters.

 

[7]:LSOIE: A Large-Scale Dataset for Supervised Open Information Extraction
标题:LSOIE:一个用于有监督开放信息抽取的大规模数据集
作者:Jacob Solawetz, Stefan Larson
备注:EACL 2021
链接:https://arxiv.org/abs/2101.11177
 

摘要:Open Information Extraction (OIE) systems seek to compress the factual propositions of a sentence into a series of n-ary tuples. These tuples are useful for downstream tasks in natural language processing like knowledge base creation, textual entailment, and natural language understanding. However, current OIE datasets are limited in both size and diversity. We introduce a new dataset by converting the QA-SRL 2.0 dataset to a large-scale OIE dataset (LSOIE). Our LSOIE dataset is 20 times larger than the next largest human-annotated OIE dataset. We construct and evaluate several benchmark OIE models on LSOIE, providing baselines for future improvements on the task. Our LSOIE data, models, and code are made publicly available

 

[8]:Exploring multi-task multi-lingual learning of transformer models for  hate speech and offensive speech identification in social media
标题:社交媒体中仇恨言语和攻击性言语识别变形金刚模型的多任务多语言学习研究
作者:Sudhanshu Mishra, Shivangi Prasad, Shubhanshu Mishra
备注:"To be published in SN Computer Science atthis https URL" "30 pages, 6 figures" "Code available atthis https URL"
链接:https://arxiv.org/abs/2101.11155
 

摘要:Hate Speech has become a major content moderation issue for online social media platforms. Given the volume and velocity of online content production, it is impossible to manually moderate hate speech related content on any platform. In this paper we utilize a multi-task and multi-lingual approach based on recently proposed Transformer Neural Networks to solve three sub-tasks for hate speech. These sub-tasks were part of the 2019 shared task on hate speech and offensive content (HASOC) identification in Indo-European languages. We expand on our submission to that competition by utilizing multi-task models which are trained using three approaches, a) multi-task learning with separate task heads, b) back-translation, and c) multi-lingual training. Finally, we investigate the performance of various models and identify instances where the Transformer based models perform differently and better. We show that it is possible to to utilize different combined approaches to obtain models that can generalize easily on different languages and tasks, while trading off slight accuracy (in some cases) for a much reduced inference time compute cost. We open source an updated version of our HASOC 2019 code with the new improvements atthis https URL.

 

[9]:Event-Driven News Stream Clustering using Entity-Aware Contextual  Embeddings
标题:基于实体感知上下文嵌入的事件驱动新闻流聚类
作者:Kailash Karthik Saravanakumar, Miguel Ballesteros, Muthu Kumar Chandrasekaran, Kathleen McKeown
备注:To appear in Proceedings of The 16th Conference of the European Chapter of the Association for Computational Linguistics
链接:https://arxiv.org/abs/2101.11059
 

摘要:We propose a method for online news stream clustering that is a variant of the non-parametric streaming K-means algorithm. Our model uses a combination of sparse and dense document representations, aggregates document-cluster similarity along these multiple representations and makes the clustering decision using a neural classifier. The weighted document-cluster similarity model is learned using a novel adaptation of the triplet loss into a linear classification objective. We show that the use of a suitable fine-tuning objective and external knowledge in pre-trained transformer models yields significant improvements in the effectiveness of contextual embeddings for clustering. Our model achieves a new state-of-the-art on a standard stream clustering dataset of English documents.

中文来自机器翻译,仅供参考。

点赞
收藏
表情
图片
附件