每日论文速递：自然语言处理相关（5月19日更新版）

出品 | 深度学习这件小事公众号

如需转载，请联系后台授权

自然语言处理（5月19日更新版）

[1] Reconstructing Maps from Text

作者 | Johnathan E. Avery, Robert L. Goldstone, Michael N. Jones

链接 | https://arxiv.org/abs/2005.08932

[2] Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

作者 | Sam Coope, Tyler Farghly, Daniela Gerz, Ivan Vulić, Matthew Henderson

链接 | https://arxiv.org/abs/2005.08866

备注 | ACL 2020

[3] Grammatical gender associations outweigh topical gender bias in crosslinguistic word embeddings

作者 | Katherine McCurdy, Oguz Serbetci

链接 | https://arxiv.org/abs/2005.08864

备注 | Extended abstract presented at the WiNLP workshop, ACL 2017

[4] Inflecting when there's no majority: Limitations of encoder-decoder neural networks as cognitive models for German plurals

作者 | Kate McCurdy, Sharon Goldwater, Adam Lopez

链接 | https://arxiv.org/abs/2005.08826

备注 | To appear at ACL 2020

[5] Interaction Matching for Long-Tail Multi-Label Classification

作者 | Sean MacAvaney, Franck Dernoncourt, Walter Chang, Nazli Goharian, Ophir Frieder

链接 | https://arxiv.org/abs/2005.08805

[6] Corpus of Chinese Dynastic Histories: Gender Analysis over Two Millennia

作者 | Sergey Zinin, Yang Xu

链接 | https://arxiv.org/abs/2005.08793

备注 | 12th Conference on Language Resources and Evaluation (LREC 2020)

[7] Improving Named Entity Recognition in Tor Darknet with Local Distance Neighbor Feature

作者 | Mhd Wesam Al-Nabki, Francisco Jañez-Martino, Roberto A. Vasco-Carofilis, Eduardo Fidalgo, Javier Velasco-Mata

链接 | https://arxiv.org/abs/2005.08746

备注 | to be published in conference JNIC 2020

[8] The presence of occupational structure in online texts based on word embedding NLP models

作者 | Zoltán Kmetty, Julia Koltai, Tamás Rudas

链接 | https://arxiv.org/abs/2005.08612

备注 | Paper presented at IC2S2 2019 and RC28 summer meeting 2019 (Columbia University)

[9] Efficient Wait-k Models for Simultaneous Machine Translation

作者 | Maha Elbayad, Laurent Besacier, Jakob Verbeek

链接 | https://arxiv.org/abs/2005.08595

[10] SemEval-2020 Task 5: Detecting Counterfactuals by Disambiguation

作者 | Hanna Abi Akl, Dominique Mariko, Estelle Labidurie

链接 | https://arxiv.org/abs/2005.08519

[11] Towards Question Format Independent Numerical Reasoning: A Set of Prerequisite Tasks

作者 | Swaroop Mishra, Arindam Mitra, Neeraj Varshney, Bhavdeep Sachdeva, Chitta Baral

链接 | https://arxiv.org/abs/2005.08516

[12] Text Classification with Few Examples using Controlled Generalization

作者 | Abhijit Mahabal, Jason Baldridge, Burcu Karagol Ayan, Vincent Perot, Dan Roth

链接 | https://arxiv.org/abs/2005.08469

[13] Syntax-guided Controlled Generation of Paraphrases

作者 | Ashutosh Kumar, Kabir Ahuja, Raghuram Vadapalli, Partha Talukdar

链接 | https://arxiv.org/abs/2005.08417

备注 | Accepted to TACL 2020

[14] MixingBoard: a Knowledgeable Stylized Integrated Text Generation Platform

作者 | Xiang Gao, Michel Galley, Bill Dolan

链接 | https://arxiv.org/abs/2005.08365

备注 | accepted at ACL 2020

[15] Cross-Lingual Word Embeddings for Turkic Languages

作者 | Elmurod Kuriyozov, Yerai Doval, Carlos Gómez-Rodríguez

链接 | https://arxiv.org/abs/2005.08340

备注 | Final version, published in the proceedings of LREC 2020

[16] Context-Based Quotation Recommendation

作者 | Ansel MacLaughlin, Tao Chen, Burcu Karagol Ayan, Dan Roth

链接 | https://arxiv.org/abs/2005.08319

[17] TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data

作者 | Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel

链接 | https://arxiv.org/abs/2005.08314

备注 | To Appear at ACL 2020

[18] Support-BERT: Predicting Quality of Question-Answer Pairs in MSDN using Deep Bidirectional Transformer

作者 | Bhaskar Sen, Nikhil Gopal, Xinwei Xue

链接 | https://arxiv.org/abs/2005.08294

[19] LiSSS: A toy corpus of Literary Spanish Sentences Sentiment for Emotions Detection

作者 | Juan-Manuel Torres-Moreno, Luis-Gil Moreno-Jiménez

链接 | https://arxiv.org/abs/2005.08223

[20] Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation

作者 | Won Ik Cho, Donghyun Kwak, Jiwon Yoon, Nam Soo Kim

链接 | https://arxiv.org/abs/2005.08213

[21] Building a Hebrew Semantic Role Labeling Lexical Resource from Parallel Movie Subtitles

作者 | Ben Eyal, Michael Elhadad

链接 | https://arxiv.org/abs/2005.08206

备注 | accepted to LREC 2020

[22] How much complexity does an RNN architecture need to learn syntax-sensitive dependencies?

作者 | Gantavya Bhatt, Hritik Bansal, Rishubh Singh, Sumeet Agarwal

链接 | https://arxiv.org/abs/2005.08199

备注 | to appear at ACL SRW 2020

[23] Cross-Lingual Low-Resource Set-to-Description Retrieval for Global E-Commerce

作者 | Juntao Li, Chang Liu, Jian Wang, Lidong Bing, Hongsong Li, Xiaozhong Liu, Dongyan Zhao, Rui Yan

链接 | https://arxiv.org/abs/2005.08188

备注 | AAAI 2020

[24] Multi-modal Automated Speech Scoring using Attention Fusion

作者 | Manraj Singh Grover, Yaman Kumar, Sumit Sarin, Payman Vafaee, Mika Hama, Rajiv Ratn Shah

链接 | https://arxiv.org/abs/2005.08182

备注 | Submitted to INTERSPEECH 2020

[25] IMoJIE: Iterative Memory-Based Joint Open Information Extraction

作者 | Keshav Kolluru, Samarth Aggarwal, Vipul Rathore, Mausam, Soumen Chakrabarti

链接 | https://arxiv.org/abs/2005.08178

[26] Encodings of Source Syntax: Similarities in NMT Representations Across Target Languages

作者 | Tyler A. Chang, Anna N. Rafferty

链接 | https://arxiv.org/abs/2005.08177

备注 | To appear at the 5th Workshop on Representation Learning for NLP

[27] Adversarial Training for Commonsense Inference

作者 | Lis Pereira, Xiaodong Liu, Fei Cheng, Masayuki Asahara, Ichiro Kobayashi

链接 | https://arxiv.org/abs/2005.08156

备注 | Accepted to ACL2020 RepL4NLP workshop

[28] Semi-Automating Knowledge Base Construction for Cancer Genetics

作者 | Somin Wadhwa, Kanhua Yin, Kevin S. Hughes, Byron C. Wallace

链接 | https://arxiv.org/abs/2005.08146

备注 | In proceedings of Automated Knowledge Base Construction (AKBC), 2020

[29] RPD: A Distance Function Between Word Embeddings

作者 | Xuhui Zhou, Zaixiang Zheng, Shujian Huang

链接 | https://arxiv.org/abs/2005.08113

备注 | ACL Student Research Workshop 2020

[30] Learning Probabilistic Sentence Representations from Paraphrases

作者 | Mingda Chen, Kevin Gimpel

链接 | https://arxiv.org/abs/2005.08105

备注 | Repl4NLP at ACL 2020, short paper

[31] Layer-Wise Cross-View Decoding for Sequence-to-Sequence Learning

作者 | Fenglin Liu, Xuancheng Ren, Guangxiang Zhao, Xu Sun

链接 | https://arxiv.org/abs/2005.08081

备注 | Achieve state-of-the-art BLEU scores on WMT14 EN-DE, EN-FR, and IWSLT DE-EN datasets

[32] Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehensio

作者 | Hongyu Gong, Yelong Shen, Dian Yu, Jianshu Chen, Dong Yu

链接 | https://arxiv.org/abs/2005.08056

[33] IntelliCode Compose: Code Generation Using Transformer

作者 | Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan

链接 | https://arxiv.org/abs/2005.08025

[34] A Text Reassembling Approach to NaturalLanguage Generation

作者 | Xiao Li, Kees van Deemter, Chenghua Lin

链接 | https://arxiv.org/abs/2005.07988

[35] Unsupervised Embedding-based Detection of Lexical Semantic Changes

作者 | Ehsaneddin Asgari, Christoph Ringlstetter, Hinrich Schütze

链接 | https://arxiv.org/abs/2005.07979

[36] Logical Inferences with Comparatives and Generalized Quantifiers

作者 | Izumi Haruta, Koji Mineshima, Daisuke Bekki

链接 | https://arxiv.org/abs/2005.07954

备注 | To appear in the Proceedings of the Association for Computational Linguistics: Student Research Workshop (ACL-SRW 2020)

[37] ApplicaAI at SemEval-2020 Task 11: On RoBERTa-CRF, Span CLS and Whether Self-Training Helps Them

作者 | Dawid Jurkiewicz, Łukasz Borchmann, Izabela Kosmala, Filip Graliński

链接 | https://arxiv.org/abs/2005.07934

[38] Sequential Sentence Matching Network for Multi-turn Response Selection in Retrieval-based Chatbots

作者 | Chao Xiong, Che Liu, Zijun Xu, Junfeng Jiang, Jieping Ye

链接 | https://arxiv.org/abs/2005.07923

[39] Integrating Semantic and Structural Information with Graph Convolutional Network for Controversy Detection

作者 | Lei Zhong, Juan Cao, Qiang Sheng, Junbo Guo, Ziang Wang

链接 | https://arxiv.org/abs/2005.07886

备注 | To appear in ACL 2020 (long paper)

[40] MicroNet for Efficient Language Modeling

作者 | Zhongxia Yan, Hanrui Wang, Demi Guo, Song Han

链接 | https://arxiv.org/abs/2005.07877

备注 | Accepted by PMLR

[41] Neural Multi-Task Learning for Teacher Question Detection in Online Classrooms

作者 | Gale Yan Huang, Jiahao Chen, Haochen Liu, Weiping Fu, Wenbiao Ding, Jiliang Tang, Songfan Yang, Guoliang Li, Zitao Liu

链接 | https://arxiv.org/abs/2005.07845

备注 | The 21th International Conference on Artificial Intelligence in Education(AIED), 2020

[42] [email protected] at SemEval-2020 Task 12: Identifying Multilingual Offensive Tweets Using Weighted Ensemble and Fine-Tuned BERT

作者 | Saja Khaled Tawalbeh, Mahmoud Hammad, Mohammad AL-Smadi

链接 | https://arxiv.org/abs/2005.07820

备注 | SemEval 2020 conference

[43] A Scientific Information Extraction Dataset for Nature Inspired Engineering

作者 | Ruben Kruiper, Julian F.V. Vincent, Jessica Chen-Burger, Marc P.Y. Desmulliez, Ioannis Konstas

链接 | https://arxiv.org/abs/2005.07753

备注 | Published in Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020)

[44] In Layman's Terms: Semi-Open Relation Extraction from Scientific Texts

作者 | Ruben Kruiper, Julian F.V. Vincent, Jessica Chen-Burger, Marc P.Y. Desmulliez, Ioannis Konstas

链接 | https://arxiv.org/abs/2005.07751

备注 | To be published in ACL 2020 conference proceedings

[45] Uncovering Gender Bias in Media Coverage of Politicians with Machine Learning

作者 | Susan Leavy

链接 | https://arxiv.org/abs/2005.07734

备注 | Digital Scholarship in Humanities Journal

[46] Critical Impact of Social Networks Infodemic on Defeating Coronavirus COVID-19 Pandemic: Twitter-Based Study and Research Directions

作者 | Azzam Mourad, Ali Srour, Haidar Harmanani, Cathia Jenainatiy, Mohamad Arafeh

链接 | https://arxiv.org/abs/2005.08820

备注 | 11 pages, 10 figures, Journal Article

[47] Machine learning on Big Data from Twitter to understand public reactions to COVID-19

作者 | Jia Xue, Junxiang Chen, Chen Chen, ChengDa Zheng, Tingshao Zhu

链接 | https://arxiv.org/abs/2005.08817

[48] Conversational Search -- A Report from Dagstuhl Seminar 19461

作者 | Avishek Anand, Lawrence Cavedon, Matthias Hagen, Hideo Joho, Mark Sanderson, Benno Stein

链接 | https://arxiv.org/abs/2005.08658

备注 | contains arXiv:2001.06910, arXiv:2001.02912

[49] Design Choices for X-vector Based Speaker Anonymization

作者 | Brij Mohan Lal Srivastava, Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Junichi Yamagishi, Mohamed Maouche, Aurélien Bellet, Marc Tommasi

链接 | https://arxiv.org/abs/2005.08601

[50] Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation

作者 | Po-Han Chi, Pei-Hung Chung, Tsung-Han Wu, Chun-Cheng Hsieh, Shang-Wen Li, Hung-yi Lee

链接 | https://arxiv.org/abs/2005.08575

[51] Audio-visual Multi-channel Recognition of Overlapped Speech

作者 | Jianwei Yu, Bo Wu, Rongzhi Gu Shi-Xiong Zhang Lianwu Chen Yong Xu Meng Yu, Dan Su, Dong Yu, Xunying Liu, Helen Meng

链接 | https://arxiv.org/abs/2005.08571

备注 | submitted to Interspeech 2020

[52] Robust Training of Vector Quantized Bottleneck Models

作者 | Adrian Łańcucki, Jan Chorowski, Guillaume Sanchez, Ricard Marxer, Nanxin Chen, Hans J.G.A. Dolfing, Sameer Khurana, Tanel Alumäe, Antoine Laurent

链接 | https://arxiv.org/abs/2005.08520

备注 | Published at IJCNN 2020

[53] Attention-based Transducer for Online Speech Recognition

作者 | Bin Wang, Yan Yin, Hui Lin

链接 | https://arxiv.org/abs/2005.08497

备注 | submitted to Interspeech 2020

[54] An Effective End-to-End Modeling Approach for Mispronunciation Detection

作者 | Tien-Hong Lo, Shi-Yan Weng, Hsiu-Jui Chang, Berlin Chen

链接 | https://arxiv.org/abs/2005.08440

备注 | Submitted to Interspeech 2020

[55] The NTNU System at the Interspeech 2020 Non-Native Children's Speech ASR Challenge

作者 | Tien-Hong Lo, Fu-An Chao, Shi-Yan Weng, Berlin Chen

链接 | https://arxiv.org/abs/2005.08433

备注 | Submitted to Interspeech 2020 Special Session: Shared Task on Automatic Speech Recognition for Non-Native Children's Speech

[56] Content analysis of Persian/Farsi Tweets during COVID-19 pandemic in Iran using NLP

作者 | Pedram Hosseini, Poorya Hosseini, David A. Broniatowski

链接 | https://arxiv.org/abs/2005.08400

[57] Vector-Quantized Autoregressive Predictive Coding

作者 | Yu-An Chung, Hao Tang, James Glass

链接 | https://arxiv.org/abs/2005.08392

[58] Fixed Point Semantics for Stream Reasoning

作者 | Christian Antić

链接 | https://arxiv.org/abs/2005.08384

[59] Wake Word Detection with Alignment-Free Lattice-Free MMI

作者 | Yiming Wang, Hang Lv, Daniel Povey, Lei Xie, Sanjeev Khudanpur

链接 | https://arxiv.org/abs/2005.08347

备注 | Submitted to INTERSPEECH 2020

[60] A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer

作者 | Vladimir Iashin, Esa Rahtu

链接 | https://arxiv.org/abs/2005.08271

项目链接 | https://v-iashin.github.io/bmt

[61] On the Combined Use of Extrinsic Semantic Resources for Medical Information Search

作者 | Mohammed Maree, Israa Noor, Khaled Rabayah, Mohammed Belkhatir, Saadat M. Alhashmi

链接 | https://arxiv.org/abs/2005.08259

[62] Dual Learning: Theoretical Study and an Algorithmic Extension

作者 | Zhibing Zhao, Yingce Xia, Tao Qin, Lirong Xia, Tie-Yan Liu

链接 | https://arxiv.org/abs/2005.08238

[63] #Coronavirus or #Chinesevirus?!: Understanding the negative sentiment reflected in Tweets with racist hashtags across the development of COVID-19

作者 | Xin Pei, Deval Mehta

链接 | https://arxiv.org/abs/2005.08224

[64] That Sounds Familiar: an Analysis of Phonetic Representations Transfer Across Languages

作者 | Piotr Żelasko, Laureano Moro-Velázquez, Mark Hasegawa-Johnson, Odette Scharenborg, Najim Dehak

链接 | https://arxiv.org/abs/2005.08118

备注 | Submitted to Interspeech 2020. For some reason, the ArXiv Latex engine rendered it in more than 4 pages

[65] Exploration of Audio Quality Assessment and Anomaly Localisation Using Attention Models

作者 | Qiang Huang, Thomas Hain

链接 | https://arxiv.org/abs/2005.08053

备注 | Submitted to InterSpeech 2020

[66] Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory

作者 | Chunyang Wu, Yongqiang Wang, Yangyang Shi, Ching-Feng Yeh, Frank Zhang

链接 | https://arxiv.org/abs/2005.08042

备注 | submitted to Interspeech 2020

[67] Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation

作者 | Tao Tu, Yuan-Jui Chen, Alexander H. Liu, Hung-yi Lee

链接 | https://arxiv.org/abs/2005.08024

备注 | Submitted to Interspeech 2020

[68] AccentDB: A Database of Non-Native English Accents to Assist Neural Speech Recognition

作者 | Afroz Ahamad, Ankit Anand, Pranesh Bhargava

链接 | https://arxiv.org/abs/2005.07973

备注 | Proceedings of the 12th Language Resources and Evaluation Conference - LREC, 2020

[69] Reducing Spelling Inconsistencies in Code-Switching ASR using Contextualized CTC Loss

作者 | Burin Naowarat, Thananchai Kongthaworn, Korrawe Karunratanakul, Sheng Hui Wu, Ekapol Chuangsuwanich

链接 | https://arxiv.org/abs/2005.07920

备注 | 7 pages, 5 figures, submitted to INTERSPEECH 2020

[70] Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition

作者 | Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Shuai Zhang, Zhengqi Wen

链接 | https://arxiv.org/abs/2005.07903

[71] Oscillating Statistical Moments for Speech Polarity Detection

作者 | Thomas Drugman, Thierry Dutoit

链接 | https://arxiv.org/abs/2005.07901

[72] Glottal Source Estimation using an Automatic Chirp Decomposition

作者 | Thomas Drugman, Baris Bozkurt, Thierry Dutoit

链接 | https://arxiv.org/abs/2005.07897

[73] Large scale weakly and semi-supervised learning for low-resource video ASR

作者 | Kritika Singh, Vimal Manohar, Alex Xiao, Sergey Edunov, Ross Girshick, Vitaliy Liptchinsky, Christian Fuegen, Yatharth Saraf, Geoffrey Zweig, Abdelrahman Mohamed

链接 | https://arxiv.org/abs/2005.07850

[74] Speaker Re-identification with Speaker Dependent Speech Enhancement

作者 | Yanpei Shi, Qiang Huang, Thomas Hain

链接 | https://arxiv.org/abs/2005.07818

备注 | Submitted to Interspeech2020

[75] Weakly Supervised Training of Hierarchical Attention Networks for Speaker Identification

作者 | Yanpei Shi, Qiang Huang, Thomas Hain

链接 | https://arxiv.org/abs/2005.07817

备注 | Submitted to Interspeech2020

[76] Feature Fusion Strategies for End-to-End Evaluation of Cognitive Behavior Therapy Sessions

作者 | Zhuohao Chen, Nikolaos Flemotomos, Victor Ardulov, Torrey A. Creed, Zac E. Imel, David C. Atkins, Shrikanth Narayanan

链接 | https://arxiv.org/abs/2005.07809