• AIPressRoom
  • Posts
  • Google at EMNLP 2023 – Google Research Blog

Google at EMNLP 2023 – Google Research Blog

Google is proud to be a Diamond Sponsor of Empirical Methods in Natural Language Processing (EMNLP 2023), a premier annual conference, which is being held this week in Sentosa, Singapore. Google has a strong presence at this year’s conference with over 65 accepted papers and active involvement in 11 workshops and tutorials. Google is also happy to be a Major Sponsor for the Widening NLP workshop (WiNLP), which aims to highlight global representations of people, perspectives, and cultures in AI and ML. We look forward to sharing some of our extensive NLP research and expanding our partnership with the broader research community.

We hope you’ll visit the Google booth to chat with researchers who are actively pursuing the latest innovations in NLP, and check out some of the scheduled booth activities (e.g., demos and Q&A sessions listed below). Visit the @GoogleAI X (Twitter) and LinkedIn accounts to find out more about the Google booth activities at EMNLP 2023.

Take a look below to learn more about the Google research being presented at EMNLP 2023 (Google affiliations in bold).

 This schedule is subject to change. Please visit the Google booth for more information.

Adaptation with Self-Evaluation to Improve Selective Prediction in LLMsJiefeng Chen*, Jinsung Yoon, Sayna Ebrahimi, Sercan O Arik, Tomas Pfister, Somesh Jha

A Comprehensive Evaluation of Tool-Assisted Generation StrategiesAlon Jacovi*, Avi Caciularu, Jonathan Herzig, Roee Aharoni, Bernd Bohnet, Mor Geva

1-PAGER: One Pass Answer Generation and Evidence RetrievalPalak Jain, Livio Baldini Soares, Tom Kwiatkowski

MaXM: Towards Multilingual Visual Question AnsweringSoravit Changpinyo, Linting Xue, Michal Yarom, Ashish V. Thapliyal, Idan Szpektor, Julien Amelot, Xi Chen, Radu Soricut

SDOH-NLI: A Dataset for Inferring Social Determinants of Health from Clinical NotesAdam D. Lelkes, Eric Loreaux*, Tal Schuster, Ming-Jun Chen, Alvin Rajkomar

Machine Reading Comprehension Using Case-based ReasoningDung Ngoc Thai, Dhruv Agarwal, Mudit Chaudhary, Wenlong Zhao, Rajarshi Das, Jay-Yoon Lee, Hannaneh Hajishirzi, Manzil Zaheer, Andrew McCallum

Cross-lingual Open-Retrieval Question Answering for African LanguagesOdunayo Ogundepo, Tajuddeen Gwadabe, Clara E. Rivera, Jonathan H. Clark, Sebastian Ruder, David Ifeoluwa Adelani, Bonaventure F. P. Dossou, Abdou Aziz DIOP, Claytone Sikasote, Gilles HACHEME, Happy Buzaaba, Ignatius Ezeani, Rooweither Mabuya, Salomey Osei, Chris Chinenye Emezue, Albert Kahira, Shamsuddeen Hassan Muhammad, Akintunde Oladipo, Abraham Toluwase Owodunni, Atnafu Lambebo Tonja, Iyanuoluwa Shode, Akari Asai, Anuoluwapo Aremu, Ayodele Awokoya, Bernard Opoku, Chiamaka Ijeoma Chukwuneke, Christine Mwase, Clemencia Siro, Stephen Arthur, Tunde Oluwaseyi Ajayi, Verrah Akinyi Otiende, Andre Niyongabo Rubungo, Boyd Sinkala, Daniel Ajisafe, Emeka Felix Onwuegbuzia, Falalu Ibrahim Lawan, Ibrahim Said Ahmad, Jesujoba Oluwadara Alabi, CHINEDU EMMANUEL MBONU, Mofetoluwa Adeyemi, Mofya Phiri, Orevaoghene Ahia, Ruqayya Nasir Iro, Sonia Adhiambo

On Uncertainty Calibration and Selective Generation in Probabilistic Neural Summarization: A Benchmark StudyPolina Zablotskaia, Du Phan, Joshua Maynez, Shashi Narayan, Jie Ren, Jeremiah Zhe Liu

Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine TranslationMarkus Freitag, Behrooz Ghorbani*, Patrick Fernandes*

Sources of Hallucination by Large Language Models on Inference TasksNick McKenna, Tianyi Li, Liang Cheng, Mohammad Javad Hosseini, Mark Johnson, Mark Steedman

Don’t Add, Don’t Miss: Effective Content Preserving Generation from Pre-selected Text SpansAviv Slobodkin, Avi Caciularu, Eran Hirsch, Ido Dagan

What Makes Chain-of-Thought Prompting Effective? A Counterfactual StudyAman Madaan*, Katherine Hermann, Amir Yazdanbakhsh

Understanding HTML with Large Language ModelsIzzeddin Gur, Ofir Nachum, Yingjie Miao, Mustafa Safdari, Austin Huang, Aakanksha Chowdhery, Sharan Narang, Noah Fiedel, Aleksandra Faust

Improving the Robustness of Summarization Models by Detecting and Removing Input NoiseKundan Krishna*, Yao Zhao, Jie Ren, Balaji Lakshminarayanan, Jiaming Luo, Mohammad Saleh, Peter J. Liu

In-Context Learning Creates Task VectorsRoee Hendel, Mor Geva, Amir Globerson

Pre-training Without AttentionJunxiong Wang, Jing Nathan Yan, Albert Gu, Alexander M Rush

MUX-PLMs: Data Multiplexing for High-Throughput Language ModelsVishvak Murahari, Ameet Deshpande, Carlos E Jimenez, Izhak Shafran, Mingqiu Wang, Yuan Cao, Karthik R Narasimhan

PaRaDe: Passage Ranking Using Demonstrations with LLMsAndrew Drozdov*, Honglei Zhuang, Zhuyun Dai, Zhen Qin, Razieh Rahimi, Xuanhui Wang, Dana Alon, Mohit Iyyer, Andrew McCallum, Donald Metzler*, Kai Hui

Long-Form Speech Translation Through Segmentation with Finite-State Decoding Constraints on Large Language ModelsArya D. McCarthy, Hao Zhang, Shankar Kumar, Felix Stahlberg, Ke Wu

Unsupervised Opinion Summarization Using Approximate GeodesicsSomnath Basu Roy Chowdhury*, Nicholas Monath, Kumar Avinava Dubey, Amr Ahmed, Snigdha Chaturvedi

SQLPrompt: In-Context Text-to-SQL with Minimal Labeled DataRuoxi Sun, Sercan O. Arik, Rajarishi Sinha, Hootan Nakhost, Hanjun Dai, Pengcheng Yin, Tomas Pfister

Retrieval-Augmented Parsing for Complex Graphs by Exploiting Structure and UncertaintyZi Lin, Quan Yuan, Panupong Pasupat, Jeremiah Zhe Liu, Jingbo Shang

A Zero-Shot Language Agent for Computer Control with Structured ReflectionTao Li, Gang Li, Zhiwei Deng, Bryan Wang*, Yang Li

Pragmatics in Language Grounding: Phenomena, Tasks, and Modeling ApproachesDaniel Fried, Nicholas Tomlin, Jennifer Hu, Roma Patel, Aida Nematzadeh

Improving Classifier Robustness Through Active Generation of Pairwise CounterfactualsAnanth Balashankar, Xuezhi Wang, Yao Qin, Ben Packer, Nithum Thain, Jilin Chen, Ed H. Chi, Alex Beutel

mmT5: Modular Multilingual Pre-training Solves Source Language HallucinationsJonas Pfeiffer, Francesco Piccinno, Massimo Nicosia, Xinyi Wang, Machel Reid, Sebastian Ruder

Scaling Laws vs Model Architectures: How Does Inductive Bias Influence Scaling?Yi Tay, Mostafa Dehghani, Samira Abnar, Hyung Won Chung, William Fedus, Jinfeng Rao, Sharan Narang, Vinh Q. Tran, Dani Yogatama, Donald Metzler

TaTA: A Multilingual Table-to-Text Dataset for African LanguagesSebastian Gehrmann, Sebastian Ruder, Vitaly Nikolaev, Jan A. Botha, Michael Chavinda, Ankur P Parikh, Clara E. Rivera

XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented LanguagesSebastian Ruder, Jonathan H. Clark, Alexander Gutkin, Mihir Kale, Min Ma, Massimo Nicosia, Shruti Rijhwani, Parker Riley, Jean Michel Amath Sarr, Xinyi Wang, John Frederick Wieting, Nitish Gupta, Anna Katanova, Christo Kirov, Dana L Dickinson, Brian Roark, Bidisha Samanta, Connie Tao, David Ifeoluwa Adelani, Vera Axelrod, Isaac Rayburn Caswell, Colin Cherry, Dan Garrette, Reeve Ingle, Melvin Johnson, Dmitry Panteleev, Partha Talukdar

q2d: Turning Questions into Dialogs to Teach Models How to SearchYonatan Bitton, Shlomi Cohen-Ganor, Ido Hakimi, Yoad Lewenberg, Roee Aharoni, Enav Weinreb

Emergence of Abstract State Representations in Embodied Sequence ModelingTian Yun*, Zilai Zeng, Kunal Handa, Ashish V Thapliyal, Bo Pang, Ellie Pavlick, Chen Sun

Evaluating and Modeling Attribution for Cross-Lingual Question AnsweringBenjamin Muller*, John Wieting, Jonathan H. Clark, Tom Kwiatkowski, Sebastian Ruder, Livio Baldini Soares, Roee Aharoni, Jonathan Herzig, Xinyi Wang

Weakly-Supervised Learning of Visual Relations in Multimodal Pre-trainingEmanuele Bugliarello, Aida Nematzadeh, Lisa Anne Hendricks

How Do Languages Influence Each Other? Studying Cross-Lingual Data Sharing During LM Fine-TuningRochelle Choenni, Dan Garrette, Ekaterina Shutova

CompoundPiece: Evaluating and Improving Decompounding Performance of Language ModelsBenjamin Minixhofer, Jonas Pfeiffer, Ivan Vulić

IC3: Image Captioning by Committee ConsensusDavid Chan, Austin Myers, Sudheendra Vijayanarasimhan, David A Ross, John Canny

The Curious Case of Hallucinatory (Un)answerability: Finding Truths in the Hidden States of Over-Confident Large Language ModelsAviv Slobodkin, Omer Goldman, Avi Caciularu, Ido Dagan, Shauli Ravfogel

Evaluating Large Language Models on Controlled Generation TasksJiao Sun, Yufei Tian, Wangchunshu Zhou, Nan Xu, Qian Hu, Rahul Gupta, John Wieting, Nanyun Peng, Xuezhe Ma

Ties Matter: Meta-Evaluating Modern Metrics with Pairwise Accuracy and Tie CalibrationDaniel Deutsch, George Foster, Markus Freitag

Transcending Scaling Laws with 0.1% Extra ComputeYi Tay*, Jason Wei*, Hyung Won Chung*, Vinh Q. Tran, David R. So*, Siamak Shakeri, Xavier Garcia, Huaixiu Steven Zheng, Jinfeng Rao, Aakanksha Chowdhery, Denny Zhou, Donald Metzler, Slav Petrov, Neil Houlsby, Quoc V. Le, Mostafa Dehghani

Data Similarity is Not Enough to Explain Language Model PerformanceGregory Yauney*, Emily Reif, David Mimno

Self-Influence Guided Data Reweighting for Language Model Pre-trainingMegh Thakkar*, Tolga Bolukbasi, Sriram Ganapathy, Shikhar Vashishth, Sarath Chandar, Partha Talukdar

ReTAG: Reasoning Aware Table to Analytic Text GenerationDeepanway Ghosal, Preksha Nema, Aravindan Raghuveer

GATITOS: Using a New Multilingual Lexicon for Low-Resource Machine TranslationAlex Jones*, Isaac Caswell, Ishank Saxena

Video-Helpful Multimodal Machine TranslationYihang Li, Shuichiro Shimizu, Chenhui Chu, Sadao Kurohashi, Wei Li

Symbol Tuning Improves In-Context Learning in Language ModelsJerry Wei*, Le Hou, Andrew Kyle Lampinen, Xiangning Chen*, Da Huang, Yi Tay*, Xinyun Chen, Yifeng Lu, Denny Zhou, Tengyu Ma*, Quoc V Le

“Don’t Take This Out of Context!” On the Need for Contextual Models and Evaluations for Stylistic RewritingAkhila Yerukola, Xuhui Zhou, Elizabeth Clark, Maarten Sap

QAmeleon: Multilingual QA with Only 5 ExamplesPriyanka Agrawal, Chris Alberti, Fantine Huot, Joshua Maynez, Ji Ma, Sebastian Ruder, Kuzman Ganchev, Dipanjan Das, Mirella Lapata

Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal SupervisionEugene Kharitonov, Damien Vincent, Zalán Borsos, Raphaël Marinier, Sertan Girgin, Olivier Pietquin, Matt Sharifi, Marco Tagliasacchi, Neil Zeghidour

AnyTOD: A Programmable Task-Oriented Dialog SystemJeffrey Zhao, Yuan Cao, Raghav Gupta, Harrison Lee, Abhinav Rastogi, Mingqiu Wang, Hagen Soltau, Izhak Shafran, Yonghui Wu

Selectively Answering Ambiguous QuestionsJeremy R. Cole, Michael JQ Zhang, Daniel Gillick, Julian Martin Eisenschlos, Bhuwan Dhingra, Jacob Eisenstein

PRESTO: A Multilingual Dataset for Parsing Realistic Task-Oriented Dialogs (see blog post)Rahul Goel, Waleed Ammar, Aditya Gupta, Siddharth Vashishtha, Motoki Sano, Faiz Surani*, Max Chang, HyunJeong Choe, David Greene, Chuan He, Rattima Nitisaroj, Anna Trukhina, Shachi Paul, Pararth Shah, Rushin Shah, Zhou Yu

LM vs LM: Detecting Factual Errors via Cross ExaminationRoi Cohen, May Hamri, Mor Geva, Amir Globerson

A Suite of Generative Tasks for Multi-Level Multimodal Webpage UnderstandingAndrea Burns*, Krishna Srinivasan, Joshua Ainslie, Geoff Brown, Bryan A. Plummer, Kate Saenko, Jianmo Ni, Mandy Guo

AfriSenti: A Twitter Sentiment Analysis Benchmark for African LanguagesShamsuddeen Hassan Muhammad, Idris Abdulmumin, Abinew Ali Ayele, Nedjma Ousidhoum, David Ifeoluwa Adelani, Seid Muhie Yimam, Ibrahim Said Ahmad, Meriem Beloucif, Saif M. Mohammad, Sebastian Ruder, Oumaima Hourrane, Alipio Jorge, Pavel Brazdil, Felermino D. M. A. Ali, Davis David, Salomey Osei, Bello Shehu-Bello, Falalu Ibrahim Lawan, Tajuddeen Gwadabe, Samuel Rutunda, Tadesse Destaw Belay, Wendimu Baye Messelle, Hailu Beshada Balcha, Sisay Adugna Chala, Hagos Tesfahun Gebremichael, Bernard Opoku, Stephen Arthur

Optimizing Retrieval-Augmented Reader Models via Token EliminationMoshe Berchansky, Peter Izsak, Avi Caciularu, Ido Dagan, Moshe Wasserblat

SEAHORSE: A Multilingual, Multifaceted Dataset for Summarization EvaluationElizabeth Clark, Shruti Rijhwani, Sebastian Gehrmann, Joshua Maynez, Roee Aharoni, Vitaly Nikolaev, Thibault Sellam, Aditya Siddhant, Dipanjan Das, Ankur P Parikh

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head CheckpointsJoshua Ainslie, James Lee-Thorp, Michiel de Jong*, Yury Zemlyanskiy, Federico Lebron, Sumit Sanghai

CoLT5: Faster Long-Range Transformers with Conditional ComputationJoshua Ainslie, Tao Lei, Michiel de Jong, Santiago Ontanon, Siddhartha Brahma, Yury Zemlyanskiy, David Uthus, Mandy Guo, James Lee-Thorp, Yi Tay, Yun-Hsuan Sung, Sumit Sanghai

Improving Diversity of Demographic Representation in Large Language Models via Collective-Critiques and Self-VotingPreethi Lahoti, Nicholas Blumm, Xiao Ma, Raghavendra Kotikalapudi, Sahitya Potluri, Qijun Tan, Hansa Srinivasan, Ben Packer, Ahmad Beirami, Alex Beutel, Jilin Chen

Universal Self-Adaptive Prompting (see blog post)Xingchen Wan*, Ruoxi Sun, Hootan Nakhost, Hanjun Dai, Julian Martin Eisenschlos, Sercan O. Arik, Tomas Pfister

TrueTeacher: Learning Factual Consistency Evaluation with Large Language ModelsZorik Gekhman, Jonathan Herzig, Roee Aharoni, Chen Elkind, Idan Szpektor

Hierarchical Pre-training on Multimodal Electronic Health RecordsXiaochen Wang, Junyu Luo, Jiaqi Wang, Ziyi Yin, Suhan Cui, Yuan Zhong, Yaqing Wang, Fenglong Ma

NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive DecodersLivio Baldini Soares, Daniel Gillick, Jeremy R. Cole, Tom Kwiatkowski

How Does Generative Retrieval Scale to Millions of Passages?Ronak Pradeep*, Kai Hui, Jai Gupta, Adam D. Lelkes, Honglei Zhuang, Jimmy Lin, Donald Metzler, Vinh Q. Tran

Make Every Example Count: On the Stability and Utility of Self-Influence for Learning from Noisy NLP DatasetsIrina Bejan*, Artem Sokolov, Katja Filippova