CFP

WMS-2021: Workshop on Multilingual Search

Ljubljana, Slovenia, April 19-23, 2021

Conference website	https://multilingual-workshop.github.io/
Submission link	https://easychair.org/conferences/?conf=wms2021
Abstract registration deadline	March 7, 2021
Submission deadline	March 7, 2021
Notification deadline	March 26, 2021
Camera Ready due	April 9, 2021

Topics: machine translation transfer learning cross lingual representations query understanding

Search engines are the workhorses of the World Wide Web, returning billions of responses to billions of queries every day. With the explosion of information on the internet, every website from social media to newsrooms to shopping portals rely on a search engine to help users quickly and easily find information that is of interest, without the need to wade through numerous irrelevant web pages. With the advent of smart assistants like Alexa and Siri, search technology is no longer restricted to a written interface, with more and more users interacting with their devices with voice and gestures. As users interact with the internet in their natural method of communication, it has become important for search engines to understand the different languages of their users. In a global setting, challenges of multilingual data are also faced by backend systems that support search engines like Catalog systems, Ads Servers, Cloud Services, IOT devices and more.

English is a widely used language on the internet. Understandably then, a large proportion of the research in search technologies like language and query understanding systems, has taken place for the English language. These models can be adopted to different languages via transfer learning and domain adaptation. However, it is not scalable to relearn the the model for each new language.

Ensuring that Search works equally well in all languages has several major challenges: How can we properly scale language and query understanding systems to languages that are significantly less wide-spread than English? Can we build universal query understanding models for all languages? How do we serve customers searching in languages with little or no annotated data? How can we leverage state-of-art deep learning research in multilingual language understanding? How can we improve the experience of users searching in a variety of languages? State-of-art NLP research has shown promising progress in building multilingual language understanding models with deep learning and massive amounts of data. We aim to bring together experts from across the globe to share their knowledge and experiences on how to leverage state-of-art science in NLP and deep learning, thus helping achieve an improved search experience in a multilingual setting.

Submission Guidelines

Authors are invited to submit papers of 4-8 pages in length. Papers should be submitted electronically in PDF format, using the ACM SIG Proceedings format, with a font size no smaller than 10pt. Submit papers through EasyChair. All submissions will be single blind and peer-reviewed. All accepted papers will be presented at the workshop. In addition, accepted papers will be published in the companion proceedings of the WWW conference and the ACM digital library, unless the authors choose to opt out from publishing their papers. We encourage both academic and industry submissions.

All papers must be original and not simultaneously submitted to another journal or conference. This workshop will cover the challenges in providing a seamless search experience in a multi-lingual settings. We welcome contributions dealing with all aspects of multilingual search including but not limited to:

Cross-lingual representations
Multilingual query understanding engines
Transfer learning, Domain adaptation and label propagation techniques
Applications to multi-lingual web search, e-commerce search and social networks
Construction of cross-lingual knowledge bases
Backend systems like catalogs of of shopping portals or storage and indexing of say new articles in different languages
Advances in Machine Translation
Challenges for IOT devices that interact with users in multiple languages
Tackling lack of behavioral data for non-dominant language queries
Matching, ranking and query understanding for cold-start and multiple languages
Zero-shot and few-shot learning
Learning from monolingual datasets
Role of uncertainty in learning multilingual embeddings
... and related areas

Committees

Organizing committee

Ashutosh Joshi, Amazon
Shailendra Agarwal, Amazon
Atul Saroop, Amazon
Vaclav Petricek, Amazon
Rahul Bhagat, Amazon

Contact

multilingual.workshop@gmail.com