Dialogue Evaluation

One of the main directions of the conference is to develop and test the verification of linguistic research methods and comparative assessments of the efficiency of Russian text analysis systems. The purpose of this work is the development of common principles of “evaluation”: evidence of the effectiveness and adequacy of the results. Such evidence can be obtained only as a result of serious tests pursuant to developed methods.

A special direction of the "Dialogue", which is called Dialogue Evaluation, involves annual comparative testing of computer analysis systems, that solve certain practical problems. Testing results are discussed at the conference, and reports of the organizers and participants are available to view below.

Dialogue Evaluation 2019

1. Automatic Gapping Resolution for Russian (AGGR)

Gapping is one of common type of ellipsis, concerning such examples as “Ей он рассказывает одно, а нам — совершенно другое”, “Кто любит арбуз, а кто - свиной хрящик”, “Дайте мне две пятерки, а я вам десятку”.

The aim of this task is to challenge non-trivial linguistic phenomenon, gapping, that occurs in coordinated structures and elides a repeated predicate, typically from the second clause. Besides the adversity of the construction itself, the phenomenon is naturally rare, which results in lack of training data. During the last two years Gapping has received considerable attention ( S. Schuster, M. Lamm, CD. Manning 2017; K. Droganova, D. Zeman 2017; K. Droganova et al 2018; S. Schuster, J. Nivre, CD. Manning 2018;       Nivre et al 2018). Unfortunately, research was mainly held on insufficient data not exceeding several hundreds of sentences so far. This campaign is a pilot event for gapping resolution task for Russian held for the first time.

AGRR repo

Important dates

Registration closes Jan 25th 2019
Release of the Training Data Jan 26th 2019
Release of the Test Data Feb 20th 2019
Systems Submissions due 18:00 Feb 23rd 2019
Final results from organizers Mar 5th 2019


Ponomareva M., ABBYY, Moscow, Russia
Smurov I., ABBYY, Moscow, Russia
Shavrina T.O., NRU HSE, Sberbank, Moscow, Russia
Droganova K., Charles University, Prague, Czech Republic
Bogdanov A., ABBYY, Moscow, Russia

Ask your questions here dialogueeval2019@gmail.com 

2. Shared Task for Anaphora and Coreference Resolution for Russian (AnCorR)

High-quality coreference resolution plays an important role in many NLP applications. However, developing a coreference resolver for a new language requires extensive world knowledge as well as annotated resources, which are usually expensive to create.

The aim of anaphora and coreference resolution components of an NLP system is to find all mentions in the text that refer to the same real-world entity. The first such evaluation for Russian was organized in 2014 (RU-EVAL 2014). The latest Shared Tasks for multilingual coreference are, e.g., CORBON 2017 (where Russian was one of the concerned languages) and CoNLL-2012.

The shared task is divided into coreference and anaphora resolution tasks. In coreference resolution task, The training set has two layers. This allows not only to train a system to determine whether two mentions are coreferential (coreference chains layer) but also to localize the boundaries of the mentions (mentions layer). In coreference chains layer, for each mention included in a chain of length more than one, there is a line describing it in the following format: Mention ID→Mention Offset→Mention Length→Chain ID. In mentions layer, for each mention in a text, there is a line describing it in the following format: Mention ID→Mention Offset→Mention Length. Mention IDs are sorted in order of appearance in the text. Mentions with equal IDs in both layers have equal offsets and lengths. Anaphora resolution task, the training set consists of anaphoric pronouns and their antecedents.

Important dates

Registration closes Jan 25th 2019
Release of the Training Data Feb 8th 2019
Release of the Test Data Feb 22nd 2019
Systems Submissions due 12:00 Mar 4th 2019
Final results from organizers Mar 10th 2019


Toldova S. Ju, National Research University HSE, Moscow, Russia
Nedoluzhko A., Charles University, Prague, Czech Republic
Iomdin L. L., The Institute for Information Transmission Problems, RSUH, Moscow, Russia
Budnikov E., ABBYY, Moscow, Russia

Ask your questions here dialogueeval2019@gmail.com 

3. Morphological parsing for low-resource languages

When working with major languages, NLP researchers usually have access to large amounts of data, whether labelled or unlabelled. The ultimate objective of this project is to find out successful approaches to morphological analysis dealing with low-resource languages. The participants may consider transfer learning approaches, using the training data for all languages covered with the shared task as well as using extra data of major languages.  

Participants are invited to work on morphological analysis and generation, and morpheme segmentation

The tasks of the forum involve:

  1. Facilitation and stimulation of the development of corpora and linguistic tools for minor languages;
  2. Inspiration of better communication between the field linguists’ community and computational linguists;
  3. Figuring out how modern methods of morphological analysis, tagging, segmentation, and synthesis cope with sparse training data and its high level of variation.

Link to site

Training data provided Jan 24th 2019
Test data provided Feb 21st 2019
Teams shared their results Feb 24th 2019
Results published Mar 17th 2019
Article call for the Dialogue conference Mar 17th 2019
Results discussed at the Dialogue conference May 29th - June 1st 2019


Svetlana Toldova, National Research University HSE, Moscow, Russia
Elena Klyachko, Moscow, Russia
Karina Mishchenkova, Moscow, Russia
Olga Lyashevskaya, National Research University HSE, Moscow, Russia

Ask your questions here dialogueeval2019@gmail.com 

4. Automatic news headlines generation

Within the framework of the "Dialogue" conference, it is planned to compare systems of automatic news headlines generation in Russian.

The competition includes the following tasks:
1) stimulating the development of header generation systems in particular and summarization in general for the Russian language;
2) understanding to what extent modern technologies can be successfully applied for Russian in particular and for all morphologically rich languages in general.

Sample data:
еще несколько сожженных тел нашли в мексиканском штате герреро
мехико, 30 ноя риа новости, дмитрий знаменский. полиция мексиканского города чилапа в штате герреро в воскресенье обнаружила новые сожженные тела, сообщают местные власти. по информации правоохранительных органов, внутри сожженной автомашины находились пять тел. они принадлежат людям, похищенным в среду в местечке ла-хагуэй. таким образом, за неделю в чилапе были найдены 16 сожженных тел. в четверг на дороге рядом с городом были оставлены тела 11 человек, обезглавленных и сожженных. рядом была брошена записка, из которой следует, что эти люди стали жертвами выяснения отношений между двумя противоборствующими преступными группировками. штат герреро - один из наиболее опасных с точки зрения активности преступности в мексике. именно здесь в сентябре пропали 43 студента, которые, как выяснилось позднее, были похищены полицией и переданы в руки бандитов в игуале.

Competition page

Contact: community messages

To participate in the track, please fill out the form

Important dates
First newsletter and the provision of a training set Jan 31st 2019
Official start of the competition Feb 1st 2019
End of the competition Mar 1st 2019
Deadline for article submissions for track participants Mar 14th 2019
Publication of the track results Mar 15th 2019
Summing-up and participants’ address at the conference "Dialogue" May 29th - June 1st 2019

Valentin Malykh, VK
Pavel Kalaidin, VK
Ekaterina Artyomova, HSE, Sberbank
Ivan Smurov, ABBYY