The Competition on Visually Rich Document Intelligence and Understanding (VRD-IU)

The 2024 Competition on Visually Rich Document Intelligence and Understanding (VRD-IU) will be held in conjunction with the 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024) in Jeju Island, Korea.

Competition date & time: 03.08.24 - 09.08.24 TBA
Competition physical location : Jeju, South Korea
Hybrid mode : TBA

Overview

The VRD-IU(Visually Rich Document Intelligence and Understanding) competition aims to tackle the obstacles presented by the diverse and complex nature of form-like documents, which frequently involve multiple stakeholders and contain essential information that is challenging to extract. This competition, based on the Form-NLU Dataset featuring digital, printed, and handwritten forms, offers two tracks catering to participants' varying skill levels. Tasks range from extracting key information (Track A) to localising it within documents (Track B), ensuring engagement across proficiency levels in advancing visually rich document understanding technology. This initiative not only accelerates advancements in document understanding but also aims to draw increased interest and engagement in this field, presenting a prime opportunity for innovators to contribute to the evolution of efficient information extraction and analysis methodologies.

Track A - Form Key Information Extraction

Users must develop a deep learning-based retriever to extract the target form components based on the given key query. We provide human-annotated semantic entities bounding box coordinates of input form documents; users are required to locate the entity based on the input query. The evaluation metric is F1-Score following the Form-NLU Task B.
Competition Link: https://www.kaggle.com/competitions/vrd-iu2024-tracka

Track B - Form Key Information Localisation

Users are encouraged to develop an end-to-end framework to predict the bounding box coordinates from the input document image based on the input key. For Track B, no ground truth bounding box of form semantic entities is given; the inputs are only strictly formed images and key queries. The evaluation metric is the Mean Average Precision (MAP) of the predicted bounding box.

Competition Link: https://www.kaggle.com/competitions/vrd-iu2024-trackb

Important Dates

Data, baseline paper & code available: 29 April, 2024
Track A Challenge Due: 17 July, 2024
Track B Challenge Due: 22 July, 2024
Announcement of Winners: 24 July, 2024
Paper Submission Due: 31 July, 2024
Competition: 05 August, 2024
Note: All deadlines are Anywhere on Earth (UTC - 12) time.

Organising Committee

Organising Committee:
- Caren Han, The University of Melbourne
- Yihao Ding, The University of Sydney
- Yan Li, The University of Sydney
- Luca Cagliero, Politecnico di Torino
- Seong-Bae Park, Kyung Hee University
- Prasenjit Mitra, The Pennsylvania State University
Advisory Committee:
- Josiah Poon, The University of Sydney
- EJ Holden, The University of Melbourne
Program Committee:
- Haiqin Yang, International Digital Economy Academy, The Chinese University of Hong Kong, China
- Paolo Garza, Politecnico di Torino, Italy
- Honghan Wu, University College London, U.K.
- Riza Batista-Navarro, University of Manchester, U.K.
- Jean Lee, The University of Sydney, Australia
- Siwen Luo, The University of Western Australia, Australia
- Changyong Zhang, The University of Science and Technology of China, China
- Roberto Navigli, Sapienza University of Rome, Italy
- Lorenzo Vaiiani, Politecnico di Torino, Italy
- Davide Napolitano, Politecnico di Torino, Italy
- HeeGuen Yoon, National Information Society Agency, Korea
- So-Eon Kim, Kyung Hee University, Korea
- Nianlong Gu, University of Zurich, Switzerland
- Yingqiang Gao, University of Zurich and ETH Zurich, Switzerland
- Ali Rasekh, L3S Research Center, Germany

For any queries, send an email to caren.han@unimelb.edu.au