The Competition on Visually Rich Document Intelligence and Understanding (VRD-IU)

The 2024 Competition on Visually Rich Document Intelligence and Understanding (VRD-IU) will be held in conjunction with the 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024) in Jeju Island, Korea.

  • Competition date & time: 03.08.24 - 09.08.24 TBA
  • Competition physical location : Jeju, South Korea
  • Hybrid mode : TBA


The VRD-IU(Visually Rich Document Intelligence and Understanding) competition aims to tackle the obstacles presented by the diverse and complex nature of form-like documents, which frequently involve multiple stakeholders and contain essential information that is challenging to extract. This competition, based on the Form-NLU Dataset featuring digital, printed, and handwritten forms, offers two tracks catering to participants' varying skill levels. Tasks range from extracting key information (Track A) to localising it within documents (Track B), ensuring engagement across proficiency levels in advancing visually rich document understanding technology. This initiative not only accelerates advancements in document understanding but also aims to draw increased interest and engagement in this field, presenting a prime opportunity for innovators to contribute to the evolution of efficient information extraction and analysis methodologies.

Track A - Form Key Information Extraction

Users must develop a deep learning-based retriever to extract the target form components based on the given key query. We provide human-annotated semantic entities bounding box coordinates of input form documents; users are required to locate the entity based on the input query. The evaluation metric is F1-Score following the Form-NLU Task B.
Competition Link:

Track B - Form Key Information Localisation

Users are encouraged to develop an end-to-end framework to predict the bounding box coordinates from the input document image based on the input key. For Track B, no ground truth bounding box of form semantic entities is given; the inputs are only strictly formed images and key queries. The evaluation metric is the Mean Average Precision (MAP) of the predicted bounding box.

Competition Link:

Important Dates

  • Data, baseline paper & code available: 29 April, 2024
  • Track A Challenge Due: 17 July, 2024
  • Track B Challenge Due: 22 July, 2024
  • Announcement of Winners: 24 July, 2024
  • Paper Submission Due: 31 July, 2024
  • Competition: 05 August, 2024
  • Note: All deadlines are Anywhere on Earth (UTC - 12) time.

Organising Committee

  • Organising Committee:
    • Caren Han, The University of Melbourne
    • Yihao Ding, The University of Sydney
    • Yan Li, The University of Sydney
    • Luca Cagliero, Politecnico di Torino
    • Seong-Bae Park, Kyung Hee University
    • Prasenjit Mitra, The Pennsylvania State University
  • Advisory Committee:
    • Josiah Poon, The University of Sydney
    • EJ Holden, The University of Melbourne
  • Program Committee:
    • Haiqin Yang, International Digital Economy Academy, The Chinese University of Hong Kong, China
    • Paolo Garza, Politecnico di Torino, Italy
    • Honghan Wu, University College London, U.K.
    • Riza Batista-Navarro, University of Manchester, U.K.
    • Jean Lee, The University of Sydney, Australia
    • Siwen Luo, The University of Western Australia, Australia
    • Changyong Zhang, The University of Science and Technology of China, China
    • Roberto Navigli, Sapienza University of Rome, Italy
    • Lorenzo Vaiiani, Politecnico di Torino, Italy
    • Davide Napolitano, Politecnico di Torino, Italy
    • HeeGuen Yoon, National Information Society Agency, Korea
    • So-Eon Kim, Kyung Hee University, Korea
    • Nianlong Gu, University of Zurich, Switzerland
    • Yingqiang Gao, University of Zurich and ETH Zurich, Switzerland
    • Ali Rasekh, L3S Research Center, Germany

For any queries, send an email to