DBQR-QA: A Shared Task

WHAT YOU GET
400 questions, ten per conversation, with tables (Pandas Data Frame) queried from our graph database.

YOUR MODEL GENERATES
Programs answering each question using our pre-defined (Python) or any custom functions in any language.

WHAT WE EVALUATE
The answer your model generated (number, text, set, or table).

QUESTION 1

What is the ratio of combined revenues reported by companies in the retail industry with liabilities between 100M and 500M every year during 2017 and 2020 to those with liabilities between 300M and 1B during the same years?

QUESTION 4

If I removed the top three companies by combined deferred tax liabilities from the first group, what would be the ratio?

QUESTION 4

If I removed the top three companies by combined deferred tax liabilities from the first group, what would be the ratio?

AUTOMATIC
Evaluate using our Python evaluation script offline (unlimited) or online (up to 100 times per day).

GPT-4O
Close-to-human evaluation accuracy. Use our prompt (unlimited) or evaluate online (twice a week).

MANUAL
Manual evaluation at the end of the submission period with the release of test answers.

100 ×	S	:	Simple query with specific companies	View
100 ×	C	:	Complex query with unspecified companies	View
50 ×	T	:	Reasoning steps requiring multiple tables	View
100 ×	H	:	Multiple-hop query	View
50 ×	I	:	Instruction QA	View

Examples and practice data are now available.
See the quick-start page for more information.

Stages

Practice

50 questions

Training

200 questions

Blind test

150 questions

Workshop

@COLING 2025 in Abu Dhabi, UAE

19-20 January 2025

Important Dates

AoE time zone

Done	Practice set release	2 Sep 2024
Done	Training/test (blind) release	7 Nov 2024
Done	First GPT evaluation	13 Nov 2024
Done	Last GPT/human evaluation	23 Nov 2024
Done	Paper submission deadline	25 Nov 2024
Done	Test answers release	25 Nov 2024
Done	Notification of acceptance	5 Dec 2024
In progress	Camera-ready deadline	13 Dec 2024

Copyright © 2024-2025 R. Nararatwong et al. The images are generated by Dall-E · 3.
This website has been designed using resources from Flaticon.com. See credit page for links to materials used.
This study is partially based on the results obtained from a project JPNP20006,
commissioned by the New Energy and Industrial Technology Development Organization (NEDO).

DBQR-QA

Quick Start

Downloads

Submit

Leaderboard

Code

Issues

Paper

Contact

DBQR-QA