Call for Data Challenge Proposals

Data Challenge Proposals

This Call for Proposals invites individuals or industry/academic groups to send their proposals for the CODS-COMAD 2024 Data Challenge. This challenge aims to be a premier annual Data Sciences & Data Mining competition held as part of the annual CODS-COMAD conference. The CODS-COMAD Data Challenge is anticipated to last for three months. The winners will be announced before the CODS-COMAD 2024 conference, and they will receive their awards at the conference award ceremony. Winners will also get a chance to present their solutions and give a demo at the CODS-COMAD conference. Organizers of the accepted challenge will receive complimentary registration for the entire conference. The conference may also consider providing reasonable support for the infrastructure required to run the challenge.

Proposals should meet the following requirements: a novel and interesting set of problems, a broad outreach for the data science and data mining community, a robust and fair setup for the competition, a challenging yet manageable task within the given time frame, and special consideration for the diversity and accessibility while developing the solution.

Novel and interesting problems

Proposed problems should have some novelty yet solve some real-world problems/requirements and call for innovative solutions. They should also be interesting to the data science community at large and the Indian data science community. The problem and the data set should be relevant to the Indian scientific community and society and try to address some problems particularly important in the India context. We are particularly looking for problems that are different from typical machine learning and data science challenges proposed at competitive venues in recent years.

Dataset

The organizers should guarantee the availability of the data and the confidentiality of the test set. The evaluation should be statistically sound for the objective comparison, yet meaningful for the application at-hand. If the dataset is owned by the organizers (or their university/organization) then they have to obtain necessary permission to make it available for the challenge. If the dataset is publicly available, then the licence terms must be examined to ensure that the dataset can be used for the competition. The conference can include a reasonable citation for the dataset on its challenge page but no commercial branding.

A challenging yet manageable task

The task should be challenging in the sense that there is enough room for improvement over the basic solutions, and novel ideas are required to succeed in the competition. The task should be manageable in about 2.5 months’ time.

Diversity and Accessibility

The dataset and solution should pay special attention toward the diversity and inclusion of the underrepresented minorities. Please describe your plan to enhance diversity of the participants and encourage participation of a diverse group of competitors?

The notions presented in the description should make the competition accessible to most data sciences and data mining practitioners who might not have significant prior knowledge of the specific domain or access to a specialized and large amount of computational infrastructure. The proposal should discuss how domain expertise can be factored in or any simplifications made to decrease the need for domain expertise.

Proposal details

Proposals should cover all the important details such as dates, submission and evaluation of results, and describe the competition rules clearly. As a rule of thumb, prepare a proposal as close as possible to the version you would publish on the competition’s webpage.

Proposal Requirements

Please follow the following template for your proposal submission:

Problem description

Describe the problem. Justify why this is an important and novel problem. In particular, please elaborate how your proposed problem is different from the previous competitions in recent years. Additionally, please include a discussion of the broader impact of this problem. Please prepare some data samples or scenarios of your proposed problem. If you plan to include more than one track, please describe the unique value for each track.

Dataset

Describe the dataset: size, structure/schema, example data items. Availability (direct download or access via API if any). Are there any privacy concerns for the released data? Have you obtained the rights to release the data for the competition if the data is owned by your institution/organization?

Evaluation

Describe how you plan to evaluate the submission. We encourage you to think about how the evaluation aligns with real-world impact. We are particularly interested if additional evaluation on the winning submissions can be conducted in the real world after the competition.

Timeline

Start of the competition (website setup, datasets release, leaderboard setup), user registration deadline, submission deadline, and notification. You can consider two rounds of submissions if suitable.

Number of Awards

We encourage you to think about the number of awards you would like to give away depending on the number of participants you expect to take part in the competition.

Implementation Details

Competition infrastructure. Which competition infrastructure do you plan to use (e.g., Kaggle, or your own)? Is the competition platform you chose equally accessible to participants all over the world? It should also be made known to the participants that they would retain all the rights to the code and any other artifacts they would develop during the competition. The conference would retain the right to make public the presentations, demos and any writeup submitted as part of the competition.

Team work

Explain how the host will organize a team dedicated to the CODS-COMAD Data Challenge. For each team member, please include a list of their roles, responsibilities, and their commitment.

Solution

What type of report, presentation, code do you require to submit for the final winning solutions? How would you handle Q&A and possible revisions during the competition? To which extent you have explored this problem and what is the baseline solution?

Publicity

How do you plan to promote the competition? What specific help do you need from the conference for publicity?

Host information

Names, affiliations, email addresses, phone numbers, and short biographies of the organizers.

Selection priority will be given to innovative datasets and problems, and proposals with a specific plan to promote a diversity of participants.

Please keep the proposal concise and strictly confidential. Submit your proposals in PDF format by the submission deadline using the Easy Chair submission link below. Select 'Data Challenge' track while submitting your proposal. We’ll contact you if there are any clarification required. You’ll be notified by email by the notification date.

Submission Link: https://cmt3.research.microsoft.com/CODSCOMAD2024

Key Dates

June 30, 2024Proposal submission deadline
July 15, 2024Decision notification
September 1 September 23, 2024Start of the competition
November 30, 2024End of the competition
December 10, 2024Notification to the CODS-COMAD Data Challenge Winners
December 12, 2024Formal announcement of the CODS-COMAD Data Challenge Winners
Jan 2025Presentation and Demo by the winners and runners-up at the Conference