Have you ever encountered a product listing on an e-commerce platform where the image showed a short-sleeve shirt, but the description claimed it was long-sleeved? Such discrepancies are not just frustrating for customers—they're a significant challenge for e-commerce platforms striving to maintain accurate product catalogs at scale.
Meesho is sponsoring the data challenge that addresses this critical issue in the e-commerce industry. Participants will develop models to automatically predict key product attributes from images, revolutionizing how products are cataloged and listed online.
Task: Develop a robust machine learning model that can accurately predict various product attributes (such as color, pattern, and sleeve length) solely from product images uploaded by suppliers.
Why It Matters:
At Meesho, millions of products are listed by suppliers where their details are added by suppliers and later verified and corrected by our agents before final listing of products. We are taking this listed data from the Meesho platform. For the purpose of this competition, the scope is limited to only 5 categories: Sarees, Women Kurties, Men Tshirts, Women Tshirts, Women Tops and Tunics. In each category, we have 10k products for training and 3k for testing, there would also be a hidden test set of 3k products.
Dataset would look like this.Model Assessment: We will evaluate the attribute classification model using a product dataset encompassing the same categories as the training data. The attribute identification task is structured as a multi-class classification problem at the attribute level. It's important to note that in our framework, each attribute can only be assigned a single value, making this a standard multi-class problem rather than a multi-label one.
Metrics: To ensure a fair and comprehensive evaluation, we will employ both micro and macro F1-scores at the attribute level.
Scoring Process:This multi-tiered evaluation approach allows us to assess model performance at various levels of granularity, from individual attributes to overall accuracy across all product categories. It provides a balanced view of the model's effectiveness in handling diverse product attributes and categories.
Please ensure all materials are submitted by the specified deadlines. Late submissions will not be accepted.
Submissions must meet the following criteria to be considered valid. Please ensure your solution complies with these guidelines to avoid disqualification.
Dataset License: ATTRIBUTION-NONCOMMERCIAL-NODERIVATIVES 4.0 INTERNATIONAL
Please register for this data challenge using the below link and we will send out the dataset and submission link.
Registration form: https://forms.gle/XQvU9uWfUP7EcLAw5
Website link: https://www.meesho.io/ai/data-challenge
The competition is sponsored by Meesho