Skip to content

VALORA AI is a Multimodal Pricing Prediction Model that uses textual and visual data to make precise predictions on product prices

Notifications You must be signed in to change notification settings

LynnFernandes23/Smart-Product-Pricing-Prediction-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Proposed Solution: VALORA AI is a Multimodal Pricing Prediction Model that uses textual and visual data to make precise predictions on product prices. Our method utilizes strong embeddings of product titles, descriptions, and images combined with engineered numeric features with high prediction precision but low computational cost. Our main innovation is a hybrid pipeline incorporating pre-trained embeddings and a LightGBM regressor to strike a balance between performance and speed.

Amazon ML Challenge 2025 Problem Statement

Problem Statement Smart Product Pricing Challenge

In e-commerce, determining the optimal price point for products is crucial for marketplace success and customer satisfaction. Your challenge is to develop an ML solution that analyzes product details and predict the price of the product. The relationship between product attributes and pricing is complex - with factors like brand, specifications, product quantity directly influence pricing. Your task is to build a model that can analyze these product details holistically and suggest an optimal price.

Data Description: The dataset consists of the following columns:

  1. sample_id: A unique identifier for the input sample
  2. catalog_content: Text field containing title, product description and an Item Pack Quantity(IPQ) concatenated.
  3. image_link: Public URL where the product image is available for download.
  4. Example link: https://m.media-amazon.com/images/I/71XfHPR36-L.jpg
  5. To download images, use the download_images function from src/utils.py. See sample code in src/test.ipynb.
  6. price: Price of the product (Target variable - only available in training data)

Dataset Details:

  1. Training Dataset: 75k products with complete product details and prices
  2. Test Set: 75k products for final evaluation

Output Format: The output file should be a CSV with 2 columns:

  1. Sample_id: The unique identifier of the data sample. Note the ID should match the test record sample_id.
  2. Price: A float value representing the predicted price of the product.

About

VALORA AI is a Multimodal Pricing Prediction Model that uses textual and visual data to make precise predictions on product prices

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages