Automate E-commerce Product Cataloging with AI

Learn how to build an automated product cataloging system that extracts product information, generates descriptions, and classifies images using ViscribeAI's Python SDK.

The Challenge: Manual Product Cataloging

E-commerce businesses face a constant challenge: efficiently cataloging thousands of products with accurate descriptions, specifications, and categories. Manual data entry is time-consuming, error-prone, and doesn't scale. What if you could automate 90% of this process using AI?

The Solution: ViscribeAI API

ViscribeAI provides multiple endpoints that work together to create a complete product cataloging solution:

Extract Structured Data: Pull product specs, prices, and features from images
Describe Image: Generate natural language product descriptions
Classify Image: Automatically categorize products
Visual Question Answering: Answer specific questions about product features

Implementation: Step-by-Step Guide

1. Installation and Setup

First, install the ViscribeAI Python SDK:

bash

pip install viscribe

Initialize the client with your API key:

python

from viscribe import Client

# Initialize client
client = Client(api_key="your-api-key-here")
# Or set VISCRIBE_API_KEY environment variable
# client = Client()

2. Extract Product Information

Use the extract_image endpoint to pull structured data from product images. This is perfect for extracting specifications from manufacturer photos:

python

# Define the fields you want to extract
product_info = client.extract_image(
    image_url="https://example.com/laptop.jpg",
    fields=[
        {
            "name": "product_name",
            "type": "text",
            "description": "The product name or title"
        },
        {
            "name": "price",
            "type": "number",
            "description": "Product price in USD"
        },
        {
            "name": "brand",
            "type": "text",
            "description": "Product brand or manufacturer"
        },
        {
            "name": "key_features",
            "type": "array_text",
            "description": "List of main product features"
        },
        {
            "name": "color",
            "type": "text",
            "description": "Primary color of the product"
        }
    ]
)

print(product_info.extracted_data)
# Output: {
#     "product_name": "Dell XPS 15 Laptop",
#     "price": 1299.99,
#     "brand": "Dell",
#     "key_features": ["15.6 inch display", "Intel i7", "16GB RAM", "512GB SSD"],
#     "color": "Silver"
# }

3. Generate Product Descriptions

Create engaging product descriptions automatically using the describe_image endpoint:

python

# Generate description with tags
description = client.describe_image(
    image_url="https://example.com/laptop.jpg",
    generate_tags=True
)

print(description.description)
# "A sleek silver laptop with a minimalist design, featuring a thin bezel
# display and aluminum chassis. The device appears to be a premium ultrabook
# with a modern aesthetic suitable for professionals."

print(description.tags)
# ["laptop", "computer", "electronics", "silver", "professional", "ultrabook"]

4. Automatic Product Classification

Categorize products into your taxonomy using the classify_image endpoint:

python

# Classify product into categories
categories = client.classify_image(
    image_url="https://example.com/laptop.jpg",
    classes=[
        "Electronics > Computers > Laptops",
        "Electronics > Tablets",
        "Electronics > Accessories",
        "Office Supplies",
        "Gaming Equipment"
    ]
)

print(categories.classification)
# "Electronics > Computers > Laptops"
print(categories.confidence)
# 0.97

5. Answer Specific Product Questions

Use Visual Question Answering to extract specific details:

python

# Ask specific questions about the product
questions = [
    "What ports are visible on this laptop?",
    "Does this laptop have a touchscreen?",
    "What is the screen size?"
]

for question in questions:
    answer = client.ask_image(
        image_url="https://example.com/laptop.jpg",
        question=question
    )
    print(f"Q: {question}")
    print(f"A: {answer.answer}\n")

# Output:
# Q: What ports are visible on this laptop?
# A: The laptop has USB-C ports, USB-A ports, and an HDMI port visible
#
# Q: Does this laptop have a touchscreen?
# A: No, this appears to be a standard non-touch display
#
# Q: What is the screen size?
# A: This appears to be a 15-inch display based on the laptop's dimensions

Complete Workflow Example

Here's a complete function that processes a product image and returns all catalog information:

python

from viscribe import Client
import json

def catalog_product(image_url, categories):
    """
    Complete product cataloging workflow
    """
    client = Client()

    # 1. Extract structured product data
    product_data = client.extract_image(
        image_url=image_url,
        fields=[
            {"name": "product_name", "type": "text", "description": "Product name"},
            {"name": "brand", "type": "text", "description": "Brand name"},
            {"name": "price", "type": "number", "description": "Price in USD"},
            {"name": "key_features", "type": "array_text", "description": "Main features"},
        ]
    )

    # 2. Generate description with tags
    description = client.describe_image(
        image_url=image_url,
        generate_tags=True
    )

    # 3. Classify into categories
    classification = client.classify_image(
        image_url=image_url,
        classes=categories
    )

    # 4. Combine all information
    catalog_entry = {
        **product_data.extracted_data,
        "description": description.description,
        "tags": description.tags,
        "category": classification.classification,
        "confidence": classification.confidence,
        "image_url": image_url
    }

    return catalog_entry

# Example usage
categories = [
    "Electronics > Computers > Laptops",
    "Electronics > Computers > Desktops",
    "Electronics > Tablets",
    "Electronics > Accessories"
]

result = catalog_product(
    "https://example.com/product.jpg",
    categories
)

print(json.dumps(result, indent=2))

Real-World Use Cases

1. Marketplace Sellers

Automatically catalog products from supplier images when importing inventory. This reduces onboarding time from hours to minutes per product.

2. Dropshipping Businesses

Extract and rewrite product information from manufacturer websites to create unique listings that avoid duplicate content issues.

3. Inventory Management Systems

Process images of warehouse items to maintain accurate inventory records with proper categorization and descriptions.

4. Product Comparison Sites

Extract specifications from multiple product images to build comparison tables automatically.

Advanced: Handling Complex Products

For products with complex nested data, use the advanced schema feature with Pydantic models:

python

from pydantic import BaseModel
from typing import List, Optional

class Specification(BaseModel):
    processor: str
    ram: str
    storage: str
    display: str
    graphics: Optional[str]

class Product(BaseModel):
    name: str
    brand: str
    price: float
    specifications: Specification
    warranty_years: int

# Extract with advanced schema
result = client.extract_image(
    image_url="https://example.com/laptop-specs.jpg",
    advanced_schema=Product
)

# Result is now validated and typed
product = result.extracted_data
print(f"Processor: {product.specifications.processor}")
print(f"RAM: {product.specifications.ram}")

Performance Tips

Batch Processing: Use async client for processing multiple images concurrently
Caching: Store results to avoid re-processing the same images
Error Handling: Implement retry logic for failed requests
Rate Limiting: Respect API limits and implement exponential backoff

Async Processing for Scale

When processing thousands of products, use the async client for better performance:

python

import asyncio
from viscribe import AsyncClient

async def process_products(image_urls):
    client = AsyncClient()
    tasks = []

    for url in image_urls:
        task = client.describe_image(image_url=url, generate_tags=True)
        tasks.append(task)

    # Process all images concurrently
    results = await asyncio.gather(*tasks)
    return results

# Process 100 products in parallel
image_urls = ["https://example.com/product1.jpg", ...]
results = asyncio.run(process_products(image_urls))

Measuring Success

Companies using ViscribeAI for product cataloging typically see:

80-90% reduction in manual data entry time
95%+ accuracy in extracted data
Consistent quality across thousands of products
Faster time-to-market for new products

Get Started

Ready to automate your product cataloging? Sign up at dashboard.viscribe.ai and get free credits to test the API. Check out our full documentation for more examples and best practices.

Automate E-commerce Product Cataloging with ViscribeAI