Image generation

Storytell's Image Generation feature allows you to create visual content from textual descriptions. This feature directly translates natural language prompts into diverse visual outputs, enabling a wide range of creative and functional applications within the Storytell platform.

Written By Mark Ku

Last updated 4 months ago

⚠️ Important: Image generation is a beta feature and is more prone to errors like misspellings in charts and graphs. Make sure to double check the underlying data before using.

Key capabilities and underlying models

Storytell integrates with leading AI models to deliver image generation:

1. DALL-E 3 (OpenAI)

Best for: General-purpose, creative, and artistic images

Key Strengths:

Fast generation (produces images in seconds)
Versatile across many styles: photorealistic, illustrated, painted, abstract
Excels at vivid, dream-like, and surreal imagery
Great for marketing materials, advertisements, and social media
Handles complex, detailed prompts well
Ideal for business presentations and product visualization

Ideal use cases:

Creative and artistic content (illustrations, concept art)
Marketing and promotional materials
Quick iterations and exploring multiple creative directions
Product mockups and visualizations

2. Banana (Gemini 2.5 Flash Image)

Best for: Images with readable text and character consistency

Key Strengths:

#1 ranked on LMArena for image generation
High-fidelity text rendering (logos, signs, labels, diagrams)
Maintains character consistency across multiple images
Supports 10 different aspect ratios
Negative prompts to exclude unwanted elements

Ideal use cases:

Logos with text, book covers, and signage
Infographics and educational diagrams
Marketing materials combining imagery and typography
Brand assets requiring clear text
Character designs needing consistency across images

3. Imagen 3 (Google)

Best for: Photorealistic images

Key Strengths:

Superior photorealism with lifelike details
Exceptional texture rendering (materials, fabrics, surfaces)
Precise prompt adherence (follows instructions literally)
Professional-quality realistic imagery

Ideal use cases:

Product photography
Realistic portraits of people and animals
Architectural visualization and interior design
Nature and landscape photography
Food photography
Fashion and clothing photography

Implementation Details

When a user initiates an image generation request, Storytell translates the generic ImageGenerationRequest parameters into the specific API calls and configurations required by the chosen AI model (e.g., OpenAI's DALL-E 3 or Google's Imagen/Gemini). The system then processes the model's output, which might involve fetching image data from a temporary URL or decoding Base64 data. Any errors, such as a failed to generate image or an image being blocked by safety filters, are captured and communicated to ensure transparency and user awareness.