Role overview
We’re building Carousely.ai
An AI platform that generates
multi-slide Instagram carousels
— visual storytelling posts tailored for different niches.
Current Context
Our current generation pipeline uses
GPT-image-1
, which delivers high-quality visuals but costs around
$0.068 per image
, making it expensive to scale.
Our goal is to achieve a
10× cost reduction
(not 2–3× — but
10×
) while maintaining
90–95% of the same visual quality
.
We’re looking for an
AI engineer experienced with open-source diffusion models
to help implement a production-ready pipeline that meets these goals.
⚙️
**Current Generation Pipeline
- Input Stage**
The user enters their niche (e.g., “fitness coach”) and optionally uploads a selfie.
2. Template Selection
The backend randomly selects 5 carousel templates from our internal library.
Each template includes multiple slides and text blocks.
3. Text Stage (GPT-4)
GPT-4 receives a system prompt, the niche, and all original slide texts.
It returns a structured JSON output where each slide includes:
- Rewritten, niche-specific text
- A visual edit instruction (prompt) for the next step
4. Image Stage (GPT-image-1)
For each slide, we call GPT-image-1 with:
- The original slide image
- The generated edit prompt
- The selfie (if provided)
This ensures
visual consistency
and
niche relevance
, but the cost is too high for large-scale production.
💡
What We’re Looking For
We’re seeking an engineer who can build and optimize an
alternative image generation pipeline
using open-source models and fine-tuning techniques.
Promising directions include:
- Fine-tuning Stable Diffusion XL , Flux , or Playground v2.5
- Training LoRA or DreamBooth adapters for niche/persona-specific styles
- Deploying optimized models via Replicate , AWS , or custom inference endpoints
- Using ComfyUI , ControlNet , or Diffusers for template-specific edits (e.g., text overlays, background swaps)
If you have additional ideas for achieving a
10× cost reduction
while maintaining
high visual quality and style consistency
, we’d love to explore them together.
🎯
Responsibilities
- Implement a cost-efficient image generation pipeline compatible with our JSON-based system
- Fine-tune or adapt open-source models (SDXL / Flux / LoRA / etc.) for our use case
- Optimize inference speed and cost via Replicate, AWS, or local GPU deployment
- Benchmark against GPT-image-1 in cost, quality, and consistency
Deliver a reproducible proof-of-concept (POC) ready for production integration
Expected Outcomes
Visual quality:
90–95% of GPT-image-1 benchmarksAverage cost per image:
$0.006–$0.007Full compatibility
with our JSON-driven prompt/output structure
About Us
We’re opening a new
Generative AI division
within our company, where we test and launch new hypotheses every month.
We’re looking for a
full-time ML engineer
for ongoing collaboration.
We’ll start with this project, and if the partnership goes well, we’ll offer a long-term contract.
If you’re interested, please
fill out the form
— and if everything looks good, we’ll invite you to a
1:1 interview
.
The entire hiring process takes
less than one week
.
What you'll work on
- Implement a cost-efficient image generation pipeline compatible with our JSON-based system
- Fine-tune or adapt open-source models (SDXL / Flux / LoRA / etc.) for our use case
- Optimize inference speed and cost via Replicate, AWS, or local GPU deployment
- Benchmark against GPT-image-1 in cost, quality, and consistency
Deliver a reproducible proof-of-concept (POC) ready for production integration
Expected Outcomes
Visual quality:
90–95% of GPT-image-1 benchmarksAverage cost per image:
$0.006–$0.007Full compatibility
with our JSON-driven prompt/output structure
About Us
We’re opening a new
Generative AI division
within our company, where we test and launch new hypotheses every month.
We’re looking for a
full-time ML engineer
for ongoing collaboration.
We’ll start with this project, and if the partnership goes well, we’ll offer a long-term contract.
If you’re interested, please
fill out the form
— and if everything looks good, we’ll invite you to a
1:1 interview
.