Role overview

We’re building Carousely.ai

An AI platform that generates
multi-slide Instagram carousels
— visual storytelling posts tailored for different niches.

Current Context

Our current generation pipeline uses
GPT-image-1
, which delivers high-quality visuals but costs around
$0.068 per image
, making it expensive to scale.

Our goal is to achieve a
10× cost reduction
(not 2–3× — but
10×
) while maintaining
90–95% of the same visual quality
.

We’re looking for an
AI engineer experienced with open-source diffusion models
to help implement a production-ready pipeline that meets these goals.

⚙️
**Current Generation Pipeline

Input Stage**

The user enters their niche (e.g., “fitness coach”) and optionally uploads a selfie.

2. Template Selection

The backend randomly selects 5 carousel templates from our internal library.

Each template includes multiple slides and text blocks.

3. Text Stage (GPT-4)

GPT-4 receives a system prompt, the niche, and all original slide texts.

It returns a structured JSON output where each slide includes:

Rewritten, niche-specific text
A visual edit instruction (prompt) for the next step

4. Image Stage (GPT-image-1)

For each slide, we call GPT-image-1 with:

The original slide image
The generated edit prompt
The selfie (if provided)

This ensures
visual consistency
and
niche relevance
, but the cost is too high for large-scale production.

💡
What We’re Looking For

We’re seeking an engineer who can build and optimize an
alternative image generation pipeline
using open-source models and fine-tuning techniques.

Promising directions include:

Fine-tuning Stable Diffusion XL , Flux , or Playground v2.5
Training LoRA or DreamBooth adapters for niche/persona-specific styles
Deploying optimized models via Replicate , AWS , or custom inference endpoints
Using ComfyUI , ControlNet , or Diffusers for template-specific edits (e.g., text overlays, background swaps)

If you have additional ideas for achieving a
10× cost reduction
while maintaining
high visual quality and style consistency
, we’d love to explore them together.

🎯
Responsibilities

Implement a cost-efficient image generation pipeline compatible with our JSON-based system
Fine-tune or adapt open-source models (SDXL / Flux / LoRA / etc.) for our use case
Optimize inference speed and cost via Replicate, AWS, or local GPU deployment
Benchmark against GPT-image-1 in cost, quality, and consistency
Deliver a reproducible proof-of-concept (POC) ready for production integration
Expected Outcomes
Visual quality:
90–95% of GPT-image-1 benchmarks
Average cost per image:
$0.006–$0.007
Full compatibility
with our JSON-driven prompt/output structure

About Us

We’re opening a new
Generative AI division
within our company, where we test and launch new hypotheses every month.

We’re looking for a
full-time ML engineer
for ongoing collaboration.

We’ll start with this project, and if the partnership goes well, we’ll offer a long-term contract.

If you’re interested, please
fill out the form
— and if everything looks good, we’ll invite you to a
1:1 interview
.

The entire hiring process takes
less than one week
.

What you'll work on

Implement a cost-efficient image generation pipeline compatible with our JSON-based system
Fine-tune or adapt open-source models (SDXL / Flux / LoRA / etc.) for our use case
Optimize inference speed and cost via Replicate, AWS, or local GPU deployment
Benchmark against GPT-image-1 in cost, quality, and consistency
Deliver a reproducible proof-of-concept (POC) ready for production integration
Expected Outcomes
Visual quality:
90–95% of GPT-image-1 benchmarks
Average cost per image:
$0.006–$0.007
Full compatibility
with our JSON-driven prompt/output structure

About Us

We’re opening a new
Generative AI division
within our company, where we test and launch new hypotheses every month.

We’re looking for a
full-time ML engineer
for ongoing collaboration.

We’ll start with this project, and if the partnership goes well, we’ll offer a long-term contract.

If you’re interested, please
fill out the form
— and if everything looks good, we’ll invite you to a
1:1 interview
.

Tags & focus areas

Used for matching and alerts on DevFound

Contract Ai Machine Learning Generative Ai

ML Engineer FineTuning OpenSource Diffusion Models SDXL LORa

Role overview

What you'll work on

Tags & focus areas

Ready to Join the Team?