DreamOmni2: Multimodal Instraction-Based Editing And Generation

Bin Xia¹, Bohao PENG¹, Yuechen Zhang¹, Junjia Huang³, Jiyang Liu³, Jingyao Li¹, Haoru Tan, Sitong Wu¹, Chengyao Wang¹, Yitong Wang³, Xinglong Wu³, Bei Yu¹, Jiaya Jia^1,2

¹ The Chinese University of Hong Kong, ² The Hong Kong University of Science and Technology, ³ Bytedance

Paper

GitHub

Demo

Object Replace

Image 1

Image 2

Replace the lantern in the first image with the dog in the second image.

Result

Image 1

Image 2

Replace the man in the first image with the woman in the second image.

Result

Image 1

Image 2

Replace the suit in the first image with the clothes in the second image.

Result

Image 1

Image 2

Replace the person in the first image with the person in the second image.

Result

Lighting Render

Image 1

Image 2

Make the first image has the same light condition as the second image.

Result

Image 1

Image 2

Make the first image has the same light condition as the second image.

Result

Image 1

Image 2

Make the first image has the same light condition as the second image.

Result

Image 1

Image 2

Make the first image has the same light condition as the second image.

Result

Style Transfer

Image 1

Image 2

Replace the first image have the same image style as the second image.

Result

Image 1

Image 2

Replace the first image have the same image style as the second image.

Result

Image 1

Image 2

Replace the first image have the same image style as the second image.

Result

Image 1

Image 2

Replace the first image have the same image style as the second image.

Result

Pose Imitation

Image 1

Image 2

Make the person from the first image has the same pose as person from the second image.

Result

Image 1

Image 2

Make the person from the first image has the same pose as person from the second image.

Result

Image 1

Image 2

Make the person from the first image has the same pose as person from the second image.

Result

Image 1

Image 2

Make the person from the first image has the same pose as person from the second image.

Result

Face Expression

Image 1

Image 2

Make the person in the first image have the same expression as the person in the second image.

Result

Image 1

Image 2

Make the person in the first image have the same expression as the person in the second image.

Result

Image 1

Image 2

Make the person in the first image have the same expression as the person in the second image.

Result

Image 1

Image 2

Make the person in the first image have the same expression as the person in the second image.

Result

Hair Style

Image 1

Image 2

Make the person in the first image have the same hairstyle as the person in the second image.

Result

Image 1

Image 2

Make the person in the first image have the same hairstyle as the person in the second image.

Result

Font Imitation

Image 1

Image 2

Make the words in the first image have the same font as the words in the second image.

Result

Image 1

Image 2

Make the words in the first image have the same font as the words in the second image.

Result

Image 1

Image 2

Make the words in the first image have the same font as the words in the second image.

Result

Image 1

Image 2

Make the words in the first image have the same font as the words in the second image.

Result

Pattern Imitation

Image 1

Image 2

Make the bag in the first image have the same pattern as the machine in the second image.

Result

Image 1

Image 2

Make the car in the first image have the same pattern as the mouse in the second image.

Result

Image 1

Image 2

Make the tape in the first image have the same pattern as the bag in the second image.

Result

Image 1

Image 2

Make the bottle in the first image have the same pattern as the compass in the second image.

Result

Image 1

Image 2

Make the dress in the first image have the same pattern in the second image.

Result

Image 1

Image 2

Make the T-shirt in the first image have the same pattern in the second image.

Result

Background Replace

Image 1

Image 2

Make the bag in the first image have the same pattern as the machine in the second image.

Result

Image 1

Image 2

Make the car in the first image have the same pattern as the mouse in the second image.

Result

Image 1

Image 2

Make the dress in the first image have the same pattern in the second image.

Result

Image 1

Image 2

Make the T-shirt in the first image have the same pattern in the second image.

Result

In-context Generation

Image 1

Image 2

The character from the first image is holding the item from the second picture.

Result

Image 1

Image 2

The character from the second image is holding the item from the first image.

Result

Image 1

Image 2

The logo from the first image is printed on the object from the second image.

Result

Image 1

Image 2

The man from the first image is wearing the clothes from the second image and is sitting on a sofa.

Result

Three References Generation

Image 1

Image 2

Image 3

The parrot from Image 1 is wearing the hat from Image 2 and standing on the ground, with a forest in the background. The color tone of the image is the same as in Image 3.

Result

Image 1

Image 2

Image 3

The cat from Image 1 and the dog from Image 2 are sitting side by side, with the background inside a car. The style of the image is the same as in Image 3.

Result

Image 1

Image 2

Image 3

On a fighting stage, two people are engaged in combat. Their movements are shown in Figure 3.

Result

Image 1

Image 2

Image 3

Picture 1 is hung on the wall of a bedroom. The cup in Picture 2, made of the same material as the plate in Picture 3, is placed on the table.

Result

Four References Generation

Image 1

Image 3

Image 2

Image 4

The man from Image 1 stands next to the woman from Image 2. The woman is wearing the hat from Image 4, which has the logo from Image 3 on it. The background is by the lake.

Result

Image 1

Image 3

Image 2

Image 4

The woman from image 1 and the man from image 2 are standing in front of a mountain. The dog from image 3 is standing between them. The style of the image is the same as in image 4.

Result

More Examples

Image 1

A spaceship is flying in the sky, with the sun visible in the background. The style of the image is the same as in Image 1.

Result

Image 1

A girl wearing a pink skirt and a white long-sleeve shirt, with long golden hair. She strikes the same pose as the man in the given image. The background is a field of flowers.

Result

Image 1

Generate a helicopter soaring above a city skyline at dusk. The color scheme of the helicopter is the same as that of the motorcycle.

Result

Image 1

A vibrant hot air balloon soaring over a sprawling valley. The color tone of the image is the same as in Image 1.

Result

Image 1

A woman in a cozy sweater and jeans, walking through a park during autumn. She has the same hair as the one in the given image and is holding a coffee cup in one hand, with the other hand tucked into her pocket. The background features golden fall leaves scattered on the ground, with trees in shades of red and orange, and a peaceful walking path stretching ahead.

Result

Image 1

A woman in a cozy knitted sweater and denim jeans, sitting by a fireplace, sipping tea while reading a book. Her hair is styled in loose waves, and she has a calm, content expression. Her makeup is the same as the woman in the given image.

Result

Image 1

Generate a bottle placed on a wooden kitchen counter. Its texture matches the given image. The bottle is surrounded by fresh fruit, a small bowl of herbs, and a cup of steaming tea.

Result

Image 1

On the cup, "Story" is displayed in the same font style as the reference image.

Result

Video Introduction

Compare with the Alternatives

Inputs

Ours

Kontext

Qwen-Edit

GPT-4o

Nano-Banana

OmniGen2

Image 1

Image 2

Make the person from the first image has the same pose as person from the second image.

Image 1

Image 2

Make the person in the first image have the same hairstyle as the person in the second image.

Image 1

Image 2

Make the first image has the same light condition as the second image.

Image 1

Image 2

Make the words in the first image have the same font as the words in the second image.

Contact Us

Feel free to contact Bin Xia at zjbinxia@gmail.com for any question,cooperation, and communication.

If you find this work useful, please consider citing:

@article{Xia2025,
    author = {Bin Xia, Bohao Peng, Yuechen Zhang, Junjia Huang, Jiyang Liu, Jingyao Li, Haoru Tan, Sitong Wu, Chengyao Wang, Yitong Wang, Xinglong Wu, Bei Yu, Jiaya Jia},
    title = {DreamOmni2: Multimodal Instruction-Based Editing and Generation},
    year = {2025},
}