|
Input (240 frames)
blue headphones, closed eyes
Input (240 frames)
dark forest, holy sword
Input (120 frames)
white hair, cartoon style
|
|||||
|
Input (60 frames)
CG style
A handsome grandpa, white hair
Input
white, snow
Van Gogh style
|
||||||
|
Input
snow, scarf, cartoon style
cheongsam, scarf, cartoon style
Input Input (52 frames)
Pink, CG style
Blue
|
||||||
|
Input
cartoon style
Input (80 frames)
cartoon style
Input Input
cotton
sunflower
|
||||||
|
Input
cartoon style
White top
Input
Cartoon style
CG style
|
||||||
|
Input
marble sculpture
white ancient Greek sculpture
|
||||||
|
Input
Condition
pixar style
panda with moon
|
|||
| A hand-drawn animation of a fluffy wolf, cartoon style | |||||||
|
Input Video
Ours
FRESCO
Flatten
TokenFlow
ControlVideo
Rerender
Text-to-Video-Zero
|
|||||||
| Orange SUV in sunny snow winter | |||||||
|
Input Video
Ours
FRESCO
Flatten
TokenFlow
ControlVideo
Rerender
Text-to-Video-Zero
|
|||||||
| A cartoon spiderman in black suit, black shoes is dancing | |||||||
|
Input Video
Ours
FRESCO
Flatten
TokenFlow
ControlVideo
Rerender
Text-to-Video-Zero
|
|||||||
| A black boxer wearing black boxing gloves punches towards the camera, cartoon style | |||||||
|
Input Video
Ours
FRESCO
Flatten
TokenFlow
ControlVideo
Rerender
Text-to-Video-Zero
|
|||||||
| A white deer in the snow | |||||||
|
Input Video
Ours
FRESCO
Flatten: Error creating optical flow index
TokenFlow
ControlVideo
Rerender
Text-to-Video-Zero
|
|||||||
| A white cat in pink background | |||||||
|
Input video
Ours
FRESCO
Flatten: Error creating optical flow index
TokenFlow
ControlVideo
Rerender
Text-to-Video-Zero
|
|||||||
| Cartoon style, a man, in a castle | |||||||
|
Input Video
Ours
FRESCO
Flatten
TokenFlow
ControlVideo
Rerender
Text-to-Video-Zero
|
|||||||
| A sculpture of a woman running | |||||||
|
Input Video
Ours
FRESCO
Flatten
TokenFlow
ControlVideo
Rerender
Text-to-Video-Zero
|
|||||||
| Exp1: combining anchor token and warped token gets better results. | |||||
| A white deer in the snow | |||||
| Only warping query token | warping query + first kv | warping query + warping kv | full, warping query + [first kv, warping kv] | ||
|---|---|---|---|---|---|
| Exp2: aligned query patches help correctly aggregate features in attention. | |||||
| A white cat in pink background | |||||
| input | warping kv, no warping query | warping attention-output feas | full | ||
| Exp3: ControlNet provide source video's structure, help flow propagation. | |||||
| controlnet input | results of no controlnet input | ||||
| Single scene: our aligned QKV flow-based attention can tolerate flow errors in single object and simple background. | |||||
|
Input
condition input
A white cat in pink background
backward flow
backward flow occlusion mask
|
|||||
| Complex scene: changing color of shopping bag, optical flow failure caused by scene change, will solve in the future. | |||||
|
Input
condition input
white hair, white top and jeans, CG style
backward flow
occlusion mask
|
|||||