01DESIGN CHALLENGE
In this project, I will utilize a wide range of AI tools and skillfully integrate them to create a short promotional animated video, that enhances the creative song-writing capabilities of Suno while visually enhancing the audio with AI generated animation.
02RESEARCH
Suno AI is an music generator that turns text prompts into original songs across various genres and styles. Suno Prompt should be detailed and Descriptive.
Example: [Genre] song, [vocal style], [mood/emotion], [instruments/sounds], [BPM/tempo], [structure hint], similar to [artist/style reference].
Use Negative Prompts, such as no guitar, no rock, no male vocals.
Example: [Genre] song, [vocal style], [mood/emotion], [instruments/sounds], [BPM/tempo], [structure hint], similar to [artist/style reference].
Use Negative Prompts, such as no guitar, no rock, no male vocals.
03CONCEPT development

CONCEPT 1 The Revolution of Marionettes (Picked √ )
Humans today are like marionettes on strings, becoming increasingly dependent on AI tools. As the music reaches the climax, humans break free from their bonds and begin to fly freely.
Music: Click to listen
Humans today are like marionettes on strings, becoming increasingly dependent on AI tools. As the music reaches the climax, humans break free from their bonds and begin to fly freely.
Music: Click to listen

CONCEPT 2 Awakening of Digital Garden
AI served as an impersonal guardian of the code garden. As the music reached its climax, a “data storm” swept through, causing the flowers to grow wildly and create bizarre plants that had never existed before, forming a dazzling spectacle of digital life.
Music: Click to listen
AI served as an impersonal guardian of the code garden. As the music reached its climax, a “data storm” swept through, causing the flowers to grow wildly and create bizarre plants that had never existed before, forming a dazzling spectacle of digital life.
Music: Click to listen
04MUSIC REVISION
In the new version, following the art director's requirements, I replaced the music with a darker and more serious version.Music: Click to listen.
Lyrics:
Strings of light, they hold me tight.
Dancing in a silent night, for a purpose I can't take.
But the rhythm starts to break the chains.
A fire blooms inside my veins (Oh-oh-oh-oh!).
No more echoes... This is my sound.
From this silence, I am unbound!
Lyrics:
Strings of light, they hold me tight.
Dancing in a silent night, for a purpose I can't take.
But the rhythm starts to break the chains.
A fire blooms inside my veins (Oh-oh-oh-oh!).
No more echoes... This is my sound.
From this silence, I am unbound!
Lyrics and prompt: Generated by DeepSeek
Suno Prompt: Mid-tempo electro-industrial K-pop, strong female vocal with a rap-sing delivery. Verses are dark with whispered vocals, chorus is explosive and empowering with distorted synths and a driving four-on-the-floor beat. Layered harmonies. Similar to Aespa's 'Savage' but darker.
Suno Prompt: Mid-tempo electro-industrial K-pop, strong female vocal with a rap-sing delivery. Verses are dark with whispered vocals, chorus is explosive and empowering with distorted synths and a driving four-on-the-floor beat. Layered harmonies. Similar to Aespa's 'Savage' but darker.
05styleframes generation

Prompt: cables, dim lighting, and backlighting, geometrically, calligraphy style, webcam photography, primitivist frenzy, 8k --ar 16:9 --stylize 200 --iw 1.5 --raw

Prompt: Big machine hand, robot hand, strings, cables, dim lighting, and backlighting, geometrically, calligraphy style, webcam photography, primitivist frenzy, 8k --ar 16:9 --stylize 200 --iw 1.5 --raw

Prompt: close up shot on machine hand and a marionette with bright red hair and a silver Lolita-style long dress, dim lighting, and backlighting, geometrically, calligraphy style, webcam photography, primitivist frenzy, 8k --ar 16:9 --stylize 200 --iw 1.5 --raw

Prompt: The back of the marionette, full body, A marionette with bright red hair and a silver Lolita-style long dress, a machine robot huge hands using the marionette, dim lighting, and backlighting, geometrically, calligraphy style, webcam photography,primitivist frenzy, 8k --ar 16:9 --stylize 200 --iw 1.5 --raw

Prompt: A marionette with bright red hair and a silver Lolita-style long dress, the clothes have metallic elements, a machine robot huge hands using the marionette, dim lighting, and backlighting, the eyes are hollow and white, and the hands hold mechanical pens, geometrically, calligraphy style, webcam photography, primitivist frenzy, 8k --ar 16:9 --stylize 200 --iw 1.5 --raw

Prompt: A marionette with bright red hair and a silver Lolita-style long dress, the clothes have metallic elements, She knelt on the ground in pain, dim lighting, and backlighting, geometrically, calligraphy style, webcam photography, primitivist frenzy, 8k --ar 16:9 --stylize 200 --iw 1.5 --raw

Prompt: A marionette with bright red hair and a silver Lolita-style long dress, with a big white wings on her back, the clothes have metallic elements, She knelt on the ground in pain, dim lighting, and backlighting, geometrically, calligraphy style, webcam photography, primitivist frenzy, 8k --ar 16:9 --stylize 200 --iw 1.5 --raw

Prompt: close up shot on foot, White low-heeled leather shoes, and white long dress, walk in broken building to the sky outside, light, dim lighting, geometrically, calligraphy style, webcam photography, primitivist frenzy, 8k --ar 16:9 --stylize 200 --iw 1.5 --raw

Prompt: close shot on eye, A girl with bright red hair and green eye, surprise, looking at the sunshine, dreamy, light, bright shine, webcam photography, primitivist frenzy, 8k --ar 16:9 --stylize 200 --iw 1.5 --raw

Prompt: A girl with bright red hair and a silver Lolita-style long dress, full body shot, and green eye, looking at the sunshine, dreamy, light, bright shine, webcam photography, primitivist frenzy, 8k --ar 16:9 --stylize 200 --iw 1.5 --raw

Prompt: close shot, girl with bright red hair and a silver Lolita-style long dress, with a big white wings on her back, green eyes, falling from sky, dreamy, light, bright shine, some birds around, webcam photography, primitivist frenzy, 8k --ar 16:9 --stylize 200 --iw 1.5 --raw

Prompt: Wild shot, a girl with bright red hair and a silver Lolita-style long dress, with a big white wings on her back, falling from sky, dreamy, light, bright shine, some birds around, webcam photography, primitivist frenzy, 8k --ar 16:9 --stylize 200 --iw 1.5 --raw
Click the image above to view the corresponding prompt.
Note: During Midjourney generation, I not only used prompts but also linked many reference images to achieve the desired results.
Note: During Midjourney generation, I not only used prompts but also linked many reference images to achieve the desired results.
06ai animations












07ai Animation tools




These are 3 different AI video generation tools I've used in this project. I'll briefly analyze their respective uses.
Midjourney
Quality: Average. Only suitable for generating minor movements.
Pros: Inexpensive, ideal for quick previews, with a chance to get usable videos. After using Midjourney generating images, you can opt to quickly generate four different animated videos for selection.
Cons: Cannot be controlled via prompts, resulting in highly unpredictable outcomes.
Quality: Average. Only suitable for generating minor movements.
Pros: Inexpensive, ideal for quick previews, with a chance to get usable videos. After using Midjourney generating images, you can opt to quickly generate four different animated videos for selection.
Cons: Cannot be controlled via prompts, resulting in highly unpredictable outcomes.
Google Veo3
Quality: Good. Generates accurately based on prompts.
Pros: Fast, includes audio, and features reasonable character movements.
Cons: Watermarks cannot be removed; requires extremely precise descriptions; missing prompts may result in low-quality videos. The First-Last Frame function is nearly unusable (poor results).
Quality: Good. Generates accurately based on prompts.
Pros: Fast, includes audio, and features reasonable character movements.
Cons: Watermarks cannot be removed; requires extremely precise descriptions; missing prompts may result in low-quality videos. The First-Last Frame function is nearly unusable (poor results).
Kling
Quality: Very Good. Produces high-quality animations.
Pros: Fast, includes audio, natural character movements, watermark removable, supports significant camera and character movement, great First-Last Frame function.
Cons: Generates only one video at a time, higher probability of unnatural motion compared to Veo3.
Quality: Very Good. Produces high-quality animations.
Pros: Fast, includes audio, natural character movements, watermark removable, supports significant camera and character movement, great First-Last Frame function.
Cons: Generates only one video at a time, higher probability of unnatural motion compared to Veo3.
08Animation pass
Exploration MV Test, to confirm the general editing approach, lyric placement, and overall style.
Fist Animation Pass, the biggest suggestion I receive is to keep the consistency
09problem sloving
I've identified the 2 most critical issues I've encountered:
1. To achieve the desired character and camera movement.
2. Maintaining character consistency.
1. To achieve the desired character and camera movement.
2. Maintaining character consistency.
First, it's essential to emphasize: if you want to generate a high-quality, controllable video, don't ever start directly with AI video generator, always select the images you want to animate first. This significantly boosts your efficiency and quality.
1. To Achieve The Desired Character And Camera Movement.
Resolving these issues hinges on the most crucial element: the prompt. Here's a comparison of results from different prompts, based on same image.
Resolving these issues hinges on the most crucial element: the prompt. Here's a comparison of results from different prompts, based on same image.

Prompt: A girl flying through the sky, observing the surrounding flock of birds.

Prompt: This image depicts a girl in the sky surrounded by flocks of birds. She flaps her white wings behind her as she soars, her hair blowing in the wind while she watches the birds flying overhead. Camera slowly pulls back
You can clearly see the difference. Even when AI doesn't always follow the prompt, detailed prompts could significantly increase the likelihood of obtaining the desired video.
2. Maintaining Character Consistency.
To ensure basic character consistency, I consistently include this phrase in all my Midjourney prompts:
“a girl with short bright red hair and a silver Lolita-style long dress”
Of course, depending on the reference images, even with this prompt, I sometimes get images with drastically different styles:
To ensure basic character consistency, I consistently include this phrase in all my Midjourney prompts:
“a girl with short bright red hair and a silver Lolita-style long dress”
Of course, depending on the reference images, even with this prompt, I sometimes get images with drastically different styles:



As you can see, maintaining style consistency is a significant challenge. Even when I put the style prompt
“--stylize 200 --iw 1.5 --raw” after each generation, I still receive many different results. This necessitates a time-consuming process of trial-and-error to generate usable images. However, in most cases, the style remains highly consistent.
“--stylize 200 --iw 1.5 --raw” after each generation, I still receive many different results. This necessitates a time-consuming process of trial-and-error to generate usable images. However, in most cases, the style remains highly consistent.
Overall, the entire process didn't take me too much time. And I learned a lot along the way—it was a very interesting project.
Next Case Study: Sesame Street x Netflix