Midjourney was my first real text-to-image AI tool experience starting back in 2022. I’ve watched it evolve as the industry swelled with competition and lots of other image, animation and video tools popping up almost weekly since. By January 2023, the tools started to evolve to a place where they made us sit up and take notice, as I outlined in my first AI Tools article, AI Tools Part 1: Why We Need Them.
But after years of progress and lots of testing, Midjourney has now raised the bar yet again, with the introduction of their new video tool, and I’m pleasantly surprised at what it can do so quickly and sensibly. Here’s what I’ve explored so far…
Midjourney Animate
Midjourney announced the new video animation feature and the output is quite impressive!
I’ve been using a lot of different animation and video generation tools the past few years, as you may know if you’ve been following my AI Tools series here on ProVideo. But this is the most seamless and quickest workflow I’ve yet to engage with.
Most generators require a starting image – like a keyframe if you will. I almost always start with an image I’ve generated in Midjourney and then gone to another tool to animate it. (You can see my last article, AI Tools: Generative AI for Video & Animation Updates for more examples of the workflow). But now in Midjourney, you can either generate a new image as your source, or start with your own photo.
First – the details and specs…
Currently, everyone with an account can access the Animate option, but only the Pro and Mega plans can use the Relax mode, and videos consume 8x more time to process than images, but it does provide you with 4 variations to choose from in each round.
Video Output Sizes & Formats
Note that the maximum resolution at the moment is 480p (832×464) and the sizes vary depending on aspect ratio of course.
You can export your video in a compressed MP4 for social use, or a larger RAW MP4 H.264 version (still compressed but less) and animated GIF. You can link to the completed video’s URL as it stays in the cloud in your account.
This is the Codec data from a “RAW” files downloaded from Midjourney:
Midjourney Video Test Drive
Of course I had to dive in and absorb all I could with this new feature and I spent a couple days running it through the paces.
Starting off, I tried using some simple prompts for various news reporters to be used as B-roll. (I’d use something like this in a pinch to put on a screen in a shot that simulated a newscast on TV, for instance). The quality is good enough for the scale it provides (480p) but in no way intended for full-screen in this initial roll-out.
My first step was to get some figures to animate. I entered short prompt descriptions for Midjourney to generate some examples. It’s funny what AI thinks about ages at times. And some of the results are just so wrong they’re HILARIOUS!
After selecting the subject I wanted for each shot, I let Midjourney decide on the motion with the Auto Animate option. Each pass provides you with four different videos to choose from so you can extend out many options.
I created this video to show you the selections and results for each subject.
I did the same with these other examples from ChatGPT prompts and explain the process for each example.
(Note: the VO says 840p when I know damn well it’s 480p! Linguistically dyslexic I guess!)
Photo to video
Testing out the photo to video feature in Midjourney, I used an old image of my 80’s hair rock & roll days. So much hair product back then!
I uploaded the photo as the first frame and I let Midjourney do the work from there. I extended it just once more to make about an 8 second clip. I exported it as an animated GIF (not JIF) since there isn’t any audio. If only I was really that cool on stage!
Using Midjourney for Storyboarding and Previz
Currently, I see Midjourney as a tool for creativity and helping you bring your ideas to life. Not necessarily as an end product, but to realize how the written word can be visualized on the screen.
This could be an amazing tool for screenwriters trying to sell a treatment, or for storyboarding scenes and shots for locations, sets, lighting and blocking.
I created a short scene completely with AI tools in just a few steps – with two different variations to show how seamlessly Midjourney responds to prompts and extensions.
I started with ChatGPT asking for ideas for projects to do and this was one of the results I followed.
I was happy with the resulting images so I went with one I liked and decided to build a story around what the character ended up doing with my extended prompts.
I must say, this was one of the most satisfying and creative projects in an intuitive workflow that I’ve done in years. And it really only took a few hours from start to finish, because I had no preconceived idea what it was going to be – and I was going to let the AI Tools be my partners as my writers, actors, sound FX, staging and camera ops. I really felt like a director of sorts.
Each render pass provides 4 different variations based on the first frame (or continues from the last pass with each extension up to 4x). It’s very subjective to decide what take you want to use, but that’s part of the storytelling aspect. In my case, I started with the ChatGPT original prompt and then instructed the action to the end of the prompt with instructions.
Each pass I would add a new instruction or direction. Mostly the camera moves and angles were determined by Midjourney but those can be directed more closely as well. However, it doesn’t always follow instructions for action, but you can often fool it by rewording instructions. Sometimes though, the mistakes can actually change the story and you follow a different rabbit down the hole.
I’m including a few GIFs below showing the order of the process and the subsequent renders for each prompt instruction, and decision I made from those render result to continue on building my scene.
Prompt (with selected start image): film noir style, trench coat detective under a streetlamp in heavy rain, black and white with subtle color tint, glistening cobblestone, intense contrast, 1950s urban alley setting, moody and mysterious, he’s holding a lit cigarette and looks around like he’s waiting on someone
Prompt change/addition: he starts to cross the street while the camera follows his movement and he flicks the cigarette down on the street. (He didn’t cross the street but I went with it)
Prompt change/addition: a woman appears from the shadows on the right and runs up to him urgently. (more like a slow saunter, but it works)
Prompt change/addition: the couple kisses and embrace.
You’ll have to watch the video below to see what Variation 2 ended like!
So I needed to add sound to this short scene and I needed a voice over to narrate in an appropriate tone and voice.
I started with ChatGPT again and my AI script writing partner and I came up with some good lines. (You’ll hear both versions in the video below).
I used the script text in ElevenLabs using their new v3 Alpha model for a more natural speech delivery and I found a great voice that really fit the time period.
I also used ElevenLabs to produce my sound FX and music bed.
Everything mixed easily in Adobe Premiere Pro in just minutes. And here’s the result(s)
For more detailed info about Midjourney video options and usage instructions, visit their website.