omg i would never want to see the json file! lol the format is sooo complex. I think it needs to stay xml. But I love the idea of openai just speaking this.
I'm not sure what value AI is providing here? Generating key frames seems not just possible, but easier to do within the GUI? Does the AI understand the relationship between script and video?
you can say things like "add a new cli menu option to cutlass that takes as input a list of words and make a 1080x1920 9 min video with random png from assets folder and place each word on the screen in font 400 with face color pink and outline color yellow width 15"
And claude will write the code in a way that generates an actually valid fcpxml file.
A fun one is: "review all the tests in ./tests to get deep knowledge of what you can do with fcpxml in this repo. Add a new cli menu option called stress-test that tries to generate a 9 min video with lots and lots of stuff. Just throw every single complex thing you can think of with many many lanes, transforms, use files from ../assets and create something wild in 1080x1920"
Hey I’ve been trying to get Claude code to generate Final Cut xml myself. Mostly I just have a Claude.md with the FCP xml reference and some guidelines. What does this do differently?
Well, the format is incredibly complex. You have to generate the XML correct in every last detail or FCP will crash when you import it. I learned the hard way you need a robust validator system and not just tests alone. Before XML goes out the door you need a last line of defense to catch problems. The go version of cutlass is pretty good at this now and python is catching up.
Editing will remain of utmost strategic important to automate until generation dominates the field. At that point, we'll probably generate from storyboards.
A lot of small startups are trying to automate timeline construction with VLMs. I've counted about a dozen, some with seed stage funding. If you can crack this, there's absolutely a path to a unicorn. But in the long term, generation will disrupt nonlinear video editors, because whatever high level software we build will be able to dispatch to both generation and editing tasks.
Fwiw, I also work in this space and spend a lot of time thinking about it. Prior to AI, I also spent a significant amount of time making films the old fashioned way.
Yeah I look at the amount of code I have now and it's just crazy. It would have taken me 10 years to get it this far before AI. So much of the code was generated. But I do wonder if all of FCP will be legacy soon...
I feel like they've really changed how they are presenting themselves but Runway was the big one that came to mind for me that focus on more than simple video generation but actually meant for film makers.
This is the best page I found- maybe there's a better one...
Wow, this is fantastic. I have been searching for a tool that would let me output videos like that.
I was going to create a javascript-based version of FCP to essentially implement the keyframes and transitions with arbitrary videos from youtube/vimeo and try to queue them up before they play. Then people would come to our site or use our widget to play all the videos.
I even considered doing a kenburns effect and zooming/panning/cropping the videos. I wanted to have the AI do some takes, finding highlights in the video by transcribing it.
And adding the stupid text overlays etc. would be done in javascript so we don’t need to generate so many versions of the video for different languages or styles.
What do you think? Should we use cutlass xml as the format? Probably better to just make our own json no?
Does anything like this already exist? There are a bunch of sites to edit video timelines. Maybe there is a js lib, like impress.js has for presentations etc?
It is a law of "AI" that every project must have a name collision with Nvidia:
https://github.com/NVIDIA/cutlass
This is amazing!
I wonder if OpenAI Structured Output [1] could be leveraged to constrain generation to required JSON Schema (and then convert to XML later)...
[1] https://platform.openai.com/docs/guides/structured-outputs
omg i would never want to see the json file! lol the format is sooo complex. I think it needs to stay xml. But I love the idea of openai just speaking this.
I'm not sure what value AI is providing here? Generating key frames seems not just possible, but easier to do within the GUI? Does the AI understand the relationship between script and video?
you can say things like "add a new cli menu option to cutlass that takes as input a list of words and make a 1080x1920 9 min video with random png from assets folder and place each word on the screen in font 400 with face color pink and outline color yellow width 15"
And claude will write the code in a way that generates an actually valid fcpxml file.
A fun one is: "review all the tests in ./tests to get deep knowledge of what you can do with fcpxml in this repo. Add a new cli menu option called stress-test that tries to generate a 9 min video with lots and lots of stuff. Just throw every single complex thing you can think of with many many lanes, transforms, use files from ../assets and create something wild in 1080x1920"
Hey I’ve been trying to get Claude code to generate Final Cut xml myself. Mostly I just have a Claude.md with the FCP xml reference and some guidelines. What does this do differently?
Well, the format is incredibly complex. You have to generate the XML correct in every last detail or FCP will crash when you import it. I learned the hard way you need a robust validator system and not just tests alone. Before XML goes out the door you need a last line of defense to catch problems. The go version of cutlass is pretty good at this now and python is catching up.
The idea here (I think) is that it (1) can be a program with code used to generate the XML and (2) is typed.
Just by virtue of being a go program it enables even more sophisticated validation and automation if you want to implement it.
hehe now I feel funny working on the python version too. It depends on my mood, yeah sometimes I want go. But sometimes I want a little python.
Editing will remain of utmost strategic important to automate until generation dominates the field. At that point, we'll probably generate from storyboards.
A lot of small startups are trying to automate timeline construction with VLMs. I've counted about a dozen, some with seed stage funding. If you can crack this, there's absolutely a path to a unicorn. But in the long term, generation will disrupt nonlinear video editors, because whatever high level software we build will be able to dispatch to both generation and editing tasks.
Fwiw, I also work in this space and spend a lot of time thinking about it. Prior to AI, I also spent a significant amount of time making films the old fashioned way.
Yeah I look at the amount of code I have now and it's just crazy. It would have taken me 10 years to get it this far before AI. So much of the code was generated. But I do wonder if all of FCP will be legacy soon...
Could you name a few of these startups please?
I feel like they've really changed how they are presenting themselves but Runway was the big one that came to mind for me that focus on more than simple video generation but actually meant for film makers.
This is the best page I found- maybe there's a better one...
https://runwayml.com/product
But its interface looks like a video editor- timeline, etc
I’m curious as well.
I’ve been exploring this space for a potential project - curious to see what these startups are doing
It's video editing, not video generation at all, not even a "different kind". Really cool though.
Yeah, but further developed it could pass for editing
Creating a video through edits is generation of a video.
This is a great project that demonstrates a unique way to tackle a problem from someone with a passion for those problems.
Wow, this is fantastic. I have been searching for a tool that would let me output videos like that.
I was going to create a javascript-based version of FCP to essentially implement the keyframes and transitions with arbitrary videos from youtube/vimeo and try to queue them up before they play. Then people would come to our site or use our widget to play all the videos.
I even considered doing a kenburns effect and zooming/panning/cropping the videos. I wanted to have the AI do some takes, finding highlights in the video by transcribing it.
And adding the stupid text overlays etc. would be done in javascript so we don’t need to generate so many versions of the video for different languages or styles.
What do you think? Should we use cutlass xml as the format? Probably better to just make our own json no?
Does anything like this already exist? There are a bunch of sites to edit video timelines. Maybe there is a js lib, like impress.js has for presentations etc?
Oh I have just the wiki for you!
https://github.com/andrewarrow/cutlass/wiki
The format is soooo complex. But if you want every feature of FCP it's all there.
[dead]