Replace background with text-to-video RunwayML GEN-2

Just had my first try of Runway.

Friend recorded a funny video of me playing balance board in a room with messy background. I replaced it with a dynamic wave background video in 10 minutes -- before that, I have no experience in traditional video editing software and have not tried Runway before.

I'm amazed by how efficient the video editing tool chain now becomes.

Process

Step 1: Annoate points for mask

Step 2: AI learns the mask

Here's the result of masking:

The identification of myself and the board is pretty accurate. More amazingly as shown below, it identifies the side of the board, when I fell off it.

Step 3: GEN-2: text-to-video

The prompt:

a big ocean wave good for surf. sunset time. cinematic, film, moody, high resolution

Here's the result:

Step 4: Replace background

Since the mask is identified and our main object is extracted in Step 2, it takes just a few clicks to replace the background. Simply drag and drop the generated background video from the previous step to below the main video.

Thoughts

It feels like going back to the year 2000... I did not know how to write computer program by that time. I did not know text edit before was as hard as typing commands (like in editor VIM). I did not know organizing files was as hard as typing commands (cd, mkdir, mv, ...). For my generation, text editing using Microsoft Word and file management via Windows were as natural as breath. As a novice user, I gained the same level of efficiency as people who were professionally trained up.

I did not realize how comfortable other people might be.

Then it came to 2015, when the subject of data visualization took off in journalism area. Many new generation of journalists without proper computer science training were able to pick up Javascript and Python to build fancy projects.

I suddenly realized that I was on the uncomfortable side.

I love algorithm design and system design. I enjoy creating with useful software either for money or for fun.

I was part of the envagelism effort to bring the tools we were familiar with to people in other domains. It was a fun process but also repeatedly reminded me of the question: what shall we do if the new born kiddies without knowing our old crafts (e.g. algorithms) can deliver the same value like us?

Then it fast forward to now -- although LLM was popularized earlier this year by ChatGPT, many other GenAI solutions did exist for a longer while (like MidJourney/ Runway). People who were trained with PS/ AI/ AE are probably very uncomfortable now. I missed the previous decade to pick up any of those skills but could produce something 80% of the quality now.

People who were first batch to test out the new AI based tools probably already tackled a niche area to deliver some business value.

In the post Geek Efficiency Curve Updated, we proposed a framework to identify the works that could benefit from GenAI. The key question is, how long does it take for other normal people to catch up? What would be a more endurable competitive edge?

Replace background with text-to-video RunwayML GEN-2

Process

Step 1: Annoate points for mask

Step 2: AI learns the mask

Step 3: GEN-2: text-to-video

Step 4: Replace background

Thoughts

Comments

More from this blog

Vibeo Concept Notes

asyncio wrapper for any sync functions

Read Hackers and Painters again for GenAI

BKTree for similar string search

Quick note of Agent based on GenAI Playbook and Tool on CloudRun

Command Palette

Process

Step 1: Annoate points for mask

Step 2: AI learns the mask

Step 3: GEN-2: text-to-video

Step 4: Replace background

Thoughts

Comments

More from this blog