Didn't think I was going to have time for a post today but it appears I do so I'm going to carry on with the series about making music with the help of technology, including but not limited to AI. Or, in this case, making videos. Mostly by brute force.
I started a few weeks ago but I wasn't starting entirely from scratch. I've made a few videos before. Mostly for game-related stuff to support posts here. I also have many dozens of hours of camcorder footage from holidays, going pretty much back to the '90s, some of which I have occasionally bothered to edit and turn into mini-movies that about two people enjoyed watching, one of those people being me and the other being my mother. And I don't think she was all that interested. Mrs Bhagpuss wisely opted out of most of the viewings, even though she was in them.
I have also, even more occasionally, made videos for songs I liked. Other peoples' songs, that is. I quite enjoyed it but seemed like a lot of work so I didn't do it often and I don't think I've done any at all for more than ten years.
In the last month I've spent more than fifty hours making three-minute videos for songs and I believe I can at least claim some improvement, even if actually being "any good" at video-making is still somewhere over the horizon. I am also a lot more motivated than ever before to keep on doing it because it turns out that, like most things, when you have an actual goal in mind, working to achieve it becomes satisfying and even fun.
As soon as I decided I was going to put all the songs I was making onto YouTube I realized there were two things I'd have to do: make a new YouTube channel and make videos for all of them. I wanted them on YouTube because that would be where I'd be most likely to watch them - and it is me I'm mostly making them for. I suffer from Reverse Imposter Syndrome, where I think everything I create is pretty damn good whether it is or not and I rarely tire of reading, listening to or watching my own work.
I also wanted them on a channel that wasn't already cluttered up with a load of other nonsense, just in case I did eventually decide to make them public. Still thinking about that.
Obviously, I didn't have to make videos at all. I could just have posted the songs with static images. If I'd done that I'd be finished already.
Millions of people do it that way. It's perfectly acceptable. As a viewer, though, I dislike it intensely. I've apologized plenty of times in music posts here for linking to songs on YouTube that have no video. It just seems rude, somehow. It's orders of magnitude more likely I'll listen to a song that also has a video than one with just a picture so I feel obligated to extend that courtesy to others, if I ever do decide to open them up to the world.Unless it's one of those bloody videos of a turntable going round and round. Those are worse than no video at all.
Once I'd decided to make the videos, it was pretty obvious they were going to be lyric videos. Since there's no actual performer, these being songs that have been brought to life by machines, obviously there weren't going to be any performances to share. I certainly wasn't going to film myself, as a sixty-six year old man, miming to the vocals of a twenty-something woman, not even one who only existed because I'd just made her up.
All the songs have female vocals, by the way. Well, almost all. There are a couple that don't, yet, although they may not make it out of production that way. Given my near-inability to play male characters any more and now this strongly negative reaction to hearing my lyrics sung by a male voice, I have to wonder, sometimes. Then again, I do have a fairly strong and well-established preference for the female voice over the male, especially in popular song, so maybe that's all it is: aesthetics.
I like lyric videos, anyway. I quite often prefer them to the "official" videos, which all too often have the whiff of am-dram about them, along with far too many rubber masks, animal costumes and food fights.
It's useful to be able follow along with the lyrics as you listen, too. That's how we did it in the olden days, when we used to sit cross-legged with a gate-fold album cover open on our laps, squinting at the badly-printed words. It encourages you to sing along, something I do more often than you might imagine. Or possibly less often, if you're a regular reader of this blog and already have your own ideas about the kinds of things I might do...
With the necessity of videos established, the next question was what would they be videos of? They had to be of something. Lyric videos that consist of nothing but words scrolling across the screen do exist but in my opinion they probably shouldn't.
Luckily, I immediately realized I was sitting on the ideal resource: all those countless hours of holiday video, much of it digitized and some already tucked away on the hard drive of the very PC I was sitting at.
It would be totally unreasonable to expect anyone to watch straight extracts from my tedious home movies, especially given that what I mostly like to point the camera at when I'm away are very large, very old buildings, all of which tend to look much the same. In between all those castles and churches, though, are fragments of all kinds of things that just happened to take my fancy at the time. And, of course, a lot of weather.
Running the lyrics over a backdrop of blue skies, clouds, sparkling water and sunsets seemed like it might work. At least it would be pretty. Better still, as I mentioned in a previous post, an awful lot of my songs seem to be partly or even mostly concerned with the weather, so it might even be appropriate.
As it turned out, though, the weather condition I reference most often as a lyricist is undoubtedly rain and if there's one thing I don't generally like to film when I'm on holiday it's rain. I kind of want to come back with the impression the sun never stopped shining. Consequently, I have almost no footage of rain, other than a couple of torrential downpours and a thunderstorm or two, where things seemed spectacular enough to be worth recording for posterity.
Still, no-one said the interpretations had to be literal. That often looks labored and anyway, as I found very quickly, trying to fit pictures to words in anything other than the loosest fashion is bloody hard work. So mostly I haven't bothered with anything more than the odd, felicitous nod to what the songs might be about. Always assuming I know what that is. Which, after forty years, I quite frequently do not.
Having decided to do it, the next question was how. In the past, I've only really ever used the basic Movie Maker software that Microsoft used to include with Windows. They have, apparently, discontinued it now, although it's still on my PC so it must have still been there in Windows 10. Or maybe I had to download it separately. Anyway, I still have it.
The first three or four videos I made for my new-old songs were done entirely with Movie Maker and they're okay as they go but it became apparent pretty quickly that I'd already used up most of the possibilities and I still had more than two dozen to do. They were all going to look pretty samey unless I came up with a better idea.
That led me to start googling for alternatives, of which there are many. I wasn't planning on spending any money so I was limited to the free apps. I also dislike using free trials so that narrowed it down some more.
By far the most widely-recommended free video-editing app seemed to be ShotCut. I downloaded that and had a play around with it.
It's great for fucking up images, which is one of the main things I wanted it for. I absolutely did not want my videos to look like generic camcorder holiday snaps, that being exactly what they are, so I needed to mess them up a little, in the way everyone does when they upload stuff they shot on their phones to social media.
After some trial and error I decided my go-tos would be the filters labeled "Old Film. There are half a dozen of those and they all have their merits but I particularly like the Technicolor filter, even if it's not always extreme enough for me. I sometimes use the Saturation and Vibrance ones as well and in extreme cases the RGB Shift. That really makes it look like something I would have loved in the early seventies, when I was young and had no discernment. Not that I have all that much now...
I found the filters very easy to work with but Key Frames needed more thought. And effort. And patience. As for the subtitles, they didn't seem either flexible or intuitive. I was finding the learning curve needed to get the most out of ShotCut a little daunting so I started looking at easier options. Or maybe just some specialized apps to do specific things.
The app most widely suggested for adding titles and captions was CapCut. I installed it and found it excellent - until I tried to download what I'd done. CapCut likes to say it has a free version and it does but it also has an extremely annoying and quite clever way of getting you to buy the Pro version instead: it lets you use the Pro features for free, then asks you to pay for them as soon as you try to download the video you just made. That's really working the sunk cost levers.
Luckily, there's a workaround for that. It transpires that CapCut used to be far more generous with what it allowed free users to get away with and it's still possible to install older versions of the software that stick to those rules. You do have to be constantly on the lookout for annoying Upgrade pop-ups because once you have your nice, free 2024 edition, the last thing you want to do is accidentally swap it out for the current model. I've already had to uninstall and re-install the damn thing twice through not paying enough attention to what I was clicking on.
CapCut is very good indeed for positioning, timing and stylizing captions or titles but it is very bad at using AI to interpret what a singer is singing. Useless, in fact, although it claims otherwise.
You don't, of course need AI to add the lyrics - you can do it perfectly well manually - but it saves a lot of time if you can get an AI to do it, especially if it also works out all the timings for you as well. I went shopping for one that could do both and I found one: Microsoft's very own Clipchamp.
Clipchamp has an embedded AI because of course it does - everything Microsoft do has to be AI-enabled these days. This one purports to be able to produce the full lyrics from a video and turn them into subtitles. And it sort of does.
There are two problems, other than the inevitable mishearings: firstly, when Microsoft say subtitles they mean subtitles. The app is meant for captioning podcasts or spoken-word videos, I believe, and it likes to put the words along the bottom of the screen, where such things are supposed to go.
Music videos with the lyrics running along the bottom like a Ted Talk look really stupid. Ugly. Just horrible. Even Movie Maker doesn't limit you to that. Clipchamp probably doesn't either, if you know how to use it, but I haven't tried to find out because CapCut lets you put text anywhere and also make it do things like dissolve or fly off the screen. And I already knew how to use CapCut.
After a few frustrating attempts to get something half-way decent-looking out Clipchamp, I figured out how to export the Clipchamp AI-generated lyrics in SRT format and upload the file into CapCut. Problem solved.
Well, partly. The other thing Clipchamp likes to do is skip verses, usually at the start. I have no clue why. It doesn't always do it. It doesn't even mostly do it or I wouldn't bother using it at all. Mostly it gets the whole song right but sometimes it just... doesn't. And as far as I can tell, if it decides to start transcribing only after thirty seconds once, it will always start transcribing thirty seconds in on that particular video. Even if you re-upload it under a different name. Very annoying.
Nothing a bit of typing and tweaking in CapCut can't fix, though. And honestly there's a lot of that going to be needed anyway. By the time you've corrected the bits the AI misheard and changed the font and moved the positioning around and stretched some bits and shrunk others and changed the phrasing and taken out all the punctuation the AI thought ought to have been in there, you sometimes wonder if it mightn't have been easier to type the whole thing in yourself .
So far I've made fifteen videos and I don't believe a single one of them has taken me less than three hours. Several took most of a day. For a three minute song. Actually more like two -and-a-half minutes in most cases. It's a labor-intensive process, even with all the labor-saving shortcuts.
First I have to flick through the source material to find a few seconds here, a few seconds there, tiny bits I can use. Then I have to import those into MovieMaker, line them up, change the speeds, stitch them together to get a rough cut the same length as the song. Then it's into ShotCut to rough up the rough cut some more before taking it to Clipchamp to get the lyrics. Finally it's off to CapCut to finish the whole thing off.
Usually at least one of those stages goes wrong, somehow, and has to be adjusted or even redone. Sometimes the whole thing just doesn't come together the way I was imagining and it's back the drawing board. As yet I haven't had to completely abandon anything, just move the parts around, but even that takes a good, long time.
And even when it works, sometimes it still doesn't. Last night I spent the best part of two hours just getting the timings of a bunch of transitions exactly right in Movie Maker. When I was happy they were spot-on, I exported the project to an MP4 file and uploaded it to ShotCut only to find half the timings were a second or two adrift, as you can clearly see in the above video. Apparently it's a known bug but since the app is no longer supported by Microsoft no-one's going to fix it.
That is the first and only time it's happened so fingers crossed it won't trouble me again. Anyway, it had the serendipitous effect of making me realize I didn't have to make these things perfect, as if I even could. They're backdrops for the words and the music, not works of art in their own right. So as long as there aren't any spelling mistakes it doesn't really matter if the color change is a few seconds behind the beat, does it?
Well, probably not, but you like to do your best work at all times, don't you? And the whole process is extremely involving, addictive and enormously entertaining, so the temptation to keep at it until it's as good as it can possibly be is high. Video editing also has the merit of being a potentially useful skill, so even if it takes up a huge amount of my free time, that's arguably time well-spent. Certainly better-spent than it would have been playing video games, anyway, which is what I would be doing otherwise.
Whether I'll ever reach the point where I even feel subjectively good at making three-minute music videos is very far from certain. I think it's a safe bet I'm never going to be objectively good at it. I am, at least, better at it now than I was a month ago, though, so that's something.
I have another nineteen songs to make videos for and then, if I want to carry on after that, I'll either have to write some new ones or do ones of other people's stuff. I wouldn't rule either out. I re-tuned my guitar a couple of days ago. First time I've picked it up to do anything other than dust around it since 1994.
I mostly did it because I discovered there's an app for tuning now. And it works. And its free. I used to really hate tuning. And I was really bad at it. Now both my guitars are in tune. And they're both completely unplayable because they both have actions like cheese-graters and my fingers need six months hardening-up even to hold down the strings.
I'm thinking of buying a new guitar with an easier action. And an amp. Geez. What have I started?