Monday, August 12, 2024

Into The Mirror-World

I just wanted to give a brief update to last week's post about backing up a blog. Right at the very end of the editing process, after I'd finished writing the thing and was just reading it back for a final time, it began to come back to me that I might have written something on the subject once before.

I checked the insanely long tag tail at the end of the blog, a growing problem about which I really ought to do something, if only I had any idea what, and found I had indeed created a tag for "back-up. It had just a single entry, which turned out to be a post I'd written all the way back in 2012, for the first Newbie Blogger Initiative.

The NBI, as a few people will remember, was something started by Syp of Bio Break and Massively (As it was then) to encourage would-be bloggers to give it a try. The idea was to provide a platform to kick off from and also to offer advice and support to those who were just starting. To that end, Syp invited a number of established bloggers to become Mentors for the event and even though in 2012 I'd only been at the blogging game for less than a year, one of the people he invited was me.

For reasons best known to my 2012 self, I chose to make this public in a post written in the style of Damon Runyon. A good piece of advice to would-be bloggers might be not to do that sort of thing.

Despite the mis-placed sense of play, I took my new responsibilities quite seriously even though, unsurprisingly, I had very little in the way of advice to offer that other, more experienced bloggers, weren't already providing. Scratching around for something to say that hadn't been said already, I hit on the idea of backing up your blog, a topic that was very fresh in my mind following an incident where Blogger had decided out of the blue to remove my own blog from the platform due to "unauthorized activity".


Obviously, since I'm here talking about it now, I got my blog back although it was quite a traumatic episode at the time. I never did find out what the supposed "unauthorized activity" was and (Fingers crossed, touch wood.) there's been no repeat of the situation since, so presumably whatever I was doing that upset Google, I haven't done it again. 

At the time, though, I was acutely aware of the possibility of a recurrence so I looked around for an insurance policy, using Google Search to do it, naturally. Oh, the painful irony.

My insurance policy turned out to be what I guess we'd now call an app by the name of HTTrack, which describes itself as a "website copier" and "free software offline browser". 

The detailed description tells us that 

"It allows you to download a World Wide Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site's relative link-structure."

I'd used HTTrack back in 2012 to mirror the whole of Inventory Full, all eight or nine months of it. When I checked last week, the files were all still there on my SSD, which I'd cloned from an older HDD when I installed it a year or so back.

On investigation it appeared I'd made also one further back-up a year or two later and then forgotten all about it. That made for an interesting curio but it wasn't going to be much help with the decade of posts that came after.

It seemed unlikely the website would still be there and even less likely that the program would still work. Almost any utility or service I used back in 2012 has either changed out of all recognition or vanished altogether. Still, I thought I might as well take a look.

As you'll see if you click through the link, the HTTrack website is still up and you can still download the software. What's more, it still works! The last update looks to have been around seven years ago but the functionality remains fully intact.

I know that for certain because I now have a complete copy of this blog including every image, link and comment, safely tucked away on an internal drive, with a back-up to the back-up on an external SSD. It runs to just under 30Gb and the download and processing took close to a day and a half. I had to pause it overnight, twice, but it completed flawlessly.

According to the website "HTTrack can also update an existing mirrored site", so next time I need to do this I should only have to ask it to add whatever I've posted since. Now I just need to remember to update the mirror more often than once a decade.

The most interesting thing about all of this, from a personal perspective, is that, while having the entire blog available offline in its original format gives me a great sense of security and satisfaction, I find I still much prefer to browse my back pages as the PDFs I was talking about last time. 

In those, the layouts are all over the place, with great swathes of white space and gaps everywhere. The entire look and feel of the blog, on which I spend so much time and effort, is totally screwed up. Nothing is where it should be and everything looks different. I would never design a blog that looks anything like it.

But it reads like a book, not a website, and that makes going back and dipping into it much more appealing. I'm no more likely to go back through the mirror of my blog and read it for pleasure than I ever have been with the online original, which is barely at all, but I've already dotted about in the PDF, picking out pages here and there and reading them for pleasure, as if they'd been written by someone else. I'd still like to have a representative sample bound in an actual paper book some day.

The thrust of much of the conversation around blog back-ups last week revolved not around aesthetics but practicalities. The main concern seemed to be for moving or restoring the blog as a fully-functioning online entity, particularly in respect of keeping the HTML and not having to re-do it all from some other format. I don't know if having the entire thing stored, as-is, offline makes that any easier but I'd be interested to know, if anyone with the technical expertise can elucidate.

It does certainly cover the whole "saving my genius for posterity" aspect, which also came up as a worry, though. For that, HTTrack has you covered. 

Whether posterity will care enough to be glad you made the effort is another question but since the option exists, you might as well take it as not. I have and I recommend it to you all.

If nothing else, you'll have something to look at when the internet goes out.

 

 

Passive-Aggressive Notes On AI Used In This Post

Given current levels of paranoia regarding AI, I thought I might re-institute my lapsed policy of foot-noting any non-human creativity or assistance used in the making of each post. I did kinda think we were over anyone caring about it but it seems not. Plus I always enjoy doing these notes and I kinda miss them.

It is, as I'm sure it won't surprise anyone who's had to do it, quite a challenge, coming up with illustrations for a blog post about the technicalities of blogging. Screen grabs of UIs and clips from websites are a) almost certainly more legally grey than AI images and b) really dull to look at. 

What I like to do instead is to cut and paste a line or a phrase out of the text of the blog itself and hand that off to one of the image generators to see what comes out. The results vary wildly depending on which model is taking a shot at it. Sometimes I get something I can use on the first try, other times it takes a while. 

For the picture at the head of this post I used SDXL Lightning on NightCafe. I left it at the default settings except for changing the format to 4:3 for the second attempt. 

The prompt I used was "download a World Wide Web site from the Internet to a local directory, building recursively all directories".

I was very curious to see what the AI would make of something so dry and non-visual but I ought to have guessed. Both images generated were glamorized visions of cybernauts hacking away in their techno-dens. I imagine there's a lot of stuff like that in the corpus.

I liked the first image but it was the wrong shape and on closer examination the hacker turned out to have three hands. Or maybe she has a three-fingered associate hiding under her desk. I've included it here, so you can take your best guess.

I changed the format and ran the identical prompt again to get the image I used, which does actually look vaguely like me at a certain age. I certainly had that hair-cut once. And those glasses. And that's my nose, give or take. Why he's sitting at that angle to the squodgy keyboards I have no idea but I'm guessing it's like that post-Tarantino fad for holding guns sideways. It just looks cool.

At least he has the right number of arms anyway. And fingers for that matter. Good job SDXL. Have a cyber-cookie!

3 comments:

  1. Web site / blog backup is an 'evergreen' topic. I like that you touched on it over a decade ago, and the approach of 'site crawling' to generate a backup is interesting.

    Thanks to the discussion this #blaugust I too started revisiting my backup strategy. I'm missing the 'offsite storage' part. I first tried Updraft's S3 vault thingy, but my backup storage requirements are trending towards $200+ a year. So more thinking there to come.

    As for the AI image generation for post 'enhancement': personally, I think that's a perfectly valid, reasonable, and fair usage of AI. Of course, I do it myself from time to time, so I'm biased ;) I actually prefer the three handed cyber hacker: I find those weird AI glitches somewhat endearing.

    ReplyDelete
    Replies
    1. I've been through a cycle of finding the warped, weird anatomy in AI images highly amusing to finding it annoying to feeling nostalgic about it now it happens so rarely. It was actually quite a surprise to see that third hand appear. It was probably because I used one of the cheaper models. You do get what you pay for, although in my case it's not real money - I have well over a thousand free credits saved on that site just from the ones they give away daily.

      Delete
  2. You already know about this, yeah—?

    https://web.archive.org/web/20240000000000*/http://bhagpuss.blogspot.com/

    — 7rlsy

    ReplyDelete

Wider Two Column Modification courtesy of The Blogger Guide