All 8 entries tagged Imagemagick

No other Warwick Blogs use the tag Imagemagick on entries | View entries tagged Imagemagick at Technorati | View all 1 images tagged Imagemagick

December 05, 2019

Scripting discovery of random YouTube videos

(I just discovered this post as unpublished draft from about three years ago. No idea why I didn't publish it. The method described in it still works. All the code is bash script. There's a comment in the last bit of code "needs more refining" which I leave as an exercise for those so inclined.)

Do you now, or have you ever wanted to, script discovery of random YouTube videos? I did recently and couldn't find anything useful online. So I made up my own method.

If you're thinking YouTube videos are identified by 11 character strings so you can generate a random 11 character string and use that, you're not technically wrong, but it's not the way to go about it. As a test I generated 1000 and none of them were were valid. This isn't at all surprising given how many possible values those 11 characters provide. In my observation, each character in can be an lower or uppercase letter a number, or a -. That's 63 possible characters. A calculator tells me that 63^11 is 62050608388552830000. (If you want to say that out loud, say "sixty-two quintillion", then mumble a bit.)

function getVideoID  {
   local id="";
   while [ "${id}" = "" ];do
      id=$(curl -s https://www.youtube.com/results?search_query=$( < /dev/urandom tr -dc A-Za-z-0-9 | head -c4) | grep -o 'watch?v=[a-zA-Z0-9]\{11\}' | sort | uniq | sort -R | head -1);
   done
   echo "${id/watch?v=/}";
}

That gets you a valid id, such as dQw4w9WgXcQ. If you discover videos entirely at random some of what you find will be NSFW. Really. It will be. The method I use to filter out NSFW content uses youtube-dl

function getVideoUrl  {
   local url="";
   url=$(./youtube-dl --age-limit 0 --get-url "${1}");
   echo "${url}";
}

videoID=$(getVideoID);

videoUrl=$(getVideoUrl "${videoID}");

If ${videoUrl} is not zero length then, in my experience at least, the video is SFW and it's value is an url of the raw video which could be used as input value for ffmpeg or whatever. (To emphasis, it is *my experience* that this method filters out NSFW content.) If you just want to download the whole video, youtube-dl can do that for you. (youtube-dl will find the highest quality version of the video by default. You may want to change that depending on your available bandwidth or what you intend to do with the video.)

Some videos on YouTube have a video component that is just a static image. E.g. someone's ripped an album and then combined the audio with the album cover art to create something that can be uploaded to YouTube. Such videos are visually uninteresting and maybe you want to identify those videos and discard them rather than use them in whatever it is you're doing that involves random YouTube videos. I did, so I worked out a way of doing that too. The method I've used is to generate a bunch of images from the video, then compare them in a way which gets a value that represents how much the images differ by. If that value is less than a certain value, discard it. I've used GraphicsMagick for comparing the images. ImageMagick can be used to but is slower. (The less powerful your hardware, the bigger the speed difference is. ImageMagick output is slightly different to GraphicsMagick so you can't just remove the "gm", the awk and cut arguments would need changing.) To extract the images you obviously first have to download the video and in the below the downloaded video is theVideo.mp4

# generate an image at 2 second intervals
ffmpeg -loglevel fatal -i theVideo.mp4 -vf fps=1/2 -y foo__%02d.jpg

if [ $? -eq 0 ];then

  # get an integer value that represents how different the images all are to each other
  v=$(gm compare -metric MAE foo__*.jpg null:-  | grep Total | awk '{print $2}' | cut -d . -f 2);

  if [ ! -z "${v}" -a "${v:0:2}" != "00" -a "${v:0:2}" != "01" ];then **** needs more refining 018 019 OK 010 not OK maybe test 3rd char too
    # the video isn't a static image
    # do whatever it is you want to do with it
  fi

fi

I arrived at discarding videos where the first two characters of v are 00 after calculating v for a bunch of videos.


October 05, 2014

Matlab Log File 'Art'

Sometimes I like to create an image from a dataset, just because. (Previously http://blogs.warwick.ac.uk/mikewillis/entry/useless_visualisation_of/) Also a few weeks ago I was looking at Matlab log files a lot (http://blogs.warwick.ac.uk/mikewillis/entry/fun_with_flexlm/). And thus, this

26th July 2014. Solarized colours.

(Click to embiggen.)

It's generated from Matlab license check outs for a single day. The image consists of 60 concentric circles split in to 24 segments. Each circle represents one minute, the innermost circle being 0 and the outermost 59 minutes past the hour. Each segment represents one hour. Midnight is where 12 would be on a clock face, noon is where 6 would be. A coloured segment indicates that a license was checked out during that hour. The distance from the centre represents the minutes past the hour when the license was checked out. Each segment is drawn with an opacity of 25%. The brighter the segment the more licenses checked out, though the brightness tops out at four licenses. (Finer graduations would mean that segments representing a single license would be really faint.) For example, the image below shows from inner to outer:

  • 1 license checked out at N minutes past midnight then being checked in sometime between 01:00 and 02:00
  • 2 licenses being checked out at N+1 minutes past 01:00 with one checked in again during the same hour and no check in time being found for the other license.
  • 4 licenses being checked out N+2 minutes past 02:00 then checked back in again sometime in the same hour (not necessarily at the same time).

Matlab log file art example image.

There is an element of doubt around tracking when a given license is checked in again. The Matlab license server log does not allocate any sort of identifier to a license check out so it's impossible to definitively identify when it was checked in again. I have taken the check in time to be the first time that a check in by the user@host combination occurs after they checked out a license. A user checking out multiple licenses from a single host could make that assumption incorrect.

The colours are the accent colours of the solarized palette http://ethanschoonover.com/solarized

Here's an image from the same data using colours used by Pirelli to denote the different compounds of their Formula 1 tyres.

26th July 2014. Pirelli Formula 1 tyre compounds colours.

This is with the colours of the RAF roundel. (Things which are round…)

26th July 2014. RAF roundel colours.

I was going to try doing one with University colours. Then I discovered the Corporate Identity part of the University website, which used to provide details of a colour palette for use in things University related, currently only provides details for a single shade of blue.

The images are generated using a bash script and ImageMagick. The script draws up to 5000 segments at a time. Initially it drew one at a time but it's a lot quicker drawing multiple segments at the same time. 5000 seemed like a nice number that didn't trip that error you get when bash command arguments are too long. It's nowhere near 5000 times quicker to draw segments 5000 at a time. This is due, to some degree I don't care enough to work out even roughly, to the temporary images being stored as mpc (http://www.imagemagick.org/Usage/files/#mpc) on tmpfs, thus minimising I/O overheard. (I (mis)use /dev/shm for this sort of thing since it's already there and usually has enough space.) Images are generated at 10000x10000 then shrunk. This is done to remove small unwanted artefacts which sometimes show up between adjacent segments in the same circle. Like this

Matlab log file art artefact example.

As that example shows, they don't appear consistently and I'm not sure why they do. I can't make them not occur without leaving gaps. If the images are generated at 1000x1000 the artefacts show up. If the images are generated at 10000x10000 the artefacts show up, but conveniently this detail is lost when the image is shrunk to 1000x1000.

Other examples which I find less aesthetically pleasing than the one linked here can be seen at http://blogs.warwick.ac.uk/mikewillis/gallery/matlab_log_file_art/


January 13, 2013

Useless visualisation of data – logins.

Sometimes I like to see if I turn a bunch of data in to some sort of image. Not an image that's in any way useful for comprehending the data though. Just because.

Each square is one of 100000 instances of someone logging in to a computer (click to embiggen):

100000 logins

The position on the x axis represents the hour and minute at which the login occurred. Position on y axis is the seconds. The colour of each square is a function of the date, month and the IP address of the computer.

Each square is added to the image individually, placed over the top of whatever is already there. The opacity of each square is 50% and the blending method is Imagemagick's ModulusAdd. (The very first run I used 'Plus' and was briefly puzzled by the result being all white :) )


March 14, 2010

Image gallery thing using ImageMagick polaroid operation

Writing about web page http://go.warwick.ac.uk/mikewillis/polaroidgallery/gallery.html

I used to tinker with HTML/CSS/Javascript a lot. I've done very little of this over the last [indeterminate period of time]. Then a few days ago whilst tinkering with ImageMagick I had an idea for a potentially pleasing method of presenting photos on a web page. The somewhat rough implementation of the idea is linked above.

All photos are taken from the University Media Library. Display of the large versions of the photos is handled by Lightbox2 (as seems to be very much the fashion these days). Bash script used to generate images and to generate the HTML.

Bad things about the gallery:

  • Because (alpha) transparency is required the thumbnails are png rather than jpeg. This means the file sizes of the thumbnails are rather large at ~60KB each including the overlay image that is used when the mouse is over the image to change the border colour. This isn't that big a problem I guess given the speeds of most people's Internet connection these days, but the part of my mind that remembers when 56k dial up was the norm keeps telling me that each image would take a minimum of ~12 seconds to load over such a connection. There are various programmes that will reduce the file size of a png a bit, e.g.optipng (in Ubuntu and openSUSE repos), and the most effective one I found, pngout. I could only get a reduction of a few percent, though some people claim to have got higher. I guess it depends on the png. I'd already told ImageMagick to strip out profiles when creating them.
  • The area of the page that needs to be clicked to display a particular image does not exactly correspond with the visible outline of the image. The area that needs to be clicked for any given image is rectangular because of course the images themselves are rectangular, but the use of transparency means they don't look rectangular. This could be solved using an imagemap, but it would only work if you collated all the pngs in to one big image and then mapped that. Otherwise parts of an image's map would be obscured by the image to the right. Collating all the images in to one big one would be a bad idea for various reasons (e.g. load time, lack of adaptation to different window widths).


Notable/Nice things about the gallery:

  • The images used to cause the border colour to change when the mouse is over a thumbnail are not merely the same image but with a different border colour. The image is generated twice, once with the white border, once with the other colour. The second image then has most of the actual photo replaced with a transparent rectangle. (Example) This reduces the file size by roughly a third.
  • The colour the border is changed to for a given image is determined by reducing the colours in the image to just one then inverting that colour. I found this gives a more pleasing effect that using the same colour for each photo. It also avoids the need to expend effort on finding a single colour that looks generally pleasing for all photos.
  • Because each photo is an individual element, they automatically flow to fill the available space. (Try resizing the browser window whilst viewing the page.)
  • Because the thumbnail images are pngs with transparency, you can stick any background you want behind them.



March 01, 2010

Staring at the

Writing about web page http://sohowww.nascom.nasa.gov/data/realtime-images.html

SOHO image wallpaper script grabs one or all of the latest SOHO images once an hour and sets it/them as your GNOME wallpaper. I was going to make it work for Mac OS X as well but by the time I'd found out how to set the relevant properties from a script I couldn't be bothered. (Mac OS X doesn't provide a way to set the background colour or the placement  (introduced in Snow Leopard) from a script. You have to alter the plist file and then restart the Dock as described here.)

Example screenshots:

$ ./soho_image_wallpaper

SOHO image wallpaper (all)

$ ./soho_image_wallpaper blue zoom

SOHO image wallpaper (blue zoom)


December 03, 2009

University Advent Calender as your (though maybe not your) wallpaper.

Writing about web page http://www.warwick.ac.uk/adventcalendar/

For no good reason other than because the idea got stuck in my head, a script (advent wallpaper script) which grabs an image of the University advent calender, does some stuff to it, sets it as your wallpaper and then adds a launcher to the panel that when clicked shows what's 'behind' today's door with the icon for the corresponding door as it's icon. Assuming you use GNOME for your desktop environment that is. Which statistically speaking, you probably don't.

Example screengrab (to click is to make larger):

Uni Advent Calender as wallpaper



Requires ImageMagick, gnome-web-photo, curl and Epiphany (the web browser not the holiday). Call from a cron-job or login script of something to automatically update wallpaper should you feel the desire to do so.


November 24, 2009

Fun with xkcd, bash, Imagemgick and GNOME

A script (xkcdannotated) which takes an xkcd comic then annotates it with the title and alt text like so:

xkcd.com - Silent Hammer

Invoked with no arguments uses the most recent comic, can also be told to use a specific comic or to use a random comic. Tested on Linux and Mac OS X.

A script (xkcdwallpaper)which calls xkcdannotated then sets the result as your GNOME wallpaper. Arguments as per xkcdannotated. Can also be told to overlay the comic on top of another image and set the result as the wallpaper.


November 16, 2009

Generating Freedesktop.org spec compliant thumbnails

A lot of Linux software that needs to generate thumbnails of an image uses the freedesktop.org thumbnail specification. This is good as it means if one application has already made a thumbnail of an image then another application can make use of it rather than generating it's own. I recently found myself looking for a way to generate such thumbnails.

The impetus for this was gnome-appearance-properties. This is the application which allows you to do stuff like change your wallpaper. Sadly as the number of wallpapers available increases it's usability decreases to some extent. The reason for this is that the user interface is frozen until all the wallpaper thumbnails are displayed. This in itself isn't too bad, unless you have thousands of wallpapers, but if it's combined with a lack of pre-generated thumbnails the user interface is frozen until all the thumbnails have been generated. This is annoying because if there are few hundred wallpapers available thumbnail generation can take over thirty seconds even on a decent spec machine. Thirty seconds during which the user is left looking at an unresponsive interface. There is long standing bug report regarding this, that I can't currently locate, which makes the very sensible suggestion that the thumbnails should be loaded asynchronously. Hopefully at some point someone will implement that, but in the mean time I found myself wondering whether it was possible to script the generation of thumbnails in advance.

My first thought was ImageMagick and a bash script because I'm already familiar with those. As it turns out ImageMagick comes very, very close to being able to generate such thumbnails using the -thumbnail option. I say close, because whilst it inserts both the MTime and URI information required by the freedesktop.org spec, it generates the URI incorrectly by inserting one too many slashes at the start. It creates

$ convert /usr/share/pixmaps/backgrounds/cosmos/earthrise.jpg -thumbnail 128x foo.png
$ identify -verbose foo.png | grep Thumb::URI
Thumb::URI: file:////usr/share/pixmaps/backgrounds/cosmos/earthrise.jpg

when it needs to be

 Thumb::URI: file:///usr/share/pixmaps/backgrounds/cosmos/earthrise.jpg

At time of writing this is actually fixed, but only in the svn version. If you have 6.5.7-8 or later then you should find it generates the URI properly. If you have an older version you can create the thumbnails like this:

#!/bin/bash
# makethumb - script to generate thumbnails to freedesktop.org spec
# *** Assumes GNU coreutils. ***
file=$1
saveto=~/.thumbnails/normal
tagfile=/tmp/$(basename $0)_tags
mkdir -p $saveto
thumbname=$(echo -n file://$file | md5sum| cut -d " " -f 1);
mtime=$(date +%s -r "$file")
echo "Thumb::URI={file://${file}}" >$tagfile
echo "Thumb::MTime={${mtime}}" >>$tagfile
convert -resize 128x -strip +profile "*" $file MIFF:- | cat $tagfile - | convert MIFF:- "PNG:${saveto}/${thumbname}.png"
rm -f $tagfile
$ makethumb /usr/share/pixmaps/backgrounds/cosmos/earthrise.jpg

Generating thumbnails this way is quite slow though. Using

$ find  /usr/share/pixmaps/backgrounds/ -type f -exec ~/makethumb {} \;

332 thumbnails took around 1:15 in my tests, though obviously this will vary depending on the spec of the machine. I tried using a variant of the script which generated thumbnails for all the files in a given directory, so only invoking the script once instead of multiple times. There was no significant difference in speed between the two methods though.

So I started looking for some way to use GNOME's thumbnail generation capabilities. The only example I could find of doing this used Python GTK bindings and was incomplete. I've only ever cobbled together one python script before, (that was also to use GTK bindings), but I managed to put together this

#!/usr/bin/python
# makethumb.py - script to generate thumbnails using GTK bindings

import gnome.ui
import gnomevfs
import time
import sys
import os

file=sys.argv[1]

uri=gnomevfs.get_uri_from_local_path(file)
mime=gnomevfs.get_mime_type(file)
mtime=int(time.strftime("%s",time.localtime(os.path.getmtime(file))))
thumbFactory = gnome.ui.ThumbnailFactory(gnome.ui.THUMBNAIL_SIZE_NORMAL)
if thumbFactory.can_thumbnail(uri ,mime, 0):
thumbnail=thumbFactory.generate_thumbnail(uri, mime)
if thumbnail != None:
thumbFactory.save_thumbnail(thumbnail, uri, mtime)

Using that to generate thumbnails in the same manner shown above for the bash script was about ten seconds faster. However after some experimentation I put together this (updated 15/8/11 to include suggestions from comment 2):

#!/usr/bin/python
# makethumbs.py - generates thumbnails for all files in a directory

import gnome.ui
import gnomevfs
import time
import os

dir="/usr/share/pixmaps/backgrounds/"

thumbFactory = gnome.ui.ThumbnailFactory(gnome.ui.THUMBNAIL_SIZE_NORMAL)

for subdir, dirs, files in os.walk(dir):
for file in files:
path = os.path.join(subdir, file)
uri = gnomevfs.get_uri_from_local_path(path)
mime=gnomevfs.get_mime_type(subdir+"/"+file)
mtime = int(os.path.getmtime(path))
print uri
print mtime
if thumbFactory.can_thumbnail(uri ,mime, 0):
thumbnail=thumbFactory.generate_thumbnail(uri, mime)
if thumbnail is not None:
thumbFactory.save_thumbnail(thumbnail, uri, mtime)

I found that generates 332 thumbnails in around 9 seconds. A massive difference to repeatedly invoking the makethumb.py script. I expect there are people who could provide a detailed explanation of why it's so much faster. I am not one of them.

It's also interesting to note that I've found this script generates thumbnails around three times faster than the gnome-appearance-properties creates them. Why that is I have no idea. The thumbs that result are not identical. The thumbnails generated by the Python script have the width and height of the original image embedded in them whilst the ones generated by gnome-appearance-properties do not. The ones generated by gnome-appearance-properties have a very lightly larger file size and the Channel Statistics embedded in the thumbnails are different too. However both sets of thumbnails say they were generated by GNOME::ThumbnailFactory.

Interesting as all this is, (to me anyway if it's not to you then why did you read this far?), it's all about generating thumbnails on a per-user basis. What if it was possible to have a system wide cache of per-generated thumbnails. E.g. you install a bunch of wallpapers and along with them you can install thumbnails that will be used rather than each user generating their own. The freedesktop.org thumbnail spec does cover this. So I tried creating such thumbnails. gnome-appearance-properties ignored them. When I say ignored them, I don't mean it looked at them and didn't use them, the output of strace indicates that it doesn't even look to see they exist. Which is a shame.


Search this blog

Tags

RSS2.0 Atom
Not signed in
Sign in

Powered by BlogBuilder
© MMXXIV