All entries for Thursday 09 February 2006
Performance issues
It's not been a fun day today. Partly related to the massive code changes we've made to blogs recently, we've had some nasty performance problems today.
Hopefully the bugs with the release and the performance issues are now resolved. Unfortunately my task today was made harder by some very bad timing. At around the same time, we had 4 search engines indexing blogs and a huge number of images requests coming from external sites. This made diagnosing the performance problems all the more difficult. As a temporary measure, I've had to block the search engine crawlers and requests for our images that are from non-Warwick pages.
What this means is that until we lift this restriction (which will hopefully be very soon), you won't be able to see images from your galleries embedded in non-Warwick pages and our latest content won't show up in the search engines quite so quickly.
To give an idea of the size of this problem, here are some stats:
- 160,000 requests for user uploaded images a day
- Only 6% are requested from on campus!
- Only 60% of images are embedded in Warwick pages, the other 40% are linked to from other websites
- Our top external referrers are myspace.com and Google images
- Top 5 image searches in the first few days of Feb: kate beckinsale, arctic monkeys, evolution motorcycle trousers,hell, mafia
- 30% of all images requets are for images from just 10 blogs
- The top 2 individual images alone count for 7% of all image hits (we have almost 80,000 images)
- In the first week or so of Feb we served out 1.3m images which is 33GB of data
Now then, as you can imagine, that's a fair few hits. The problem I have with it is that the performance for our staff and students is degraded because of a massive number of external requests. I like the fact that Warwick Blogs rates highly in Google…but with that ranking comes a lot of unwanted traffic.
Obviously we'll try and resolve the performance issues and try and allow these requests to start flowing again. The problem is that we are not just serving images statically as we are doing single sign-on checks and permission checking and resizing of images on the fly. These are all problems that can be fixed and optimised, but it just goes to show that with systems like this, you never can tell where the bottlenecks will be until you hit them.