All 8 entries tagged Computer

View all 31 entries tagged Computer on Warwick Blogs | View entries tagged Computer at Technorati | There are no images tagged Computer on this blog

March 02, 2008

Bazaar Sprint

Canonical and the Bazaar guys have been kind enough to invite me to the Bazaar sprint happening in London this week. So I head off tomorrow morning, and should be getting back Saturday afternoon.

I should still be contactable by phone, email and/or IRC if I’m needed especially urgently for anything.


February 28, 2008

Lazyweb: mplayer's OGG support is not good

So I’ve been watching some of the talks from Linux.Conf.Au (to tide me over until the FOSDEM talks become available), which are all in the Ogg/Theora format.

My video player of choice is mplayer. However, it’s Ogg/Theora support seems to be awful0 (skipping backward often breaks, skipping forward occasionally does). Lazyweb, my question is this: Is there some way to fix this, or is it a problem upstream?

[Footnote 0: Not good, Meg, not good.]


Free Software vs. Open Hardware

Writing about web page http://rowetel.com/ucasterisk/

I watched the video of a talk from Linux.Conf.Au yesterday, “How To Build An Embedded Asterisk IP-PBX”.

Within this, David Rowe talks about how he got interested in starting such a project, how it was realised, and what future plans are (all of which was very interesting). The IP04 is the primary product produced thus far, which is based entirely on open hardware (much of it designed by Rowe himself). What was most interesting, for me, was the motivation for the project that David talked about when mentioning open hardware, that he wanted to drive the market price of VoIP hardware down.

Coming from someone who was talking a lot about liking FOSS (though using the O more than the F), this seems like an unusually capitalist argument. The economic argument for it is obvious: if I can design my hardware for free (by using the open hardware designs) then I can still make a decent profit while massively undercutting any of my competitors.

From the limited results that have been seen so far (production of the open hardware is still being ramped up), this model works for hardware. So, why is it that we don’t see the same results with Free Software? Is it because the economic model for open hardware is massively different from that for Free Software? I don’t believe so.

I believe it is because the markets in which the vast majority of Free Software competes are much broader than the market in which the IP04 and it’s forthcoming friends compete. The open hardware, in this case, has a very specific purpose, it is meant to connect phone calls (and, in fact, Asterisk, on which it is based, is one of the more successful Free Software projects in commercial terms). Free Software, however, rarely strives merely to replace proprietary software but instead tries to improve it.

Improvement obviously requires change. Once the Free Software has changed from what it was originally intended to replace, it is no longer a direct competitor. It may fulfil all of the functions that are really important to certain applications of it (normally those that the developers, be they paid or otherwise, are most interested in) but inevitably supports some use cases of the original in a worse manner0.

And, of course, a lot of Free Software was never written to replace proprietary software (i.e. Rhythmbox was intended to be a media player, not necessarily a direct replacement for Windows Media Centre), which means it has even less common ground to compete on. In fact, projects that started like this often require a complete paradigm shift, which means that differing parties are arguing at complete cross-purposes.

I’m not sure how to conclude this post, other than to suggest that Free Software projects that aim to replace a proprietary project tend to do better, within traditionally proprietary markets, than those that attempt to truly innovate. How does this reflect on what projects individuals choose to start and what projects companies who are competing in those markets choose to contribute to?

[Footnote 0: This, naturally, leads to the problems with benchmarking competing software products, each camp chooses the 10% of their project which is unique and better than the other, and spends time trying to convince people that that’s what’s really important.]


September 15, 2007

A Reasonable Blog War?!

Writing about web page http://blog.reindel.com/2007/09/13/i-will-never-support-the-semantic-web/

Through the programming reddit I found Brian Reindel’s post about the Semantic Web. The very first comment on this is James Simmons’ letting Brian know that he had written a blog post in response. Brian then responded in James’ comments.

I’m not hugely interested in the Semantic Web, but it’s a refreshing change to see a reasoned debate about an issue, as opposed to the mud slinging matches that are often found online.


September 14, 2007

Multi–Machine Parallel Python Benchmarks

Follow-up to Benchmarking Parallel Python Against Jython Threading (Benchmarks Take 3) from The Utovsky Bolshevik Show

Having claimed in a previous post that Parallel Python's ability to use the processing power of more than a single machine would work in its favour even when compared to the times for Jython threading, I thought I should probably look at some results to see if this is the case.

As previously, the benchmark being used is to sum all the primes beneath each multiple of 10000 between 100000 and 1000000.  The code examples can be found at http://oddbloke.uwcs.co.uk/parallel_benchmarks/

The Jython script uses Tim Lesher's cookbook recipe for a thread pool.  The Parallel Python script uses a slightly tweaked version of one of the examples on the Parallel Python site.

The two machines over which this is being tested are the University of Warwick Computing Society's servers, Backus and Codd, with Codd being used as the master server.  Both these machines have two CPUs.

The setup for the slave machine really is as easy as:

$ ./ppserver.py -p 35000 -w 2 

Once this was set up, I proceeded to test the Jython and Parallel Python scripts.  Disappointingly, the Jython script used more memory than I have available on my ulimit'ed account when running more than a single thread.  I have approximated based on the previous results I've had.


1 Worker
2 Workers
3 Workers
4 Workers
8 Workers
Jython Threading
289s
~150s
N/A
N/A
N/A
Parallel CPython (1 machine)
660s
352s
353s
351s
N/A
Parallel CPython (2 machines)
N/A
185s
180s
183s
188s


Looking solely at the numbers for Parallel Python, it seems that the speedup gained by using a second machine is significant.  It should be noted that Parallel Python's default regardless of whether or not it had the second machine available was 2 workers, so the automatic detection code is obviously sub-optimal.  It's trivial to override, so this wasn't a problem.

When this is compared to Jython's threading, it doesn't look significant but when we consider Jython's arithmetic ability and the fact that Parallel Python can continue to scale beyond this, Parallel Python begins to look better and better.  It should also be noted that, unsurprisingly, Jython uses a considerable amount more memory than CPython does.

EDIT: As pointed out in the comments, Jesse Noller has also started looking into benchmarking this sort of stuff.


September 11, 2007

Benchmarking Parallel Python Against Jython Threading (Benchmarks Take 3)

Follow-up to Benchmarking Parallel Python Against Threading from The Utovsky Bolshevik Show

Having had it pointed out to me that benchmarking against CPython threading is pointless, I am now going to do what I should have done originally (third time's the charm, right?) and benchmark Parallel CPython against threaded Java, in the hopes I will fail less at producing something useful.

Each of these results is the time it takes to sum the prime numbers below each multiple of 10000 between 100000 and 1000000 (i.e. perform the operation 90 times on numbers increasing by 10000 each time). 

I'm reusing the Parallel Python results from previously.

I decided to use Tim Lesher's cookbook recipe to test threads, as I already have a script which doesn't require a great deal of rewriting to make it Jython (i.e. CPython 2.2 or so) compatible.

Now, the results:


1 Worker
2 Workers
4 Workers
Vanilla CPython
1195s
N/A
N/A
Parallel CPython
1153s
601s
582s
Jython Threads
442s
241s
254s

As can be seen here, Jython threads by far and away beat Parallel CPython.  This does not, however, take into account the fact that Parallel Python can use several machines at once, which Jython threading obviously cannot do.

What's interesting to note is that Parallel CPython on one worker is roughly the same as standard GIL'd CPython (slightly faster, in fact, in this case).  If you need to write and deploy CPython as opposed to Jython, then there's no performance cost in writing parallelisable code to use Parallel Python regardless of end user (as PP, by default, spawns a number of workers equal to the available CPUs).

These statistics were taken on an IBM Thinkpad T60 with a Core Duo T2400 running Ubuntu Feisty GNU/Linux (using the standard packages where available) using the scripts found under http://oddbloke.uwcs.co.uk/parallel_benchmarks/ . 

Hopefully these are useful statistics and conclusions, as opposed to my previous efforts to produce such. :) 


Benchmarking Parallel Python Against Threading

Follow-up to Benchmarking Parallel Python from The Utovsky Bolshevik Show

Having had it pointed out to me that my last benchmarking post is fairly useless without a comparison to threading by a couple of people, I now have such a comparison.  The numbers for PP are those used in the last blog post.

For threads I initially tried using Christopher Arndt's threadpool module to make my life easier.  I've included these results in the table below and, looking at them, you can see why I thought had to find a different way of testing threads.

I decided to use Tim Lesher's cookbook recipe to retest threads.

The function used by all the methods is identical, so this should just be a measure of their performance.

Without further ado, the results: 


1 Worker
2 Workers
4 Workers
Parallel Python
1153s
601s
582s
threadpool
1176s
1246s
1254s
Cookbook Recipe
1175s
1238s
1362s


Obviously these results don't reflect brilliantly on threads.  What I did notice is that it was only Parallel Python that used more than 1 of my processors, which I presume is something GIL related.

Either Parallel Python is an excellent improvement over threads, or I'm doing something stupid regarding threads.  If the latter, please let me know and I'll run the benchmarks again.


Benchmarking Parallel Python

Writing about web page http://www.artima.com/weblogs/viewpost.jsp?thread=214303

This post is Bruce Eckel’s follow-up to his previous post which covered, among other things, concurrency within Python. Basically, CPython has the Global Interpreter Lock (GIL) which makes life very awkward for those wanting to run Python on more than one processor.

Anyhow, in this post Bruce points to Parallel Python as an add-on module which is a potential solution. I had a look at this and thought it was pretty cool. However, bearing in mind Guido van Rossum’s post about the performance implications of removing the GIL last time it was attempted I thought I’d see if this actually did provide a speed-up and benchmark it.

The following stats are for calculating the sum of primes below every multiple of 10000 between 105 and 106 (including the lower bound and excluding the upper). The first set uses only one working thread0 of my Core Duo laptop and the second set uses two (as I have two processors).

It should be noted that the code snippet being used is provided as an example on the Parallel Python website and so is probably one of their most optimal cases. Regardless, I think the numbers are helpful.

One Processor

Real Time Taken: 1153.53128409 s
Number of jobs: 90
Total Job Time: 1153.53128409 s
Time/Job: 12.816742

Two Processors

Real Time Taken: 601.201694012 s
Number of jobs: 90
Total Job Time: 1180.9738
Time/Job: 13.121931

It can be seen that running two worker threads increases the actual CPU time used by around 30 seconds but the fact that two processors are being used leads to a total speed up factor of 1.918709304, which is pretty impressive.

0 I’m not sure of the internals, so I don’t know if it is technically a thread. Regardless, only one calculation will happen at a time.


November 2014

Mo Tu We Th Fr Sa Su
Oct |  Today  |
               1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30

Search this blog

Tags

Galleries

Most recent comments

  • I should note that the talk info is available at http://linux.conf.au/programme/detail?TalkID=293 an… by on this entry
  • "I X'd you a Y, but I eated it by Lamby on this entry
  • Nice. Did not know it was that easy, I had a few problems to get it working some time ago. Have you … by mats on this entry
  • You can't make progress if you just argue, I try to be constructive. Cheers! by James on this entry
  • Thanks a bunch for doing this. I think that both Guido and Bruce Eckel are right; we really need peo… by Eli on this entry

Blog archive

Loading…
RSS2.0 Atom
Not signed in
Sign in

Powered by BlogBuilder
© MMXIV