All 9 entries tagged Etech

View all 32 entries tagged Etech on Warwick Blogs | View entries tagged Etech at Technorati | There are no images tagged Etech on this blog

March 16, 2005

Creating a new web service at google

Nelson Minar

The Google AdWords API

  • Adwords: Campaign management done via a web app. Advertisers select keywords applicable to their ad
  • Hierarchical data model. advertisers->campaigns->keywords
  • API goals – allow developers to integrate with the platform
  • 3rd party companies springing up to tweak keywords for maximum efficiency, or make alternative UIs
  • Smart companies integrating their back-office systems with their ad campaigns e.g. when stock runs out, pause the ads
  • Features: Campaign management, reporting functions, traffic estimator
  • Technologies: SOAP/WSDL over SSL. Quota system; multiple authentication mechanisms (proxying / remote management)
  • consultancy and toolkit vendors are starting to spring up.

  • Uses SOAP 1.1 + the WS-I basic profile
  • objective is to make the integration as simple as possible for a WSDL-enabled application. For a good platform an API call should be 2 lines (make a proxy, call the method)
  • uses Document/literal soap rather than RPC-oriented: D/L is closer to ReST/Atom – it's just passing documents about
  • Doc/lit soap requires good xml—>native object bindings. Poor binding is a frequent cause of interop problems
  • Reality: Interop is still hard; WSDL support varies by toolkit; doc/lit support likewise
  • Good platforms: .NET, Java (axis). OK: C++ (gSOAP), Perl (SOAP::Lite) Not good: Python (SOAPpy, ZSI), PHP

interop hazards

  • nested complex objects
  • polymorphic objects
  • optional fields
  • overloaded methods
  • xsi:type: Since clients keep getting them wrong it's easier to just not bother
  • ws-* – only sun and MS support it
  • doc/lit support is weak in scripting languages
  • or you could just parse the XML yourself.

Why not just use ReST?

  • Easy to use
  • tinkerable
  • high ReST – use the HTTP verbs to build the app, use meaningful URL path, use XML only as a document (payload), use HTTP headers for metadata
  • Nelson treats POST as update (c.f. Ben yesterday who considered it to be create)
  • lack of support for PUT/DELETE from browser – poorly tested in caches
  • limited standardisation for error codes
  • browsers can't cope with URLs more than 1000 chars
  • you've got to do your own databindings – no WSDL

bottom line

  • For complex data the XML is what matters and it doesn't make much difference if it's doc/lit soap or ReST
  • for read-mostly apps, ReST is best
  • need better tooling

Lessons learned
– good things:

  • doc/lit
  • stateless design
  • developer reference guide
  • developer tokens
  • interop testing
  • private beta period
  • batch-oriented methods – specify an array of IDs and get back multiple XML entities. big speedups. Makes error semantics harder, and messages larger
    – bad things
  • doc/lit switch was expensive
  • lack of a common/clear data model
  • dates / TZs are wierd – SOAP dates are GMT but google works on PST
  • no gzip enconding
  • quota confusion / anxiety
  • no sandbox
  • SSL - hard to sniff, XML dumps aren't publishable because they contain plain-text passwords, slow. note to self we should use a 1-way hash or something for our APIs_

  • Make sure your SOAP is well validated and clean: test interop. Distributing a client library is worthwhile
  • need good developer support – docs, samples, FAQ, debigging instructions, community

March 15, 2005

Creating applications without software (or at least code!)

Adam Gross –

  • how do you develop an application that's going to be used as a platform?
  • How you develop an application is totally dependent upon the technology in use at the time.
  • Moores law means that stack sizes increase exponentially
  • The desire for more abstraction is what drives the increasing stack size
  • compare and contrast: procedural language dev: c/c++—>vm langs (java/.net)—>scripting langs vs. declarative: html – does less, but does it well, and has a much lower cost of entry
  • how will we spend the next helping of moores law?

– more abstraction

– more separation of definition and deployment of the app

– more utility computing (this won't come without changing the development model – grid won't work)

  • what will it look like?

– new stack

– declaritive app dev with some scripting

– focussed on specific app types

  • Where's utility computing really happening now? Google / Ebay / Amazon etc – big providers with wide APIs. Sforce APIs are 20% of alesforce's web requests. 40% of ebay listings. This is real web services, happening now.
  • next step – on demand app dev – inject your own business rules into an app. provider.
  • is an example of an app/service where end users configure more-or-less the whole application through a GUI. If your application is some kind of CRM-ish thing then salesforce can be customized to host it regardless of your particular data.

Tangible Computing

Writing about web page

Mat Jones / Chris Heathcote – nokia

  • Ubiquitous computing is here
  • But the interfaces can't support the interactions we need to have with our computing devices
  • WIMP affordances aren't good enough

We need to play to our strengths: what have we got?

  • we are situated
  • we are embodied
  • we have opposable thumbs
  • we can touch

  • Interfaces should have a real effect; real is tangible
  • Dance Dance revolution is the cutting edge of tangible computing
  • Principles: Cognitive economy (don't allow people to do the wrong thing); social legibility (If I see you do it, I can copy you) – extelligence (c.f. the design of everyday thing)
  • Use attention wisely – glanceability: important information isn't in a window, it bubbles up. Direct combination – choose the objects and let the system infer the appropriate actions
  • Tangible tiredness – tangible computing is more physical.
  • what's out there now? Tablet computers, musical instruments like audiopad and jazz mutant; smart furnitured – drift table, sensitive objects (microphones on a flat surface can 'know' where you tap; cameras – digital pens, eyetoy, augmented reality; passive information display – make information more available; smart objects – barcodes / ids everywhere, haptics / force feedback everywhere.

Ambient devices - pre-attention cognition - process information without having to think about it

  • NFC - Near Field Communications – touch technology
  • Touch phone to computer – phone knows to sync itself
  • NFC reader/writer hardware – tags – have small amounts of info in them


Touched a phone to a tag (on an ID badge), it flashed & read the info of the tag. Tag is about the size of a 50p piece. But this is nothing new – just like the card readers on our doors. Once the phone has picked the tag data up though it could transfer it to another phone just by touching it. Tags can be written to as well as read from.

  • cool new output devices – dotdotdot for phone displays, palmorb, airport express
  • need programmability – more apis, more I/O

Day 2 keynotes

Session 1

Rael Dornfest – O'Reilly

  • The etech focus is "small things loosely coupled"
  • enabling the trend for 'mass amaturization" and the DIY-IT ethos – commoditisation of hardware and of knowledge via the lazyweb
  • Remxing vs. hacking: Remixing is more conversational
    h2. remix the….
  • web – view source, find out how it works, make it better
  • music – rip/mix/burn
  • TV - tivo
  • Network – prevalent wifi
  • movies – Bittorrent, netfix, videoOnDemand
  • data – webscraping —>xml apis —>emerging standards
  • text – does blogging remix journalism?
  • syndication – rss/atom
  • Bookshelf – project guttenburg , amazon search-in-book, etc.
  • IT - lots and lots of specialists. Hacks become frameworks become foundations
The reasonable man adapts himself to the conditions that surround him… The unreasonable man adapts surrounding conditions to himself… All progress depends on the unreasonable man.

George Bernard Shaw

Tim O'Reilly – Internet application design patterns

  • Architect your system to be used as a component of a larger system – at any scale
  • Release early and often, be greedy for feedback
  • Perpetual Beta – keep releasing new stuff continually Instrumenent your application so that you know how people are
    using your new features

  • Users add value to shared data. Use the 'network effect by default' principle – make participation the default. aggregating user data as a side effect of using the system. c.f. flickr: the defaults are always 'public'
  • exploit the long tail – look for a niche that was formerly too small to exploit
  • think about software above the level of a single device; design apps. from the ground up to be multi-platform.
  • Social networking: Atchitect your application to share the social fabric underlying your app. rather than inventing/constructing a new one.
  • Think about packet size: What's the smallest chunk of data/transaction that defines your application? Build your business model to make your living from the smallest atomic unit.

Stewart Butterfield Flickr

Flikr is built on it's own APIs
An open AIP helps with

  • trust
  • utility
  • discipline
  • credibility
  • creativity
  • community
    -and causes problems with
  • scalability
  • ops problems
  • other peoples bugs
  • privacy
  • copyright
  • support costs
  • business risks

_note to self: FlikrFox firefox extension looks interesting.

Brendon Eich Mozilla

– some stuff about mozilla extension schemes. The most interesting bit was the suggeting that there will soon be a XULRunner executable that just takes an arbitrary jar file of XUL and chrome and runs it – firefox and thunderbird could be two such jar files.

Danny Hillits Applied minds

Applied minds do cross-discipline stuff with hardware and software. He showed some pretty cool demos of walking robots, and a map-visualisation table which displayed a map on its surface that you could manipulate with your hands. An even more jaw-dropping version 2 used a table with a mouldable surface that changed it's shape to the profile of the map i.e. it raised up where the mountains were and dropped down into the valleys

Jeff Bezos Amazon

Jeff was demoing A9's new vertical search capability: They've set up an XML format by which a search engine can describe it's interface, and an RSS extension( 3 new fields: result count, results per page, current page) for search engines to return results in RSS. search engines which do this can be plugged into A9 by end users. Should put a bit of cat amongst the RDN pigeons.

Session 2:

Rick Rashid MS Research – Unconventional Inventions and cross-discipline serendipity

  • A TB can store every conversation in your life, or a picture a minute for your life, or a year's video
  • sensecam – 'black box' for a human being. fisheye camera and audio recorder. Smart image stabilisation to optimise when to take photos. Uses sensors to detect environment changes (light levels, movement) to trigger photos. Potential use for memory-loss patients, for reflective practice, or even for tourism – V2 sensecam devices ( smaller, lighter better) in mfg now
  • Surface computing – short-throw projection + computer vision allows you to use any surface as an interface.
  • Touchlight – projects 3d onto holographic film & uses a pair of cameras to capture interaction to allow manipulate objects
  • Using the same techniques that have been used for imunobiology to try and attack spamming – then applying the same principles back to HIV vaccine research

Garry Flake Yahoo Research

  • Y!Q – a search tool – on-the-fly search results. Floats a popup window with results over the top , applying the context from the originating page (e.g. read an article about sports, search for tickets, get results for the game you were reading about)
  • working on machine learinging, collective intelligence, scientific computing, text mining
  • Yahoo research – less product-oriented ( is basically betas)
  • Tech Buzz game : Aggregation over population weighted by performance – an adaptive voting mechanism that rewards good performance with more votes.
  • Joint R&D project between yahoo and O'Reilly. Alpha-geeks can make predictions on emerging tech. trends, driven by search volumes

Peter Norvig Google labs

  • Google: gone from indexing information in Web pages to -> video – >books ->desktops. Interaction with user is getting more and more sophisticated
  • google suggest was done by 1 engineer in his spare time.
  • google maps – almost client-side quality in a web browser
  • google personalized search
  • google sets – enter search terms and construct a set including all the terms

George Dyson

[a fascinating historical interlude about von Neumann and the origins of the first computer]

Kevin Keely AT&T labs

  • New sorts of spam: Spit (over internet telephony); text spam; spim (over instant messaging, skype spam…)
  • Patching isn't working well
  • AT&T is the transport network of choice for hackers because of it's reliability :-) This has incentivized AT&T to do something about the volume of crap on it's networks. As of later on this year they're intending to clean up their networks at the edge, to the point where they won't have a corporate firewall anymore.

March 14, 2005

Building Apps with RSS, Atom, and the Atom API

Writing about web page

Ben Hammersley (sporting his UtiliKilt)

  • RSS / RDF / Atom are syndication formats. RSS has lots of different formats, largely due to it's politicized development process
  • Atom: is currently only at v 0.5; is changing quite fast.
  • Atom mandates a lot more stuff than RSS
  • RSS 2.0 great for machine readable lists. RSS1 good for super-complex interlinked document mining. Atom learns from the experience of RSS1 & 2 and sits somewhere between them.
  • [ a diversion about the philosophical origins of the word 'atom' ensues …]
  • 5 atomic facts about a document: who (created it), when (it was created), where (it is) , what (it's called), what (it contains). An atom document must state those 5 facts about a document, because if you don't capture that up front you can never get it back with certainty. note to self: how does atom deal with modifications?
  • Key concept: Resource - the document + all the data about it. Representation – some view (e.g. html) of the document in an application.
  • 2 types of document: an entry and a feed. An entry is the resource in XML - the content and all the metadata.
  • Constructs – extension points in the atom schema. 6 types: text, person, date, link, category, identity, service. Ben asserts that these 6 are sufficient.
  • I wonder what we could do with an atom entry for every sitebuilder page? It would be a simple alternate rendition
  • Ben Hammersley fancies himself to be the Wittgenstein of web syndication ( © John Dale 2005 :-)) Where are his bees?
  • Feeds - a collection of documents (entries) plus it's own metadata. A feed is a query over resources. note to self: support for projections?
  • Feeds of Feeds are supported, since a feed is also a resource. (an entry can describe a feed)

Atom API

  • History: BloggerAPI, Metaweblog API (XML-RPC / SOAP)
  • Atom uses REST
  • [ a brief interlude about REST ] Ben's interpretation of PUT/POST is not what I understood: He interprets POST=create, PUT=replace; I would have said PUT=create, POST=update. Must investigate (doubtless I am wrong!)
  • note to self: could we define individual warwickgroups as atom entries?
  • An atom API call is an atom entry document sent over HTTP with the appropriate verb (method)
  • atom endpoints: 4 per system; postURI (one per system), editURI (one per resource), FeedURI (one per query), ResourcePostURI (one per system)
  • Adding a link rel="" element to a page makes the PostURI discoverable by atom API-aware clients. similarly for service.edit (edit an entry), service.feed (get feed) etc.
  • Features: inherits from lower-level protocols e.g. internationalisation (use XML) authentication, cacheing (use HTTP), encryption (use SSL). Keeps the spec small; means clients must be multi-protocol aware.
  • Versions: Each version is its own resource (or you could create each diff as it's own resource), and you use a link element with an app-specific tag to indicate that each resource in a version history is related to the same 'document'.

part 2

  • Documentcentrism – input—>content stored as atom entries—>view
  • inputs: atom api, file creation, other interfaces ; output: html, XML, RSS …
  • Ben demos his simplest-possible atom CMS, which does HTML and RSS representations of queries over atom resources. In use at the Guardian/Observer, for building their media blog.
  • Using apache + http content negotiation is a neat way of having 1 URI serve multiple different renditions of content.
  • Atom doesn't specify how to do locking of resources – defers to the application. Not obvious to me how you would communicate to the server that you wished to lock a resource, though it's easy enough to communicate back a lock failure (HTTP 503 resource unavailable or something). Maybe you could represent locks as entries in their own right?

Sitting around

So, I'm killing time before lunch: Noting some cool things bundles – group your tags! Go to the settings/tags page but replace 'tags' with 'bundle' in-page categorisation. Kind of hard to describe, but way cool. There's a flash demo here

Etech back–channels

Where else is stuff happening?

irc: / #etech.


Web Services Mash–up

Alan Taylor, Cal Henderson (flikr), Eric Benson

AT :

  • Amazon light – glues amazon to yahoo, google, gmail, blogger, libraries (using Jon Udell's LibraryLookup OPAC integration). Start with a book then use the other services to do stuff with/about that book.
  • Need to be very aware of the Terms of Service for API use. Providers may change this without regard to your requirements!
  • 'small w' web services – alpha/beta code; not enterprise-level. Not necessarily SOAP/XML-RPC – could be anything.
  • What's available?

* Amazon, Google, Technorati, Flikr

* ebay,, paypal, o'R. Safari,, GigaBlast (RSS-based web search engine).

* RSS feeds from news/media corps/orgs.

* screenscrapes / include files.

  • Use with care – people don't always build these interfaces with 3rd party application development in mind (esp. RSS feeds / screen-scrapes which are optimised for readers)
  • Mash-em-up: Look for shared keys from tags/metadata to join different sources. e.g. google categories->flikr tags
  • Adding your own information/content is safer/easier than trying to join 2 third-pary services.
  • most services require a dev. token to use
  • flikr api
  • Amazon

* Amazon are working on a 'remote shopping cart' – keep your own look and feel while the user builds their basket, then dumps you into amazon for checkout

* XSL transforms – can reskin amazon based on a set of XSL stylesheets

  • amazon own Alexa & IMDB, though the APIs for those two are semi-private
  • Google: Search, cache, spellcheck. Advanced Search queries make for a fairly powerful API frontend

* SOAP only; 1000 requests/day max

  • Technorati: Query, keywords, tags, pinging (info technorati dev wiki)
  • Considerations: Terms of use, attribution/credit/licensing; act conservatively with others' resources; handle failure gracefully when you have a long pipeline/chain of requests.
  • Cache: You must cache your calls if you want to be even remotely nice
  • Talk to the data owners; they'll find out sooner or later anyway and it's better to be talking to them from the start.
    Real-world examples
  • Mappr
  • Dropcash – fundraising tool. Uses typekey for authentication and paypal for payment
  • – what books are people blogging about ?

CH: (is a good speaker and very droll.)

  • Flikr: The centre of a big distributed DB . Only the UI is photo-centric
  • Once you provide the API, people build new ways to get data in and out
  • You do need to store data that people care about
  • everything happens over HTTP (except the odd bit of SMTP)
  • flikr does SOAP/XML-RPC/REST
  • REST is the coolest at the moment. Way simple. 90% of flikr users use REST.
  • Flikr use a wrapper for their REST responses to indicate status (why?)
  • Page-scraping (HTML-over-HTTP) – volatile, bandwidth-greedy
  • Being transport-agnostic is a good thing – once you've defined your domain, it's easy to offer multiple formats.
  • Beware of 'shitty coders' – people who scape the site and pull in large numbers of pages, especially API abusers e.g. the windows flikr screensaver that checks for new photos every 2 seconds. with 100 users that's 50 hits/second.
  • provide client bindings/libs which do cacheing – naieve programmers will use the bindings
  • Cache on the host too.
  • Monitor closely
  • Authentication: flikr use query params for authentication – sends pwds over clear text (lame-o)
  • Http BASIC is next step up from that; HTTPS means you can keep using query params since they're encrypted on the wire.

Learning points from flikr:

  • Be Open – don't hide any of your data
  • Be protocol agnostic
  • Be careful of abuse – use dev. tokens so you can block abusers easily


  • Robot co-op make 43 things – creating communities based on shared wishes.
  • trying to enable machines to talk about humans e.g. flikr is just computers talking to more computers, but it's a community because they're talking about things people care about
  • 43 things read API: people, goals, tags, entries, cities, teams
  • write api: add/update/remove goals, write entries
  • I wonder if we could make a Warwick43Things ?
  • Most people aren't worrying too much about how clean their REST APIs are – query strings are OK! If the data is interesting enough, users will learn the API


– How can you make money off this stuff? [A] You don't make money off web apis, you make a service that people want to pay for, and then provide the APIs to attract more people. e.g. Amazon / Ebay uploader / storefront tools

– What will happen with paying for higher-volume access [A] it's on the way, but it's happening on a one-off basis. I wonder if we should talk to flikr re. galleries in WB ?

  • Flikr color wheel works by scraping all new images off flikr periodically, cacheing locally, and scans them for color-balance.

I'm at ETech

W00T. I'm in sunny* San Diego for the O'Reilly Emerging Technology 2005 conference. Hopefully, I should learn all kinds of new and cool tricks to do with HTTP, REST interfaces, RDF, RSS/Atom, and all that jazz. Or maybe I'll just get to spend a week somewhere warm with palm trees. Either way it's a win.

Right now, though, I'm just wondering how it is that apple can get away with spoiling the otherwise-flawless Powerbook Ti with a battery life that is nothing more than tEh SUX0r. 2 hours indeed. Bah.

* Actually not that sunny. But it's 67F, dry, with sunny spells, which is a fair bit better than Coventry right now.

Most recent entries


Search this blog

on twitter...


    RSS2.0 Atom
    Not signed in
    Sign in

    Powered by BlogBuilder
    © MMXXI