[ Home page | Things that suck ]
This is the (sadly) rather unstructured transcript of an IRC conversation on the topic of web-loggers and their itty bitty protocols and file formats. After finishing this rant, I hadn't the energy to format it into prose, but I'm egotistical enough to want to expose it to the web anyway. Sorry.
(Also, apologies for physical formatting in this page. It Just Looks Better That Way.)
chris | jesus! |
chris | libxml2 is 3.7MB of source code! |
chris | how complicated can recognising bloody angle brackets be? |
chris | how can it be that understanding a simple "Atom" feed requires installing two bazillion fucking perl modules? and what the fuck is SAX, and why should I want anything to do with it? |
J | has studiously avoided becoming infected with answers to any of these questions. |
J | But 3.7MB takes the piss. |
chris | and what, i mean what the FUCK, is "X-WSSE authentication"? |
chris | oh for fuck's sake. fucking "Atom" has invented A WHOLE NEW AUTHENTICATION SCHEME! it's not as if there are two in HTTP already! no, those are no good, because those DON'T HAVE A FUCKING X IN THEIR FUCKING NAMES. |
chris | fucktards |
J | "Ex-wussy authentication"? |
J | WTF is "Atom"? |
chris | it's a replacement for RSS |
chris | RSS, a simple format for news headlines |
chris | of which there are now SEVEN MUTUALLY INCOMPATIBLE VERSIONS |
chris | (or is it nine?) |
|
|
meteobot | NOW: wind 8 knots (force 3, gentle breeze)
from W → |
|
|
meteobot | NOW: wind 2 knots (force 1, light air) from SW
↗ |
|
|
chris | The simple task of producing a library which parses "Atom" has so far required the installation of 20 archives full of perl modules and one gigantic library from GNOME |
chris | allegedly this standard is supposed to be better than RSS (it would be hard to be worse...) but it's not looking promising so far |
meteobot | NOW: wind 8 knots (force 3, gentle breeze) from W → |
J | Mein gott. |
|
|
chris | oh, i give up |
chris | it turns out that atom sucks just as much as rss (probably more) :( |
J | RSS wins solely on the number of MB of extraneous source not required, by the sound of things. |
chris | well, i can't actually remember how much other crap i had to install to make rss work |
J | had to install a fair amount of XML and XSLT related libraries, IHRC. |
chris | actually the problem with both of them is that they're not so much protocols as manifestos for new societal norms, just like XML is. |
J | wonders what the critical differences between societal norms and protocols are in practice |
chris | ok, my comment was (intended to be) a bit soundbiteish |
J | (mine was more a genuine question than a refutation of yours) |
|
|
chris | the protocols that actually work -- smtp and http are basically the only long-surviving examples on the internet, I think, perhaps IRC too -- are characterised by the strict-send/loose-receive principles |
|
|
J | ah |
chris | XML and the blogger-weenie protocols are characterised by the statement "if you don't obey the protocol then the rightful wrath of people who actually give a fuck about XML will descend on you *and that will show you!*" |
chris | this argument was made frequently in the early days of HTML, and look where that got us. |
chris | "those who do not learn from history are doomed to repeat it" |
chris | in more depth, the blogging weenie protocols are crippled at birth by the fact that the people who write blogging software are, as a class, completely unaware of their own limitations |
chris | so rather than getting something really simple (list of dates, titles and URLs, separated by whitespace) we instead get some complicated mound of crap which exists in seven mutually-incompatible versions. (see mark pilgrim for the gory details.) |
J | Interestingly, the successful protocols are a layer or two removed from the problematic ones. SMTP, HTTP, IRC are all essentially simple - the information passed over them can be complex, but they all treat it is a blob of no great significance. |
chris | (yeah, that's another place where the successful protocols go right and the others go wrong. and it applies in more detail too. the hairy bits of http/smtp that Don't Quite Work are the ones where the servers have to look inside the box -- e.g. anything involving character sets, content-types, languages, etc.) |
J | It could be that the problems with XML, HTTP, RSS and friends are merely reflections of the complexity (and complete lack of "this is the right answer"ness property) of the data they are supposed to deal with. |
meteobot | NOW: wind 0 knots (calm) |
chris | (probably. but that doesn't explain the failure to start with limitations in mind and build robust protocols. instead we have this enormous scaffolding of XML, which is supposed to solve all of these problems but doesn't, and then a crowd of users who can't get the simplest thing right. |
chris | the rss community even disagree over how html-in-rss should be encoded! |
chris | (i think in practice one interpretation is now settled, but even so!) |
J | Very true, twicely. |
chris | anyway, it's very annoying (and slightly mystifying). |
meteobot | NOW: wind 6 knots (force 2, light breeze) from W → |
chris | worse still, it's pretty clear that for lots of the sites i try to read via rss, nobody actually uses the rss feed -- often it will turn out to be weeks or even months out of date, *and nobody has noticed*. |
chris | lots of people rely on third-party scrapers to scrape the content on their sites, and these just stop working |
chris | the other problem is that that idiot dave winer "owns" rss in some complicated territorial sense (for him, it's the equivalent of pissing on the web so that it smells of him for ever after) and so the blogger people won't use it |
chris | of course, it's not true that quality of writing is influenced by quality of software, so there are some sites i'd like to read via my headlines aggregator which are blogger sites. |
chris | so i need atom support. |
chris | but now i go and look at some of these sites in more detail, and it turns out that the atom feeds have EXACTLY THE SAME PROBLEMS THE THIRD-PARTY SCRAPED RSS FEEDS HAVE -- they're out of date, or have only a subset of the content, or (in one case) seem to have content completely unrelated to what actually appears on the site |
chris | (i suppose one should file this under "reasons not to use a shared hosting facility") |
chris | anyway, it's all a total trainwreck |
|
|
J | got sucked in to dull material linked from Dave Winer's website, and escaped only just in time. |
chris | * J got sucked in to dull material linked from Dave Winer's website <-- oh god, it may be too late! |
J | is pretty certain he managed to pull out before becoming infected. |
chris | dull material linked from Dave Winer's website <-- as opposed to what kind of material linked from Dave Winer's website? |
J | apologises for the tautology. |
And to finish with a question: should I add links to the above?
Copyright (c) 2004 Chris Lightfoot. All rights reserved.