Archive for the ‘dead 20’ tag
You are reading a blog - Innovation in Software - no longer under active maintenance. These pages are kept here for archive purposes. If you wish to find out more about Vagueware please read our current website which will include links to the new blogs when live.
A Manifesto for Tags
Ask Skeptic’s Mom doesn’t get tags at all. I don’t blame her. I think it’s time for a… Manifesto. Or at least a way of trying to explain tagging to somebody who is new to this way of thinking.
In the beginning there was the librarian. Wise, unnoticed by members of the opposite sex and mostly under-paid, they did their Good Work. However come the end of the 19th-century there were many books that needed to be organised, and no agreed-upon way of organising them. Enter our hero for this introduction, Melivil Dewey, who invented The Dewey Decimal Classification system that we all know and our love from our fine public libraries today.
The DCC (as it’s known to its friends) is a specific form of something we like to call a Taxonomy, or if we’re feeling particularly philosophical, an Ontology. The purpose is simple: how can we take a large pile of books (or indeed any resource) and place them into some order such that when we need to recover an item we can do so easily. Further, would it not be easier for all items of a similar nature to be grouped together in the same place?
Without such a system, the public libraries would be even more chaotic, noisy and party-like than they are even today. We would be clambering over the beer kegs asking the jocks if they knew where that volume on architecture up to circa 300AD was. Do you think they’d know? They wouldn’t answer, as they do now, “Dude, 722 – don’t you even know your DCC codes, you doofus?!” whilst whipping us with a towel. No, they would merely look confused.
On meandering over to section 722, we would surely find the book we were looking for, but behold! We also find lots of other books on the same subject, that we didn’t even know we were looking for in the first place! We are in one place looking at the entire majesty of the resources the local library has managed to put before us in section 722 – all 2 books! OK, so public libraries are under-funded, but that’s not the point. Without DCC or some other taxonomy to replace it, the two books would never be near each other, we might find one, but never the other.
We know this is one reason the classification of resources is very important. It helps us not only find information we’re looking for, but it also helps group those items together to make ‘neighbours’ easily found.
Great, so what do we need tags for?
Well, the problem with any taxonomy or ontology is scope. How do you create a taxonomy that is so large it can takes absolutely anything humanity can come up with? The short answer is, generalise – don’t try and be too specific. If you have a book “How to Hunt & Cook Pigeons”, then that probably belongs in either 799 (hunting), 598 (birds) or 641 (Food & drink). But wait, you have one book – which one does it go in? Or do we create a new category specifcally about cooking hunted birds?
This is the problem with taxonomies – at some level you need them to be very basic so that they can be easily understood and referenced. However, too general and over-arching and you find that some things need to fit into multiple categories, or you need to create whole new categories every time something comes along that doesn’t quite fit. What’s more it can all be quite subjective: is shooting pigeons and cooking them hunting, or is it sport, or survivalism or is it just cruel and barbaric and not the sort of thing you should have in public libraries? Who decides?
The problem gets harder when you try and get information out so you can find the book later. You now need to navigate through possibly three different branches of the hiearachy to discover what you’re looking for, each time having to make choices down branches that might be wrong.
What’s more, whilst it’s nice to find neighbours, what if I want to find neighbours that don’t match the way the taxonomy was organised? What if I want a collection of books on cooking birds of all types, not just pigeons? I might need to go to several different places. What if I’m just interested in pigeons in general? Do I miss out by looking in the birds section by not knowing there is a book I might be interested in over in the hunting or cookery section?
It gets worse when you realise that you could be dealing with not just a few thousand items in a library, but the entire sum of human knowledge. Every document, photo, film, sound recording, computer program and physical object. Imagine trying to classify and then later find everything related to piegeons, cooking and hunting in that lot.
Enter our new much-hyped, but little-understood hero: tagging.
The purpose of tagging is to replace taxonomies. We want to do this for lots of reasons, including:
- We don’t want to have to worry about where we put stuff into the system. We want to mark the item up without having to spend an hour – or decade – debating which part of the taxonomy it belongs in.
- We want to know it can be easily retrieved by those who may be interested in finding it at a later date.
- We want to be able to easily find ‘neighbours’ even if they belong in a traditionally unrelated taxonomy.
Let’s look at the lifetime of a tagged object, our now familiar book on cooking pigeons. We have a book that we’re going to enter into our database with tags. We decide just a few tags will be sufficient:
cookery, birds, pigeon, hunting, book
If it’s a digital book, we would attach the file to this ‘record’ now, or we might just point to a shelf location if that’s where it belongs. Note, we have not referred to any taxonomy here, we’ve just put the data into the system, and we’ve now moved onto entering our book on architecture in the 2nd century. No debates, no discussions, no new classifications needed for a quirky book. It’s just been put in the system.
Now comes the important bit: getting the data back out.
Our first custodian stumbles in, scratching his beard and thinks about doing some shooting for dinner tonight. He walks to the console and types in ‘cookery’ and ‘hunting’ as tags to search for. We get a hit for hundreds of books, and he notices our book on pigeons. He selects it, looks around, and now asks for ‘cookery’ and ‘pigeon’ to swap the classification he’s looking for to see if there are any other useful guides in this library. Vegetarians the World over will be pleased to hear that there aren’t, but when he de-selects ‘cookery’ and has just ‘pigeon’ on screen, he is reminded that this is but a mere mortal beast worthy of his mercy thanks to the billions of pictures of cute pigeons he is exposed to.
Our second custodian is thinking of doing some game hunting this weekend, but has no idea what he might do with his catch. He selects ‘birds’ and ‘hunting’. Again, a selection of titles comes along, including this title he might otherwise have missed.
Our third custodian is an animal-rights protestor. She means business. On pushing our first custodian out of the way, she searches for ‘hunting’ and is bewildered by her choices until she notices in very small letters in the tag cloud ‘pigeon’. Mortified, she then discovers the option for ‘cookery’ and decides to create a list of all books tagged ‘cookery’ and ‘birds’ to include in her letter to the chief librarian.
The point, you see, is retrieval. We don’t all think in terms of a taxonomy. Creators will create things that don’t fit into a category, and people will be able to take advantage of finding items and their neighbours that don’t belong together in the scheme devised by Dewey (or any other scheme for that matter). This is particularly hard in taxonomies because ‘neighbours’ are really subjective based on the context of the person doing the looking, not the Dewey view that they are objectively determined at the point of classification.
This is particularly useful when you consider that if all content on the Internet is tagged (or maybe metadata ‘keyword’ tagged), we can create powerful search engines that help us devise something powerful and far more user-friendly than any taxonomy.
Google effectively does this for us. Google in fact beat the ‘taxonomy’ that Yahoo and others were pushing before Google arrived through ‘keyword search’ (which is just another way of saying ‘tag search’) – but we don’t realise or recognise it.
The fact we’re now bringing it into the foreground of the application in the Web 2.0 age should be something users rejoice over, rather than reject. Imagine how they would feel however if Google only allowed one word in its search box and if you wanted a particular cookery web page you had to select ‘cookery’ and then browse through all of them until you found the one you wanted.
The downside of course, is that most current interfaces for tags only allow the selection of one tag at a time. Very few allow for us to find inter-sections of tags. It is no good being able to find all books tagged ‘pigeon’ OR all items tagged ‘cookery’ – we need to find the cross-over. Current web applications have reached the first stage, but it is when they reach the next and all content on the net has been tagged that we will truly understand their power.
Sunday Link round-up – 10th September 2006
Even Sunday is a work day at Vagueware HQ more often than not. Instead of trying to impart some lengthy wisdom/nonsense though, each Sunday I plan to do a quick round of the blogs and mailing lists and find things that should make the evening before Monday morning a bit more interesting. If you don’t get to this before Monday morning, well, it’s better than work – go make another coffee, and tell the boss you’re catching up on e-mail.
Vlogging the VRML of Web 2.0 – I’m becoming increasingly enamoured with Dead 2.0 even though I don’t agree with their central premise that this is a bubble like the last one. However, I agree that video blogs need to mature to the point of competing with normal TV for people to be prepared to give up their time to them. I can scan read 100 blogs and pick out what I need to be aware of, what I want to come back to later, what I can ignore, all in about 30 minutes a day – I can’t do that with video. I still have a TV in the corner of the room (I owe a business partner £50 for losing a bet by not getting rid of it actually) but I only ever really watch The Daily Show, Newsnight and the odd thing on FilmFour. I don’t think I can add several hours of video to my daily digest of media, even if it does involve attractive women leaning forward in an oh-too-unsubtle-pose – I’m too busy formenting digital revolution.
Creative Commons making life on-board a warship more tolerable – great story, because it confirms what I’ve always suspected: expose somebody to culture and they’ll ignore the fact reading isn’t meant to be ‘cool’ and gulp it down. Stories like this just help me confirm that if I can help boost the economy around CC/PD content I’ll not only build a successful business, I’ll be helping people express themselves, connect, and do amazing things. I’m still looking for legal help, by the way. Story via the wonderful people at The Open Rights Group
MythTV beats MCE in Review – One of the very first business ideas I had was to build a device that could sit under a TV, connected to a high-speed Internet connection and download video to a hard disk to watch at the user’s pleasure, therby meaning you had every TV programme, film, radio show, ever, all accessed for little cost in your home. Thing is, I had that idea and a design when I was 14 (that was 1992 when the commercial Internet was months old in the UK) and no money. These convergence boxes are heading down that route, and I’m still thinking about how to plug MythTV into a distribution network to offer something better than the commercial nonsense we’re about to get shoved down our throats. In the meantime if you want a great DVR, I know where you can get the best cards in town at the best prices for the job if you want to build your own. However if you contact the man behind the facade – in exchange for portraits of the Queen signed by Mervyn King, naturally – he may even build you your dream home-media device. He knows his stuff.
RailsConf Europe sold out – in truth, I would have loved to have made this. Rails is my worklife right now, and in months to come I hope to actually contribute to the core code itself via several means. The price however, was just way too high. At £475 on the early-bird for registration, it was just too rich pickings for me. Never mind – maybe I’ll think about what I can present at next year’s and try and get in for free…. ;-) Hope the guys have a blast in London, and I’ll be checking the blogs to see what goes on.
The Ultimate Blog Post – this will be all over the ‘blogosphere’ in a matter of hours and if you’re new to the whole blogging scene, it’s terribly in-joke-ish. But funny all the same.
A-Z of making money from a blog – whilst we’re talking about blogging, here’s the real reason people are loving the blog thing. TechCrunch reportedly makes $60,000 a month in ad revenue and they’re not the only ones – Steve Pavlina has for some time been making ‘five figures a month’ from his blog. Of course it helps if you settle on a topic people are interested in, there is a a high CPC on your ads and you can actually write but I expect this to blossom into a bubble – or maybe even a real sustainable business model – proper in 2007. I’ve run over a dozen blogs in the last 5 years, and I reckon total revenue from AdSense was less than $20, but don’t listen to me. Be inspired, learn how to express yourself and get going.
Why Joel’s Business isn’t like yours – it’s easy when you start a software company – trust me – that the way to go is to follow in the path of other successful software companies. Fog Creek, 37signals, whoever. Not the way to do it – you have to find your own voice, your own way of doing it. My way? Don’t hire for as long as possible; don’t hire employees, ask peers to come and have fun with you; cashflow is king; the nicest office is bed; customers matter more than you. Maybe I’m wrong, but it feels right for me right now, so until I’m proven wrong….
Top 10 ways anyone can guarantee an angry workplace – I swear I’ve worked at places that hired people who considered those 10 steps their personal raison d’etre. I fouled a few of them myself in the past.
Shuffling cards one-handed – little known fact about me #9273: I used to be able to do some really, really cool (and dodgy) things with a deck of cards. I sometimes practised for days, weeks, months and read up on books written by old card sharks. This is a simple cut that can get you started, but there’s an easier way to do this my Dad showed me when I was little that shoves the bottom half up straight under the top half using the index finger running alongside the underneath of the top half of the deck. Also, if you want to do this seriously, look at exercises piano players do to stretch their tendons, and consider being kinder to your hands by moving to a Dvorak keyboard. Yes, I know I take this stuff way too seriously. Via lifehacker

