Kagtum & Rails Rumble

September 10th, 2007

This weekend just gone, I attempted to compete in Railsrumble 2007 with an application I call ‘Kagtum’. The idea of Rails Rumble is to take 48 hours to build a Rails app from scratch in a competitive scenario. At around 7:30pm BST last night I realised I wasn’t going to finish the app and so I asked to be withdrawn, as I explained:

Maybe it was the fact I decided to try and compete on my own, and therefore didn’t have the advantage of a team. Maybe it was the fact it took half the first day just to get a working application stack up on linode. Maybe it was the Saturday afternoon spent in the company of friends rather than coding. Maybe the idea was too ambitious.

Maybe I just wasn’t good enough.

Whatever it was, I didn’t get enough of my idea implemented on the weekend of the 8th and 9th September 2007, that I wanted other people to look at it. I failed. There is no app here.

The ideas I played with in those 48 hours though, intrigue me, and they will be worked on over the coming weeks and months. The end goal is going to either be very interesting, or an exercise in futility. If you want to find out which, you can keep an eye on the blog and I’ll be making announcements there.

I will be judging, and I look forward to seeing other apps, so good luck. Until next year…

The first day did kill me - linode was under heavy load (not surprisingly, with over 100 teams trying to get their application stacks set up) and the guidance we had been given by way of a screencast was inaccurate in places. Top tip: when you’re in a hurry leave the rdoc behind and always pass “-d” to gem install.

Anyway, I still hope to judge - and I’d advise anybody with an interest in innovation to look out for the announcement that you can sign up for judging and take a look at the apps that were finished - but I thought I’d talk about Kagtum a little bit here, because the core is almost done and I’m confident I can get a working app out of the door soon. I’m also tempted to open source it.

It all started about 2 years ago when I was left distinctly underwhelmed by Wikinews. The problems with wikinews are many and pretty obvious to anybody who spends a few minutes digging.

The primary problem to my mind is that they’re using a piece of software designed to build an encyclopedia to build a news website which means all articles are given equal footing. It seems reasonable that they should be given equal footing, until you realise that unlike an encyclopedia, not all news items are equal. A world-famous opera singer dying is not equal to a drunken brawl in my local town centre, and neither are equal to the Iraqi PM losing the confidence of the Iraqi citizenship.

However, the core idea - news written by, and for, everybody is a great idea. I’ve spent the last two years playing with lots of ideas in my head and watched emerging developments in the online news and journalism scene before I came up with an answer: quite simply it comes down to targeting relevance.

If I am in Manchester UK, there are stories that are local to me I’m interested in that somebody in New York doesn’t want to see. Likewise, there are stories happening on the other side of the planet which are important to me because they have an impact on me, or because they are in an area I have an interest in. The “perfect” news website will know this, and present just the articles I need to see. Ideally, I also don’t want to be bogged down with partisan and opinionated pieces - I want impartial, simple, Economist-news-page-style articles that give me the leader and then show me what is out there being written about it.

Thus was born the concept of Kagtum - the phrase “kag tum” means “to bring news” in Sumerian, the script of which is the oldest written language currently known to mankind. Kagtum will be a wiki news site that helps target articles based on relevance to you and your life. Relevance is everything.

The idea is quite simple, but the algorithm needs some polish before I can roll it out: we create a news story perhaps based on a report in MSM, or perhaps as a first-hand eyewitness account, that points to online sources if available. We then attach to that story some “impact profiles” based on location or a tags.

For example, a story happening on my street (say, planning permission for a new development) would have a geocoded location and an impact radius of the local neighbourhood. A story happening in 10 Downing Street would also be tagged with that location but could have an impact on the whole of the UK. Suppose the latter story was a policy announcement on Iraq - we’d add Iraq as a location impact as well.

I then login and give my location as my postcode or street, which is geocoded, and the software knows that the story in my street is relevant to me, so is the story in Downing Street. It knows therefore, what is relevant to each and every user and displays the appropriate stories.

Let’s suppose however I have no interest in Iraq. We can tag stories and users can also add tags to their profile that they’re very interested in or very disinterested in. If I said I wanted all stories marked “Iraq” to be pushed down the queue then its relevance to me would be lowered - it might still appear, just not as prominently.

In theory then, when I log in to kagtum, I would see stories about technology, politics and cricket, particularly with stories about my local neighbourhood (stories about technology in my neighbourhood would be even more prominent), whilst my friend who doesn’t care about anything but beer and football will see something perfectly tailored for his interests.

It may also be the case that there are multiple profiles for each user (home vs. work) and that a user can add multiple locations - where they live, where they work, where they grew up, where their parents live - and sees a mixture of stories about places they care about.

The biggest problem I had this weekend was developing the specifics of knowing which stories to show to each user. The problem isn’t hard algorithmically, but providing a technique that doesn’t harm performance and can scale to more than a few hundred users online at a time is proving a little tricky using standard ActiveRecord associations and using the methods baked into GeoKit by default.

There is also an issue of what we mean by “radius”. Saying “this story is important to everybody within 5 miles” is simple enough, but what if I say “to everybody within Greater Manchester”? I somehow need to know if a given longitude and latitude is within that district or not. The Radii Problem (as I came to call it whilst muttering to myself) is important and it’s difficult. I discovered it as I added a story in Washington that was important for the whole of the US - if I added a simple radius of 3,500 miles (to take in California) it of course also covered a huge chunk of Canada, Middle America, the Caribbean, the whole of the North Atlantic (including Ireland!) and most of the North Pole. For a story about domestic US politics, this is obviously needlessly “grabby”.

I have ideas on how to solve this problem, but they’re going to take a few weeks of playing with datasets from the UN and other agencies to be able to get them working smoothly.

There are other aspects I have planned for the site around developing narratives and helping individuals become kagtum journalists, but I’ll keep discussion of those for after the roll-out of both kagtum and the new vagueware.com.

I’ve turned comments on for this article, so if you have thoughts, ideas or suggestions, please leave them.