A notch above a monkey

Reading sources

Google Reader was redesigned lately and I’ve been annoyed ever since. I had a dubious privilege of cutting and changing product features people loved in pursuit of higher different goals, so I try to be understanding when others do the same. I mostly found clumsy workarounds for removed features, but I do wish I could at least still trust that list of unread items actually has all of them. On a positive note I can save some electricity now because copious amounts of “helpfully” white whitespace illuminate this room brightly enough that you wouldn’t sit naked in-front of your computer even with lights turned off. That is if you are the sort of person who likes doing that but stops short of flashing your neighbors.

I still strongly dislike changes made, but I continue using Google Reader, because crack-heads don’t give up dope just because it was cut too thinly. I cherish my list of reading sources and like a gardener I have been cultivating it through years because I believe they make me better informed than I would be if I relied only on links shared by others. This may be elitist, but it is also true.

We are biased when choosing friends and communities we belong to. At the very least we enjoy our life more when surrounded with like minded people which is really a lighter shade of group think. We share to tell stories as much about what is shared as we do about who we are. Even when not self-censoring or trying to project an image we still are horribly bad at evaluating what influences us and how. Sharing everything, as this idiotic article suggests, doesn’t fix this [1]. It’s still content from same people only more of it.

Then there are social new sites, which are in essence news organizations with bigger editorial board. Their focus might not be the same and their world view less obvious (or not) as of traditional boards, but the end result really isn’t all that different. I don’t dread waking up in a world without Apple as I do in one without fish, but it is not articles about all things piscatorial that keep popping up on regular basis.

This doesn’t make socially filtered news useless, just limited and best suited for finding out what is popular at this moment. They should be a side dish not the whole diet. Getting some of your information diet from social sources may improve it, but relying only on them is just stupid. I wouldn’t fret so much if I didn’t worry about development trends — latest Reader changes being one example of them.

Reader had two methods of sharing. Obvious one was button Share which was adequately replaced with sharing to Google+ circles. The other one, which was the one I actually relied on, was to create public feeds for articles marked with certain tags. The most important difference is that in first case you grouped by intended audience and in second by actual content [2]. Instead of following me, you could just follow my selection on particular topic which in most cases would probably be closer to what you want.

By itself stripping a feature like that doesn’t mean much. However when I also judge other changes such as aforementioned abundance of whitespace, removal of “Note to reader” and  new reading unfriendly theme, it’s easy to come to conclusion that all roads now lead to Google+. Reader’s role is at best to feed its younger brother with stuff to socialize around.

It would be wrong to attribute these changes simply to competition with Facebook since they are a part of a larger trend to social curation. I find this trend just a normal consequence of a web ecosystem where most product innovation happens in VC funded startups. How companies were funded was always a part of their DNA and economics of today’s VC environment for companies that will probably be acquired at some point (and let’s be honest, who REALLY believes most news experiments won’t be?) almost demands a quick and high growth. It’s not impossible to achieve this with sources-based product, but it’s certainly harder and less obvious than creating another twist on social news.

If my first and main point was a personal appeal to seek insight also in your own, personally picked sources, then my second is to question if shaping web and world with it should really be left only to industry and academia. It really doesn’t have to be this way.

  1. Browser’s history is a great place to see just how much of what we visit is unimportant, unrepresentative and often unsharable. A small friction necessary for a deliberate act of sharing is actually a feature that gives at least a modicum of reflection on content’s share-worthiness.
  2. Feeds enable that and are one of crucial building blocks for what I started to call social software for introverts. It is software which is better when used by many, but is good even when you are its only user. Instapaper would be a perfect example of such an application and Facebook is a counter-example.

My OpenData hackday experience

I attended OpenData hackday previous weekend where we tried to create a website inspired by theyworkforyou.com. We didn’t quite achieve our goal, but a lot was done by everyone and we had fun. At least I did and I am certain we will release the first version soon (before the end of October, mark my words).

Not releasing our website still bothered me and I’ve been thinking about it since, contemplating what we did right and what I did wrong. I hope this rumination can be of use to others who may be thinking about organizing or attending such an event. If you already have and have some tips to share or simply disagree with something I said, then please leave a comment at the bottom.

Planning

It is important to finish something. Having something at least a bit useful or fun gives everyone involved a sense of achievement and provides a better motivation for further work. To avoid problems with too many ideas started and none finished, we suggested to work only on ideas that have at least 3 volunteers. We also created a wiki where everyone could add their idea and volunteer for projects they liked.

I think strongly suggesting at least 3 volunteers per project idea had an effect of ending up with just one idea. Not our intention, nor necessarily a problem, but it’s worth keeping in mind. We all tend to take the path of least resistance when not really committed to an idea and joining is easier than finding an idea AND people to help.

However if you do come up with an idea, then here are a few things you might keep in mind:

Self-sustaining projects win. Even with best intentions you will probably run out of time for your project eventually. So projects that can be finished or which can be run by a community with few development and administration resources are less likely to end in failure.

Strip it to bare essentials, but keep a list of things you’d like to add if you had more time or help. I knew that we couldn’t build theyworkforyou in a day, but my planned version was still too grandiose. I should stop only when removing anything would reduce it to useless. Another reason for this is that I also often overestimate my free time — important for projects that will continue after hackday.

Check if you have data you need. First check for completeness, since missing parts might significantly affect feasibility of your project (especially if they are needed for a stripped down version). If you plan to scrap data from a public source (instead of using API or something you already have), then scrape it well before the event. Site may not be available when you need it or, as it was our experience, it can be frustratingly slow. Perception of speed is relative, but downloads can always go slower than even the pessimist in you would make you believe. I learned on hackday that I was planning to use a non-existing data.

Check the quality of your data, don’t only glance at it. Spend some time getting to know it and think about how you will clean it up. I knew our data was in an atrocious state, but I was still widely optimistic. Luckily Tomaž is a seasoned data wrangler that can deal with crap input.

Have a detailed TODO list. Not only will it give you a better picture of scope, it also makes it easier to divide work to unexpected volunteers. Mark what is necessary and what would be nice to have. Also pay attention to skills needed for finishing each task. Gašper put a lot of effort into our to-do list and I haven’t. We could certainly achieve more if it was clearer what needs to be done and if I could delegate better.

Have a “roadmap”. Not every task can be done in parallel so identify and mark task dependencies. It may also be quicker if multiple developers extract different information from the same piece of data then for everyone to wait on one person to extract all.

Prepare hosting beforehand. If you plan to host a website, then set up and test at least essential parts before the event. Day passes too quickly even if you don’t waste time setting up the environment.

Ditto for development environment. How many people do you think will or can work on project(s)? Are they all familiar with tools you intend to use? Do you have adequate resources?

I expected time constraints to prevent us from auditing code for exploits, so we opted for bitbucket which gives you private repositories for free. But we had to juggle around its limitation of 5 developers per repository and choice of mercurial as DVCS. Having them private turned out to be unimportant.

So pick tools most are comfortable with and set up your own server (before the event of course) if publicly available options might not be good enough. You want everyone to start contributing as soon and as easily as possible.

Running event

Our hackday was as open as it gets. We used Eventbrite for registration, but didn’t enforce it so everyone who came was welcome no matter how long they stayed. It’s fantastic when people show up offering free help and it would be plain crazy to turn them away. There is always something they can do no matter what their skill set.

Here are a few tips I wish I followed:

Lead. Self-organization is great, but the likelihood of misunderstandings and duplication of effort quickly increases with the size of a group. If you have dependencies in your project roadmap, then it’s also important that showstopper tasks are done before those that rely on them. That’s why someone usually has to lead development and if a project doesn’t have that someone, you either have to find or be him.

Talk to everyone. Find out what they are doing or what they would like to do. Find out also how long they intend to stay, since that might limit the choice of suitable tasks. Talking is pretty much the only reliable way to find these things out.

Even better might be to have short group meetings throughout the day for everyone to catch up and I regret I didn’t think of this then. This however doesn’t absolve you from talking, especially if contributors are free to come and go whenever they please as they might have not been around yet the last time you were catching up.

Delegate. As much as possible to as many as possible. Don’t play a hero trying to do too much. You can always pick another task later or help someone with theirs if you finish too soon.

In fact it’s probably better to help less experienced members with their tasks than doing whatever you are doing. Having a healthy contributing community is what makes or breaks voluntary projects so nurturing one is even more important than finishing the first version. This goes even more so for projects that can’t ever be really finished.

Morning after

Hackday is over and everyone has gone for a beer. Hopefully everyone is happy with results, but ambitious projects almost by definition can’t be finished in a day and the next morning it was amazing to see people continuing hacking where they left off the day before.

Not every hacking community is as engaged as ours. However we all have more or less the same needs. It’s nice to hear our work was appreciated and even better that it mattered. We like to help if we are not alone and if it is clear how. Thus I found email sent by Gašper a few days later a pitch perfect post-hackday message. He first thanked everyone involved, then listed what each and all together achieved and ended by what he intends to do next.

So, this is it. A long, but not exhaustive run through my experience of our last hackday which also took way longer to write than I expected. It’s time to go back to hacking!

Missing refer(r)ers

I was talking to my wife today about why an “Unknown Source” is becoming the most common source of visits on Flickr. My guess is that it is almost certainly because a Referer header field is missing in page requests and Flickr is only relying on presence of this piece of data to discern sources. This might have made sense when browsers ruled the web, but we don’t live then anymore.

Simply put referrer field can be added to request only when request was triggered from source which also has an URI address. Every web page has it since its address is also an URI, but most programs don’t.

That’s why referrers aren’t missing only from visitors using paranoid browsers or security software that stripped it out of requests, but also when page is visited from a link embedded in emails or instant messages.

Requests without referrers used to present a small enough subset of visits that most of us didn’t care. I first noticed this changing when Twitter clients became popular and now there are tons of apps behaving like this. With XMLHttpRequest Level 2 even web developers will get a choice of making anonymous requests which won’t include potentially private data like referrers.

That’s nice. I have certainly visited places which I wouldn’t want to share with strangers. But I suspect this is not the common case and most of the time we don’t care telling which page or application sent us there.

Nobody can precisely predict future so specifications can never be perfect. But I am always amazed when reading core web specs like HTTP’s how insightful their authors were. That most programs don’t have a URI does not mean that they couldn’t. Practically anything can and definitely more things should.

In principle it is possible for each program to have its own address which could be used as referrer when a better option doesn’t exist. I suspect this would be against the spirit of specification, but likely not against its text.

However better options often do exist. Lets take Twitter as a popular example. Every link that triggers a visit from Twitter was included in at least one tweet and every tweet is also published on web. It might not be accessible to everyone, but it does have an address. I see no reason why that address could not be used as referrer by Twitter clients.

It is originating content or its address that interest me, but really, almost anything truthful beats an unknown source. Just knowing name of the service (like Twitter) or app used (e.g. Tweetdeck) would be better than nothing. By the way there is another header field that could provide such insight: User-Agent. Sadly it is notoriously unreliable and often missing as well, but that’s another long story.

So, if I come back to Flickr. I can’t really tell how much Flickr can or could know. I suspect more than it tells, but I would be astonished if they are not mostly in the dark too. Web was largely built by people too busy (and often too lazy) not to cut corners. Meaning: learning and doing as little as possible to get something to work and sometimes we pay price for our ignorance. But it’s a small price compared to waiting for Xanadu.