Comments

Building a reverse HTTP proxy in Node.js (and what I learned from Node Knockout)

Posted by sh1mmer on Aug 31, 2010 in General, Node.js, Web Technology

Help 3>2>1 win Node.js KO!

I’m fortunate enough to live in San Francisco which meant when Node Knockout rolled around I spent it hacking in Joyent’s HQ on the 20th of a downtown building.

I’ve been at Yahoo! for a while and I’m used to hacking in 24 hour competitions (hack days) but this time we had 48 hours. It turns out that makes a massive difference in both good and bad ways. Firstly in a 24 hour project the obvious way to get that little extra boost is to code all night. This is not possible in a 48 hour contest, at least if you want to write code that works. However, because we all planned to go somewhere to sleep I didn’t think our team was as gelled as some of the 24 hour projects, because there wasn’t the same kind of urgency throughout. This really manifested itself in how the team interacted with the outside world. It’s much harder to shut out the world for a weekend than it is for 24 hours. The two of us in relationships left hacking for hours at a time to see our significant others. While this isn’t a bad thing in general I don’t think it helped us hitting a fully feature complete product.

I am however really happy with what we built. We aimed for something unreasonably ambitious and I feel like we managed only achieve something damn good. Which is fine by me.

What we built was an reverse HTTP proxy; an HTTP router. What’s different about our project is the real-time reporting and the ability to create on the fly routes and access rules in JavaScript. Let me break each of those down a little bit more.

When you visit a route, for example demo2.ko-3-2-1.no.de, the proxy will round-robin your connection to all of the upstream hosts associated with that route. In the case of demo2 that means sports.yahoo.com and yahoo.com. In the simplest case you now have an HTTP load balancer. Since Node.js is significantly faster than most web servers it would be feasible for many people to simply insert our proxy in front of their existing infrastructure.

Node Stats

What would this get them other than load balancing? One of the other features we added was a rich API including a web-socket which streams the connection data. This can be seen on the admin interface we mocked up to demonstrate what the proxy is doing. An HTML5 canvas based graph is connected to the web-socket and streams the events as they happen. We also load the details of each request into a YUI3 datatable so you can see exactly what is being requested.

I also made a screencast showing the interface in action.

The final and least obvious feature (because we didn’t have time to write an interface for it) is the ability to add dynamic rules via the API. This is a really powerful and useful feature. Let me give an example:


This function when attached to the clientReq event in the router will “hard block” any requests from IP addresses in the array. An interesting characteristic is that because this is attached to the event which accepts the initial request from the client we can kill the connection off before we even connect to the downstream host. Not only do we not respond with an HTTP request (we simply terminate the TCP connection) we also tell the router not to process any more rules by calling the functions router.stopPropergation(event) and router.preventDefault(event).

It is also easy to implement a soft block which actually returns an HTTP 503 status instead of closing the TCP stream, or to use more complex lookup rules than finding from an array. This is definitely a part of the admin interface we will be expanding.

And that brings me to my final point about competing. 48 hours is big enough that you can bite off something challenging, and fail. Not in the sense that we didn’t build something great, I think we did, but it’s not complete. I have a list of things I’m chomping at the bit to add to this. By challenging ourselves with a tough projects we’ve set ourselves up for success in the future (and hopefully in the project) because this is a very useful system that now I have some momentum to really build into something that will be used every day.

Don’t just take my word for it, find an excuse to challenge yourself too.

Help 3>2>1 win Node.js KO!

Share with friends:
  • Twitter
  • HackerNews
  • del.icio.us
  • Reddit
  • Digg
  • StumbleUpon

 
Comments

Amazon should use the Kindle to crowd source proof reading

Posted by sh1mmer on Dec 31, 2009 in General

The title says it all, but let me expand. The problem with crowd sourcing stuff is generally finding enough interested people and giving them an easy way to contribute to the larger problem. It’s unfortunate that unless the way to contribute is really easy people tend not to be that interested after all.

The Kindle is, at least to me, a bloody excellent device for reading books. However, I get really annoyed when the books I’ve just bought have some really glaring typos in them. I assume these mistakes stem from some shitty OCR the publishers paid someone (in the cheapest outsourcing market they could find) to do for them because they couldn’t possibly be expected to generate e-books in any vaguely modern way, like using computers and such.

An example of this the the Kindle version of Crooked Little Vein by Warren Ellis. Chapter 24 starts with the line:

Aob picked us up outside our hotel, wearing his Same Old Bob face, not a hint of his earlier breakdown.

What really irks me is that the a of “Aob” is a drop cap. Yes, that’s correct. However the hell Harper Collins transcribed this they managed to keep the correct capitalisation of his Same Old Bob face, and entirely fuck up the incredibly obvious drop cap at the beginning of the line. I assume, after having read a few, that publishers are happy to take my $9.99 for Kindle books, but not to pay someone to read them even one fucking time.

Anyway to get back to the point. I’d love to be able to annotate these fuck-ups with the handy keyboard on the Kindle. Since most of them are nothing more than simple typographical or scanning errors (A instead of B) it would be easy to show the most common ones to the publishers in a way that they could keep the Kindle books up-to-date with the latest fixes. I’d love to see Amazon, or Barnes and Noble with the Nook, push the technology a bit further and actually allow me to improve the books I read for the next guy, or the next time.

Share with friends:
  • Twitter
  • HackerNews
  • del.icio.us
  • Reddit
  • Digg
  • StumbleUpon

 
Comments

Your Users’ Mental Models

Posted by sh1mmer on Dec 15, 2009 in General

This is a reprint of a short essay I wrote for the excellent Designing Social Interfaces (on Amazon) by Christian Crumlish and Erin Malone.

One of the things I like about computers is their ability to create magic. They provide abilities that no-one thought possible and make them a reality. Yet, for many people this is also the biggest source of complaint about computers.

When you drive a car you probably don’t understand the thermodynamics of expanding chains of combusting hydrocarbons happening under the hood. Perhaps you understand the concept that gas expands in the engine block, pushing pistons in sequence which make the car go. But, even if you don’t, you can still understand that there is a direct correlation between the accelerator and the car moving forward. Of course, most interfaces are not quite this simple, even in cars. If the car won’t move, you assess what might have happened. And lo, you’ve left the parking break on! With this error dealt with you are free to go about your driving.

Obviously I’m not going to ask you to model your user interfaces after cars. However it is interesting that while cars contain significant amounts of complexity (complexity you and I almost certainly don’t fully understand) we can still functionally use them and recover the situation when things go wrong. This is because the sequence of events which makes the car work has formed a mental model in our heads. The car goes forward only when: it contains fuel, the engine in on, you are not applying the breaks, and you are pressing the accelerator. Since we have this model of how the car works we are able to troubleshoot when it doesn’t behave as we expected.

What is significant about the models we create is how functional they are. They aren’t based on the combustion of hydrocarbons or lateral torque. Heck, if there is serious engine trouble that is still a black box to me, but I know I can call AAA to tow me to a garage. And this, dear friends, is the crux of it; You need to design interfaces that let people recover from their mistakes. The problem you face as the designers of magical boxes rather than cars, however, is that users do not have the same robust mental model of computers that they have for cars. When things go wrong, and they certainly will, you users are lost in a sea of uncertainty.

So how do we solve this dilly of a pickle? Let’s start with what we know. Users must have a mental model of computers, otherwise they wouldn’t be able to use them at. However the scope of this mental model covers, perhaps, user interface widgets and probably some landmark mark or list based navigation. The problem, the thing that makes computers different from cars, is that computers interact differently based on context or conditions outside of our control. Much of this context may not be understood by the user, or may have never been explained. Cars are pretty old technology, children learn about them in school. By the time we first learn to drive a car, we are expected to have a basic understanding of how it works however generalized that model is. The same is not true for computers. Computer users are often actively discouraged from learning the underlying principles of what they are doing, and told to focus on the specific of the interface.

A great example of how this leads to the breakdown of users’ mental models is interaction with the Web. The Web is probably one of the least benign environments for a user on their computer and yet it is arguably the most successful computing platform. Using the Web there are numerous contextual or circumstantial errors than can occur, however the majority of users have no mental model with which to understand and recover from them. We looked at 4 possible causes of the gas pedal not accelerating a car, and yet a web page failing to load can have upwards of a dozen causes. Since users lack a mental model the best plan of action is to try and self-diagnose the error and educate the user. The distinction is important.

While it may seem sufficient to tell the user something went wrong and what they can do next, they are going to get into the same state again with the same confusion. Instead if there was a problem with the DNS tell them so, and help them understand what DNS is. Maybe you have to use an analogy of a phone book for web site numbers for their computer to dial, maybe you can convey the information more straight up. However you do it, don’t just let your users keep failing and becoming frustrated, give them a mental model that will last them a lifetime as a satisfied customer.

Image courtesy of mvallius on Flickr.

Share with friends:
  • Twitter
  • HackerNews
  • del.icio.us
  • Reddit
  • Digg
  • StumbleUpon

 
Comments

iPhone world data congress

Posted by sh1mmer on Nov 6, 2009 in General

I’m in Berlin right now, and I’m really really missing the always on Internet, I enjoy in the US. Normally in the US I have a 3G card and my iPhone with me, both of which have essentially unlimited data plans.

Now I get that this isn’t necessarily achievable in many countries. Not everywhere has unlimited cheap data, however what I would appreciate is not paying $20/mb. This is so expensive it’s essentially pointless.

How about this for a plan? When I’m roaming in Germany I can pay the German iPhone plan’s rates for data pro-diem. So over here they pay €50/mo for their iPhone plan which includes 300mb of data. Let’s assume €30/mo of that is the data plan. So when I’m visiting Germany I would pay €1/day to get 10mb of data access.

The money should be kept by T-Mobile who do the iPhone in Germany. On the other hand when my friends visit the US they should get unlimited data for $1/day (unlimited data plan in US is $30/mo). Apple should really start enforcing this amongst their exclusive carriers since without data the iPhone is essentially crippled. I’m also not suggesting this should affect voice calls.

Voice calls are still getting routed via the US, but that shouldn’t be true of internet so I don’t see the big deal.

Share with friends:
  • Twitter
  • HackerNews
  • del.icio.us
  • Reddit
  • Digg
  • StumbleUpon

 
Comments

Direct TV remote control code for Toshiba Regza

Posted by sh1mmer on Jan 21, 2009 in General

It’s a small thing but maybe it’ll help someone. After fighting with my DirectTV remote for a while I finally reprogrammed it. The right code for my 40″ Toshiba Regza was 11256. You can program the remote using this handy guide from Yahoo! Answers I found.

Share with friends:
  • Twitter
  • HackerNews
  • del.icio.us
  • Reddit
  • Digg
  • StumbleUpon

 
Comments

Ideas are hyper-inflated

Posted by sh1mmer on Jan 18, 2009 in General

Illiad posted a cartoon today that really solidified a lot of the character of my favourite tech news site Hacker News.

User Friendly Cartoon

While he is being pretty flippant (as usual) I think this analogy is actually an excellent one. Comparing ideas to Z$ fits the notion that the intrinsic value of ideas is very much subject to a half-life. Like currency in hyper-inflaction the value of ideas quickly deceases the longer that it goes without being spent. In the case of ideas being “spent” is of course being turned into an implementation.

In the developed world this cycle is somewhat staunched by the patent system which allows a company or individual to stake a monopoly on a particularly valuable idea (or more specifically a suggested implementation of it) and delay the decay of its value. However, this system has taken on something of a cold-war-esque feel to it as more companies choose not to implemented but only enforce their patent rights (often acquired).

In contrast companies like Fedex have been openly sharing their business processes with university students. They see their ability to implement their processes, operating knowledge and attitude as the drivers of their success. As such they have no fear of opening up their business to others knowing it’s their implementation that makes them successful rather than just their raw ideas.

Essentially what it boils down to is the Hacker News attitude, build stuff, don’t just talk about it.

Share with friends:
  • Twitter
  • HackerNews
  • del.icio.us
  • Reddit
  • Digg
  • StumbleUpon

 
Comments

A Haven for User Data

Posted by sh1mmer on Dec 20, 2008 in Geek Culture, Web Technology

Last month my friend Suw twittered something that surprised me a little.
what is it sites closing at the moment? IWantSandy, then Pownce and now Ficlets
Obviously, I knew about Pownce and I remember reading about Twitter buying Values of N, so I get they shut down I Want Sandy.

I headed over to GetSatisfaction to look at IWantSandy’s product page. What a wash of anger and sadness. The closure topic has hundreds of replies and 95 :( faces and 17 :) faces most of which were Rael’s and he doesn’t really count. Rael has made an export available for I Want Sandy but it hasn’t been up there for too long. In fact, it closes today. Something that a lot of people objected to.

This isn’t totally dissimilar to the way we closed Yahoo Photos (not that I was involved in that project). I think it’s sad that something can close in the space of a month and after that period user data is lost. The idea occurred to me that while archive.org is great for public data it sucks ass for private data. The idea of the data haven was born.

Imagine this. You sign up for the newest shiniest start-up service. When you sign up you have the option to guarantee your data will be preserved by somedatahaven.org. If that service goes belly up they can pass your data and login credentials to somedatahaven.org who will allow you login and export your data. If an independent organisation can take on the role of guaranteeing the data availability of a number of services that you sign-up to then it’ll be a huge step forward for data portability. This would be especially true if the data could be syndicated as easily transformable open standards to be accepted by other services.

So, I want to build this service. However, I’m a busy man. I might build it anyway, because I’m an engineer with twitchy coding fingers, but I’d really like half a dozen or so people that would want to sign up to such a service so I can work with some real customers and support their needs while building. If you are interested email me.

So as I Want Sandy shuts down for good today, I hope we can create a better solution for the future.

Share with friends:
  • Twitter
  • HackerNews
  • del.icio.us
  • Reddit
  • Digg
  • StumbleUpon

 
Comments

Explaining Flow to non-geeks

Posted by sh1mmer on Dec 4, 2008 in General

So I ended up having a little tiff with my wonder wife Rosemarie this evening. I love her but I was a bit grumpy because I was trying to get some coding done. I don’t like working late, but sometimes needs must. I realised the main problem was that she just didn’t understand why the little distractions matter. To her I was being a princess, and a drama queen. I can understand how my slight peevishness came of really badly when you don’t have an understanding of Flow. Without understanding what Flow is it would be easy to think I was snapping at her.

So, honey, I love you and this is to try and explain.

Book Pages by dwyman on FlickrImagine reading a book. It should be a real page turner, something you are completely glued too. You know the characters, you feel their pain. Then right at the crux of the story, something interrupts you. It’s a little annoying, but it’s ok. You fluff your cushion, stretch your legs and start to read again.
But no! Something interrupts you again. That’s ok. I’m just going to ignore it. But it’s won’t be ignored. You read the same sentence you just read for the third time. Now frustration moves to annoyance.

This is flow, the concept is that you need a certain amount of time to be able to do an activity well. Sure you can read at the drop of a hat but it takes a minute or so to start understanding what you are reading. If you want that total immersion of a good book it takes a bit longer, but boy the feeling is better.

Programming is hard. It’s harder than reading. In fact programming is hard than most things I know because it’s basically continuous problem solving. So programming takes longer to get into than reading, even deep reading. Some studies have shown it can take 30 minutes to achieve Flow in programming. So for a programmer the experience of small distractions can be that much worse than a reader.

Boulders by cherieb on Flickr To put it visually imagine an easy task (such as reading) is like a pebble and a hard task (such as programming) is like a boulder. It’s easy to push the pebble down a hill it keeps rolling until something stops it. But it’s pretty easy to get started again, because hey it’s only a pebble. Programming is like a boulder you sit and lever the boulder with a stick for 20 minutes to get it moving much at all. The last thing you want is something that stops it part way down. Sure it’s a little easier to get it going again half way down the hill, but it sure would have been nice if it hadn’t stopped at all.

So while I shouldn’t have been a jerk, I hope this explains a little bit why I was and maybe next time I’m working a little late I’ll find some space where my crotchety programmer angst isn’t going to make you feel bad.

Share with friends:
  • Twitter
  • HackerNews
  • del.icio.us
  • Reddit
  • Digg
  • StumbleUpon

 
Comments

Writing about REST; An annocodote

Posted by sh1mmer on Nov 29, 2008 in General

So I’ve finally knuckled down after a lot of planning and started writing the book I’ve been meaning to write about RESTFul Web Services. This is still a really interesting topic and I think as things shift to the cloud it will only get more so.

Obviously if you are going to talk about REST you have to talk about Roy Fielding who coined the term REST in his PhD thesis. While I was at ApacheCon this year I had a chance to catch up with Roy. He was a little bit reticent about another person writing a REST book. He has been, understandably, frustrated with the widespread misuse of the term. Following his recent blogging has been insightful, and often amusing. Roy is not prone to pandering to people who don’t take the time to properly understand his work, although he was nice enough to discuss the topic in great detail in the blog comments.

Roy is something of a hero of mine, with his work on URIs, HTTP and his thesis he’s set the scene for a great deal of Web technology. So, time for my anecdote, such as it is. At ApacheCon one of the sponsors had brought some arcade machines including a Star Wars Arcade machine. Throughout the conference people would stop by and hammer out a free few games. I played a bit but never made the high scores table. There was a very good reason for this, the high scores tables started at a million points (way past my ken). The interesting bit though is pretty much every name on the high scores was “Roy”. Looks like Dr Fielding has other talents than network systems architecture.

Back to the book, I’m so glad I’m around so many people that inspire me. Right now I’ve been talking a lot with Seth Fitzsimmons who works at Yahoo’s Brickhouse. There is some awesome stuff Seth’s been working on that will be around soon. It’s really great working with someone who is such a trooper for churning stuff out. I’ve also really enjoyed my occasional discussions with Jeff Lindsay who’s been talking a lot about “web hooks”. I’m not sure I’m convinced yet, but Jeff talks a good game and knows his stuff. It’s definitely good brain food. So look out for some sample chapters soon. If you are interested please email me.

Share with friends:
  • Twitter
  • HackerNews
  • del.icio.us
  • Reddit
  • Digg
  • StumbleUpon

 
Comments

What is the Open in an Open Standard or Specification?

Posted by sh1mmer on Nov 19, 2008 in General

Open is the buzzword of the moment. It was interesting to talk about “Open standards” at ApacheCon ’08. I sat down with Whurley, David Recordon and the W3C‘s Ted Guild to discuss what we could do to improve a number of issues.

One of the things that really struck me is like most conversations about “Open” it means different things to different people. In the context of various standards bodies they use the term to mean drastically different things. In order to reset expectations I’d like to see a common set of terms we use to talk about this stuff. Interestingly David said that’s why the Open Web Foundation (OWF) choose to create “Specifications” rather than “Standards”. By avoiding the existing term they escape the implications that their specifications are mandatory or industrially definitive.

Looking at the way standards/specification organisations use the word “Open” I see 4 key behaviours:

  • Open Working Groups
  • Open Consultations
  • Open Availability
  • Free (as in beer)

Open Working Groups

Open Working Groups are those that are available to anyone who wants to participate. The exact barriers to entry vary across different standards organisations. A group like WHATWG sets absolutely no barrier to entry. W3C only allow member organisations to contribute to their specifications. However, membership is open as long as you make a financial contribution (related to the size of your organisation) and agree to the member charter. These terms include release of intellectual property rights and some other clauses. The W3C also allow invited experts from non-member organisations to contribute (without paying a fee). Such people are normally industry experts who are invited to share their subject knowledge.

Some groups also publish public minutes of their meetings with attendance, topics, conclusions and other details about the inner working of their standard/specifications organisation.

These approaches can be contrasted with a group like the British Standards Institute which selects a group of people designated as subject experts to write a standard. The membership of such a group is not expected to change throughout the process of writing a standard. They also do not share their internal working outside BSI until they publish a standard.

This, of course, means we know what open isn’t in this context, but we still haven’t defined exactly what open is; Or at least given the grades of open participation specific names.

Open Consultations

While working groups vary in their openness, many of them also do open consultations. By this I mean they publish draft of their standard and invite commentary. The W3C is notorious for this as they publish many stages of draft, from regular “Working Drafts” to “Candidate Recommendations”, the stage before final publication. By soliciting participation in this way organisations are allowing open input on their work periodically rather than constantly as in the case of open working group participation.

This approach can reduce the amount of time taken to create a standard/specification by allowing a core team to develop a model to be critiqued by the larger community rather than trying to find rolling consensus on all issues.

Open Availability

The ability to gain access to a standard is not guaranteed. Some standards are restricted to organisations in certain industries, such as telecoms or with membership to certain industry groups. Open standards are freely available to read or implement.

Free

Some standards are open but cost money, such as many of those published by the BSI. The money used from selling the standard pays for the cost of developing it. The majority of Internet and Web specifications and standards are free, the cost being borne by the corporations which sponsor the development for their own use.

Terms to describe standards and specifications

There is also an interesting distinction between standards and specifications. This was something David was keen to point out on behalf of the Open Web Foundation. The OWF builds specifications not standards, this is because they are focused on rapid problem solving. By creating a consensus around a technical issue, a specification if you will, they enable standards bodies to take that existing work and use it as the basis for standardisation amongst their members. I think this is the key distinction between a standard and a specification.

Member organisations contribute to standards bodies in order to help shape the direction of standards they know they actively want to use. This means that standards bodies often have large amounts of people engaged in the work on a standard. This can make the process unwieldy and highly politic. Groups like the OWF on the other hand write specifications, usually with a small group of highly motivated people at the core of each specification. Since the goal is technical consensus there is less politics and a sensible specification can often be reached quickly.

While the only difference between a specification document and a standard document is wording, the difference in process can be huge.

Pulling it all together

By outlining the different usage of Open I hope I have given everyone an idea of the concepts. As such I now want to tie those concepts back into a set of terms people can use to understand Open in the context of standards and specifications.

Working Groups

Open Working Group
A working group to which anyone can participate regardless of company affiliation or other status. The meeting records (such as they are) are Open and freely available.
Members Working Group
A working group which is limited to members organisations of the standards/specification body. Meeting records may be Openly available or restricted to the Member organisations.
Private Working Group
The working group is invitation only and minutes and other records are restricted to member organisations or the working group.

Membership

Open Membership
Anyone/organisation can join the specification/standards body to contribute. Agreement to a common membership agreement may be required (to release IP for example, or set a code of conduct).
Nominal Fee Membership
Membership is open to any organisation/individual but a nominal fee may be required in addition to a membership agreement.
Closed Membership
Membership is limited to an invitation list of participants. This may be a limit to members of a particular industry (such as telecoms) or it may be private consortium of companies.

Participation with the community

Open Consultations
The working group will periodically release a version of its standard/specification for public review and comment. Each comment received will be publicly addressed
Member Consultation
The working group will periodically release a version of its standard/specification for review and comment by members of the standards/specification body. Each comment received will be addressed to the members
Private Consultation
Specific experts will periodically be asked to review and comment on a version of the standard/specification and their comments will be reviewed by the working group

Availability

Openly Availability
The final version of the specification/standard is freely available at no cost
Nominal Fee Availability
The final version of the specification/standard is freely available at a nominal cost (the value of nominal may vary from industry to industry)
Member Availability
Available to members of a specification/standards body. There may be a charge.

Examples

Open Web Foundation

The Open Web Foundation have been a buzz in the blogosphere recently so they can be our first example. The OWF:

  • Publish Specifications
  • Use Open Working Groups
  • Use Open Consultation
  • Provide Open Availability

World Wide Web Consortium (W3C)

The W3C are one of the leading providers of Web Standards, such as HTML, XML and CSS. They have been criticised in some quarters for taking to long to publish new and updated versions of standards. Their profile is different to that of the OWF. The W3C:

  • Publish Standards
  • Use Member Working Groups
  • Use Open Consultation
  • Provide Open Availability

British Standards Institute

The BSI produces a lot of industrial standards such as the “kite mark” which governs the safety of children’s toys and other items in Britain. They have also produced a number of technology standards such as PAS78 which helps organisations purchase accessible Web sites. They mostly have a very different model to the OWF and the W3C. My understanding is that BSI:

  • Publish Standards
  • Use Member Working Groups
  • Use Member Consultation
  • Provide Nominal Fee Availability

Conclusions

I think that this terminology makes it clearer and easier to describe a number of existing standards bodies. By using the same terminology people and organisation can have better expectations of the various standards/specification organisations.

Share with friends:
  • Twitter
  • HackerNews
  • del.icio.us
  • Reddit
  • Digg
  • StumbleUpon

Copyright © 2013 Kid666 Blog All rights reserved. Base theme by Laptop Geek.