Comments

Building a reverse HTTP proxy in Node.js (and what I learned from Node Knockout)

Posted by sh1mmer on Aug 31, 2010 in General, Node.js, Web Technology

Help 3>2>1 win Node.js KO!

I’m fortunate enough to live in San Francisco which meant when Node Knockout rolled around I spent it hacking in Joyent’s HQ on the 20th of a downtown building.

I’ve been at Yahoo! for a while and I’m used to hacking in 24 hour competitions (hack days) but this time we had 48 hours. It turns out that makes a massive difference in both good and bad ways. Firstly in a 24 hour project the obvious way to get that little extra boost is to code all night. This is not possible in a 48 hour contest, at least if you want to write code that works. However, because we all planned to go somewhere to sleep I didn’t think our team was as gelled as some of the 24 hour projects, because there wasn’t the same kind of urgency throughout. This really manifested itself in how the team interacted with the outside world. It’s much harder to shut out the world for a weekend than it is for 24 hours. The two of us in relationships left hacking for hours at a time to see our significant others. While this isn’t a bad thing in general I don’t think it helped us hitting a fully feature complete product.

I am however really happy with what we built. We aimed for something unreasonably ambitious and I feel like we managed only achieve something damn good. Which is fine by me.

What we built was an reverse HTTP proxy; an HTTP router. What’s different about our project is the real-time reporting and the ability to create on the fly routes and access rules in JavaScript. Let me break each of those down a little bit more.

When you visit a route, for example demo2.ko-3-2-1.no.de, the proxy will round-robin your connection to all of the upstream hosts associated with that route. In the case of demo2 that means sports.yahoo.com and yahoo.com. In the simplest case you now have an HTTP load balancer. Since Node.js is significantly faster than most web servers it would be feasible for many people to simply insert our proxy in front of their existing infrastructure.

Node Stats

What would this get them other than load balancing? One of the other features we added was a rich API including a web-socket which streams the connection data. This can be seen on the admin interface we mocked up to demonstrate what the proxy is doing. An HTML5 canvas based graph is connected to the web-socket and streams the events as they happen. We also load the details of each request into a YUI3 datatable so you can see exactly what is being requested.

I also made a screencast showing the interface in action.

The final and least obvious feature (because we didn’t have time to write an interface for it) is the ability to add dynamic rules via the API. This is a really powerful and useful feature. Let me give an example:


This function when attached to the clientReq event in the router will “hard block” any requests from IP addresses in the array. An interesting characteristic is that because this is attached to the event which accepts the initial request from the client we can kill the connection off before we even connect to the downstream host. Not only do we not respond with an HTTP request (we simply terminate the TCP connection) we also tell the router not to process any more rules by calling the functions router.stopPropergation(event) and router.preventDefault(event).

It is also easy to implement a soft block which actually returns an HTTP 503 status instead of closing the TCP stream, or to use more complex lookup rules than finding from an array. This is definitely a part of the admin interface we will be expanding.

And that brings me to my final point about competing. 48 hours is big enough that you can bite off something challenging, and fail. Not in the sense that we didn’t build something great, I think we did, but it’s not complete. I have a list of things I’m chomping at the bit to add to this. By challenging ourselves with a tough projects we’ve set ourselves up for success in the future (and hopefully in the project) because this is a very useful system that now I have some momentum to really build into something that will be used every day.

Don’t just take my word for it, find an excuse to challenge yourself too.

Help 3>2>1 win Node.js KO!

 
Comments

A Haven for User Data

Posted by sh1mmer on Dec 20, 2008 in Geek Culture, Web Technology

Last month my friend Suw twittered something that surprised me a little.
what is it sites closing at the moment? IWantSandy, then Pownce and now Ficlets
Obviously, I knew about Pownce and I remember reading about Twitter buying Values of N, so I get they shut down I Want Sandy.

I headed over to GetSatisfaction to look at IWantSandy’s product page. What a wash of anger and sadness. The closure topic has hundreds of replies and 95 :( faces and 17 :) faces most of which were Rael’s and he doesn’t really count. Rael has made an export available for I Want Sandy but it hasn’t been up there for too long. In fact, it closes today. Something that a lot of people objected to.

This isn’t totally dissimilar to the way we closed Yahoo Photos (not that I was involved in that project). I think it’s sad that something can close in the space of a month and after that period user data is lost. The idea occurred to me that while archive.org is great for public data it sucks ass for private data. The idea of the data haven was born.

Imagine this. You sign up for the newest shiniest start-up service. When you sign up you have the option to guarantee your data will be preserved by somedatahaven.org. If that service goes belly up they can pass your data and login credentials to somedatahaven.org who will allow you login and export your data. If an independent organisation can take on the role of guaranteeing the data availability of a number of services that you sign-up to then it’ll be a huge step forward for data portability. This would be especially true if the data could be syndicated as easily transformable open standards to be accepted by other services.

So, I want to build this service. However, I’m a busy man. I might build it anyway, because I’m an engineer with twitchy coding fingers, but I’d really like half a dozen or so people that would want to sign up to such a service so I can work with some real customers and support their needs while building. If you are interested email me.

So as I Want Sandy shuts down for good today, I hope we can create a better solution for the future.

 
Comments

Yahoo, the identity middle-man?

Posted by sh1mmer on Aug 23, 2008 in Geek Culture, Web Technology

I had an interest thought about the direction of some of Yahoo’s projects. It seems like a lot of the stuff Yahoo is working on are about helping people to aggregate and manage their data. Two of the most obvious public examples are Fire Eagle and Open ID.

What I think is particularly interesting is that neither of these things are products or applications in themselves. Neither of them tries to control what you do with your data and in fact they will happily let you to use the information anywhere on the web that supports it.

I think Yahoo is actually creating an interesting market here. I don’t think I’m going out on a limb to say Yahoo is a brand perceived as safe and family friendly. By providing tools to let average people safeguard and manage important parts of their identity Yahoo is creating a new trend in middleware.

There has been some discussion recently about once you put something online how hard it is to manage thereafter. Conceptually Yahoo are positioning to help become your identity provider for the Web. Pick a brand you trust and let them act as a middleman between you and everyone else. Sites can put data in and sites can take data out but only if your middleman lets them.

Many people who support Open ID have stated an aim like this. Open ID allows SSO but it can also facilitate “attribute exchange”. This is where the Open ID provider passes the relying party¹ a number pieces of information (attributes) about the user logging in, assuming they say it’s ok. Right now Yahoo’s Open ID provider service allows users to pick a number of IDs they can use, from their Yahoo username, their Flickr username, to a random anonymous one. There is nothing that would stop Yahoo allowing you to associate a unique profile which each of these users. There is already a certain amount of evidence to show that the youth of today already do this kind of segmenting by hand and manage multiple online profiles.

I’d be interested in seeing what techniques can be applied to this information management. If you’ve ever seen the Facebook application TOS they are pretty harpdcore. You are allowed to store virtually nothing about the user in your own database. This is because Facebook are aware of the simple truth that once you share information you can’t unshare it. Look at the music industry, the properties of bits are not the same as those of physical objects. As such the only real protection you can offer the user is legal.

That said, obviously most people are happy to share information with many sites they use and let them store it. I’d like to see a much better way to represent the TOS so that a user could effectively review it before they share information. This a topic I discussed, yet again, at Leeds Barcamp. I want to see terms of service use a number of creative commons style attributes. Any additional terms they required would then be easy to identify and read. If you were using Open ID to sign in, it would be easy to define what you were happy to accept from a site and what you weren’t. Your Open ID provider could then easily flag any discrepancies to you before you login/signup.

Despite all of this I am not suggesting to say that Yahoo should own this potential market. However I think they are being extremely progressive in it. I’d love to see providers competing for consumer’s love. And, of course, since it’s all about being Open it’s not like anyone would stop you swapping providers. Not at least if they were sensible. I read a quote by a Sun exec (that I can’t seem to find) about making it easy for customers to leave, because they are much more likely to stay if they come back.

¹The site that lets you login with Open ID

 
Comments

TipJoy; Creators Vs. Value conduits

Posted by sh1mmer on May 29, 2008 in Web Technology

So TipJoy has been getting some press lately, which is good because I like it. While Clay Shirky apparently says Micropayments are dead I really like anything I can do immediately (in a very GTD). TipJoy makes it easy for me to share a small amount of love in a very instant gratification kind of a way.

I also like in a very startup style they are extremely responsive. I sent them a couple of points of feedback yesterday and by the time the sun started to rise over the America West Coast I had an answer and they had added something to the site. You can now look at who I’ve tipped and get it as a feed. They are also working on an API which hopefully I’ll get access to soon.

One thing I think would be interesting is to see if they can look at better granularity of claims against URLs. On some systems authors get ownership of a subdirectory rather than the whole domain. When I tip that URL do I want the tip to go to the author of the site or the owner of the domain? On a shared blog, if I tip a story I like am I tipping the blogger or the site? They shouldn’t have to resolve all these issues with some large unwieldily way of specifying who you were tipping. But, it is interesting to think about a URL or domain not necessarily not being an indicator of the creator of value but instead the identifier of the conduit of value.

Technorati Tags:
, , , ,

 
Comments

Lifecycle Messaging

Posted by sh1mmer on May 13, 2008 in Web Technology

I’ve been reading YCombinator’s Hacker News a lot recently. It had a great link today to post about Lifecycle Messaging by Josh Kopelman.

Josh points out some really good things, and it all comes back to one of the current calls to arms at Y!, relevancy. It really is the holy grail. If you can send someone an actual relevant reminder at a relevant time (when they were about to forget) then you are going to get a much better total click through rate than bulk mailing a low relevancy message to more people.

Relevancy doesn’t have to be about mind reading. People won’t mind your best guess if it really is that. We deal with “nearly-noise” all day long. We don’t get upset when our co-workers ask us if we would like a drink, because the offer is relevant and timely (they are going to fetch drink and so the return will be immediate). On the other hand watching people who don’t want a free paper outside the tube station is a dramatically marked experience from those who do.

man handing out free london lite papers - photo by bowbrick

The key is knowing enough of what people want to make any offers seem like polite courtesy rather than blanket bombing. People will like you for polite reminders (as long as you don’t nag). This is something Josh really hit the nail on the end with, if you contact people just as they are about to forget it’s a reminder and they don’t feel upset. If you hit them too early in the curve it’s a nag.

Technorati Tags:
, , , ,

 
Comments

TinyDB; an actual useful app built on AppEngine

Posted by sh1mmer on May 11, 2008 in Web Technology

TinyDB is a new micro database app that allows you to easily get and set data to a URL and then access it again. It supports JSON and XML formats. I really like the concept a lot. The ability to throw up a simple datastruct or two in a very light way is definitely a good thing. The creators say it was because they wanted to attach data to tinyurls which makes a lot of sense to me.

While they point out that the TinyDB entries aren’t secure, I’m mostly curious to see if they plan to let people alter the data after it’s been posted or if it will remain locked in time forever. Whatever small concerns I have, it’s a great project. I also think it’s really cool that they’ve used AppEngine to create it with. I’ve been wondering how successful App Engine would be but with stuff like this coming out I’m looking forward to seeing the big internet providers compete to host the many apps of the future.

Technorati Tags:
, , , , , , , , ,

Copyright © 2013 Kid666 Blog All rights reserved. Base theme by Laptop Geek.