Saturday, November 9, 2013

Hype, thy name is "Cloud"

To be honest, I'm tired of the word "cloud". The hyperbole about cloud computing has now reached the stratosphere. When a speaker mentions cloud computing, the first question in my mind is "what now?" I lose interest in what the speaker has to say, because I have already heard it before! What is worse is that cloud computing pundits are so wrapped up in themselves, they don't seem to notice the "audience fatigue". May be they don't need an audience, may be they only want to pat each other's backs.

What is cloud computing? The simplest definition of cloud computing is "it is a bunch of computers available for rent, on demand, over the internet". There are a few intricate features built on top of that, but this definition is suffice for all practical purposes.

Cloud computing is indeed useful under certain circumstances, but what riles me is that it is being prescribed as the single solution to all the problems. I have come up with a few scenarios where it can be put to good use:

- For small companies (or departments) working on a short term IT projects, cloud computing can provide the temporary hardware (and software) for development and testing purposes. Previously the companies have to buy hardware, especially for web applications, rack and stack them, install the requisite software and then use the servers for testing. Today they can just rent the servers from one of the providers, use them for the duration of their project (usually a few months) and then shut them down. They can even shut down their servers during weekends and nights when no one is working, to save some more money. This will definitely make the small companies more productive, free up their time to concentrate on the problem at hand. They can always include the cost of renting the servers in to the contract with their clients. This cost is usually very small when compared to the overall cost of project execution, the clients would gladly agree.

- For 24x7 companies to augment their existing infrastructure. The keywords here are 24x7 and augment. Let me explain. The 24x7 companies are businesses who have the necessity to run their servers 24x7, for e.g. Netflix. These companies should have a base level of infrastructure, i.e., servers running in their own data centers. During nights, weekends, and holidays the number of people accessing Netflix would increase. At that time, they can temporarily rent the servers from the providers and join them to their network. This will augment their capacity and enable them to service the increased demand. Once the demand goes down, they can shut down the rented servers.

What is cloud computing not good for? It is not good for 24x7 companies to run their base infrastructure from the cloud providers. For e.g., Netflix, unfortunately, rents all of its computers from a cloud provider. They do not have a base infrastructure of their own. Renting the computers is way more expensive than buying and installing the servers, and paying for the electricity, bandwidth and other utilities in the data center. Read Jeff Atwood's calculation here: The cloud computing prices are falling, and one day might be cost effective to rent all the computers all the time, but we haven't reached that situation yet.

Remember, cloud computing does not eliminate the need for system administrators, engineers who are required to install and configure the servers and maintain the systems. Companies would still have to employ the same number of people. There are no savings there. Hence the 24x7 companies should always have their own base infrastructure in their own data centers, and use cloud providers only to augment their capacity during the times of increased demand.

- 24x7 small companies and startups: The same cost argument is even more applicable to smaller companies and startups running server applications. The cost of renting servers takes the lion's share of the operational expenses in smaller companies. Hence smaller companies should go for hosting their own servers, and slowly move to the hybrid model describe above in the Netflix example.

Comments are welcome. Please let me know your thoughts.

Friday, September 7, 2012

Apache contributes to reduction of Consumer Privacy in Do Not Track (DNT) debate

First things first, definition of Do Not Track (aka DNT): Lets say that you went to a retailer's web site and searched for an LCD Television. Few minutes later, lets say that you are on a different web site. Have you noticed that "LCD TV" advertisements magically come up on that site? This is because you are being tracked across the web sites! What you do in one place, is now visible to many other web sites and they can tailor their offerings according to your taste. This is called "personalization". This happens without the user's consent. To prevent this, a Do Not Track (DNT) option is available in most of the web browsers today. Once the user sets it, the browser tells the web site(s) that the user doesn't want to be tracked. Beauty of the Web is that as users have a choice, so do the web sites, in that they are free to ignore it! There is no law forcing the web sites to obey the DNT choice of the users. (Official standards are published by Tracking Protection Working Group). Only a few web sites honor this user choice.

For the technically oriented, this is a HTTP header named DNT, sent by the browser when it accesses a web site. If DNT = 1, then the user has opted out of tracking, i.e., doesn't want to be tracked, if DNT = 0 then the user has opted in, i.e., wants to be tracked and if the header is not sent all, then the user hasn't expressed a preference. The default behavior of the browsers is to not send the header. From this behavior we can see that the user not expressing a preference has the same result on the user's privacy as the user opting in. The web sites will track the user in both of these cases. They will not track, only if the user has opted out. Of course, this conforms to the standards published by the Tracking Protection Working Group.

Now comes Microsoft with their release of Windows 8 and Internet Explorer 10 (IE10). They did an awesome thing. They turned on (DNT=1), i.e., opt out by default. The user has the option of turning off DNT (i.e., opt in) as part of Windows 8 Setup. Technically, this is a violation of the published standard, because as per the standard, the browser is supposed to remain neutral, and not send any header. But we have already seen that remaining neutral, is not actually neutral, but is equivalent to opting in!

Lets digress. Have you ever received in the mail a 10 page booklet explaining the privacy policy of your credit card company? The privacy policy would be published in a 0.1 font size, and if you manage to read it, you will find something akin to this: "we will share your information with our business partners and affiliated companies for business purposes". No one will tell you what these business purposes are, but the behavior of all these financial institutions is "opt in" by default. This is wrong. The behavior should be "opt-out" by default, just like the IE10 from Microsoft. Today, you have to specifically send them a signed letter in the mail, asking them not to share your information. Most of us don't do it, and hence our data is very easily discoverable. If a business has enough money, say a few tens of thousands of dollars, it can buy the entire data of the entire US consumer population. And all this is legal. Believe me folks, this is true.

Then comes Roy Fielding, scientist par excellence. I looked at Mr.Fielding's bio, and respect him for what he is. He is one of the architects of HTTP protocol, one of the founders of the Apache Web Server (aka HTTP Server) project and one of the proponents of the DNT standard itself! But guess what, just like many of the luminaries, he also has a holier-than-thou attitude. He has come up with a patch for the Apache Web Server (note: Apache Web Server is the most widely used web server in the world) that will ignore the DNT option if the browser is IE10. He wants to do this, because Microsoft has violated the DNT standard by not being neutral. His argument is that DNT option does not protect anyone's privacy unless the web sites respect it (as I have said at the beginning of this post). That is correct, but spending time and energy and coming up with a software patch to defeat one particular browser's setting? I call this crazy! He is probably one of the many people who hate Microsoft for no reason. IE10's default setting of opt-out is a small step in the direction of increasing consumer privacy and we should all support it (even though many web sites don't respect that option). The right course of action for Mr.Fielding (and his esteemed colleagues at the W3C) would be to change the DNT standard with the default setting of opt-out. Do what is good for the consumers, don't let that chip on your shoulder come in the way.

The argument from Internet Advertisers and the web sites is this. They are providing a service free of charge (like most of our email, photo storage, blogs etc... are free). Hence they are entitled to track the user's behavior, sell it and make money off of it. Ok, I agree that a business should make a profit, but the users should have the option of protecting their privacy and pay for the services if they so wish. Not giving the users choice, ignoring the users choice or disabling the users choice by creating ingenious software patches is reprehensible.

I implore the Apache Foundation to reject Mr.Fielding's patch. I implore Microsoft to not budge and continue with the current setting of opt-out as default.

Thursday, May 17, 2012

The morphing of Facebook

I don't use Facebook often, because of privacy concerns, but whenever I use it, I find that Facebook is slowly morphing in to a group or family discussion forum. The reason I'm saying this, is because nowadays I find that Facebook contains (or shows) only the posts and photos of my closest family members and friends.

I'm sure Facebook has a complex algorithm to figure out what (whose) status updates to show when I login. I'm assuming it is calculated by the things I "liked" and the posts I "responded" to. If the algorithm is right, then it give us a startling conclusion: After fervently adding gazillion friends and the novelty died down, we are capable of interacting only with family members and a few friends on a daily basis. And status updates and photos from only those people appear when we login to Facebook.

Our family (extended family including my cousins, nephews, in-laws etc...) have always shared information thru' email, using yahoo groups. Now I get all that family information when I login to Facebook. Of course, this conclusion applies only to my demographic: the middle aged, middle class, middle income voter!

To prevent this automatic coalescing in to a small group, I see that many people always click the "like" button on almost all of the status updates, hoping that this will trick the Facebook algorithm to show more variety on their home page when they login.

At least for me personally, Facebook has replaced Yahoo Groups. Is Facebook any more useful than this? I don't think so, but only time will tell, may be it will morph in to something else in the future.

NodeJS vs IIS : IIS is faster at dishing out static HTML

I wanted to check if NodeJS would be the correct tech for one of my upcoming projects, hence I did a rudimentary benchmark of NodeJS against IIS on windows for dishing out static HTML. IIS does come out ahead of NodeJS, IIS is about 2.5 times faster than NodeJS on windows.

Details of my benchmark can be found here, on one of my answers at stackoverflow:

Updated: Tomcat appears to be the fastest server dishing out STATIC HTML on WINDOWS. Tomcat is about 3 times faster than IIS in responding to the same request

Updated (5/18/2012) Previously I had 100,000 total requests with 10,000 concurrent requests. I increased it to 1,000,000 total requess and 100,000 concurrent requests. IIS comes out as the screaming winner, with Nodejs fairing the worst. I have tabularized the results below:

Friday, October 28, 2011

Will the Desktop ever be dead?

Though the pundits have been proclaiming the death of the Desktop for quite a few years now (Virtualization), the desktop has continued to survive and doesn’t show any sign of weakness, if not gaining strength. If a quad core processor with 8 GB of RAM, top of the line video chip and a 1920 x 1080 HD screen can be made available in a 5 pound laptop, why would people not use it? Why would people abandon such a rich user interface and awesome processing power? It will be naive to expect people to give it up, be it Linux or Windows.

The topic "Death of the desktop" has been given new lease of life, thanks to some of the cloud computing gurus, who predict that after "everything" moves to the cloud, the user needs only a browser, and hence they don't need a powerful desktop. They seem to confuse the browser and the desktop.

If the browser becomes the all-in-one program, where the user edits all documents and presentations, works with audio and video files, writes code and performs myriad other tasks which are today performed by separate applications, then that browser would be a humongous amalgamation of all the applications and would require all the processing power in the world to run. Performing all tasks inside a browser simply means that the user has only one application to deal with, but it doesn't mean the need for processing power will go down.

The question "will the desktop ever be dead" itself is silly. It's like asking "will the computer ever be dead". All this debate about desktop is simply fueled by unjustified hatred of Microsoft. In an effort to unseat Microsoft's dominant position, more and more features are being crammed in to the browser, so that it becomes the de-facto operating system. Many people, including those who work in the technology industry, have this mistaken notion that if the software is accessible on the internet (for e.g., Microsoft Office 365 or Google Docs), then it must be running on the server (or the cloud!). No, such software is written in JavaScript, downloaded by the browser and executed in the local machine. The software is skinny, but so are the features! In the future, when the time comes, where the software downloaded from the internet has the same features as the desktop installed software, the browser would have bulked up and will be as heavy as the operating system.

The "death of the desktop" philosophy is counter-intuitive to the evolution of the tech industry and consumer behavior. The hardware (CPUs, storage, network) capacity has increased manifold over the years. And so is the software's complexity and the users' hunger to do more and more things sitting on the couch.

Of course, the hardware will change its form factor. The processing power of the main frames became available in the desktops, which is now available in a laptop or a netbook, and very soon it will be available in the pads and the slates. It doesn't mean that Desktop is dead, it just means those devices are the new Desktops. Our future is filled with increasingly powerful devices of all shapes and sizes. If you find evidence to the contrary, let me know.