Computing is getting smaller

Computing is getting smaller.  Ten years ago when you would plan an enterprise system we would plan in terms of how many physical boxes we would need to serve the need of x users.  For example, we would say that to server 1000 simultaneous users we would have one application server box, one  database server box, and two load balanced presentation servers.  So if our client need was 5000 simultaneous users  then we would know that we are talking 5 times of many boxes (e.g. 20 boxes).  This was a significant aspect of solution design and it was very much a limiting factor.  It meant that there had to be time and resources for integration (installing OS, application libraries, and applications on one box is time consuming:  now imagine on 20 boxes) accounted for in estimates.  You had to factor into your solution how much it would cost for the hardware.  That’s why all shops back then usually had large it infrastructure resources on hand.  For you to create an enterprise solution you had to maintain a shop that you would only use for a small portion of your development all year round (there was little outsourcing of it back then).  I am not even talking about something at the scale of a Google or amazon either.  Medium size applications (5000 to 100,000 simultaneous users) would need these resources.  Otherwise you would not be able to create a solution quickly.

The result of all this is that the typical development effort would have a full complement of developers as well as a sizeable on site hardware and hardware staff complement.  It also meant that any discussion of a new solution would require hardware folks in the discussion.  Capacity planning was very much in the domain of the hardware folks.  Typically you would spec out a list of boxes needed and an expected load per box.  The hardware folks would take that and return with a hardware requirements list.  Computing back then was a big effort because even before you started there would be several servers sitting around with staff supporting them.  During the development even more hardware and staff would be added.  Once you were done there would be typically more hardware and staff as well.

Today, we still need infrastructure to support of solutions.  However, the effort is much smaller.  I can seriously say that I can plan and implement a solution on the same scale of a medium size solution from ten years ago without either consulting any hardware folks and without adding a single physical box to my infrastructure.  In fact, depending on my market goals, I may be able to do this without spending a single dime initially.

What has caused this sea change?  A few things.  The first is undoubtedly virtualization.  I can remember the first time that sysadmin and I worked on deploying virtualization software in this infrastructure where I was a team lead.  It was at my request, because I was frankly getting pissed about how long it would take for me to get hardware to just test out new software solutions.  Eventually at that site we got a huge Dell sever with 16 gigs of goodness that ran at least 4 virtual servers.  That sysadmin may not have realized it, but, the moment we got that box my reliance on him to deploy new solutions immediately became zero.  With a few mouse clicks I could configure and deploy a new box and automatically size it to whatever my application needed.  I would tell any sysadmin that if they are worried about job security don’t give a person like me a virtual server to play with as I will likely never need your help with anything ever again. 

The seond thing is cloud computing.  Actually you can think of cloud computing as an obvious extension of virtualization.  However, there is a way in which cloud computing has achieved a sort of critical mass.  With virtualization you basically end up with a bland server that still has to be configured for your apllication needs.  You still have to make that application server provide all the services needed for your application.  If you want a cron type process running in the background then if its a windows box you will have to create a scheduled task or Windows Service.  If you want access to a sql like data access layer you will have to install some db server and then set up authentication and whatever database objects that you need to exist.  The point is that a virtualized server will still need to be configured for your application and that will require additional software and configuration before you get to the point of focusing on your domain problem.  Even for some of the brilliant development tools like Ruby you still have to deploy Apache.  However, the cloud changes that.  All the cloud implementations start out by providing you all the services and abstraction layers that you could ever imagine for creating an application.  Out of the box you begin by working only on your domain problem.  No need for an authentication layer, a job monitoring layer, a message queue layer.  All of these facilities are included.  In fact the only piece of software you will spend much time with is a development tool for whatever language the cloud supports.  E.g. if its the Google cloud you will be using Eclipse running either Python or Java.  Clouds also abstract away hardware.  In fact hardware becomes an abstract concept that is only referred to in terms of CPU cycles or concurrent users.  As a developer you focus on this when it is absolutely necessary:  when your application users need more performance for an already deployed application.

The third thing that has changed can best be described as the growth of XRX.  The web is a platform.  This is also a good way to look at the components that make up a domain model based solution.  In fact this is a natural way to look at a problem in a domain based way.  I wont get into a discussion of why I think a domain model specific way is a natural way to look at problems.  However, I will say that this is exactly the basic premise of all the cloud infrastructures that I have experience with.  The focus is on getting the developer to think on their problem domain and then constructing data models and processes on those data models.  10 years ago if I was planning an application so much time would be spent on specifying hardware, bandwidth, application libraries, etc, that by the time we would get to actually writing in any domain specific way it was usually towards the end of the application effort.  By then any notion of a problem domain was very much influenced by all these other hardware and application library constraints.  Today, the problem domain can pretty much be your only consideration.  Hardware specifications and application library constraints are not a limiting factor anymore.  For a developer it is very freeing as we spend time doing what we should be doing:  problem solving.

Why do I say computing is smaller.  Well its because the effort to create something like a Facebook can literally occur at my dinner table with no more than myself and some other developers involved.  We don’t need a dedicated server team.  We don’t need to spend our capacity planning effort on speccing servers.  yes, there are still giant servers in some data center somewhere.  However, I don’t have to think about that till my app has been up and running in the cloud long enough to necessitate additional capacity that I will then hopefully have enough revenue to pay for.

Are you really Open Source?

I’m a bit groggy and out of it from pulling an all-nighter to build a software that I needed for a client proposal.  I won’t name any names, but, this “open source” product was not easy to build.  First off, when I went to a link that says “download” I was thinking that I was going to see one or more tar gzips.  Well this product had none.  In fact all I could do was to use firefox download manager to download several folders on the svn website for the project. 

Seondly, those files were marked as debian packages and since I was running Ubuntu 9.1 I thought I was cooking with gas.  Those downloads took about 20 minutes and I spent the last week trying to deploy them to no success.  I spent the last week pulling my hair out before accepting that the debian packages were hopelessly broken and not even rebuilding them from the source had any hopes of fixing them.  I went through the torture of downgrading my perl install to 5.8.8 because I assumed there was a library issue.  Actually the core issue is that the dependencies for the debian packages were farked.  In fact, even after adding the developer’s debian site to my Synaptic source list I discovered how broken the packages were:  The two main packages had circular dependencies.  Unfortunately, no package manager on earth can resolve circular dependencies since such a thing is illogical.

Thirdly, after giving up on using debs (and somewhat messing up my perl installation with a polluted deploy of perl 5.8.8–that I must uninstall) I moved to CPAN.  This is where I discovered how truly devious the developers were.  I could understand if someone had not updated their debian and tar gzip deliverables, but, for the CPAN deployment to be broken actually takes some effort from the developers themselves to ensure that the package is uninstallable.  What makes this worse is that to even get to the point of the CPAN deployment I had to download via SVN the trunk for the application.  This is not documented in any of the developer’s documentation.  I am sorry, but, this developer is giving a big middle finger to the Open Source community by not fixing whatever distribution mechanism they are using for their software.  I appreciate the hard work that they have certainly put into this product, but, to deliberately (all my problems were not by accident and were not coincidence, they have deliberately broken the automated build and deploy mechanisms) make it hard to find and use their software they might as well not bothered to declare it “Open Source.”  Open source is not just a phrase, its about a certain respect that developers have for themselves and the greater user community.  A respect that means the developers have faith that others will benefit more from the fruits of their labor rather than restricting use to a chosen few.  Not showing this respect is dishonest and not in keeping with the spirit of the Open Source community.

Seriously, I would have found it easier to download and build the Windows 7 source code than this application.  That is a shameful indictment of this developer and their business practices.  Its no surprise that they offer pay for hosting for the software (I’m guessing the debian packages broke on the same day they announced the hosting plans).

After much hand-wringing, the software finally builds under Ubuntu 9.1.  I have not had the luck of deploying it into Apache, but, I don’t expect much problems with that (though I do expect to have to hack through their setup scripts to make sure files get copied properly).  I had to spend the whole night going through missed dependencies in CPAN (I even had to reverse engineer a few build scripts).  I was considering some sort of partnership with the developer, but, now I don’t respect them enough to do so.  Instead, I will hack their software myself.  Of course I’ll make sure there is a working debian for whatever changes I make available on my site.

Goodbye, and thanks for the fish!!!