Tuesday, May 11, 2010

Idea: Cost Accounting using Chef

Since Chef describes your infrastructure as a set of resources, I think it is possible to go beyond the basic bandwidth + CPU/hours for cost accounting.

Suppose you have a set of heterogenous web sites, each using a different classification of recipes:

  • Site A: static files

  • Site B: static files

  • Site C: PHP-FCGI (Wordpress)

  • Site D: Python (Trac)

  • Site E: Ruby on Rails 2

  • Site F: Ruby on Rails 3


In our hypothetic infrastruture, Sites A and B are served from a single Nginx server. Site C and D are also served on that same Nginx server, using proxies.

Site E & F requires a stand-alone web app, however, they both share the same Master MySQL database (though the database schema are seperate).

Now, how do you do the cost-accounting for that? If we were to use basic bandwidth + CPU/hours, the best we can do is determine what each of the hosts, in this case, 4 hosts, costs us. This assumes you are using EC2 or Rackspace CS.

On the other hand, we know from our Chef-repo describing this whole setup, that Sites A, B, C, and D each are Host 1 and split the costs of Host 1. We can weigh Sites C and D higher since they are not using static files, and therefore doesn't take advantage of Linux send_file.

Sites E and Site F each have Host 2, and 3 respectively. However, they would include the costs for sharing Host 4, used as the MySQL master.

I'm not exactly sure how this would be implemented. We can query this by node, but better yet, if we have a higher-level description of each of those sites, we can calculate cost structure around that.

Ultimately, this doesn't matter as much if your site runs a single app exclusively. But even a large company, such as Flickr or Facebook with many different subsystems, would want to be able to track logical cost structure of each subsystem, versus how much revenue each subsystem brings.

No comments:

Post a Comment