Friday, May 14, 2010

Adventures with chef-server and Gentoo, part 7

Continued from Part 6

Earlier this week, I integrated veszig's's Portage code into my set of cookbooks. I've also merged in veszig's keywords, use, etc. provider into a seperate module. Along the way, I cleaned up some things (my fascination with Scheme is showing through in the code) and made it so that it will default to EIX if available and fallback to emerge --search if not. I monkey-patched it into the Portage Provider, and may submit it as a real patch upstream to Opscode.

Some gotchas I ran into in the past few days:

  1. Chef 0.8.10 - knife does not properly dump metadata.json, resulting in a rebuld everytime I upload a cookbook.

  2. Chef 0.8.14 - This release fixes the bug in 0.8.10, and introduces another one: it does not properly set the file mode. (Or more precisely, it looks like it creates a file in a read-only mode and did not set it as specified, or something).

  3. Chef 0.8.16 - This release fixed the mode bug. However, it doesn't work with json 1.4.3.


Why do I know this? I was working on the Improved Ghetto DNS recipe. It figure out what the private IP is and save that back into the sever index. jtimberman had said you did this with node.save (huh...), then suggested dropping it into a ruby_block so it is saved at execution time rather than compile time. When I thought about it though, I decided to do this at compile time instead and put guards to keep it from saving the information. This stuff touches host names, the fundamental stuff that Chef assumes you've already figured out. There is (unfortunately) a lag between the time a private ip gets reported, and when it gets indexed by the server index, so we essentially have an ad hoc eventual convergence, over multiple runs of chef-client. Yes. That's why it is Ghetto DNS.

When Ghetto DNS drops the new /etc/hosts file, chef-client continues on its merry way. Unless it is set to mode 0600, of course, in which case, it doesn't know where chef-server is. Oops. Chef 0.8.16 fixed that. Too bad there is no chef-overlay that pulls that in, so I ended up updating with rubygems.

I am probably going to create Chef gem upgrade recipes at some point. However, I've come to the end of the iteration, so I'll need to work on a different project. Maybe I'll update more Gentoo stuff in my off hours, at least until the next Chef iteration comes up.

As it is, I've gotten a lot of monit recipes up. One gotcha, at least for 0.8.10, was that nested definitions do not properly pass params, so you had to rebind variables locally within the block. As it is, I created several monit macros:
monit

Basic monit macro.

Example:
monit 'my_nginx_site'  do
cookbook 'my_site'
source 'nginx.monit.erb'
variables(:listen_ips => listen)
end

monit_service

Monit a background service

monit_service 'chef-solr-indexer' do
process :pid_file => '/var/run/chef/solr-indexer.pid',
:timout_before_restart => '30'
end

monit_net_service

Monit a service listening on an IP port

monit_http_service 'couchdb' do
process :listen_ips => [%w(127.0.0.1 5984)],
:pid_file => '/var/run/couchdb/couchdb.pid',
:timout_before_restart => '30'
end

monit_http_service

Monits a HTTP service, checks HTTP protocol

monit_http_service 'chef-server-webui' do
process :listen_ips => [[nil, '4040']],
:pid_file => '/var/run/chef/server-webui.4040.pid',
:timout_before_restart => '30'
end


monit_service, monit_net_service, monit_http_srevice all take the following arguments:
process

  • :start_cmd - override the start command

  • :stop_cmd - override the stop command

  • :pid_file - override the pid file

  • :timeout_before_restart - override the timeout

cookbook

Specify the cookbook to pull the source template from

source

Specify the source template to use

variables

Specify the variables to pass to the template

enable

If true, then create the monit configuration file.


For complete source code, see it here.

With these, I created chef::monit_server, chef::monit_server_webui, chef::monit_solr, chef::monit_solr_indexer, couchdb::monit_server, and added monit to gentoo::portage_rsync_server and gentoo::portage_binhost_server. Weirdly enough, the one that gave me the most trouble is rsyncd, since it does not behave very well as a daemon. I still have not figured out how to monit rabbitmq, on account that the stock rabbitmq-server does not drop a pid file. (However, rabbitmq-multi does).

It was loads of fun randomly killing off processes and watching them come back up.

Finally, Seth at MaxMedia told me about CloudKick at the Atlanta Ruby User's Group meetup. It is too much for me, but I suppose if I were doing something that I needed to show investors or enterprise customers that "We Can Scale", that would be it.

No comments:

Post a Comment