by K Lars Lohn (noreply@blogger.com) at May 21, 2013 03:32 PM
by K Lars Lohn (noreply@blogger.com) at May 21, 2013 03:32 PM
by K Lars Lohn (noreply@blogger.com) at May 20, 2013 05:10 PM
by K Lars Lohn (noreply@blogger.com) at April 24, 2013 08:52 PM
by K Lars Lohn (noreply@blogger.com) at April 14, 2013 03:08 PM
Talk @ http://svc2baltics.com/ in Vilnus, Lithuania, with a 1000+ students interested in starting up.


Q&A in UAE w/entrepreneurs and @davemcclure at MAKE.
As I mentioned in my previous blog post, trying out Ganeti can be cumbersome and I went out and created a platform for testing it out using Vagrant. Now I have a PDF guide that you can use to walk through some of the basics steps of using Ganeti along with even testing a fail-over scenario. Its an updated version of a guide I wrote for OSCON last year. Give it a try and let me know what you think!
I totally forgot blogging about this!
Remember how I curate a collection of fail pets across the Interwebs? Sean Rintel is a researcher at the University of Queensland in Australia and has put some thought into the UX implications of whimsical error messages, published in his article: The Evolution of Fail Pets: Strategic Whimsy and Brand Awareness in Error Messages in UX Magazine.
In his article, Rintel attributes me with coining the term "fail pet".
Attentive readers may also notice that Mozilla's strategy of (rightly) attributing Adobe Flash's crashes with Flash itself by putting a "sad brick" in place worked formidably: Rintel (just like most users, I am sure) assumes this message comes from Adobe, not Mozilla:

Thanks, Sean, for the mention, and I hope you all enjoy his article.
Note: This is a cross-post of an article I published on the Mozilla Webdev blog this week.
During the course of this week, a number of high-profile websites (like LinkedIn and last.fm) have disclosed possible password leaks from their databases. The suspected leaks put huge amounts of important, private user data at risk.
What's common to both these cases is the weak security they employed to "safekeep" their users' login credentials. In the case of LinkedIn, it is alleged that an unsalted SHA-1 hash was used, in the case of last.fm, the technology used is, allegedly, an even worse, unsalted MD5 hash.
Neither of the two technologies is following any sort of modern industry standard and, if they were in fact used by these companies in this fashion, exhibit a gross disregard for the protection of user data. Let's take a look at the most obvious mistakes our protagonists made here, and then we'll discuss the password hashing standards that Mozilla web projects routinely apply in order to mitigate these risks.
This one's easy: Nobody should store plain-text passwords in a database. If you do, and someone steals the data through any sort of security hole, they've got all your user's plain text passwords. (That a bunch of companies still do that should make you scream and run the other way whenever you encounter it.) Our two protagonists above know that too, so they remembered that they read something about hashing somewhere at some point. "Hey, this makes our passwords look different! I am sure it's secure! Let's do it!"
Smart mathematicians came up with something called a hashing function or "one-way function" H: password -> H(password). MD5 and SHA-1 mentioned above are examples of those. The idea is that you give this function an input (the password), and it gives you back a "hash value". It is easy to calculate this hash value when you have the original input, but prohibitively hard to do the opposite. So we create the hash value of all passwords, and only store that. If someone steals the database, they will only have the hashes, not the passwords. And because those are hard or impossible to calculate from the hashes, the stolen data is useless.
"Great!" But wait, there's a catch. For starters, people pick poor passwords. Write this one in stone, as it'll be true as long as passwords exist. So a smart attacker can start with a copy of Merriam-Webster, throw in a few numbers here and there, calculate the hashes for all those words (remember, it's easy and fast) and start comparing those hashes against the database they just stole. Because your password was "cheesecake1", they just guessed it. Whoops! To add insult to injury, they just guessed everyone's password who also used the same phrase, because the hashes for the same password are the same for every user.
Worse yet, you can actually buy(!) precomputed lists of straight hashes (called Rainbow Tables) for alphanumeric passwords up to about 10 characters in length. Thought "FhTsfdl31a" was a safe password? Think again.
This attack is called an offline dictionary attack and is well-known to the security community.
The standard way to deal with this is by adding a per-user salt. That's a long, random string added to the password at hashing time: H: password -> H(password + salt). You then store salt and hash in the database, making the hash different for every user, even if they happen to use the same password. In addition, the smart attacker cannot pre-compute the hashes anymore, because they don't know your salt. So after stealing the data, they'll have to try every possible password for every possible user, using each user's personal salt value.
Great! I mean it, if you use this method, you're already scores better than our protagonists.
But alas, there's another catch: Generic hash functions like MD5 and SHA-1 are built to be fast. And because computers keep getting faster, millions of hashes can be calculated very very quickly, making a brute-force attack even of salted passwords more and more feasible.
So here's what we do at Mozilla: Our WebApp Security team performed some research and set forth a set of secure coding guidelines (they are public, go check them out, I'll wait). These guidelines suggest the use of HMAC + bcrypt as a reasonably secure password storage method.
The hashing function has two steps. First, the password is hashed with an algorithm called HMAC, together with a local salt: H: password -> HMAC(local_salt + password). The local salt is a random value that is stored only on the server, never in the database. Why is this good? If an attacker steals one of our password databases, they would need to also separately attack one of our web servers to get file access in order to discover this local salt value. If they don't manage to pull off two successful attacks, their stolen data is largely useless.
As a second step, this hashed value (or strengthened password, as some call it) is then hashed again with a slow hashing function called bcrypt. The key point here is slow. Unlike general-purpose hash functions, bcrypt intentionally takes a relatively long time to be calculated. Unless an attacker has millions of years to spend, they won't be able to try out a whole lot of passwords after they steal a password database. Plus, bcrypt hashes are also salted, so no two bcrypt hashes of the same password look the same.
So the whole function looks like: H: password -> bcrypt(HMAC(password, local_salt), bcrypt_salt).
We wrote a reference implementation for this for Django: django-sha2. Like all Mozilla projects, it is open source, and you are more than welcome to study, use, and contribute to it!
Funny you should mention it. Mozilla Persona (née BrowserID) is a new way for people to log in. Persona is the password specialist, and takes the burden/risk away from sites for having to worry about passwords altogether. Read more about Mozilla Persona.
Make no mistake: just like everybody else, we're not invincible at Mozilla. But because we actually take our users' data seriously, we take precautions like this to mitigate the effects of an attack, even in the unfortunate event of a successful security breach in one of our systems.
If you're responsible for user data, so should you.
If you'd like to discuss this post, please leave a comment at the Mozilla Webdev blog. Thanks!

Silicon Valley comes to Beijing and Hong Kong! Awesome crew.
In my home network, I use IPv4 addresses out of the 10.x.y.z/8 private IP block. After AT&T U-Verse contacted me multiple times to make me reconfigure my network so they can establish a large-scale NAT and give me a private IP address rather than a public one (this might be material for a whole separate post), I reluctantly switched ISPs and now have Comcast. I did, however, keep AT&T for television. Now, U-Verse is an IPTV provider, so I had to put the two services (Internet and IPTV) onto the same wire, which as it turned out was not as easy as it sounds.
tl;dr: This is a "war story" more than a crisp tutorial. If you really just want to see the ebtables rules I ended up using, scroll all the way to the end.
IPTV uses IP Multicast, a technology that allows a single data stream to be sent to a number of devices at the same time. If your AT&T-provided router is the centerpiece of your network, this works well: The router is intelligent enough to determine which one or more receivers (and on what LAN port) want to receive the data stream, and it only sends data to that device (and on that wire).
Multicast, the way it is supposed to work: The source server (red) sending the same stream to multiple, but not all, receivers (green).
Turns out, my dd-wrt-powered Cisco E2000 router is--out of the box--not that intelligent and, like most consumer devices, will turn such multicast packets simply into broadcast packets. That means, it takes the incoming data stream and delivers it to all attached ports and devices. On a wired network, that's sad, but not too big a deal: Other computers and devices will see these packets, determine they are not addressed to them, and drop the packets automatically.
Once your wifi becomes involved, this is a much bigger problem: The IPTV stream's unwanted packets easily satisfy the wifi capacity and keep any wifi device from doing its job, while it is busy discarding packets. This goes so far as to making it entirely impossible to even connect to the wireless network anymore. Besides: Massive, bogus wireless traffic empties device batteries and fills up the (limited and shared) frequency spectrum for no useful reason.
Suddenly, everyone gets the (encrypted) data stream. Whoops.
One solution for this is only to install manageable switches that support IGMP Snooping and thus limit multicast traffic to the relevant ports. I wasn't too keen on replacing a bunch of really expensive new hardware though.
In comes ebtables, part of netfilter (the Linux kernel-level firewall package). First I wrote a simple rule intended to keep all multicast packets (no matter their source) from exiting on the wireless device (eth1, in this case).
ebtables -A FORWARD -o eth1 -d Multicast -j DROP
This works in principle, but has some ugly drawbacks:
-d Multicast translates into a destination address pattern that also covers (intentional) broadcast packets (i.e., every broadcast packet is a multicast packet, but not vice versa). These things are important and power DHCP, SMB networking, Bonjour, ... . With a rule like this, none of these services will work anymore on the wifi you were trying to protect.-o eth1 keeps us from flooding the wifi, but will do nothing to keep the needless packets sent to wired devices in check. While we're in the business of filtering packets, might as well do that too.So let's create a new VLAN in the dd-wrt settings that only contains the incoming port (here: W) and the IPTV receiver's port (here: 1). We bridge it to the same network, because the incoming port is not only the source of IPTV, but also our connection to the Internet, so the remaining ports need to be able to connect to it still.
dd-wrt vlan settings
Then we tweak our filters:
ebtables -A FORWARD -d Broadcast -j ACCEPT
ebtables -A FORWARD -p ipv4 --ip-src ! 10.0.0.0/24 -o ! vlan1 -d Multicast -j DROP
This first accepts all broadcast packets (which it would do by default anyway, if it wasn't for our multicast rule), then any other multicast packets are dropped if their output device is not vlan1, and their source IP address is not local.
With this modified rule, we make sure that any internal applications can still function properly, while we tightly restrict where external multicast packets flow.
That was easy, wasn't it!
Some illustrations courtesy of Wikipedia.
Ganeti is a very powerful tool but often times people have to look for spare hardware to try it out easily. I also wanted to have a way to easily test new features of Ganeti Web Manager (GWM) and Ganeti Instance Image without requiring additional hardware. While I do have the convenience of having access to hardware at the OSU Open Source Lab to do my testing, I’d rather not depend on that always. Sometimes I like trying new and crazier things and I’d rather not break a test cluster all the time. So I decided to see if I could use Vagrant as a tool to create a Ganeti test environment on my own workstation and laptop.
This all started last year while I was preparing for my OSCON tutorial on Ganeti and was manually creating VirtualBox VMs to deploy Ganeti nodes for the tutorial. It worked well but soon after I gave the tutorial I discovered Vagrant and decided to adapt my OSCON tutorial with Vagrant. Its a bit like the movie Inception of course, but I was able to successfully get Ganeti working with Ubuntu and KVM (technically just qemu) and mostly functional VMs inside of the nodes. I was also able to quickly create a three-node cluster to test failover with GWM and many facets of the webapp.
The vagrant setup I have has two parts:
The puppet module I wrote is very basic and isn’t really intended for production use. I plan to re-factor it in the coming months into a completely modular production ready set of modules. The node boxes are currently running Ubuntu 11.10 (I’ve been having some minor issues getting 12.04 to work), and the internal VMs you can deploy are based on the CirrOS Tiny OS. I also created several branches in the vagrant-ganeti repo for testing various versions of Ganeti which has helped the GWM team implement better support for 2.5 in the upcoming release.
To get started using Ganeti with Vagrant, you can do the following:
git clone git://github.com/ramereth/vagrant-ganeti.git git submodule update --init gem install vagrant vagrant up node1 vagrant ssh node1 gnt-cluster verify
Moving forward I plan to implement the following:
Please check out the README for more instructions on how to use the Vagrant+Ganeti setup. If you have any feature requests please don’t hesitate to create an issue on the github repo.
At Tag1 Consulting we do a lot of work on increasing web site performance, especially around Drupal sites. One of the common tools we use is memcached combined with the Drupal Memcache module. In Drupal, there are a number of different caches which are stored in the (typically MySQL) database by default. This is good for performance as it cuts down on potentially large/slow SQL queries and PHP execution needed to display content on a site. The Drupal Memcache module allows you to configure some or all of those caches to be stored in memcached instead of MySQL, typically these cache gets/puts in memcache are much faster than they would be in MySQL, and at the same time it decreases work load on the database server. This is all great for performance, but it involves setting up an additional service (memcached) as well as adding a PHP extension in order to communicate with memcached. I've seen a number of guides on how to install these things on Fedora or CentOS, but so many of them are out-dated or give instructions which I wouldn't suggest such as building things from source, installing with the 'pecl' command (not great on a package based system), or using various external yum repositories (some of which don't mix well with the standard repos). What follows is my suggested method for installing these needed dependencies in order to use memcached with Drupal, though the same process should be valid for any other PHP script using memcache.
For the Drupal Memcache module, either the PECL memcache or PECL memcached (note the 'd'!) extensions can be used. While PECL memcached is newer and has some additional features, PECL memcache (no 'd'!) tends to be better tested and supported, at least for the Drupal Memcache module. Yes, the PECL extension names are HORRIBLE and very confusing to newcomers! I almost always use the PECL memcache extension because I've had some strange behavior in the past using the memcached extension; likely those problems are fixed now, but it's become a habit and personal preference to use the memcache extension.
The first step is to get memcached installed and configured. CentOS 5 and 6 both include memcached in the base package repo, as do all recent Fedora releases. To install memcached is simply a matter of:# yum install memcached
Generally, unless you really know what you're doing, the only configuration option you'll need to change is the amount of memory to allocate to memcached. The default is 64MB. That may be enough for small sites, but for larger sites you will likely be using multiple gigabytes. It's hard to recommend a standard size to use as it will vary by a large amount based on the site. If you have a "big" site, I'd say start at 512MB or 1GB; if you have a smaller site you might leave the default, or just bump it to 512MB anyway if you have plenty of RAM on the server. Once it's running, you can watch the memory usage and look for evictions (removal of a cache item once the cache is full) to see if you might want to increase the memory allocation.
On all Fedora / CentOS memcached packages, the configuration file is stored in /etc/sysconfig/memcached. By default, it looks like this:
PORT="11211" USER="memcached" MAXCONN="1024" CACHESIZE="64" OPTIONS=""
To increase the memory allocation, adjust the CACHESIZE setting to the number of MB you want memcached to use.
If you are running memcached locally on your web server (and only have one web server), then I strongly recommend you also add an option for memcached to listen only on your loopback interface (localhost). Whether or not you make that change, please consider locking down the memcached port(s) with a firewall. In order to listen only on the 127.0.0.1 interface, you can change the OPTIONS line to the following:
OPTIONS="-l 127.0.0.1"
See the memcached man page for more info on that or any other settings.
Once you have installed memcached and updated the configuration, you can start it up and configure it to start on boot:
# service memcached start
# chkconfig memcached on
If you are on Fedora and using PHP from the base repo in the distribution, then installation of the PECL extension is easy. Just use yum to install whichever PECL extension you choose:
# yum install php-pecl-memcache
Or
# yum install php-pecl-memcached
CentOS and RHEL can be a bit more complicated, especially on EL5 which ships with PHP 5.1.x, which is too old for most people. Here are the options I'd suggest for EL5:
# yum install php52-pecl-memcache
# yum install php53u-pecl-memcache
EL6 ships with PHP 5.3, though it is an older version than is available for EL6 at IUS. If you are using the OS-provided PHP package, then you can install the PECL memcache extension from the base OS repo. If you want the PECL memcached extension, it is not in the base OS repo, but is available in EPEL. See the instructions linked from the CentOS 5 section above if you need to enable the EPEL repo.
# yum install php-pecl-memcache
Or, enable EPEL and then run:
# yum install php-pecl-memcached
As with EL5, some people running EL6 will also want the latest PHP packages and can get them from the IUS repositories. If you are running PHP from IUS under EL6, then you can install the PECL memcache extension with:
# yum install php53u-pecl-memcache
Similar to EL5, the IUS repo for EL6 does not include the PECL memcached module.
If you are using PECL memcache extension and will be using the clustering option of the Drupal Memcache module which utilizes multiple memcached instances, then it is important to set the hash strategy to "consistent" in the memcache extension configuration. Edit /etc/php.d/memcache.ini and set (or un-comment) the following line:
memcache.hash_strategy=consistent
If you are using the PECL memcached module, this configuration is done at the application level (e.g. in your Drupal settings.php).
Once you've installed the PECL memcache (or memcached) extension, you will need to reload httpd in order for PHP to see the new extension. You'll also need to reload httpd whenever you change the memcache.ini configuration file.
# service httpd reload
If you have SELinux enabled (you should!), I have an older blog post with instructions on configuring SELinux for Drupal.
That's it, you're now good to go with PHP and memcache!
As I've used cfengine less and less recently the packages in Fedora and EPEL have been a bit neglected. At one point someone stepped up to update them, but then nothing ever came of it. I've finally updated the packages to the latest upstream version as of this writing (3.3.0) in Fedora 16, Fedora 17, Fedora Devel, and EPEL 6. They should be pushed to the updates-testing repos for each of those releases soon if not already there. There are some package changes since the last 3.x release, so any testing people can do would be appreciated.
I've uploaded EL6 and F17 RPMs here for reference: http://sheltren.com/downloads/cfengine/testing/
Note that these are quite different from the upstream-provided RPMs which simply dump everything in /var/cfengine. The good news here is I've actually provided a source RPM for those that need to tweak the build. Also, I hit some configure errors when attempting to build on EL5 which I haven't worked out yet -- looks like an upstream bug with the configure script to me, so there are no EL5 packages at the moment.
If anyone is willing to co-maintain these in Fedora and/or EPEL with me, please let me know.
I see a lot of people coming by #centos and similar channels asking for help when they’re experiencing a problem with their Linux system. It amazes me how many people describe their problem, and then say something along the lines of, “and I disabled SELinux...”. Most of the time SELinux has nothing to do with the problem, and if SELinux is the cause of the problem, why would you throw out the extra security by disabling it completely rather than configuring it to work with your application? This may have made sense in the Fedora 3 days when selinux settings and tools weren’t quite as fleshed out, but the tools and the default SELinux policy have come a long way since then, and it’s very worthwhile to spend a little time to understand how to configure SELinux instead of reflexively disabling it. In this post, I’m going to describe some useful tools for SELinux and walk through how to configure SELinux to work when setting up a Drupal web site using a local memcached server and a remote MySQL database server -- a pretty common setup for sites which receive a fair amount of traffic.
This is by no means a comprehensive guide to SELinux; there are many of those already!
http://wiki.centos.org/HowTos/SELinux
http://fedoraproject.org/wiki/SELinux/Understanding
http://fedoraproject.org/wiki/SELinux/Troubleshooting
If you’re in a hurry to figure out how to configure SELinux for this particular type of setup, on CentOS 6, you should be able to use the following two commands to get things working with SELinux:# setsebool -P httpd_can_network_connect_db 1
# setsebool -P httpd_can_network_memcache 1
Note that if you have files existing somewhere on your server and you move them to the webroot rather than untar them there directly, you may end up with SELinux file contexts set incorrectly on them which will likely deny access to apache to read those files. If you are having a related problem, you’ll see something like this in your /var/log/audit/audit.log:type=AVC msg=audit(1324359816.779:66): avc: denied { getattr } for pid=3872 comm="httpd" path="/var/www/html/index.php" dev=dm-0 ino=549169 scontext=root:system_r:httpd_t:s0 tcontext=root:object_r:user_home_t:s0 tclass=file
You can solve this by resetting the webroot to its default file context using the restorecon command:# restorecon -rv /var/www/html
I’m going to start with a CentOS 6 system configured with SELinux in targeted mode, which is the default configuration. I’m going to be using httpd, memcached, and PHP from the CentOS base repos, though the configuration wouldn’t change if you were to use the IUS PHP packages. MySQL will be running on a remote server which gives improved performance, but means a bit of additional SELinux configuration to allow httpd to talk to a remote MySQL server. I’ll be using Drupal 7 in this example, though this should apply to Drupal 6 as well without any changes.
Here we will setup some prerequisites for the website. If you already have a website setup you can skip this section.
We will be using tools such as audit2allow which is part of the policycoreutils-python package. I believe this is typically installed by default, but if you did a minimal install you may not have it.# yum install policycoreutils-python
Install the needed apache httpd, php, and memcached packages:# yum install php php-pecl-apc php-mbstring php-mysql php-pecl-memcache php-gd php-xml httpd memcached
Startup memcached. The CentOS 6 default configuration for memcached only listens on 127.0.0.1, this is great for our testing purposes. The default of 64M of RAM may not be enough for a production server, but for this test it will be plenty. We’ll just start up the service without changing any configuration values:# service memcached start
Startup httpd. You may have already configured apache for your needs, if not, the default config should be enough for the site we’ll be testing.# service httpd start
If you are using a firewall, then you need to allow at least port 80 through so that you can access the website -- I won’t get into that configuration here.
Install Drupal. I’ll be using the latest Drupal 7 version (7.9 as of this writing). Direct link: http://ftp.drupal.org/files/projects/drupal-7.9.tar.gz
Download the tarball, and expand it to the apache web root. I also use the --strip-components=1 argument to strip off the top level directory, otherwise it would expand into /var/www/html/drupal-7.9/# tar zxf drupal-7.9.tar.gz -C /var/www/html --strip-components=1
Also, we need to get the Drupal site ready for install by creating a settings.php file writable by apache, and also create a default files directory which apache can write to.# cd /var/www/html/sites/default/
# cp default.settings.php settings.php
# chgrp apache settings.php && chmod 660 settings.php
# install -d -m 775 -g apache files
Setup a database and database user on your MySQL server for Drupal. This would be something like this:mysql> CREATE DATABASE drupal;
mysql> GRANT ALL ON drupal.* TO drupal_rw@web-server-ip-here IDENTIFIED BY 'somepassword';
Test this out by using the mysql command line tool on the web host.# mysql -u drupal_rw -p -h drupal
That should connect you to the remote MySQL server. Be sure that is working before you proceed.
If you visit your new Drupal site at http://your-hostname-here, you’ll be presented with the Drupal installation page. Click ahead a few times, setup your DB info on the Database Configuration page -- you need to expand “Advanced Options” to get to the hostname field since it assumes localhost. When you click the button to proceed, you’ll probably get an unexpected error that it can’t connect to your database -- this is SELinux doing its best to protect you!
So what just happened? We know the database was setup properly to allow access from the remote web host, but Drupal is complaining that it can’t connect. First, you can look in /var/log/audit/audit.log which is where SELinux will log access denials. If you grep for ‘httpd’ in the log, you’ll see something like the following:# grep httpd /var/log/audit/audit.log
type=AVC msg=audit(1322708342.967:16804): avc: denied { name_connect } for pid=2724 comm="httpd" dest=3306 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=system_u:object_r:mysqld_port_t:s0 tclass=tcp_socket
That is telling you, in SELinux giberish language, that the httpd process was denied access to connect to a remote MySQL port. For a better explanation of the denial and some potential fixes, we can use the ‘audit2why’ utility:# grep httpd /var/log/audit/audit.log | audit2why
type=AVC msg=audit(1322708342.967:16804): avc: denied { name_connect } for pid=2724 comm="httpd" dest=3306 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=system_u:object_r:mysqld_port_t:s0 tclass=tcp_socket
Was caused by:
One of the following booleans was set incorrectly.
Description:
Allow HTTPD scripts and modules to connect to the network using TCP.
Allow access by executing:
# setsebool -P httpd_can_network_connect 1
Description:
Allow HTTPD scripts and modules to connect to databases over the network.
Allow access by executing:
# setsebool -P httpd_can_network_connect_db 1
audit2why will analyze the denial message you give it and potentially explain ways to correct it if it is something you would like to allow. In this case, there are two built in SELinux boolean settings that could be enabled for this to work. One of them, httpd_can_network_connect, will allow httpd to connect to anything on the network. This might be useful in some cases, but is not very specific. The better option in this case is to enable httpd_can_network_connect_db which limits httpd generated network connections to only database traffic. Run the following command to enable that setting:# setsebool -P httpd_can_network_connect_db 1
It will take a few seconds and not output anything. Once that completes, go back to the Drupal install page, verify the database connection info, and click on the button to continue. Now it should connect to the database successfully and proceed through the installation. Once it finishes, you can disable apache write access to the settings.php file:# chmod 640 /var/www/html/sites/default/settings.php
Then fill out the rest of the information to complete the installation.
Now we want to setup Drupal to use memcached instead of storing cache information in MySQL. You’ll need to download and install the Drupal memcache module available here: http://drupal.org/project/memcache
Install that into your Drupal installation, and add the appropriate entries into settings.php. For this site, I did that with the following:# mkdir /var/www/html/sites/default/modules
# tar zxf memcache-7.x-1.0-rc2.tar.gz -C /var/www/html/sites/default/modules
Then edit settings.php and add the following two lines:$conf['cache_backends'][] = 'sites/default/modules/memcache/memcache.inc';
$conf['cache_default_class'] = 'MemCacheDrupal';
Now if you reload your site in your web browser, you’ll likely see a bunch of memcache errors -- just what you wanted! I bet it’s SELinux at it again! Check out /var/log/audit/audit.log again and you’ll see something like:type=AVC msg=audit(1322710172.987:16882): avc: denied { name_connect } for pid=2721 comm="httpd" dest=11211 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=system_u:object_r:memcache_port_t:s0 tclass=tcp_socket
That’s very similar to the last message, but this one is for a memcache port. What does audit2why have to say?# grep -m 1 memcache /var/log/audit/audit.log | audit2why
type=AVC msg=audit(1322710172.796:16830): avc: denied { name_connect } for pid=2721 comm="httpd" dest=11211 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=system_u:object_r:memcache_port_t:s0 tclass=tcp_socket
Was caused by:
One of the following booleans was set incorrectly.
Description:
Allow httpd to act as a relay
Allow access by executing:
# setsebool -P httpd_can_network_relay 1
Description:
Allow httpd to connect to memcache server
Allow access by executing:
# setsebool -P httpd_can_network_memcache 1
Description:
Allow HTTPD scripts and modules to connect to the network using TCP.
Allow access by executing:
# setsebool -P httpd_can_network_connect 1
Again, audit2why gives us a number of options to fix this. The best bet is to go with the smallest and most presice change for our needs. In this case there’s another perfect fit: httpd_can_network_memcache. Enable that boolean with the following command:# setsebool -P httpd_can_network_memcache 1
Success! Now httpd can talk to memcache. Reload your site a couple of times and you should no longer see any memcache errors. You can be sure that Drupal is caching in memcache by connecting to the memcache CLI (telnet localhost 11211) and typing ‘stats’. You should see some number greater than 0 for ‘get_hits’ and for ‘bytes’.
Now we’ve used a couple SELinux booleans to allow httpd to connect to memcached and MySQL. You can see a full list of booleans which you can control by using the command ‘getsebool -a’. They are basically a preset way for you to allow/deny certain pre-defined access controls.
As I mentioned briefly in the ‘TL;DR’ section, another common problem people experience is with file contexts. If you follow my instructions exactly, you won’t have this problem because we untar the Drupal files directly into the webroot, so they will inherit the default file context for /var/www/html. If, however, you were to untar the files in your home directory, and then use ‘mv’ or ‘cp’ to place them in /var/www/html, they will maintain the user_home_t context which apache won’t be able to read by default. If this is happening to you, you will see the file denials logged in /var/log/audit/audit.log -- something like this:type=AVC msg=audit(1324359816.779:66): avc: denied { getattr } for pid=3872 comm="httpd" path="/var/www/html/index.php" dev=dm-0 ino=549169 scontext=root:system_r:httpd_t:s0 tcontext=root:object_r:user_home_t:s0 tclass=file
The solution in this case is to use restorecon to reset the file contexts back to normal:# restorecon -rv /var/www/html
Update: It was noted that I should also mention another tool for debugging audit messages, 'sealert'. This is provided in the setroubleshoot-server package and will also read in the audit log, similar to what I described with audit2why.# sealert -a /var/log/audit/audit.log
A family of tourists, getting ready to watch the sun set on the Pacific coast. I love silhouette photos like this: It's fun to see the different characters with their body shapes and postures.
The CentOS Continuous Release repository (“CR”) was first introduced for CentOS 5.6, and currently exists for both CentOS 5 and CentOS 6. The CR repo is intended to provide package updates which have been released for the next point release upstream (from RHEL) which has not yet been officially released by CentOS yet due to delays around building, testing, and seeding mirrors for a new point release. For example, this means that once RedHat releases RHEL 5.8, CentOS will include package updates from 5.8 base and updates in CentOS 5.7 CR repo until the time that CentOS is able to complete the release of CentOS 5.8. For admins, this means less time without important security updates and the ability to be on the latest packages released in the latest RHEL point release.
What’s included in CR and how might it affect your current CentOS installs? At this point, the CR repo is used only for package updates which are part of the next upstream point release. For example, for CentOS 5.7, once Red Hat releases RHEL 5.8, the CR repo will contain updates from upstream base and updates repos. When a new update for RHEL 5.8 is released, it will be built in the CentOS build system, go through a relatively minimal amount of QA by the CentOS QA team, and then will be pushed to the CentOS 5.7 CR repo. This process will continue until the time that CentOS releases its own 5.8 release. Once CentOS releases 5.8, the CR repo will be cleared out until the time that RedHat releases the next (5.9) point release.
The CR repo is not enabled by default, so it is up to a system administrator to enable it if desired. That means, by default, you won’t see packages added to the CR repo. Installing the repo is very easy as it’s now part of the CentOS extras repository which is enabled by default. To enable CR, you simply have to:
yum install centos-release-cr
If you don’t have CentOS Extras enabled, you can browse into the extras/ directory for the release of CentOS you’re currently running and download and install the centos-release-cr package by hand, or manually create a centos-cr.repo in /etc/yum.repos.d/
In my opinion, unless you have an internal process for testing/pushing updates, you should absolutely be using the CR repo. Even if you do have your own local processes for updates, I would consider the CR repo to be part of CentOS updates for all intents and purposes, and pull your updates from there for testing/release. The packages in the CR repo can fix known security issues which without the CR repo you won’t have access to until the next CentOS point release -- and that can sometimes take longer than we’d like!
In a recent post to the CentOS Developers list, Karanbir Singh proposed moving the CR repo into the main release for 6.x. What this would mean is for CentOS 6.x and onward, we would see the base OS and ISO directories be updated for each point release, but in general, updates would be pushed to a central 6/ directory, basically incorporating CR into what is currently considered updates/.
This proposal is different from the current CR setup in that it incorporates CR into the release by default, and puts less reliance on the old point release model. This will help ensure that people are always running the latest security updates as well as take a bit of pressure off of CentOS developers and QA team when trying to build, test, and release the next point release. If the package updates are already released and in use, point releases become less important (though still useful for new installs).
Incorporating CR more into the main release doesn’t mean that point releases will go away completely. They will still include updated base packages and ISO images, typically with installer bug fixes and/or new and updated drivers. In general, I see this as a good move: it means more people will be getting security updates by default instead of waiting during the time lapse between upstream RHEL releases and the time it takes for CentOS to rebuild, test, and release that point release. Having those packages available by default is great, especially for those admins who don’t pay close attention and wouldn’t otherwise enable the CR repo. It should be noted that at this point, the incorporation of CR into the main release is only being discussed for CentOS 6.x onward and won’t change anything in the 5.x releases where people will still need to manually opt-in to the CR packages.
References:
http://wiki.centos.org/AdditionalResources/Repositories/CR
http://lists.centos.org/mailman/listinfo/centos-cr-announce
http://lists.centos.org/pipermail/centos-devel/2011-November/008268.html
We’ve just release version 0.7 of Ganeti Web Manager. Ganeti Web Manager is a Django based web application that allows administrators and clients access to their ganeti clusters. It includes a permissions and quota system that allows administrators to grant access to both clusters and virtual machines. It also includes user groups for structuring access to organizations.
This is the fourth release of Ganeti Web Manager and it contains numerous new features. It also includes various bug fixes and speed optimizations. Here is the full CHANGELOG, or read on for the highlights.
Ganeti Web Manager now have full Xen support. Prior versions could display Xen instances, but now you can create and edit them too. This as an important addition because Xen is a widely used and mature project. Now with full hardware virtualization in Linux 3.0, Xen will continue to be an important technology for virtualization. This was our most often requested feature and we’re glad to have fulfilled it.
Thanks to a large community contribution, internationalization support was added for nearly all aspects of the interface. Users can switch between their default language and any other. Currently only a Greek translation is available, but we’d like to see many more languages. If you can read and write another language this is a great opportunity for you to get involved. We’re using Transifex to coordinate people who want to help translate.
Administrators of larger cluster can now find objects easier with our search interface. It includes an Ajax auto-complete feature, along with detailed results.
We’ve also added contextual links wherever we could. This included ensuring breadcrumbs were properly formatted on each page. Object Permissions and Object Log were updated to ensure navigating between those screens and Ganeti Web Manager is seamless.
There are now import tools for Nodes. These work the same as for instances. The cache updater has also been reworked to support both Nodes and Instances. It’s now a twisted plugin with modest speed improvements due to Ganeti requests happening asynchronously.
We’ve sought out places where we performed extra and or inefficient database queries. We identified numerous places where database interaction could be reduced, and pages returned faster. This is an ongoing process. We’ll continue to optimize and improve the responsiveness as we find areas of the project we can improve.
Numerous bugs were fixed in both the user interface and the backend. Notably, the instance creation interface has had several bugs corrected.
We’re building several modules along with Ganeti Web Manager. The following projects have new releases coinciding with Ganeti Web Manager 0.7:
Lance Albertson and I will be speaking about Ganeti & Ganeti Web Manager at several conferences this summer. Catch us at the following events:
Five OSUOSL co-workers and I recently finished a road trip to Google I/O 2011. We took two cars on an 11 hour drive through scenic southern Oregon and northern California. We learned more about Android and other technologies shaping the web. It was also a great opportunity to spend time with each other outside the office.
Monday night we joined about 30 Google Summer of Code mentors for dinner and drinks hosted by the Google Open Source Programs Office. We’re always grateful for events that bring together friends old and new. One developer nervously sat down at our table, professing that he didn’t know anyone. We might not work on the same project, but we’re all part of the open source community.
The highlight of the conference was the double announcement of Android Open Accessory program and Android @ Home. Both open up Android to integration with third party devices. These features coupled with near field communications (NFC) stand to dramatically change how we use our mobiles devices to interact with the world around us. This is not a new idea. X10 home automation has existed since 1975. Zigbee and Z-wave are more modern protocols, but also available for years. The difference here is 100 million Android users and a half million Arduino hackers.
As Phillip Torrone wrote on the Makezine Blog, “There really isn’t an easier way to get analog sensor data or control a motor easier and faster than with an Arduino — and that’s a biggie, especially if you’re a phone and want to do this.”
It won’t be a short road. We still have obstacles such as higher costs. A representative from Lighting Science I spoke to at their I/O booth quoted Android@Home enabled LED lights at $30 per bulb. Android and Arduino might be the right combination of market penetration, eager hackers, and solid platforms for a more integrated environment.
My favorite session was How To NFC. NFC (near field communication) is similar to RFID except it only works within a few centimeters. Newer android phones can send and receive NFC messages any time except when the phone is sleeping. NFC chips can also be embedded in paper, like the stickers that came in our I/O Badges. An NFC enabled app can share data such as a url, or launch a multiplayer game with your friend. It makes complex tasks as simple as “touch the phone here”. Android is even smart enough to launch an app required for an NFC message, or send you to the market to install the app you need. Only the Nexus-S supports NFC now, but this feature is so compelling that others will support it soon too.
The other technical sessions were very useful too, whether you were interested in Android, Chrome, or other Google technologies. The speakers were knowledgeable on the subject areas they spoke on. I attended mostly Android talks, and it was great hearing from the people who wrote the APIs we’re trying to use. The sessions were all filmed and are worth watching online.
One of the best features of Ganeti is its ability to grow linearly by adding new servers easily. We recently purchased a new server to expand our ever growing production cluster and needed to rebalance cluster. Adding and expanding the cluster consisted of the following steps:
For simplicity sake I’ll cover the last three steps.
Assuming you’re using a secondary network, this is how you would add your node:
gnt-node add -s <secondary ip> newnode
Now lets check and make sure ganeti is happy:
gnt-cluster verify
If all is well, continue on otherwise try and resolve any issue that ganeti is complaining about.
Make sure you install ganeti-htools on all your nodes before continuing. It requires haskell so just be aware of that requirement. Lets see what htools wants to do first:
hbal -m ganeti.example.org Loaded 5 nodes, 73 instances Group size 5 nodes, 73 instances Selected node group: default Initial check done: 0 bad nodes, 0 bad instances. Initial score: 41.00076094 Trying to minimize the CV... 1. openmrs.osuosl.org g1.osuosl.bak:g2.osuosl.bak => g5.osuosl.bak:g1.osuosl.bak 38.85990831 a=r:g5.osuosl.bak f 2. stagingvm.drupal.org g3.osuosl.bak:g1.osuosl.bak => g5.osuosl.bak:g3.osuosl.bak 36.69303985 a=r:g5.osuosl.bak f 3. scratchvm.drupal.org g2.osuosl.bak:g4.osuosl.bak => g5.osuosl.bak:g2.osuosl.bak 34.61266967 a=r:g5.osuosl.bak f <snip> 28. crisiscommons1.osuosl.org g3.osuosl.bak:g1.osuosl.bak => g3.osuosl.bak:g5.osuosl.bak 4.93089388 a=r:g5.osuosl.bak 29. crisiscommons-web.osuosl.org g2.osuosl.bak:g1.osuosl.bak => g1.osuosl.bak:g5.osuosl.bak 4.57788814 a=f r:g5.osuosl.bak 30. aqsis2.osuosl.org g1.osuosl.bak:g3.osuosl.bak => g1.osuosl.bak:g5.osuosl.bak 4.57312216 a=r:g5.osuosl.bak Cluster score improved from 41.00076094 to 4.57312216 Solution length=30
I’ve shortened the actual output for the sake of this blog post. Htools automatically calculates which virtual machines to move and how using the least amount of operations. In most these moves, the VMs may simply be migrated, migrated & secondary storage replaced, or migrated, secondary storage replaced, migrated. In our environment we needed to move 30 VMs around out of the total 70 VMs that are hosted on the cluster.
Now lets see what commands we actually would need to run:
hbal -C -m ganeti.example.org Commands to run to reach the above solution: echo jobset 1, 1 jobs echo job 1/1 gnt-instance replace-disks -n g5.osuosl.bak openmrs.osuosl.org gnt-instance migrate -f openmrs.osuosl.org echo jobset 2, 1 jobs echo job 2/1 gnt-instance replace-disks -n g5.osuosl.bak stagingvm.drupal.org gnt-instance migrate -f stagingvm.drupal.org echo jobset 3, 1 jobs echo job 3/1 gnt-instance replace-disks -n g5.osuosl.bak scratchvm.drupal.org gnt-instance migrate -f scratchvm.drupal.org <snip> echo jobset 28, 1 jobs echo job 28/1 gnt-instance replace-disks -n g5.osuosl.bak crisiscommons1.osuosl.org echo jobset 29, 1 jobs echo job 29/1 gnt-instance migrate -f crisiscommons-web.osuosl.org gnt-instance replace-disks -n g5.osuosl.bak crisiscommons-web.osuosl.org echo jobset 30, 1 jobs echo job 30/1 gnt-instance replace-disks -n g5.osuosl.bak aqsis2.osuosl.org
Here you can see the commands it wants you to execute. Now you can either put these all in a script and run them, split them up, or just run them one by one. In our case I ran them one by one just to be sure we didn’t run into any issues. I had a couple of VMs not migration properly but those were exactly fixed. I split this up into a three day migration running ten jobs a day.
The length of time that it takes to move each VM depends on the following factors:
Most of our VMs ranged in size from 10G to 40G in size and on average took around 10-15 minutes to complete each move. Addtionally, make sure you read the man page for hbal to see all the various features and options you can tweak. For example, you could tell hbal to just run all the commands for you which might be handy for automated rebalancing.
Overall the rebalancing of our cluster went without a hitch outside of a few minor issues. Ganeti made it really easy to expand our cluster with minimal to zero downtime for our hosted projects.
Along with the rest of the OSU Open Source Lab crew (including students), I was invited to the grand opening of Facebook’s new datacenter yesterday in Prineville, Oregon. We were lucky enough to get a private tour by Facebook’s Senior Open Source Manager, David Recordon. I was very impressed with the facility on many levels.
I was glad I was able to get a close look at their Open Compute servers and racks in person. They were quite impressive. One triplet rack can hold ninty 1.5U servers which can add up quickly. We’re hoping to get one or two of these racks at the OSL. I hope they fit as those triplet racks were rather tall!
Here’s a look at a bank of their web & memcached servers. You can find the memcached servers with the large banks of RAM in the front of them (72Gs in each server). The web servers were running the Intel open compute boards while the memcached servers were using AMD. The blue LED’s on the servers cost Facebook an extra $0.05 per unit compared to green LED’s.
The hot aisle is shown here and was amazing quiet. Actually, the whole room was fairly quiet which is strange compared to our datacenter. Its because of the design of the open compute servers and the fact that they are using negative/positive airflow in the whole facility to push cold/hot air.
They had a lot of generators behind the building each a size of a bus easily. You can see their substation in the background. Also note the camera in the foreground, they were everywhere not to mention security because of Green Peace.
The whole trip was amazing and was just blown away by the sheer scale. Facebook is planning on building another facility next to this one within the next year. I was really happy that all of the OSL students were able to attend the trip as well as they rarely get a chance to see something like this.
We missed seeing Mark Zuckerburg by minutes unfortunately. We had a three hour drive back and it was around 8:10PM when we left and he showed up at 8:15PM. Damnit!
If you would like to see more of the pictures I took, please check out my album below.
Thanks David for inviting us!
I just returned from PyCon 2011, the largest annual gathering of Python users and contributors. The conference was full of energy and I came home with my head stuffed full of new ideas and Python skills. Hillary Mason best described my feelings about PyCon in her opening keynote, “I’m glad I’m in a room where list comprehensions receive spontaneous applause”.
The talks came in many flavors: hands-on tutorials, sessions, a poster session, and open space discussions. Topics included dev-ops, deployment, scalability, concurrency, large scale data processing, science, and much much more. There was a great deal to learn for both novice and experienced programmers alike. Most sessions taught useful skills like:
But some sessions were just fun, mind blowing examples of what you could do with Python:
It was difficult to choose which talks to see during most time-slots. There were just too many great topics to choose from, so it’s fortunate that the session videos are already online. Many thanks to the PyCon team for being so prompt.
It’s an exciting time for Python developers whether you are just entering the workforce, or looking for something new and exciting. Part of the exhibit hall was dedicated to startups looking for new employees, but every other exhibitor was looking for employees, too. There is definitely an employer out there to match your individual passions, and I’m glad to know that my students will have many great choices after graduation.
This was my first PyCon and did not know many other attendees, so I planned to spend a good deal of time meeting other people. Among the 1380 attendees walking the halls and attending sessions were Python Core Developers, authors of your favorite libraries, keynote speakers, and even Guido. While this could seem intimidating, we all came to PyCon to learn from each other and collaborate. Everyone was welcoming and happy to share knowledge and great conversations.
Six days of talking to random people resulted in many awesome “a ha” moments. Topics spanned programming, technology, science, and art. The ideas I shared in these talks were as valuable as the formal sessions I attended. The best part was making so many new friends. Looking forward to a great PyCon with you all next year in Santa Clara!
We’ve released Ganeti Web Manager 0.6. Ganeti Web Manager is a Django based web application that allows administrators and clients access to their ganeti clusters. It includes a permissions and quota system that allows administrators to grant access to both clusters and virtual machines. It also includes user groups for structuring access to organizations.
This release comes after a short development cycle, with the goal of fixing critical bugs and providing important core features. Check out the full change log, or read on for some highlights:
Ganeti Web Manager 0.6 includes multiple improvements to the virtual machine detail view. We’ve added the complete list of virtual machine properties. The layout has been updated to group properties into relevant sections, as well as make it more readable.
The following new controls were added for virtual machines:
Ganeti Web Manager 0.6 also features improvements to the virtual machine deployment process. It now detects and recovers from Ganeti errors better than before. If a create job fails without ganeti deploying the virtual machine, you can edit the settings and re-submit the job. All other failures will let you continue to the virtual machine detail view where you can use the provided admin tools to repair the virtual machine.
Nodes are now cached by Ganeti Web Manager. This allows views using node data to be displayed faster. We now also provide Node views that allow an admin to issue commands on a node such as migration and changing the node role. The node detail view also provides information from the perspective of a node including used resources and which virtual machines are deployed on it.
Ganeti Web Manager 0.6 now provides a log of actions performed on every object. This will allow admins to see the history of every action taken on a VirtualMachine, Node, and Clusters. It also shows every action a user account has performed. The log is intended to aid auditing and troubleshooting.
Logging is provided by the newly branded Django Object Log app. It is a reusable app that can log generic messages. Each message can define it’s own rich format, including contextual links to the related objects. Object Log will be developed in parallel with Ganeti Web Manager and future projects by the OSUOSL.
Nicely put together collection of console apps. I want to check out hnb.
Tutorial giving an overview of the simple and well designed mysql API.
Semi automated tool for testing web applications.