DHCP failover with ISC DHCP

November 22, 2009

I recently set up DHCP with failover for a roughly 4,000 node network. While this isn’t the largest network on earth, it’s big enough to warrant some careful thought before implementing, and unfortunately I found documentation to be lacking, outdated and in some cases, flat out wrong (!) which may cause a less reckless person to shy away from implementation.

Requirements

Here’s what I had, the ISC DHCP version is important because the DHCP failover protocol is a moving target:

  • ISC DHCP 3.1.1 and make sure the version numbers match on your servers
  • Two DHCP servers running Debian Lenny

Configuration

Primary

key primaryserver {
        algorithm hmac-md5;
        secret "examplesecret";
};

omapi-key primaryserver;
omapi-port 7911;

failover peer "example" {
        primary;
        address 192.0.2.1;
        port 519;
        peer address 192.0.2.2;
        peer port 519;
        max-response-delay 30;
        max-unacked-updates 10;
        load balance max seconds 3;
        mclt 300;
        split 128;
};

In most cases, what I have here should work just fine for you, with the exception of the mclt. This value should be tweaked based on your personal setup. Read section 5.14 of the draft specification implemented by DHCP 3.1.1 if you want more information on choosing the MCLT, but dhcpd.conf(5) has this useful summary:

          The  mclt statement defines the Maximum Client Lead Time.   It
          must be specified on the primary, and may not be specified  on
          the  secondary.   This is the length of time for which a lease
          may be renewed by either failover peer without contacting  the
          other.    The longer you set this, the longer it will take for
          the running server to recover IP addresses after  moving  into
          PARTNER-DOWN  state.    The  shorter you set it, the more load
          your servers will experience when they are not  communicating.
          A  value  of  something  like 3600 is probably reasonable, but
          again bear in mind that we have no real operational experience
          with this.

I set my MCLT value to be fairly short (5 minutes) because the servers are underutilized as it is, and I can easily handle the potential higher load. Since we are somewhat strapped for addresses, a short MCLT will allow me more flexibility in moving leases around in failover scenarios, or putting a server into partner-down state, which I will talk a little bit about later.

Secondary

key secondaryserver {
    algorithm hmac-md5;
    secret "examplesecret";
};

omapi-key secondaryserver;
omapi-port 7911;

failover peer "example" {
    secondary;
    address 192.0.2.2;
    port 519;
    peer address 192.0.2.1;
    peer port 519;
    max-response-delay 30;
    max-unacked-updates 10;
    load balance max seconds 3;
};

Please note that beyond the keys and such, the only difference in the failover peer section is that the mclt and split lines are missing from the secondary’s configuration. They only need to be configured on the primary, and will be communicated to the secondary when the servers start up. Also note that I used the same port on both peers unlike the man page’s example. This is cleaner to me and is perfectly legal.

Key generation

In order to generate the keys you see in the configuration file above, you’ll need to use dnssec-keygen.

I used the following command on the primary server:

  dnssec-keygen -a HMAC-MD5 -b 512 -n HOST primaryserver

and this one on the secondary:

  dnssec-keygen -a HMAC-MD5 -b 512 -n HOST secondaryserver

The only difference between the two is the name of the key. The fun part comes next. Inside the generated .key file that you just created (there will be a file ending in .key and one in .private) the very last column is your secret, and it’s going to look something like:

FNUM+qEBnDXmWf18wjTvt77Cy/IcJw==

Use this in place of examplesecret. I don’t think it matters where the two generated files are, I think we are just abusing this process to generate a proper key. I kept them around anyway on each of my servers in case I need them later.

Operation

Start up your servers. There will be a short period of time where they will not serve anything while they both share lease information with each other. This is a one-time startup cost, and should last under 30 seconds. Unfortunately some configuration errors (such as putting hostnames in the address and peer-address sections) will result in a hard-to-see error in the logs followed by the two servers stuck in recover state and not serving leases to anyone. If your servers are recovering for longer than a minute or two, abort the failover configuration and try to figure out what you did wrong.

Long-term maintenance (partner-down)

If you are going to be doing maintenance on one of the servers for a long period of time and you want the other server to serve all of your leases, you will need to put the server you want to keep online in the partner-down state, which says that the other server is not coming back online anytime soon and you should go ahead and serve all leases. When the second server does come back online, reintegration will still be possible, but it will take a little longer.

When you are ready to do this, shut down the server you are taking offline. If you don’t do this, the servers will immediately reconnect once you put one into partner-down. This behavior is nice later on because when you are ready for reintegration, you just boot up the downed server and everything happens automatically. You can then put the server that will be up into the partner-down state using the following steps:

Open ”omshell”:

  omshell

Enter the omapi-key from your dhcpd.conf (so open it up in a place you can easily copy/paste from):

  key "YOUR KEY HERE"

Now connect, the default host/port will work if you’re logged in. Otherwise you’ll need to specify them before connect:

  connect

Now open the failover-state object, which is how we manipulate the server’s failover state:

  new failover-state

Next you will set the name of the state to open, name is the same as the declaration in dhcpd.conf for the failover peer block:

  set name = "YOUR failover peer BLOCK NAME HERE"

Now you can open the object, and your state will be listed:

  open

Now change local-state to 4, which is partner-down. Please note that in ISC DHCP 3.1.1 this state’s integer value is wrong in the man page. Consult newer versions’ man pages for the correct tables. I used DHCP 4.1:

  set local-state = 4

Lastly, update the running server configuration:

  update

You should see some appropriate things in the logs indicating that the server is in the partner-down state.

Please let me know if this helped you out at all, or if I left something important out. One of the reasons I’m writing this post is so that more people can figure out how to set up failover because of the awful state of existing documentation, so I am very interested in making this as easy-to-follow as possible as well as including rationale and pointers to other solid documentation wherever possible.

Simple CouchDB: Feed Caching

February 25, 2009

For those of you that haven’t heard of CouchDB, it is a document-oriented database system unlike MySQL or PostgreSQL aimed at solving certain problems differently and often more elegantly than its RDBMS counterparts. I won’t go into any more detail than that.

In designing and mocking up a site that consumes RSS/Atom feeds, I wanted to come up with a very simple solution for deployment that didn’t involve caching in the same way I had been caching some other data objects. I didn’t want to have the user trigger a cache replacement for a few reasons (I’ll talk about them a little). First, a few solutions you might come up with:

Just pull the feed when the page gets hit.

This is not really an appropriate solution unless the feed you are consuming is inside your organization or something to that effect. I think this realization is obvious to most people, but it’s certainly the easiest way of doing things, so I can’t help but wonder how many sites try this.

Pull the feed once and cache it for some length of time.

This works better than the first solution. I think it’s passable, but it still suffers from some weaknesses that are dealbreakers for many people. First of all, what if the target site goes down? Aside from the usual error-handling code, you’ll need to make sure you don’t stick failed results in your cache. One nasty issue that might bite you here is that if the cache TTL expires triggering a cache miss and pulling a fresh feed when the target site is down, suddenly all of your other visitors are going to miss against the cache and, depending on your backend, you might not even be able to pull stale results because they have been invalidated by the expired TTL.

Pull the feed periodically and fetch it from your local store.

This is the situation for which I’m employing CouchDB.

What’d I do?

  1. Wrote a short script that can be run from cron to fetch the feeds I want. This ended up being 60 lines with comments and error handling, without any CouchDB libraries.
  2. Used _show to set up a “view” whereby my client could simply request the feed from CouchDB in exactly the same way as it would request a view from the actual URI. Not even simple JSON decoding is needed by the client, just point your feed consuming library at the URI. I can’t stress enough how cool this is.

Because URIs for feeds are unique, each of my feeds could even use its URI as the id in the database! Perfect. When a remote feed goes down, the cron fetcher skips updating that feed and the feed simply doesn’t get refreshed for the length of the downtime. No extra error handling needed beyond the usual. The implementation ends up being dead simple, fast, and (because of CouchDB) scalable if you need it.

PHP tip: LimitIterator

January 25, 2009

If you’ve ever found yourself writing the anti-pattern:

$i = 0;
foreach ($iterable as $element) {
    if ($i >= 5)
        break;

    $element->doStuff();
    $i++;
}

You might be able to make use of PHP’s SPL:

foreach (new LimitIterator($iterable, 0, 5) as $element) {
    $element->doStuff();
}

Introduction

I’ve found that the documentation for getting this working is lacking, probably because not all that many people have to do it. Most related links would lead you to believe the process involves Makefile hacks, library shuffling, and any number of difficult-for-the-next-guy-to-reproduce (might be you!) steps. My main goal in writing this is to make it as simple as possible, but no simpler. Additionally, I’d like as much as possible to explain why I chose to do something along with how I chose to do it.

As the post title implies, this is Debian-specific, though I’ve used several other distributions and I see no reason why this post can’t be fairly simply adapted to them. Leave me a comment if these instructions are useful to you and especially if you were able to get it working with another distribution.

What You’ll Need

The Oracle Instant Client is distributed here as a binary package. Navigate to the Platform Downloads section and pick out your architecture. I used the Basic Lite because I do not need translations, but you could just as easily use the Basic package if you do. You will also need the SDK package if you want anything else to be able to use the Instant Client. I do, because I want to set up Perl and PHP to use the Instant Client. Additionally, I need the SQL*Plus package to do simple administrative tasks for our department such as password changes. This may not be something you need. If not, you can safely leave it out. Download the RPMs, and you’re on your way.

One neat trick if you’re like me and don’t want to set up an Oracle account is to use a service called BugMeNot to use an account that someone else has set up. profile.oracle.com seems to have the accounts that worked for me. If you like the service, don’t ruin it for everyone else by doing stupid things with the account. If you want to create an account, you can easily do that instead.

Installation

Now that you have the RPMs you need, we’ll convert them into packages that Debian can use. This will make it easier to do package management. Use a utility called alien to convert the RPMs into .deb packages that you can then install using dpkg. If you don’t have alien, apt-get install alien.


# Replace the filename with the RPM you want to convert.
# You can specify multiple RPMs with one command if you have them.
alien oracle-instantclient-foo.rpm

Now that you have the packages:


# Do this for all of your packages.
dpkg -i oracle-instantclient-foo.deb

and now they’re installed!

Configuration

The next step is to configure Oracle’s tnsnames.ora, which tells Oracle a little bit about the database(s) you’ll be connecting to. This will be site-specific. Before I go any farther, I’d like to define something I’ll use from here on out: ORACLE_HOME. This allows me to refer to that, wherever it may be on your system. By default, my Debian systems (both Etch and Lenny) put the Instant Client in /usr/lib/oracle/{version}/client (64-bit systems will have client64 instead of client). Anywhere you see this in my post, replace ORACLE_HOME with your local path to the Instant Client. That said, you will now put your tnsnames.ora file into ORACLE_HOME/network/admin/tnsnames.ora. Ask your local DBA for help with this part if you don’t know what I’m talking about.

Essentially, you’re done with the required parts! Easy, wasn’t it? Now, if you want SQL*Plus to work, you’ll probably need to tell the linker where to find the Oracle libraries, since they aren’t on the linker path by default. On a Debian system, the linker looks in /etc/ld.so.conf.d for files that it can source. To make my local changes easy to reproduce, I made a new file called oracle.conf and placed it in /etc/ld.so.conf.d. All this file contains is:


ORACLE_HOME/lib

Then, run ldconfig with no arguments to get the linker to source the files, and you’re set. Now, you’re ready to test it! Simply type


ORACLE_HOME/bin/sqlplus

and you should be able to go about your normal Oracle administration. Have fun!

When I tried this for the first time, I got some complaints about libaio. If you see similar complaints after trying to run sqlplus, try:


apt-get install libaio1

to install the asynchronous I/O libraries.

Finishing Touches

The Oracle Instant Client is an interesting beast since it requires an ORACLE_HOME environment variable to operate. My database connection libraries inject that appropriate environment variable when an Oracle database is connected to. If you don’t do something like this, your connections will fail. Optionally, you can do other things such as add this to Apache’s environment so it will always be available, or you can place this environment variable in the environment of users that run certain scripts.

Optional: PHP Integration

We’ll be installing the OCI8 module for PHP. You’ll need to make yourself root for this. You’ll also need to have installed the SDK mentioned earlier, and probably have gcc and make installed.


cd /usr/local/src
pecl download oci8
zcat oci8-version.numbers.here.tgz | tar xvf -
cd oci8-version.numbers.here
# For the phpize step, you might have to install php5-dev
# if it doesn't already exist.
phpize
# Remember, replace ORACLE_HOME with your ORACLE_HOME!
./configure --with-oci8=instantclient,ORACLE_HOME/lib
make
make install

Next, you’ll need to make sure your newly installed PHP module gets loaded when Apache starts up. You do this by adding a file to /etc/php5/conf.d called oci8.ini (or anything, really) and make sure it contains the line:


extension=oci8.so

Restart Apache and you should be good to go.

Optional: Perl Integration

You’ll need to make yourself root. You might also need the gcc and make packages if you don’t have them already on your system. To reduce confusion because of the literal ORACLE_HOME environment variable, I’m using the more generic path. The ORACLE_HOME that follows is literally the environment variable you need to set.


# The following two lines are needed by the installer so that it builds
# the module against the correct Oracle libraries. This is also useful
# if you need to build the module against different versions.
# Remember to replace {version} with the version you want to
# build against.
export ORACLE_HOME=/usr/lib/oracle/{version}/client
export LD_LIBRARY_PATH=${ORACLE_HOME}/lib
# The following line does the same thing as a normal CPAN install
# (e.g. cpan DBD::Oracle) except it does not require the tests to
# pass, because they won't pass (the database credentials the
# test uses are always invalid). Check the output of the test runner
# though, because you want the first test (loading the module) to
# pass, even if the rest fail.
perl -MCPAN -e "CPAN::Shell->force(qw(install DBD::Oracle));"

Conclusion

This should get you well on your way. If any of this looks confusing, it’s probably just because I wrote a lot. Setting up everything described in this post can only take 5-10 minutes if executed reasonably quickly.

Please leave comments if this was helpful to you or if you have places where I can explain myself better or am just flat out wrong. Good luck!

I ran across a post recently about using flags in your UI as a way for a user to select the language of the text (and possibly UI direction). While I agree with the post, the main point that drives me to use text in the native language as the way to choose the native language could be emphasized more as a way to help people think about what a flag represents as opposed to what a language name represents.

A national flag represents a nation (country). A language name represents a language.

Take a look at YouTube or Google News. YouTube best illustrates this point. Take a minute and visit the homepage, and check out the country and language selections at the top. Both flags and language names are used! Google News takes a similar approach, except without the use of flags at all, instead opting to use the name of the country. This illustrates the point I think should be clearer. When I click on the flag of Germany in YouTube’s UI, I now get content primarily from Germany, but my UI is still in English. When I click on Deutsch, I get a German user interface. I can also have a German user interface with content from or relevant to France. I don’t think there’s anything necessarily wrong or confusing with switching the user’s interface to the language most likely to be used in the country that the content is primarily from, but that really depends on your users. Of course, so does everything, and if your users want flags to represent languages, that’s what they’ll get, and nobody on the internet is going to convince them otherwise :)

Summary of my personal uses:

Use the German flag when the content is German in origin or relevant to Germany
Use Deutsch when the user interface will be switched to that language

Consider also that not everyone is able to see flags, and if your site is accessible you will need to have the text explaining the flag anyway. Doing things as described above keeps things very simple, easy for users to understand, and makes clear the distinction between geographic content relevance and content language.

My goal is to make my home network as simple as possible, but not to use IPv6 exclusively. That said, wherever possible I have enabled and preferred IPv6 to shake out any issues and to see where things can be improved. I try to mimic a “realistic” dual stack environment because that is the most useful balance to me so that I can continue to get things done while automatically preferring IPv6 wherever possible.

Here is a simple ASCII diagram of my physical network:

[ LAN ] -- [ WAP/switch ] -- [ OpenBSD 4.3 ] -- { Internet }

The end result for those that don’t want to read the whole thing is an extremely stable and functional LAN that supports IPv6-enabled devices easily and automatically without denying anything to IPv4-only hosts. Windows Vista clients, for example, can simply plug in or associate with my WAP and have IPv6 connectivity with zero configuration. I realize this is relatively trivial (and hopefully my explanation is as trivial) but I feel like this is important: I commonly hear that IPv6 is very difficult to use or difficult to set up. While there are some things you need to know to set up an IPv6 network (as with IPv4), there is (or rather, should be) absolutely nothing you need to know as a client in a properly configured dual-stack environment. When a user decides to go to freebsd.org they should need to do a little sleuthing to figure out that most, if not all, of their network communication just took place over IPv6 ;)

I will go through the steps I took (neatly sidestepping the mistakes I made…) to set this up as well as posting any relevant configuration files I have. I’ll try to keep this segmented into easily visible sections, because I don’t like splitting this kind of thing up into multiple blog posts. Additionally, before I begin, I am going to use example IPv6 addresses within the RFC 3849 documentation-use-only IPv6 prefix, and example IPv4 addresses from TEST-NET described in RFC 3330 instead of my own. The IPv6 documentation prefix is 2001:db8::/32 and IPv4 TEST-NET is 192.0.2.0/24. I will use my private addresses (in 10.0.0.0/8 address space) as they actually are configured on my home network. What does this mean for you? Quoting from RFC 3849, “[a]ddresses within this block should not appear on the public Internet,” so don’t expect any of these addresses to work for you without altering them!

Step 1: Check configuration

Since this post is about OpenBSD, I’m using OpenBSD as an example. First of all, there are some sysctl options to be sure you have set to allow you to forward packets and be a well-behaved router.

/etc/sysctl.conf

# 1=Permit forwarding (routing) of IPv4 packets
net.inet.ip.forwarding=1

# 1=Permit forwarding (routing) of IPv6 packets
net.inet6.ip6.forwarding=1

# 1=Permit IPv6 autoconf (forwarding must be 0)
net.inet6.ip6.accept_rtadv=0

Next, make sure you have a pf.conf that you are content with, because in a minute you will (hopefully) become connected via IPv6. Here is a barebones pf.conf which is a literal copy and paste from the pf.conf currently on my OpenBSD box. If you’re thinking about copying and pasting this, please make sure it matches your security policies. I like my pf.conf to be more liberal than many people, so if you don’t understand what this does I would recommend man 5 pf.conf.

/etc/pf.conf

# Macros
ext_if="rl0"
int_if="xl0"

# Tables

# Options
set block-policy return
set skip on lo

# Normalization
scrub in
scrub out

# Queuing

# Translation
nat on $ext_if inet from ! ($ext_if) -> ($ext_if)

# Filtering
pass in
pass out

Make sure pf is enabled:

pfctl -e

and that your ruleset is loaded:

pfctl -f /etc/pf.conf

where /etc/pf.conf is the location of your pf.conf (this is default).

Step 2: Get connected

If you’re in the USA, statistically you probably do not have native IPv6 connectivity. This is a little unfortunate, but thankfully there are organizations who are willing to allow us to use their services to tunnel IPv6 over the existing IPv4 network to get to their point of presence, and from there the traffic can travel over the IPv6 internet. While this is not ideal, this will have to do for most of us. If you have native IPv6 connectivity, you’re probably laughing at me :)

I used Hurricane Electric as my tunnel broker. Once I was signed up, I asked for a /64. One nice thing about Hurricane Electric (and this might be true for other tunnel brokers as well, I have no idea) is that they provide customized configurations for nearly any operating system.

I’ll explain more about this later, but I’d like to show the information Hurricane Electric gave me here (using the example prefixes instead of mine) so you can tell how to apply this to your own tunnel (and I’m sure other tunnel brokers do it similarly). Hopefully the explanation afterward will give these values some meaning if they don’t already make sense to you:

Server IPv4 address:    192.0.2.74
Server IPv6 address:    2001:db8:1f04:4c9::1/64
Client IPv4 address:    192.0.2.44
Client IPv6 address:    2001:db8:1f04:4c9::2/64
Routed /64:  	        2001:db8:1f05:4c9::/64

Server marks Hurricane Electric’s IPv4 and IPv6 tunnel endpoints, and Client marks my IPv4 and IPv6 tunnel endpoints.
Routed /64 is the subnet I am assigned. Note that it differs slightly (1f05 instead of 1f04) from the tunnel endpoints.

Since I’m setting up my OpenBSD 4.3 box as one of the endpoints of the tunnel, I selected OpenBSD and had them generate a configuration based on the assigned addresses. Here was the configuration generated for me (note the line continuation):

ifconfig gif0 tunnel 192.0.2.44 192.0.2.74
ifconfig gif0 inet6 alias 2001:db8:1f04:4c9::2 \
                          2001:db8:1f04:4c9::1 prefixlen 128
route -n add -inet6 default 2001:db8:1f04:4c9::1

You could happily copy and paste this, but chances are if you’ve read this far you’re going to want to know what exactly this is doing. I know I did, as I don’t like to simply copy and paste configuration from other people without knowing what it does first. The following is kind of verbose, but it might help some people to understand better what the above commands mean.

The first line says “ifconfig, I want to operate on a generic tunneling interface (gif0, man 4 gif for more information) to create a tunnel from my IPv4 address assigned to me by my ISP (192.0.2.44) to another IPv4 address somewhere else on the internet (192.0.2.74).” The tunnel concept is no more complex than thinking of a virtual tube that connects two points. While the internet may route the physical packets between the two endpoints 30 hops around the world, as far as the logic is concerned, the tunnels are directly connected. You can think of this tunnel as the underlying “road” or transport on top of which our IPv6 packets will travel to get to a place where they can be routed natively, as we (I) do not yet have native IPv6 connectivity to the internet.

The next line says “ifconfig, I want to operate on gif0 again, this time specifying things about IPv6 (inet6). I would like to create a new address for this interface as opposed to altering any existing addresses (alias), and I would like this address to be 2001:db8:1f04:4c9::2. Since we’re talking about a tunnel, I want the other end (Hurricane Electric’s end) of my tunnel to be 2001:db8:1f04:4c9::1. Finally, since I’m dealing with just a single host address on a point-to-point link, I will use prefixlen 128.”

Dead simple so far, right? The last line says “I’d like to adjust my routing tables (route -n, don’t worry about the -n for now but you can read the man page for route(8) if you’re curious) to add an IPv6 (-inet6) route which will be my default route (if my machine doesn’t know exactly where to send packets, they go here), and I want them to head toward the “far end” of my tunnel which is 2001:db8:1f04:4c9::1, where hopefully they’ll eventually be routed and arrive at their destination.” Note here that we specify the “far end” of the tunnel. We want our packets to go through the tunnel to the other end where they’ll be picked up by Hurricane Electric, not simply go to our end of the tunnel and stop short of their destination.

gif(4) is a neat little pseudo-interface that can encapsulate any combination of IPv6 or IPv4 packets based on how it is configured. Now that we’ve set up a gif(4) interface (gif0 above), it will see that since the tunnel is set up via IPv4 (the first line above) that IPv6 packets traveling through it need to get encapsulated inside IPv4 packets so they can be routed through the IPv4 internet. Once they reach Hurricane Electric at the other end, Hurricane Electric’s endpoint is set up to unpack (decapsulate) the packets and route them over the native IPv6 internet to their destination. The reverse happens exactly as you’d expect; IPv6 packets encapsulated in IPv4 packets coming in from the Hurricane Electric tunnel to our gif(4) interface get decapsulated and shuffled across our LAN to their destination as IPv6 packets.

At this point, you’ll want to assign your own IPv6 addresses to your interfaces so that you can access them via IPv6. Hurricane Electric assigned me the 2001:db8:1f05:4c9::/64 subnet as shown above, and you must assign these addresses out of the available allocated space. Working from our example, then, these are some sample valid addresses:

2001:db8:1f05:4c9::10
2001:db8:1f05:4c9::dead:beef
2001:db8:1f05:4c9::420

You can use ifconfig inet6 alias to do such configuration and an example, for completeness, of assigning an address to one of your interfaces might be

ifconfig xl0 inet6 alias 2001:db8:1f05:4c9::10 prefixlen 64

Step 3: Advertise

Now that we’re ready to go and you’ve verified that you can do something like:

ping6 ipv6.google.com

and you get replies, you can move on to telling other IPv6 capable hosts on your network about your connectivity, and how they can get some. OpenBSD ships with rtadvd (router advertisement daemon) which we will use for exactly this purpose.

Again, the configuration file first:

/etc/rtadvd.conf

xl0:\
        :addr="2001:db8:1f05:4c9::":prefixlen#64:raflags#64:

It might look like noise at first, so I’ll break it down. man 5 rtadvd.conf will be useful for more details.

Each field in this configuration file is separated by a : character. The first line starts off with an interface that rtadvd is going to advertise on. You may notice that xl0 is my internal interface from my pf.conf. This is because I want rtadvd to advertise the information that follows on my LAN. The backslash and whitespace that follows is simply to make it easy to track things in a large file; they are completely optional. The next section is addr="2001:db8:1f05:4c9::". This gives the address prefix to advertise to hosts. With IPv6, you advertise a prefix of some length and the hosts then fill in the rest themselves. Therefore I am advertising the prefix for the network you saw above. The next section is prefixlen#64. You may notice that string values are distinguished from their corresponding identifiers with = and numeric values are distinguished with #. This prefixlen section tells hosts how long the prefix that I’m advertising is. As the address 2001:db8:1f05:4c9:: expands to 2001:db8:1f05:4c9:0000:0000:0000:0000, I have to say which part of that I’m advertising, and which part is left up to the host to choose for itself. This says I’m advertising the first 64 bits of the address (The first 4 colon-delimited sections), leaving the host receiving this advertisement to deduce that it can pick the other 64 bits for itself. The last section is perhaps the least well-understood. This field raflags#64 stands for router advertisement flags, and they carry, you guessed it, flags about the nature of the router advertisement. There are two flags we are interested in. They are documented in rtadvd.conf with the following:

raflags
        (num) Flags field in router advertisement message header.  Bit 7
        (0x80) means Managed address configuration flag bit, and Bit 6
        (0x40) means Other stateful configuration flag bit.  The default
        value is 0.

I will simplify this slightly to make it as easy as possible to understand at first (hopefully) so if you want details or the authoritative source, refer to RFC 4861, page 19 for more information.

The M flag says that the host will need to get addresses via DHCPv6. In other words, it tells the host that it shouldn’t pick its own identifier (remember those last 64 bits above?), because the network policy is to ask a central location (Managed, see?) for an address first. This will likely trigger DHCPv6 in hosts that support it.

The O flag says that the host may obtain Other information from a central location as appropriate, also using DHCPv6. In other words, if you’d like to make available DNS servers, time servers, etc via DHCP, you’ll want this flag turned on so that hosts ask you about them. Note that this is separate from the address configuration. You may have (and I do indeed do it this way) the O flag set while the M flag is not set, indicating that hosts can pick their own addresses but if they want other neat information they should ask. Note that this is a little more flexible than DHCP available for IPv4, and allows for better separation of network management if you don’t want the “all-or-nothing” approach that DHCP for IPv4 offers.

The value is 64 for raflags, which is the decimal value (and I personally think the man page is confusing in this regard) of the hexadecimal value 0x40, meaning that I have the O flag set, but the M flag remains unset. This is because, in order for users to feel like they have connectivity out of the box, they will need DNS services, and I will provide them with a DNS server address to use (via DHCPv6) as I will show in a moment, so the host needs to know that it can ask for it.

Once you’ve got everything set up like you want it, start the server with

/usr/sbin/rtadvd xl0

where xl0 is the interface you want rtadvd to operate on. xl0 is my internal interface.

Step 4: DNS

I’d like to be able to resolve DNS over IPv6 for machines that support it, and it required a little tweaking on my part to get it working like I wanted it to.

First, I ran rndc-confgen to generate a key to use to communicate with the running DNS server, and did the appropriate things with it. Take a look at the man page for rndc-confgen; I won’t go into the details, but you’ll need to substitute yours below (for YOUR_OWN_SECRET_HERE if you choose to use my configuration file.

/var/named/etc/named.conf (partial)

key "rndc-key" {
        algorithm hmac-md5;
        secret "YOUR_OWN_SECRET_HERE";
};

acl clients {
        10.1.1.0/24;
        2001:db8:1f05:4c9::/64;
        127.0.0.0/8;
        ::1/128;
};

controls {
        inet 127.0.0.1 port 953
                allow { 127.0.0.1; } keys { "rndc-key"; };
};

options {
        listen-on    { any; };
        listen-on-v6 { any; };

        empty-zones-enable yes;

        allow-recursion { clients; };
};

logging {
        category lame-servers { null; };
};

This tells BIND to listen on all of my interfaces but only recursively resolve queries from my local IPv4 and IPv6 networks, which I’ve gone over above. I’ve also done some other things to the default shipped configuration like allowed version queries. If you’re unhappy with my security policies, you’ll need to make sure you modify this file to match yours before putting it into production. With this in place, I simply started the server by executing

/usr/sbin/named

Check /var/log/daemon to make sure everything started properly.

Step 5: DHCPv6 extras

This part isn’t quite as standard on OpenBSD, yet. I decided to go with WIDE-DHCPv6 for no particular technical reason, but it is simple to build and configure.

Once I unpacked the software, I changed into its directory and did

./configure && make && sudo make install

which builds it and installs the software to /usr/local. If you need/want it somewhere else, you can use the standard configure options to alter the prefixes and some other things. My OpenBSD box is a Pentium III running at 1GHz, and it takes a very small amount of time (2 minutes, if that) to configure, build, and install.

Even easier than installing this software is configuring it (in my case at least). I created a file called /usr/local/etc/dhcp6s.conf to configure the server, and the file looks like this:

/usr/local/etc/dhcp6s.conf

option domain-name-servers 2001:db8:1f05:4c9::10;

which simply tells the DHCPv6 server to hand out the IPv6 address 2001:db8:1f04:4c9::10 as the primary IPv6-accessible DNS server. Windows Vista clients, for example, if given one or more IPv6 DNS servers, prefer the IPv6 DNS servers over the IPv4 DNS servers.

You can now start the daemon with

/usr/local/sbin/dhcp6s xl0

substituting xl0 for the interface you would like it to listen on (xl0 is my internal interface) and the path to the server for the path you used if you installed it to a different location.

Step 6: Finalize

Now that we’ve set it all up, let’s make our configuration persistent across reboots.

I used /etc/rc.local to start WIDE-DHCPv6’s dhcp6s on boot.

/etc/rc.local

echo -n 'starting local daemons:'

# Add your local startup actions here.

echo -n ' dhcp6s'

/usr/local/sbin/dhcp6s xl0

echo '.'

My /etc/rc.conf.local looks like this (you may not need all of these):

/etc/rc.conf.local

dhcpd_flags=""
named_flags=""
ntpd_flags="-s"
rtadvd_flags="xl0"

pf=YES

and I have three hostname.if files:

/etc/hostname.gif0

tunnel 192.0.2.44 192.0.2.74
inet6 alias 2001:db8:1f04:4c9::2 128
dest 2001:db8:1f04:4c9::1

!route -n add -inet6 default 2001:db8:1f04:4c9::1

/etc/hostname.xl0

inet 10.1.1.1 255.255.255.0 10.1.1.255
inet6 alias 2001:db8:1f05:4c9::10 64

/etc/hostname.rl0

dhcp NONE NONE NONE

You may need to change some things, for example I obtain my rl0 IPv4 address via DHCP, so my first line of hostname.rl0 contains the right incantation to obtain the address that way.

It was also pointed out to me that assigning an IPv6 address to this external interface if you are using a setup like mine may cause routing confusion (and in fact did in my case!)

Conclusion

I hope this at least gives you a headstart when it comes to setting up a home network on OpenBSD. This isn’t necessarily intended as a guide, more as a way for me to document my thought process as I set up my network. That said, I have written it with people reading as a way to get ideas for themselves in mind, so I would appreciate comments about places where you think this can be improved. Chances are I’ve made a mistake in my thinking or have given out bad information, and I’d appreciate corrections to that effect even more.

By now, many people are aware that the RIAA has been going after people (specifically university students) they believe are violating the copyrights of their member companies. Other people have written articles on specifically how the RIAA (or more realistically, companies the RIAA hire) do this, so if you haven’t done so, I’d recommend reading about it.

Many universities have policies, whether written or unwritten, that dictate some sort of action against students when emails are received requesting that infringing content be removed from the computers serving that content.

In most cases, universities do a variation of the following:

  1. Look up the student’s information based on the IP address(es) listed in the email
  2. Disable the student’s internet access
  3. Follow up with the student in some way (require that some document be signed before internet access is restored, ask that they meet with a university employee or judicial officer, etc.)
  4. Restore the student’s internet access

One of the many issues that I have with this process is that in almost all cases the email is never verified, nor is it verifiable. Most of the time there is a persona certificate sent along with this email, but the email itself is not digitally signed. Two different emails sent to two separate institutions contained such persona certificates that hashed to the same value. Therefore, if somebody were to spoof such an email, attaching that certificate to the email would make that email as authentic as any emails supposedly sent from the RIAA.

The problem here is that institutions are taking action against students without even attempting to verify the authenticity of the emails they receive. Universities claim that they want to avoid potential problems, and so they are complying with the text of these emails. What happens, then, when students realize their university is taking action against them from what is essentially an anonymous threat? What happens when spoofed emails result in action taken against students?

Who’s to say this isn’t happening now?

bind(2) failed on Linux?

December 10, 2007

I’ll start this off with an example. How many of you running a Linux variant have seen the following in your logs?

sshd[6238]: Server listening on :: port 22.
sshd[6238]: error: Bind to port 22 on 0.0.0.0

When I first discovered this I had a couple theories about why this might be happening, but all I really knew for sure was that I couldn’t reproduce the problem on an OpenBSD or FreeBSD system. Soon enough, I came across numerous blog and list posts with sledgehammer answers such as “disable IPv6 support in the kernel!” I didn’t know what was going on at the time, but I certainly wasn’t satisfied by this answer. I went looking for behavior specific to Linux and came across this excerpt from Linux’s ipv6(7):

IPv4 connections can be handled with the v6 API by using the v4-mapped-on-v6 address type; thus a program only needs only to support this API type to support both protocols. This is handled transparently by the address handling functions in libc.

IPv4 and IPv6 share the local port space. When you get an IPv4 connection or packet to a IPv6 socket its source address will be mapped to v6 and it will be mapped to v6.

That’s more like the information I’m looking for. From what I’ve gathered, this v4-mapped addressing behavior is supposed to make it simpler for applications to be configured during the IPv4 to IPv6 transition (you only need to specify one Listen directive in Apache, for example).

I personally dislike the idea of the v4-mapped-on-v6 address type. Put simply, when I create an INET6 socket, I expect that I’m going to do communication over that protocol. I wouldn’t expect to do IPX communication on an INET socket, why should I expect to do INET communication on an INET6 socket? Furthermore, if I have an application like sshd and I give it the -6 parameter, my expectation is that it will no longer use IPv4 for communication. It will, however, do IPv4 communication via v4-mapped addresses on certain systems and this may cause problems for administrators who believe their system is only listening for certain types of communication when in reality it is listening for others as well. I believe if you’re going to write software that listens for both IPv4 and IPv6 communication, you should be writing AF-independent software, and not have to rely on something that may make your life more difficult in the future. You’re going to have to touch the code anyway to update it (if you even have to), so you might as well get in there and do it right. The downside due to the v4-mapped behavior being the default is that those who wish to write portable AF-independent code (at least using the method linked above, if there is a better way I’d like to know about it) will need to explicitly turn on the socket option IPV6_V6ONLY if the system supports it.

If you don’t want this v4-mapped behavior at all in Linux, the following should turn it off globally, but will reset on the next reboot unless you make it permanent (see your distribution documentation for details):

net.ipv6.bindv6only = 1

In any case, I prefer the default behavior of the BSDs who, at the time of this writing, have disabled (by default) the v4-mapped addressing. You can always turn it on using the inverse of the socket option method described above.

I’ve submitted a patch to the Portable OpenSSH project. I fully expect to get flamed to dust because of something I’ve overlooked, but as far as I can tell setting the socket option in OpenSSH as the patch does should bring Portable OpenSSH on systems using v4-mapped addresses more in line with the behavior of OpenSSH on OpenBSD.

Update: The patch has been accepted by the Portable OpenSSH project.

At work I manage FreeBSD servers mostly running web applications. One of the databases we need to be able to talk to is an Oracle database, so all I need to be able to do is connect, perform queries, and retrieve results. We need to be able to do this from PHP.

On FreeBSD, one obvious option is to use oracle8-client, which is a binary port of Oracle’s client libraries for Linux. The PHP OCI8 extension port has this as a dependency, so with no configuration aside from building PHP with OCI8 support, you now have oracle8-client on your system.

Everything was incredibly simple to install, except for one problem. Now, with the OCI8 extension enabled in php.ini, the following occurs:


user@host:~> php hello.php
Hello world!
Segmentation fault
user@host:~>

The contents of hello.php don’t matter, but now all PHP scripts run from the command line always segfault on exit. This includes pear and other CLI utilities written in PHP. Using gdb I was able to trace unloading of the OCI8 library to a call in Zend/zend_API.c (DL_UNLOAD) which calls dlclose(3) on a FreeBSD system. After doing some reading, I can only assume that the Oracle library is doing something incorrectly such as registering a function to be called at exit via atexit(3). In that case, we would call dlclose(3) after the function(s) are registered, but then when we exit and those registered functions get called, we get a segmentation fault as we’ve unmapped the memory they live in! While that may not be precisely what is happening in this case, as far as I can tell I’m dealing with a bug in the Oracle library.

We were able to run like this (in production too) for quite some time before I found a solution, because everything in the script could run before the segfault occurred. However, the return code would always be nonzero (so scripts that properly checked return codes from CLI PHP scripts had to be modified), it would spam the kernel logs, and generally annoy myself and our developers. Also, OCI8 connections made through httpd and mod_php were unaffected as the PHP module and extensions get loaded into memory, but not unloaded during the normal course of execution.

My patch is not elegant but works fine:


--- Zend/zend_API.c.orig Wed Sep 12 12:40:21 2007
+++ Zend/zend_API.c Wed Sep 12 12:43:45 2007
@@ -1912,9 +1912,11 @@

#if HAVE_LIBDL || defined(HAVE_MACH_O_DYLD_H)
#if !(defined(NETWARE) && defined(APACHE_1_BUILD))
+/*
if (module->handle) {
DL_UNLOAD(module->handle);
}
+*/
#endif
#endif
}

which essentially stops PHP from calling dlclose(3) on modules when it is exiting. As far as I have tested this works just fine because PHP is exiting anyway, but I would not use this patch if the dlclose(3) was called normally during runtime without exiting immediately afterwards. If you care to apply this patch, I saved the above text in a file called patch-Zend::zend_API.c, placed it in /usr/ports/lang/php5/files, and rebuilt PHP by executing portupgrade -f php5. Ports will automatically patch zend_API.c appropriately, and you should be good to go. The system I tested all of this on:

FreeBSD 6.2-RELEASE-p6
PHP 5.2.3
oracle8-client-0.1.1_1

There are some other instructions out there but I have found no need for the instant client, nor can I see how the instant client is even used by PHP in the example given there. That means you don’t need the Linux compatibility module loaded into your FreeBSD kernel, a Linux base system, or any of Oracle’s instant client packages (which require registration to download anyway).

At work, most of us use procmail to conveniently filter our mail into folders to keep a semblance of organization. I don’t know how it took me so long to find this, because I know I’ve tried to search for it before. Lo and behold, it’s located (maybe newly?) on the Tips and Tricks page. This great feature I’m talking about is the simple ability for Thunderbird to check all of my folders for new mail automatically, rather than only my Inbox. I know about folder subscriptions, the problem with those is that I have several profiles for which I’d have to do this on each of them, and I’d have to redo them every time I wiped my profile and started fresh.

Now that I’ve figured out how to do this, I’m pretty set on Thunderbird as my email client of choice.