Monday, June 4, 2007

Where is mkmf?

I was recently trying to install Ruby's SQLite adapter to make my development easier (e.g. no fussing with SQL servers). I did a quick Google search with no obvious results so I figured it couldn't be that hard and hopped over to rubyforge.org to get the gem. I found sqlite3-ruby, which fit the bill, and started installing it.
sudo gem install sqlite3-ruby
Select which gem to install for your platform (i486-linux)
1. sqlite3-ruby 1.2.1 (mswin32)
2. sqlite3-ruby 1.2.1 (ruby)
3. sqlite3-ruby 1.2.0 (mswin32)
4. sqlite3-ruby 1.2.0 (ruby)
5. Skip this gem
6. Cancel installation
> 2
Building native extensions. This could take a while...
ERROR: While executing gem ... (Gem::Installer::ExtensionBuildError)
ERROR: Failed to build gem native extension.

ruby extconf.rb install sqlite3-ruby
extconf.rb:1:in `require': no such file to load -- mkmf (LoadError)
from extconf.rb:1

What the frack? A little experimenting and searching lead me to this post which explained that mkmf is part of the ruby1.8-dev package. A quick apt-get install and I was good to go.

Friday, May 25, 2007

Creating Solaris Link Aggregations

Now, I'm not export when it comes to Solaris, in fact I know very little, so take what I say with a few barrels of salt.

I was recently setting up a new Solaris box and one thing we knew we wanted to do was link aggregation. For a file server this is pretty handy when some of the heavy-access nodes are on the same switch.

I had the box up and the DHCP gave the first interface an address. I used this to log in and set up another link we wouldn't be using in the aggregation with an IP address, which I would then use to log in and set up the aggregation. I logged in on the spare link, took down the first link and unplumbed it.

# ifconfig e1000g1 down unplumb

Then I created the aggregation of three links with key of 1.

# dladm create-aggr -d e1000g1 -d e1000g2 -d e1000g3 1
dladm: create operation failed: Device busy (invalid interface name)

Well, that's not quite what I wanted. After searching around a little and fiddling with dladm commands I was about to reboot and see if that would help when I struck on an idea.

# ps -e | grep dh
42 ? 0:00 dhcpagen
# kill 42
# ps -e | grep dh
# dladm create-aggr -d e1000g1 -d e1000g2 -d e1000g3 1
# ifconfig aggr1 plumb up
# dladm show-aggr
key: 1 (0x0001) policy: L4 address: 0:14:4f:21:11:d1 (auto)
device address speed duplex link state
e1000g1 0:14:4f:21:11:d1 1000 Mbps full up attached
e1000g2 0:14:4f:21:11:d2 1000 Mbps full up attached
e1000g3 0:14:4f:21:11:d3 1000 Mbps full up attached
# ifconfig aggr1 aggr1: flags=1000843 mtu 1500 index 12
inet 0.0.0.0 netmask ff000000
ether 0:14:4f:21:11:d1


Ah-ha! The dhcp agent was keeping the interface busy. Lesson of the day: if you configure an interface with DHCP once, don't forget to take down the dhcp daemon when you want to do anything else with it.

Thursday, May 24, 2007

Simple Port Knocking

A while ago I had the idea of creating a stealth-host on our network to do our network monitoring and perhaps host syslogs. It was a low priority project, and one that I'm interested in more for the exploration than the practicality. A great example of this is my desire to hide the SSH port behind a port-knocking system. This week I dredged up an old desktop, installed Debian and went to work.

A few months ago I came across a very brief how-to on using IPTables to implement portknocking. After I finished the project I found a more detailed how-to which explains the system in much greater detail.

I wrote up the original script with a few comments and now I source it from my main IPTables script. Warning: If you configure firewall scripts remotely be prepared for your own mistakes. When working with firewall scripts remotely it is a good idea to put in a cron job to flush the rules every 15 minutes so you don't lock yourself out. You can use iptables -F to flush all the rules. Note: There are some lines that are being auto-wrapped due to the narrowness of Blogger. Take care to edit them out if you cut and paste.


# Path to iptables
IPT=/sbin/iptables

# Port on which ssh listens
sshport=678

# Ports to knock to in order from 1 to 3
port1=14159
port2=26535
port3=8979

# Timeout
timeout=10

# Create new chains because we can't do two recent commands in one rule.
$IPT -N knock1
$IPT -N knock2
$IPT -N knock3

# Populate our custom chains.
# These chains basically just remove the host from the last table
# and add it to the next. This ensures each host must knock in
# order.
# Move from portknock1 to portknock2
$IPT -A knock1 -m recent --name portknock1 --remove
$IPT -A knock1 -m recent --name portknock2 --set -j DROP

# Move from portknock2 to portknock3
$IPT -A knock2 -m recent --name portknock2 --remove
$IPT -A knock2 -m recent --name portknock3 --set -j DROP

# Accept the packet and remove the entry from portknock3
$IPT -A knock3 -m recent --name portknock3 --remove -j ACCEPT

# These are the actual filter rules.
# This is the initial rule. Adds the host to portknock1
$IPT -A INPUT -m state --state NEW -p tcp --dport $port1 -m recent --name portknock1 --set -j DROP

# These two rules move from portknock1 to portknock2 and from
# portknock2 to portknock3 respectively when their associated
# port is knocked.
$IPT -A INPUT -m state --state NEW -p tcp --dport $port2 -m recent --rcheck --name portknock1 --seconds $timeout -j knock1
$IPT -A INPUT -m state --state NEW -p tcp --dport $port3 -m recent --rcheck --name portknock2 --seconds $timeout -j knock2

# This is the final check. When the packet is received to $sshport
# it must be in portknock3 to be accepted.
$IPT -A INPUT -m state --state NEW -p tcp --dport $sshport -m recent --rcheck --name portknock3 --seconds $timeout -j knock3

# These are anti-scanning rules. Though the best scanners (e.g.
# nmap) scan in a random order.
$IPT -A INPUT -m state --state NEW -m tcp -p tcp --dport $[port1 - 1] -m recent --name portknock3 --remove -j DROP
$IPT -A INPUT -m state --state NEW -m tcp -p tcp --dport $[port1 + 1] -m recent --name portknock3 --remove -j DROP
$IPT -A INPUT -m state --state NEW -m tcp -p tcp --dport $[port2 - 1] -m recent --name portknock2 --remove -j DROP
$IPT -A INPUT -m state --state NEW -m tcp -p tcp --dport $[port2 + 1] -m recent --name portknock2 --remove -j DROP
$IPT -A INPUT -m state --state NEW -m tcp -p tcp --dport $[port3 - 1] -m recent --name portknock1 --remove -j DROP
$IPT -A INPUT -m state --state NEW -m tcp -p tcp --dport $[port3 + 1] -m recent --name portknock1 --remove -j DROP



Once this is set up you will have to send a SYN packet (first part of a TCP connection) to $port1, $port2, and $port3 within $timeout seconds of each other. This will then open up $sshport for 10 seconds. It's your responsibility to have a daemon listening on that port.

Here's a quick script for connecting to an SSH port protected by the previous script.


#!/bin/sh

sshhost=myhost
sshport=678
port1=14159
port2=26535
port3=8979

for PORT in $port1 $port2 $port3
do
    nc -w 1 -z $sshhost $PORT &
done

ssh -p $sshport $sshhost

Wednesday, May 23, 2007

Looking for Linux Administrators... on Windows

When I was last hunting for a job I posted my resume on sites like CareerBuilder and Monster. The strangest part was that I got more email about openings after I landed a job than before. However, I found out that more than a few firms were farming those sites for email addresses and I was getting placed on a few job lists. I removed myself from most of them, but I keep some of the job-spam around for the amusement value.

I always find it amusing when I get messages stating, "we have reviewed your resume and you are perfect for this position. ... Senior Java Developer ... 4+ years experience developing with Java in an enterprise environment." I can tell they actually care about me because I have listed no Java experience, just mentioned it in passing with a word-salad that helps me get through the auto-filters.

I got one I found particularly interesting today. It's from my friend at James Moore and Associates which I get an update from at least once a month on new positions. Usually they're way off my expertise and experience, but this one was actually on target.
We have three exciting, fast-growing pre-IPO clients seeking Linux System Administrators.
Now Linux System Administration is something I actually do! I like to keep an eye on what's out there and what skills seem to be in demand. I was actually interested until I got to the following:
If any of these sound interesting to you, please send me your resume as a Word attachment ASAP and indicate which position most interests you.
So let me get this straight, you want a Linux system administrator to send you a resume in a format that will likely only display properly if written in Windows. Word-formatted resumes are the most irksome thing about job hunting in my opinion. I can deal with the unemployment, I can deal with interviews and HR people, but why do these people insist on using a format for which the display is highly dependent on the system it is viewed on? Some web-based submission forms won't accept anything but .doc files at all. Couldn't they at least accept PDFs, a format which is designed to be created and read across platforms such that all the formatting is preserved as the author intended?

Here I think we run into the fundamental problem of having HR folks searching for technical experts and hackers. It's two entirely different mindsets and if the HR people can't understand the people they're looking for they have no hope of distinguishing the good from the bad. I believe that's why they depend so much on key words. As Paul Graham explained in his hyperbolic manner, "All the computer people use Macs or Linux now." And you'll find any developer worth his salt can pick up the latest technology in a few days.

I like to look at it as a reverse-screening processes. Corporations that don't care enough about their employees to be sensitive and personal in the hiring process are probably not places I'd like to work.

Wednesday, May 16, 2007

Getting it wrong the first time...

I often feel that the first time I implement anything--be it a file server, web application, or pencil organization scheme--that it won't be done right. The first time is a learning experience where you hack away at things till they work. The second time you just do it right the first time. I've met some sys admins who believe that if it eventually works, great. I prefer to have a nice clean system that's set up right from the start instead of having spare packages sitting around or config files with comments chronicling my mistakes, not to mention the mistakes that aren't commented out. When a system is hacked together like that there's a greater chance for emergent behavior due to strange interactions.

Now, this just means I have to spend some more time on the implementation, and I think that it is worth the stability in the system. However, the one large problem is when the users come to expect some functionality I had in. Often this is temporary logins while I get a centralized login system set up, or SSH tunnels while I'm setting up the VPN. The users have come to expect it, so they complain when it doesn't work. I could always take the hard line and be a good BOFH, but that really isn't my style. So when Joe decides he must have x11 forwarding through a tunnel for his scripts instead of using the VPN, I advise him to use the VPN and then just let him keep his firewall account.

The moral here is just to try to only support the best-class solution from the start. Putting in bypasses while you're setting things up may cause users to become attached to something you don't intend to last. Remind them of their childhood puppy. Johny, all good things must come to an end.

Thursday, May 10, 2007

Samba isn't anti-social, it just has it's own groups

I just spent at least 15 minutes trying to debug a simple problem with Samba. The issue is that I had a share set up like the following:
[test]
comment = Test Share
path = /volumes/local/test
writable = yes
public = no
browsable = no
force group = users
valid users = drew
It looks pretty standard, but there's a gotcha. I listed the shares and it didn't show up, as I planned so I tried to connect to it directly and got the cryptic
tree connect failed: NT_STATUS_NO_SUCH_GROUP
Now I find that odd since I am a member of a group which exists and is named users.

Apparently there are some issues with this version of Samba on stand-alone servers. If you look in your logs you'll see things like this:
[2007/05/10 15:43:40, 0] auth/auth_util.c:create_builtin_administrators(785)
create_builtin_administrators: Failed to create Administrators
[2007/05/10 15:43:40, 0] auth/auth_util.c:create_builtin_users(751)
create_builtin_users: Failed to create Users
Apparently Samba has it's own special groups for Users and Administrators. Since it can't create them properly it doesn't have the built-in and thinks the group doesn't exist. Using any other group name works fine.

The moral: Don't use "users" or "administrators" as groups in Samba if you want to refer to the system groups. Hope this helps someone.

Monday, May 7, 2007

Backing Up Web 2.0

So Alice is happily managing her administrative duties, a few purchase orders, some emails, update the conference room schedule, watch a few YouTube videos, checking out the links her friends send her on del.icio.us, just the usual. She's typing away, getting things done, tapping her fingers to the music in her headphones as the pages load and suddenly she realizes they aren't. Something has gone down in the network. It could be the connection, it could be the server, it could be the service; the ultimate result is she is no longer working.

I was just bit by this when del.icio.us wasn't responding. Suddenly I can't look up that how-to I was working from last week.

Networked applications are great. They let us all communicate and collaborate better, they centralize things, and we're all more productive. However, the problem is that when network applications fail they tend to fail catastrophically. If the network is down, no one can work.

The problem really started to arise when we started getting always-on internet connections with good bandwidth. Before our networks were reliable and fast things were designed for intermittent communication. Things like FTP and CVS are designed to create a local copy for you to work on and then upload to the server when it's convenient. Now the bandwidth is high enough that we can just keep our documents on the server all the time, be they file shares or GoogleSpreadsheets. In the old model if the network went down everyone could keep working on their local copies. Now when the network goes down no one can work.

For this reason it makes a lot of sense to have backups, both of data and procedures, for when your web applications fail and servers fail. Who do you talk to to submit that purchase order? Who will keep track of changes to a collaborative document? How will you get your files from one system to another? Some suggestions:
  • Designate an individual to be in charge of each process, such as documents or purchasing or code check-ins, so that there is a central point of contact when the system stops.
  • For web-forms and submissions make sure you have paper backups and that people know where they are and have a supply on hand to last at least a day.
  • Use technologies where users keep a local cache such as AFS and SVN instead of having all files reside on the server.
  • Never entirely replace something with a networked counterpart. Leave the previous version around so people can roll back to something familiar when the new system fails.
  • Make sure you have out-of-band communication with your systems. This is usually a keyboard and monitor or serial connection to the system. If you get in the habit of SSHing into everything you'll have a lot of trouble fixing it when you accidentally assign an IP address of 192.168.1. Yes, that's three octets, and yes, I've done that when trying to remotely roll-over a firewall while out traveling.
Fortunately, most of these outages are brief, but it is important to remember that nothing has 100% uptime. To mis-quote Fight Club: On a long enough timeline, the uptime of any system approaches 0%.