open source and hacky stuff

Tuesday, October 25, 2011

Getting wifi to work on an Asus Eee PC 1015PEM

This asus has a Broadcom BCM4313 wifi card. The linux kernel that ships with Slackware 13.37 comes with an open-source driver for this wifi card. Unfortunately, it does not come with the firmware for the card, so the driver is useless. If you try to download and install the driver from Broadcom it crashes the machine (even if you blacklist every other module).

The solution is to download and install the firmware according to the driver's README (which you can find at /usr/src/linux-2.6.37.6/drivers/staging/brcm80211/README). Once the drivers are copied to /lib/firmware/brcm/ you need to make symlinks "bcm43xx-0.fw" and "bcm43xx_hdr-0.fw" to the files which look closest like those.

Of course the git repository that has the files is not working, so you have to pull them from somewhere else. You can get the archive from debian here: http://packages.debian.org/sid/firmware-brcm80211 (download the source package's .tar.gz file and extract it)

Once you set up the firmware, just reboot and the machine should attempt to load the brcm80211 module and the right firmware automatically. Don't use the "wl" driver from broadcom as it will crash the machine. Add "wl", "b43" and "ssb" to the /etc/modprobe.d/blacklist.conf file just in case it tries to load those.

UPDATE

Apparently, that stuff doesn't work the way it should. You have to upgrade to the latest 2.6 kernel (2.6.39.4 as of this writing) and load the 'brcmsmac' driver, as this is the new driver used by the BCM4313 on the latest 2.6 kernels. Blacklist all the other drivers first ('wl', 'brcm80211', 'b43', 'ssb', 'b43-legacy', 'bcma'). I'm not sure if this is because the latest firmware is incompatible with older driver versions, but it kept crashing my machine to do anything but use the latest kernel and the brcmsmac driver with the latest firmware. What a pain in the ass.

Monday, October 24, 2011

pNFS is in Linux 3.x!

I totally missed it, but pNFS is officially in Linux 3.0 and beyond. If you need simple, stable, parallel network filesystem that is included with vanilla Linux kernels, now you have it. Any NFS 4.1 compatible client should be able to use servers set up with pNFS.

Here's the docs I found so far on it:

http://wiki.linux-nfs.org/wiki/index.php/PNFS_Setup_Instructions
http://wiki.linux-nfs.org/wiki/index.php/Configuring_pNFS/spnfsd
http://wiki.linux-nfs.org/wiki/index.php/PNFS_Block_Server_Setup_Instructions
http://wiki.linux-nfs.org/wiki/index.php/Fedora_pNFS_Client_Setup

Thursday, October 20, 2011

most new startup companies are stupid

Let me go down the list of some kinds of startup companies from Start-Up 100 and why they're stupid.

Advertising and Marketing
Right off the bat, a bunch of useless bullshit. I don't ever WANT to see an advertisement or marketing. I want to be able to find that shit if i'm looking for it, but I don't want any of it to just show up somewhere.

Audio and Media
More stupid "web 2.0" websites based around music and other crap. Internet radio has existed for well over a decade. I don't need another place to not find the music I want to hear. (Pandora sucks, Spotify sucks, Grooveshark sucks... it all sucks. I'll turn on Shoutcast or Last.FM Radio if I want random music that I kind of like)

Education, Recruitment and Jobs
First of all, if you didn't get a normal education, some web 2.0 shell of a company probably isn't going to educate you any better. We have Dice and Monster, and people who are competent at what they do will network and find jobs in person like normal.

Enterprise: Security, Storage, Collaboration, Databases
Finally, startups intended for technology. Too bad all of them suck. Most people i've seen who try to develop startups have not worked very long in tech so they design or implement poorly and if they survive it's from sheer luck. Most of these solutions are crap or unnecessary.

Finance, Payments and Ecommerce
Again, i'm pretty sure all the big contenders have already been created. It'd be interesting if they actually had a new way to deal with finance or ecommerce, but most of it's been done and there's not a lot of room for innovation.

Gaming, Virtual Worlds
Ok, here's something that actually has promise. Make a stupid game which is addicting and make a billion dollars like Rovio.

Social Networking and Collaboration
JUST. LET. IT. DIE.
Social networking is a fad. You know what the original social network was? AOL. Just let the shit die. God I hate social networks.

Travel and Transport
I guess there's still a few niche/boutique businesses you could start in this space. But if it's another "how to look up cheap flights" website, just kill yourself.

What I would like to see more of are startups that are intended on bettering mankind, or fixing a common problem, or pioneering a new technology (a *real* new technology, not just a new shitty website or NoSQL garbage tool nobody wants). Medical device startups are really cool. Startups that develop technology for the 3rd world are cool. I'm still waiting on somebody to build a company that just services new companies, giving them turn-key solutions to build new networks and support them. I'll go work for them.

Saturday, October 8, 2011

note to self for change management system

if a hack like extra privs is applied to a system to allow a dev to fix some issue in production or something, should be a system in place to automatically revoke privs after a given time. or specify a date/time range that the privs should be added, so you can specify "during maintenance window sunday 5am-9am developer Steve gets weblogic sudo access". as a matter of principle, all changes should be allowed to have date ranges applied to control when the changes happen. if the date/time starts but has no end, assume end time is indefinite. if start time and end time are the same, change is only applied once.

all account access should have defined end dates (for example, contractor steve has a 6 month contract, so all his access should have an expiry time set). BEFORE access is revoked email alerts will be generated for 2 weeks out, 1 week out, 3 days out, 1 day out before access expires to alert somebody before his access goes out the window. most configuration should not have expiry times because it's assumed if it is in config management it's meant to be there indefinitely, but for quick hacks where we know we don't want it to be there long we can set expiry times and will get alerts before it expires.

Monday, September 26, 2011

dumb network policies and systems practices

"jump server". just the term itself conjures up an image of "getting around" security or the network. it's a HACK. unless there's a big problem with your network, you should be able to allow access directly to the server you need to get to. connecting to one host just to connect to another host is retarded.

the only thing that is a potential benefit is that you're essentially forcing any network communication through one protocol (which can subsequently be circumvented on the jump server, depending) and (again, depending on the jump server) authenticating twice.

the bad things? it's incredibly, incredibly slow to transfer files. functionality with different protocols becomes broken. and you're circumventing the firewalls and network security. once you tunnel to the jump box it becomes much more difficult to determine who is connecting to where (after the jump box). and attacks on the internal network get much more interesting, not to mention if you escalate privs on the jump box you can piggyback any connection any other user is making from the jump box. not to mention you're forcing a new layer of complication onto your users so doing their job becomes more of a hassle - which almost by definition inspires people to break good convention for the sake of convenience. not to mention it's a waste of resources.

systems guys, don't jerk your users around. if there's a way you can get something done quicker, do it. for example: resizing logical partitions in a VM guest.

if your user wants 10GB more added to their work partition, get a procedure in place so you can do it live. rebooting the server should not be necessary for most admin tasks on a unix host.

don't believe me? read this blog post explaining how to extend an LVM volume while the box is still up. hey, now i don't have to wait a day to keep doing my work on a server which was allocated way fewer resources than it should have had!

Monday, September 19, 2011

secure kickstarting of new linux servers

PXE is not secure. Not only does it rely on broadcast requests for a PXE server, it uses UDP and TFTP to serve files, thus removing any remaining security features. It also can't be used on WANs and typically requires admin infrastructure and VLANs set up wherever the boxes will be installed, so lots of admin overhead is required.

To get around these problems and provide a secure mechanism for remote install you should use a pre-built linux image on CD-ROM or floppy disk. Most servers today come with one or the other, and both provide enough space to include a kernel and tiny compressed initrd with barebones networking tools.

The kernel should obviously be the newest vanilla kernel possible, patched to include any relevant hardware support. The more vanilla the better as you can quickly pick up the newest released kernel and build it without needing to modify vendor-specific patches.

The initrd should probably be based on an LZMA-compressed mini filesystem or cpio archive. Usually something custom like squashfs works the best. You'll need to build busybox and the dropbear ssh client in a uClibc buildroot environment to make it as small as possible. Bundle an ssh key for the admin server along with DNS and IP information for the server (in case DNS resolution fails, it can try the last known good IP address(es)). It should try indefinitely to get an IP address, and once it gets one it should make an SSH connection. Once SSH connection is established it should download install scripts and execute them. All downloads happen through the secure SSH tunnel.

The initrd should also include proxytunnel and potentially openvpn or another UDP-based SSL tunnel so it can fall back to trying to connect through HTTP or udp port 53. You may need cntlm to work with NTLM proxies as proxytunnel's support does not seem to work for all versions of NTLM. Also, rsync should probably be included, or downloaded immediately once a network connection is established. Big apps can always be downloaded to a tmpfs partition later once the SSH connection is established. The initrd should also use a bootloader that can pass custom arguments to the initrd at boot time, for example to specify proxy or IP settings. It should probably support Web Proxy Autodiscovery Protocol for corporate environments.

If you have room, you may also want to bundle grub with the initrd. This can help you recover a system if it fails and it will allow you to install grub over whatever's currently on the hard drive. A good grub configuration to install would include options to boot from hard drive, CDROM or floppy, so as long as you have remote console you can reboot the box and select the install image from grub at boot time without needing to change BIOS settings.

Finally, each client image should be modified before burning/imaging to have custom ssh keys. You want each boot image to be able to be revoked from the admin server's login list, in case a boot image/disk is compromised or stolen. Granted, you're not giving this thing any more rights than access to your kickstart file tree, and that shouldn't have anything super confidential on it anyway. To customize the ssh keys per image you can have a script which generates new images, creates keys for each one and renames the image file to something specific to the machine. Match up the specific piece of hardware with this unique image file in your network's inventory database.

The kickstart server should have IP addresses dedicated only for the kickstart clients. An HTTP reverse proxy should be installed for ports 80 and 443 so that the client can use proxytunnel to connect to SSH through HTTP proxies. Optionally an openvpn daemon should be enabled on port 53 in case a client's firewall has an open outbound port 53. Each kickstart server's SSH daemon (listening only on the kickstart IPs) should use the same host keys so they can be copied to the initrd and you won't have to worry about the server host keys changing per box, messing up the initial connection from the clients. You could manage all the kickstart host keys independently but why complicate matters further?

In the end what you have is a client install CD or floppy which can boot up on a network, connect securely to a remote server, configure itself and follow setup instructions given by the remote server. It can even deliver detailed information about the host on boot-up so it can then download instructions specific to the machine type. When the machine boots up for the first time, select the CDROM or floppy and run it; as it boots it can install grub on the hard drive so you never even have to change the BIOS boot order. I recommend having the initrd grub config default to boot from hard disk; you can always manually select the remote install function and it's safer to default to booting from hard disk if the CD or floppy will stay in the machine.

Yes, this is a lot more maintenance than a simple PXE server. This is not intended for use in all environments. But if you need a truly secure, remote-accessible machine kickstarting solution, this one will do the job across all kinds of network types.

P.S. You can substitute ssh for a minimal HTTPS client and a copy of the server's certificate on the initrd so you don't have to rely on CAs. I personally don't trust 3rd party certificate authorities (as more and more evidence shows that states can snoop on SSL traffic without problems).

Tuesday, August 23, 2011

A tip for handling long downtime

So you push out a piece of code and it eats your live database. The site is broken. You need to take it down to repair the database. So you're going to keep your site down for how long? 30 minutes? 6 hours?

If you're trying to "fix" a database and you're keeping your site down until it's done, get a read-only copy of an old snapshot of the database + site code up. Put up a banner on all pages saying the site is under emergency maintenance so parts of the site are temporarily disabled.

This way your users get to continue using at least the read-only parts of the site and not all of your traffic goes out the window. Keep this in mind when developing the site too; not being able to update a hit counter in the database for a specific page should be a soft error, for example.

If you don't have a place to host this temporary database + site code, think about having such a place. Secondary/failover hosts would work at a time like this, or maybe your single host(s) need more capacity.