open source and hacky stuff

Thursday, April 8, 2010

how to make a product everyone will buy

Make a hardware platform only you control and manufacture. Also make sure it looks very pretty and is reliable.
Make an operating system that's user-friendly, simple, and very pretty with some killer apps and an easy dev environment. But make sure it only works on your hardware.
Make software to provide all the personal needs one has with a computer and make it all tie together seamlessly. Make sure it's user-friendly, simple, and very pretty.
Make accessory products which provide for things people want day to day, make it work with your hardware and software, and tie it all together seamlessly. Also, make sure it's very pretty and reliable.

Now release anything and make sure it ties together with all the previous products seamlessly, is user-friendly, simple, reliable, and very pretty. It doesn't even matter if it has a purpose or is redundant: people will buy it. It helps if you have the world's greatest PR/hype machine and if you can make people believe they're superior to someone else by owning these products. Above all make sure it is always very pretty. In this vein the product is like a luxury car: completely impractical and unnecessary, but people pay a premium for something that looks fancy and probably doesn't provide any benefit over a cheaper less pretty device.

Monday, March 15, 2010

Better Security [tips]

You've got network intrusion detection and stateful firewalls. Your kernels are patched as far as they can go for exploit prevention. You're using OpenSSH. That's awesome. Now why is it someone can still penetrate your precious servers so easily?

When you begin to secure something (anything, really - buildings, documents, servers) you have to consider everything. Each factor which could possibly be targeted in an attack could be used with any other factor to increase the likelihood of a successful compromise. So each factor has to be looked at in conjunction with every other factor. Yes, this is usually incredibly tedious and mind-bogglingly complex. To help mitigate this you can design preventative measures around each possible attack vector. In other words, add security to everything.

In the example above there's loads of attack vectors just waiting to be leveraged. One example is OpenSSH. A lot of people just use it in it's default form and never add any security to it. This will lead to an exploit. If you allow password entry to an OpenSSH server, just assume it's been compromised. It's so easy to observe a password being typed or intercept it somewhere else it's laughable. Not to mention people hiding passwords under their keyboards or on their monitors! No, a password-protected SSH key is the minimum you should use to allow access to a server. The "something you know, something you have" two factor authentication is far more secure than a single factor. I should stress that this is only true when properly implemented, as bad two-factor can be even less secure than strong one-factor. For more on authentication factors read this and take note of the ways different factors can be exploited (don't rely on just biometrics!).

In newer versions of OpenSSH there are even more methods to harden the authentication process, such as certificate authorities and key revocation lists. Also disabling root logins, having a set list of users allowed to authenticate, disabling old deprecated protocols, ciphers and algorithms, and explicitly dropping any connection with conflicting host keys is a good idea. You should even consider the libraries used by the application - were they built with buffer overflow protection? Is PAM enabled? One need only look around to see the underlying systems of your very critical remote administration software could be rife with potential exploits. For every one exploit known there is probably ten unknown waiting to be found.

Now think about what you have access to on your own system. Consider for a moment what would happen if an attacker used the same methods you do to gain access. Would it be difficult or easy for them? If it's easy for you to access the system, it may be for them. Try to make it more difficult even for yourself to gain access and a potential attacker will have a hell of a time trying to leverage something you've left unguarded. Make your firewalls incredibly verbose and restrictive; you'd be amazed how little can be done to a system when an attacker doesn't know exactly how to use it. Require multiple levels of logins before root can be obtained, and try to minimize any need to get to the root account.

Make all of your services run as unprivileged users. Make scripts to be executed by sudo which don't take any options (and clean their environments) that take care of tasks you may need root to perform. Any admin should be able to perform basic administrative tasks as a non-root user. Make all services controllable by an "admin" group, with each service having its own unique user to minimize attacks from one service to the next. Most services can be configured to start up and bind to a privileged port and drop to an unprivileged user, but for those that cannot there are methods (SELinux, etc) to work around restrictions in an application or system.

Make your configuration also be applied by a non-root user. A good way to think of configuration management is "let the service manage itself." Create your base model or template of configuration management scripts and then create service-specific configuration that can be run by the user of that service. In this way you don't need to worry about an attacker pilfering your configuration management and applying rules to all the machines as root. You can also create more fine-grained controls in terms of what admin (or group) can configure what service. You don't need to worry about a "trusted" user compromising your whole network if you only explicitly grant them access to the things they need to manage.

In fact, consider time-based access control for your entire network. You should expire SSH keys and user access for different services around the same time you expire old passwords. This will force you to improve the method users have to request access and hopefully increase productivity and responsiveness in this area of support. Just don't fall into the trap of allowing anything anyone asks for. Make it easy to get their manager to sign off on a request so at least there's some accountability; you can only benefit in terms of security if somebody thinks they might get fired for granting access willy-nilly.

Thursday, February 25, 2010

using facebook to live-phish people indirectly

(05:38:23) Friend On Facebook: hey
(05:38:24) Friend On Facebook: hey
(05:38:25) Friend On Facebook: are you there
(06:43:28) Friend On Facebook: hry
(06:43:36) Friend On Facebook: how are you
(09:02:00) Me: morning
(09:02:54) Friend On Facebook: How are you
(09:03:06) Me: i'm good thanks, you?
(09:03:22) Friend On Facebook: I'm in a mess
(09:05:35) Me: in a mess?
(09:05:51) Friend On Facebook: yeah
(09:06:54) Friend On Facebook: i'm stranded in London,England and need help flying back home
(09:07:03) Friend On Facebook: Got mugged at a Gun point last night
(09:07:18) Friend On Facebook: all cash,credit card and cell phone were stolen
(09:09:50) Me: holy crap
(09:10:18) Friend On Facebook: thank God i still have my life and passport
(09:10:30) Me: that sux
(09:11:07) Friend On Facebook: Return flight leaves in few hours but having troubles sorting out the hotel bills
(09:12:29) Friend On Facebook: Need you to loan me some few $$ to pay off the hotel bills and also get a cab to the airport
(09:12:42) Friend On Facebook: I promise to refund it back tomorrow
(09:15:21) Friend On Facebook: are you there
(09:21:06) Friend On Facebook: are you still there
(09:25:34) Me: nice try ;)
(09:26:03) Friend On Facebook: ok

I confirmed via text message that this wasn't sent by the actual friend and they had no idea it was going on. In fact, someone the friend knows already fell victim to this phisher because they believed they were talking to the real person and had already sent money. (I wouldn't have sent him any money anyway because I don't know him THAT well, but still scary)

Lesson learned: Do not trust the internet.

Friday, February 12, 2010

hacking the samsung PN-58B860 firmware

Here's a brief example of reverse-engineering and a bad implementation of encryption.

My friend bought a 58" Plasma TV. He mentioned something about browsing youtube on it. This brief convo led my curiosity to their firmware download page. After downloading the self-extracting windows executable (yay for ZIP file compatibility!) and unzipping it I found a couple files.

pwillis@bobdobbs ~/Downloads/bar/T-CHE7AUSC/ :( ls -l
total 80
-rwxr-xr-x 1 pwillis pwillis 11695 2009-09-14 20:18 MicomCtrl*
-rwxr-xr-x 1 pwillis pwillis 19431 2009-09-14 20:18 crc*
-rwxr-xr-x 1 pwillis pwillis 21057 2009-09-14 20:18 ddcmp*
drwxr-xr-x 2 pwillis pwillis  4096 2010-02-12 14:09 image/
-rw-r--r-- 1 pwillis pwillis  7738 2009-09-14 20:18 run.sh.enc
pwillis@bobdobbs ~/Downloads/bar/T-CHE7AUSC/ :) ls -l image/
total 162080
-rw-r--r-- 1 pwillis pwillis     2048 2010-02-12 13:28 appdata-sample
-rw-r--r-- 1 pwillis pwillis 35663880 2010-02-12 13:54 appdata.fuck
-rw-r--r-- 1 pwillis pwillis 35663872 2009-09-14 20:18 appdata.img.enc
-rwxr-xr-x 1 pwillis pwillis     1573 2010-02-12 13:49 decrypt.pl*
-rwxr-xr-x 1 pwillis pwillis     1573 2010-02-12 13:41 decrypt.pl~*
-rw-r--r-- 1 pwillis pwillis 47304710 2010-02-12 13:50 exe.fuck
-rw-r--r-- 1 pwillis pwillis 47304704 2009-09-14 20:18 exe.img.enc
-rw-r--r-- 1 pwillis pwillis       18 2009-09-14 20:18 info.txt
-rw-r--r-- 1 pwillis pwillis       47 2009-09-14 20:18 validinfo.txt
-rw-r--r-- 1 pwillis pwillis       44 2009-09-14 20:18 version_info.txt

`file` tells us that MicomCtrl, crc, and ddcmp are ELF 32-bit LSB ARM executables. I ignore these because they probably don't serve a major function and since they are plain-old unencrypted files and can be reverse-engineered with a debugger and standard development tools without much trouble.

We can see that there's obviously a shell script and two 'img' files, which are probably filesystem images, all encrypted. The question then becomes, how are they encrypted and how can we decrypt them? I start by opening up the files. The shell script appears to have a normal script-style structure, with multiple lines (sometimes repeating exactly) separated by newlines. Since it has a 'normal'-looking structure I can already guess whatever the encryption method is it isn't very good. Good encryption should give you no idea of what the data is or its form, and should have no apparent patterns in it.

When I open up one of the image files it seems pretty much like random garbage, as is expected. I don't expect to find much in them but i run them through the unix `strings` command anyway. All of a sudden, lots of series of the same ASCII characters tumble out:

"CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7AUSCT-CHE7"

The first thing I think is: Holy Shit. I know immediately what i'm looking at. Since my early days experimenting with XOR encryption I learned two simple principles: one, never repeat the key you're encrypting with, and two, never allow NULL characters in your data! The second applies because XORing anything against NULL (all 0 bits) just outputs the encryption key. As most people familiar with binary files know, they are absolutely rife with null characters, often in series and in large blocks. What I was looking at above was a huge block of the encryption key repeating over and over.

But could it really be that easy? I wrote a quick Perl script to test it.

#!/usr/bin/perl
# decrypt.pl - pwn a stupid firmware encryption
# Copyright (C) 2009 Peter Willis 
# 
# So here's the deal. I noticed a repeating pattern in the encrypted filesystem
# image of the firmware for this TV. One ASCII string repeating over and over,
# and only partly in other places. From experience with basic XOR encryption you
# may know that you can "encrypt" data by simply taking chunks of unencrypted 
# data the same length as your encryption key and using an xor operation. The
# problem is, if there are any 'nulls' in your input data your key is going to
# be shown to the world in the output. My theory is this is what happened here.

use strict;

# This is the string that might be the encryption key. We don't know if this is
# the correct order of the key, only that these characters repeat themselves.
#my $possiblekey = "HE7AUSCT-";
#
# However, by looking at the input again, we know the repeating string is 10 chars
# long. Counting the number of bytes from the beginning of the file to this string
# 10 at a time we can assume the real string's order is:
my $possiblekey = "T-CHE7AUSC";

die "Usage: $0 FILE\nDecrypts FILE with a possible key \"$possiblekey\"\n" unless @ARGV;

$|=1;
open(FILE, "<$ARGV[0]") || die "Error: $!\n";
# Make sure we read an amount of bytes divisible by the length of the key or
# we would mess up our xors
while ( sysread(FILE, my $buffer, length($possiblekey) * 100) ) {
    for ( my $i=0; $i < length($buffer); $i+= length($possiblekey) ) {
        my $chunk = substr($buffer, $i, length($possiblekey));
        print $chunk ^ $possiblekey;
    }
}
close(FILE);

And we run it on the script to test the theory:

pwillis@bobdobbs ~/Downloads/bar/T-CHE7AUSC/ :) ./decrypt.pl run.sh.enc
#!/bin/sh

PROJECT_TAG=`cat /.info`

WRITE_IMAGE()
{
        if [ -e $2 ] ; then
                echo "==================================="
                echo "$1 erase & extract & download!!"
                echo "==================================="
                        $ROOT_DIR/ddcmp -d -i $2 -o $3
                sync
                echo "===============DONE================"
        elif [ -e $2.enc ] ; then
                echo "==================================="
                echo "$1 erase & extract & download!![Enc]"
                echo "==================================="
                        $ROOT_DIR/ddcmp -e $PROJECT_TAG -i $2.enc -o $3
                sync
                echo "===============DONE================"
        fi
}

As you can see, the script decoded beautifully with this key on the first try. The other two encrypted files also decode fine. It turns out "exe.img.enc" is a FAT filesystem image with an x86 boot sector (obviously for 'dd'ing to some storage device on the TV). The "appdata.img.enc" file is a Squashfs filesystem.

It took about 20 minutes for me to download and decode this supposedly-encrypted firmware image. This is the lesson: use a real tool for encryption. Do not think you know how to do it yourself. And don't waste your time trying to obfuscate a filesystem image from me; i'll just crack open the TV and dump the flash ROM.

For extra fun: note the name of the directory the executable was originally extracted into. :-)

Friday, January 29, 2010

why you should not trim your system install

I happen to think it's *mostly* pointless to trim the install of a system's packages. When I install a system, be it a desktop, server, development machine, etc I install all available packages for that distro. A lot of people disagree with me. They usually say:

"All those packages take up space!"

Go buy a hard drive made this decade. And while you're at it, stop partitioning your kicks with 2GB /usr partitions and 500MB /tmp partitions. If your disk is full it's full; there's no benefit in letting it fill up sooner than later. Your filesystem should have been created with at least a 1% reserve for root only, which will allow you to log in and fix the issue (unless you are running filesystem-writing apps as root; you're not, right?) not to mention the system monitors you use to tell you before the disk fills up.
"But it's a security risk!"

Do you really think your system is more secure because it lacks some binary files? While you're spending time trimming your package list, you're forgetting the basics of system security like firewalling, disabling services, checking the filesystem for overly-permissive files/directories, setuids, etc. Just because you didn't install that setuid kppp doesn't mean there isn't a hole somewhere else on your system. Do a proper audit of your system once everything is installed. This will eliminate typical system attacks and you'll be secure enough to handle exploits in userland apps.
"It takes extra time to update all those packages!"

Is your network that slow? Even if you upgraded all of KDE or Gnome it shouldn't take but a couple minutes to download the updated packages. Of course you were a good admin and you have a kickstart repository on the LAN of each machine (or accessible a hop or two away) so the bandwidth should be immaterial.
"Yum/apt will take care of the extra packages if you need to install something later."

Oh boy! Let's talk YUM, shall we? First of all it's one of the shittiest pieces of vendor-approved package managing/updating software ever. Read the source if you dare (and if you can). The only thing that's more retarded than its code is how retarded it is to have to troubleshoot YUM when it doesn't do what you want to do. Let's go down the checklist:
1. Run `yum clean all`
2. Check that the package's --requires exist in packages in the repo
3. Check that the 'meta' arch of the package matches the arch of the machine
4. Make sure there isn't a duplicate package with a different arch in the repo
5. Make sure there isn't a package with a similar name but higher epoch in the repo
6. Make sure the name is the same
7. Make sure the version is higher and has the same exact format as any other package with the same name
8. Make sure the metadata in the repo is up to date, and re-gen it just to be sure
9. Do a `yum clean all` again
10. Sacrifice a goat to the Yum maintaners
11. Rename your first born to 'Yellowdog'
12. Etc
Usually someone pushing a bad package or a dependency of a package that used to work will be what breaks Yum. It'll go unnoticed until you really really need that package and its dependencies installed. Then you'll spend hours (and sometimes days) trying to get it installed and fix whatever was broken with rpm/Yum. Whereas if you had installed everything right after your kick, the package would just be there, ready for use. You should only use something newer than what came with your kick if you really really need it.

Of course experience teaches us the folly of trusting any update to an rpm. Whenever you push a new package you must test it on the host it'll be installed on. The package itself may not install correctly via Yum (though using just RPM would probably work), or there could be some other problem with the contents of the package that you'd only know by running the programs contained in the package on the target host. Because we do this, we don't need Yum to browbeat us every time the RPM (or something else) isn't 100% to its liking. If you just install packages en-masse and test them you can skip the whole process of troubleshooting Yum and skip right to troubleshooting the package itself on the host it's intended for, which we'd be doing anyway with Yum.

For a VPS or some other disk-and-bandwidth-limited host it's obvious that trimming packages will save you on both of your limited resources. But on a normal network with multiple hosts and plenty of storage I wouldn't spend a lot of time time tweaking my kickstart packages list.

Friday, January 22, 2010

hype of the century

There's never going to be an end to the ridiculous hyperbole surrounding new, expensive, fashionable technology. It never matters if the thing they're hyping is actually good. I think this report sums up exactly the kind of situation I see all the time from the masses of ignorant media fiends.

However, there does seem to be a kind of peak that is hard to reach again. I'd like to tell you all about the biggest peak to date: the iPhone. Seems you can't talk about the iPhone in a negative context without someone bringing up the point that no matter what, I have to agree, the iPhone "changed the world". WELL THEN. Let's just take a look at this entrancing device and see if this assertion may be true?

First of all, I don't see any marketing blitzes in Rwanda or Haiti or the North Pole for this shiny hunk of metal and silicon. In fact, when it was first released the device was only accessible by those with a considerable amount of money and a specific geographic location and mobile service provider (AT&T). Over the years they've released the iPhone officially in other territories and it's possible to purchase one unlocked for other carriers. <edit> You can also now buy the iPhone locked for dozens of carriers around the world. But the device is not universal to all carriers and nations. </edit> This kind of access does not change the world. Maybe they meant to say it "changed the PART OF THE world WHERE I LIVE AND MY ACCESS TO MOBILE CONTENT ON ONE CARRIER". That may be true enough, but that's not what they actually say.

Did it change the industry? Perhaps... The idea of a manufacturer/producer of a phone dictating the terms of how the phone operates and even taking a cut of the subscription profits was certainly a hybrid business model. But did it change the industry? Thus far, no other vendor has accomplished such a feat. With the release of the Nexus One from Google we see an unlocked phone provided on multiple carriers whose operating system is solely controlled by Google. This is probably the second time I know of that a carrier's dictatorial domination of a device has been stripped away. But the iPhone was released three years ago. It took 3 years to begin to change the industry? What took so long? You can't say Apple's original hybrid business model "changed the industry" because the industry remained the same - only one schmuck corporation (AT&T) went along with the idea of total vendor lock-in.

If anything, the iPhone has influenced the way the industry builds out its services. I'm willing to bet that the volume of data is starting to surpass that of the volume of voice traffic. Text messages already replace most short conversations, and one day perhaps the voice channels will all be replaced by a single digital link with VoIP connecting users to providers. There's no reason why in the future you couldn't pick a different long-distance carrier than your "mobile ILEC", similar to land-line phone call routing for the past however many decades. It's obvious AT&T can't keep up with the current flow of data, however, and other carriers must see the need to expand their capacity.

Nobody rational or educated would argue that the iPhone ushered in a new era of smart phones. Smart phones had all of the features of the iPhone and more for years. Granted, those phones were usually high-priced unlocked devices more for the early adopters with green falling out of their pockets. The iPhone itself wasn't exactly cheap, starting out at $399 (and $499 for the 16GB version) plus a 2-year contract. OUCH. (To contrast that, the extremely capable Nokia N95 was around $500 at release time in the same year - unlocked with no contract). The phone even lacked basic features like Bluetooth profiles for input devices (or virtually any other useful profile, including that for wireless stereo headsets). It couldn't send picture messages, it couldn't copy-and-paste... Other than the multitouch interface it wasn't revolutionary in terms of technical gadgetry. You could get more done with a brick-style phone on any carrier than you could with the iPhone.

The one thing you could say was a game-changer was the App Store. Apps for phones is nothing new. Ever since Java became the "operating system" for phones in the early 90's people have been making custom apps and selling them for big bucks world-wide. But there was never an easy way to just look for, pay for and install any given app. The App Store made them accessible to any user at all times. This then also brought more developers in, and with the fast processor and moderately-fast bandwidth many apps were brought about to bring new kinds of content to the device. There were a couple other "app store"-style websites around, but nothing that tied directly into a phone. Google's Android followed suit with their own app store years later, and Microsoft is just now starting to get into the game.

In the end, we now have an industry saturated with look-alike devices, many of which provide more features and functionality than the iPhone itself. But they will never surpass the iPhone in terms of sales or user base. And the reason comes directly from Apple's ubiquitous business model of total lock-in. Control the hardware + control the software = control of the users. At this point, any device that tries to come in and "shake up the game" will be nothing but a distraction for the uninformed random user who stumbles into a carrier's brick-and-mortar trying to be told what they should buy. Most Web 2.0-savvy users will be looking for a phone that supports the "apps" of a particular service or web site, and the only two options today seem to be iPhone or Android. Some people still make Blackberry apps, but that's probably going to become niche for apps which cater to business or corporate users and less for the general public. So an iPhone pretender will always be that, and never become as successful - until they have their own App Store that can compete.

If you're quite done drinking my Haterade, i'll admit that the iPhone is nice. At this point it's the cheapest possible smart phone that provides such a feature set, support from developers and an incomparable user base. But world-changing? Ask anyone today who doesn't have an iPhone if their world seems different since 2007. They'll probably tell you about how the economy is fucked and the banks are running our government, and thank god we have a new President (or holy jesus we're all fucked because of the new President). But if you ask them to list the top 10 things that have changed their world since then, the iPhone won't be among them. Because most people don't give a shit. Their mobile phone needs have been met. And other than general curiosity in high tech gadgetry, they won't have the need to buy into something else.

Monday, January 11, 2010

safe subversion backup with rsync

Let's say you have a subversion repository and you want to keep a backup on a remote host. Doing a "cp -R" is unsafe as is mentioned here, so your two safe methods of copying a subversion repo are 'svnadmin hotcopy' and 'svnadmin dump'. The former only makes a local copy, but the latter creates a single dumpfile on stdout, which is the most flexible method (though it does not grab repo configs).

The simple method to back up the repo from one host to another would be the following:

ssh user@remote-host "svnadmin dump REPOSITORY" | svnadmin load REPOSITORY

That would create an identical copy of remote-host's repository on the local host. However, for big subversion repositories this could take a lot of time and bandwidth. Here is the form to use disk space to save time and bandwidth:

ssh user@remote-host "svnadmin dump REPOSITORY > REPO.dump" && rsync -azP user@remote-host:REPO.dump REPO.dump && svnadmin load REPOSITORY < REPO.dump

This makes a dump on the remote host and rsync's the difference of the dump file to the local host. Note that the dump file is uncompressed and rather large, so if you have lots of spare cycles you can pipe the output of 'svnadmin dump' into 'lzma -c' and the opposite for 'svnadmin load'. (The rsync '-z' flag uses gzip compression, but lzma will save you much more space and thus possibly more time)

edit lol, i just realized compressing before rsync is probably pointless, other than reducing the size on local disk. also 'svnadmin hotcopy' is probably the exact same if not better than dumping to a local file vs piping from one host to another (and saves the config).