Linux Archive

Apache and larger than 2Gb file downloads

I have an unsophisticated backup routine on one of my servers that just tar and gz the /var/www/html directory and then an office server downloads it from a protected directory.

Unfortunately it stopped working a couple of weeks ago so I went to find out what the problem was. After much tweaking and testing followed by a couple of hours Googling, it turns out that it wasn’t an Apache problem but a Linux Kernel problem. Certain kernels won’t allow files larger than 2Gb to be downloaded through Apache.

Since I wasn’t about to start messing around with a kernel upgrade on a production server, I went for downloading the file by FTP instead. Ican’t remember where I found this out, but if it was your site, leave a comment and I’ll credit you (there was more info there than here).

Linux vs Mac vs Windows quick comparison

Check this out! Hilarious…

Apache Log Analysis

I’ve just had to do some Apache Log analysis for the various Find Locally domains. As all the sites are run off one server, the log files contain data from all the domains. This makes it a little difficult to get any meaningful information without having to analise each domain one at a time, but before getting to that, I’d have to get all the data into a format I could do something with.

After downloading the 140 Mb log files, I used M$ Access to import the delimited files and then used the MySQL ODBC Connector to export the tables into a MySQL database on my laptop (it’s a Core Duo and pretty much the fastest machine I have).

I then decided to get rid of the Bot traffic from the 5 million hits so I could be left with something usable. As Apache only stored the IP address of the remote machine, I sent PHP on a 12 hour mission to use gethostbyaddr() to store all the remote domains. This made it a lot easier to filter out known Search Engines and the like.

Having gotten rid of all the junk data and having it in a decent database, it made it easy to perform analysis on the remaining logs. As our domain names are all in the same format, using PHP made it easy to dump out a few tables of results showing exactly what I need. All I need now is to get it into a set of automated scripts so I don’t have to do it by hand next month.

New 1and1 Root Server and PHP Sessions

Phew, it took me three goes to get that posted! After moving to the new Root Server I was having problems with some of the scripts that automatically create files somewhere in the document root.

To get this working I just decided to run Apache under the same user as the owner of the document root. This then caused the problem that all the session variables were not being saved! Quick chown to the new user on the temp directory used to save the session files sorted that.

Build Your Own Linux Cluster: Stage 1

Cluster Update: I´ve gone ahead and started building my Linux cluster to act as my way-over-the-top web server. I grabbed 6 identicle Athlon based systems that I wasn´t using and ripped out the hard disks (I only need the drive in the master node). These were then all networked and I then went searching for tutorials on how to set it all up.

After a bit of searching I found this tutorial which is quite good. I used Fedora Core 3 since I already had the disks on my desk but it works the same. The only bit that I found confusing was adding the libraries lines which after more searching is added to the file /etc/ld.so.conf. I´ve booted it all up with the master and one slave node and I´m about to do a boot with all 5 slaves. After that I might do some testing to see what kind of performance I´m getting out of it.

Build your own Linux Cluster

I was reading an article in a PC mag entitled "Build your own supercomputer" but as it turns out it was mainly supercomputer history and then talking about basic multiple core PCs which are more powerful than a $10 million Cray from the early 90s. The end of the article though was a lot more interesting. It was about using bog standard PCs to create a linux cluster to do some serious number crunching.
 
All you really need is a bunch of old PCs, a switch or two and the software to run it all. The PCs can be pretty much anything as long as they have a COU, RAM and a network card. The next step is to make sure they are linked up to the switch or switches to optimise the communication speeds between the different boxes. After that it´s just a case of installing the software (all of which is free to download since it´s open source) and configuring the entire system.

 
I also want mine to be able to run with one machine on a UPS and set up so that if one or all of the others go down (either due to power failure or hardware failure) it will keep running as a system and when the other systems come back on line they will just boot and join the cluster again. I´ve got a lot more Googling to do to find out how to do all this and if all of it is possible but I´ll post more here when it gets done

Bwa ha ha: Take that Windoze sysadmins!

After all these years of having nice wizards and configuration menus to help you set up your servers, Microsoft is releasing a server version of Windows that comes with no GUI, just a command line. Time to join us Linux sysadmins in doing it the good old fashioned way.
 
Trust me, its more fun this way anyway and you learn more about how things work. Plus the non-technical people around you think you´re doing something really complicated!

Nearly ready for the server move

After weeks of planning, massive reconfigurations, new backup scripts and file moves, I think I´m finally ready to move servers!
 
I´ve just to figure out a night when we should be fairly quiet to do the move so the sites aren´t down too long. I´ve got it all planned out and should be able to do it in stages so that I don´t have to do a series of all-nighters. Plus it is the holiday season so the distruption to the public should be minimal.
 

Just got to wait until Tuesday for a test script to run successfully and then I can go ahead and do it!

Printing problems? Thatll be the power cuts

We´ve got a system that gets used for most of our online projects that allows companies to sign up for a service and then automatically creates a pdf invoice to send them. This then gets downloaded by the office server and then printed out once an hour.
 
We were going through the last days invoices and noticed that there was a big gap in invoice numbers. I checked the database and found that a load hadn´t been printed. This was probably because the power had been off so the download server was running but the print server was still turned off so they were just disappearing into nowhere.
 
Luckily we have another networked printer that I can use and if the download server can´t directly contact it, it will keep it in its print queue until it comes back up again. Sorted. Just got to make sure we don´t run out of toner or nothing´ll get printed until we get more.

Phew! Servers moves are fun!

Well, I´ve got around to setting up the development servers in the office, distributing the development files and databases across them and set up the backup routines. I´ve also changes all the files to take into account the server move so all I need to do now is set up the new servers, move accross and test.

 
The move and testing are pretty simple since it´ll just require taking down the database at a certain point, moving the files accross and bringing it live again. As long as I do this at stupid-o-clock then nobody should loose any info or even notice.
 
The main problem is the DNS shift. If I was moving to another host then this would take 24 hours and result in serious downtime, but since the same DNS servers are being used and there is just an IP address change it should go through a lot quicker. I´ll still do it at night to minimise disruption though. The only problem is that it requires going through a two minute process to change the IP address the domain points to in the control panel and we´ve got about 4,000 domains! I think I might need a sleeping bag and a lot of coffee that night!