Linux Sysadmin Blog

Cloud Computing Scenario’s for Database Servers

- | Comments

We’ve been investigating the possibilities of using cloud computing for our clients. Especially Amazon EC2 has the potential to be be really effective in offering flexible, pay-as-you-go computing. From my own perspective I have been looking at how to use cloud computing in combination with MySQL and I must say that I’m a bit sceptical about the effectiveness of cloud computing in replacing the primary database server. First off there does not seem to be that much in the way of performance data for this type of installation. Can a cloud server really offer the I/O performance necessary to replace a dedicated database server? And even if the performance is equal, what is the main advantage? Scaling web sites is done by adding more servers in most cases but the same approach only works for database servers when clusters are used. So in what other scenario’s does cloud computing give us an edge?

Temporary reporting servers

Create a one time copy of an existing production database server to run specific heavy reports. This is ideal for monthly reports since the server only needs to be up and running for several hours per month.

Backup database server

This is a backup solution where the server is only allocated once there is a problem with the primary server which makes a lot of sense because the client only pays for the server once it is used. One downside to this scenario is that the server has to created and loaded with the latest backup which will result in a decent amount of downtime but at least all of this can be automated. A bigger problem is the loss of data since the latest backup.For our high availability sites we have a standby database server replicating all changes from the master so we can switch over at a moment’s notice without losing any data.

Migrations

Performing a migration or a system upgrade usually brings some downtime. Promoting a standby system to primary creates a single point of failure so it makes sense to create a remporary standby of the standby.

Development branches and testing environments

For development branches we usually only need an extra database for a short amount of time although truth be told, those database are not very large in general so we tend to put them on the same development database server anyway. The same is true for testing and QA. These activities usually occur in cycles which means that they are very attractive targets for cloud based servers.

Alternative data center

Yes, it happened to us once that our datacenter went off line due to a very heavy attack. Instead of finding another data center for these eventualities it could be useful to have cloud based backup servers defined. However, this requires the extra effort of keeping these instances up to date for this eventuality. Additionally, DNS caching will stop the switch from being instantaneous. A geographical load balancing solution would be the answer to that but at that point the cost for preparing for this eventuality will have to be compared to the loss due to down time.

Cisco ASA 5505: Active/Standby Failover Configuration

- | Comments

The ASA 5505 is the smallest (and cheapest) solution from the current Cisco hardware security appliances. Still, if we have the proper software license (like Security Plus for example) we can use the ASA5505 to setup rather complex solutions. This post will show how we can setup a pair of ASA5505 in failover configuration, solution that can be very useful in a small office where we want to achieve a high availability and we can’t tolerate a failure of our frontend firewall.

Prerequisites

Before even starting, let’s check that our ASA5505’s are running the appropriate software license. For example the sh run command will output something like this: sh ver ... Licensed features for this platform: Maximum Physical Interfaces  : 8 VLANs                        : 20, DMZ Unrestricted Inside Hosts                 : Unlimited **Failover                   : Active/Standby** VPN-DES                      : Enabled VPN-3DES-AES                 : Enabled VPN Peers                    : 25 WebVPN Peers                 : 2 Dual ISPs                    : Enabled VLAN Trunk Ports             : 8 AnyConnect for Mobile        : Disabled AnyConnect for Linksys phone : Disabled Advanced Endpoint Assessment : Disabled UC Proxy Sessions            : 2 This platform has an ASA 5505 Security Plus license. . You should look at the Failover feature and you should have ”Active/Standby”. If this outputs disabled, you will have to order and install a software license upgrade from Cisco in order to be able to use the ASA’s in failover.

Cisco (as always) has a very complex documentation on how you can achieve this. Still, it is hard to digest, as they try to cover all possible devices on the same page (even the obsolete pix500); even more the ASA5505 has some particularities compared with the rest of the ASA 5500 range of products and this is not very clearly explained. Hopefully this post will be more useful and simpler to follow.

First we need to understand some limitations of our devices. The ASA5505 can only perform Active/Standby failover and not Active/Active. If you need that, you will have to look at a higher range device. Also they can only perform LAN-Based Failover (as opposed to old pixes that can use cable based failover) and they don’t support Stateful Failover (meaning all active connections will be lost after a failover event). Also both units must have the same hardware, software configuration, and proper license and run in same mode (single or multiple, transparent or routed).

Configuring the Primary Unit

For each of the IPs assigned to the interfaces of the ASA we will need to allocate a secondary IP from the same network range; this will be used as the IP of the standby unit, while the main IPs will always be used by the primary (active) unit and will be normally used by the clients (as default gateways for ex). The first step is to configure the active and standby IP addresses for each data interface; the cisco documentation is confusing here and it is not clear that on the ASA5505 this is done for each of the used vlans, and not real interfaces: conf t (config)#interface Vlan1 (config-if)#ip address active_addr netmask standby standby_addr for ex: (config-if)#ip address 192.168.0.1 255.255.255.0 standby 192.168.0.2

Once we have defined all standby IPs we can move forward… You will also need to define one interface that will be used for failover. You can either cross-connect this between the 2 ASAs or you can use a switch with a dedicated vlan for this. The later one is preferred as it will more accurately detect if one ASA is down. Again in the documentation this is not clear how to do it on the ASA5505 and it discusses about real interfaces, while on the ASA5505 we have to use vlans.

The trick is to create a new vlan and don’t assign any ip on the vlan inteface: interface Vlan32 description LAN Failover Interface no shutdown the ip will be assigned by the failover commands; Finally enable failover: failover failover lan unit primary failover lan interface failover Vlan32 failover interface ip failover 192.168.255.1 255.255.255.0 standby 192.168.255.2 (where you will use one unused ip range for the failover ips).

Save the running config: copy running-config startup-config

Configuring the Secondary Unit

The configuration of the secondary, standby unit is very simple as it needs only the failover interface configuration.  The secondary unit requires these commands to initially communicate with the primary unit, and get its configuration from the active unit.

As with the main ASA we have to define the vlan that will be used for failover first: interface Vlan32 description LAN Failover Interface no shutdown

And next we just have to enable failover and set this unit as secondary: failover failover lan unit secondary failover lan interface failover Vlan32 failover interface ip failover 192.168.255.1 255.255.255.0 standby 192.168.255.2

After this, the active unit sends the configuration in running memory to the standby unit. As the configuration synchronizes, the messages “Beginning configuration replication: Sending to mate” and “End Configuration Replication to mate” appear on the active unit console.

Verifying the Failover Configuration

The command show failover can be used to show the status of the failover operation; the output on the active device will look similar to: sh failover Failover On Failover unit Primary Failover LAN Interface: failover Vlan32 (up) Unit Poll frequency 1 seconds, holdtime 15 seconds Interface Poll frequency 5 seconds, holdtime 25 seconds Interface Policy 1 Monitored Interfaces 5 of 250 maximum Version: Ours 8.0(4), Mate 8.0(4) Last Failover at: 02:28:31 CST Jan 23 2009 This host: Primary - Active Active time: 2166923 (sec) slot 0: ASA5505 hw/sw rev (1.0/8.0(4)) status (Up Sys) Interface inside (10.10.10.1): Normal Interface outside (192.168.0.1): Normal slot 1: empty Other host: Secondary - Standby Ready Active time: 378 (sec) slot 0: ASA5505 hw/sw rev (1.0/8.0(4)) status (Up Sys) Interface inside (10.10.10.2): Normal Interface outside (192.168.0.2): Normal slot 1: empty

Finally, you will probably want to test the failover functionality and maybe tune the triggers of the failover, but maybe we will talk about this in a future post. I hope you found this post useful and helped to explain better the steps needed to configure the Active/Standby Failover on the ASA5505.

HowTo Recover From a Corrupt Rpm Database

- | Comments

RedHat based systems (rhel/fedora/centos) use rpm/yum to install and upgrade packages. If the rpm database gets corrupted in some way it will break all the functionality of rpm and other tools that rely on it like yum for example. We have seen this either showing some errors on the console, or most frequently just by hanging and not performing any rpm related operations (even simple view operations like rpm -qa, or yum update, etc.). You will have to kill the processes and then find a way to rebuild your rpm database. This post will show how to rebuild your rpm database and bring back to life your rpm command.

First we need to backup our existing database (just in case); this is located in /var/lib/rpm and depending on your distribution the content of the folder might look like this: /var/lib/rpm -rw-r--r--  1 rpm  rpm   5464064 Feb  9 04:17 Basenames -rw-r--r--  1 rpm  rpm     12288 Feb  3 05:03 Conflictname -rw-r--r--  1 root root        0 Aug 25 02:48 __db.000 -rw-r--r--  1 root root    24576 Aug 14 04:02 __db.001 -rw-r--r--  1 root root  1318912 Aug 14 04:02 __db.002 -rw-r--r--  1 root root   450560 Aug 14 04:02 __db.003 -rw-r--r--  1 rpm  rpm   1982464 Feb  9 04:17 Dirnames -rw-r--r--  1 rpm  rpm   5259264 Feb  9 04:17 Filemd5s -rw-r--r--  1 rpm  rpm     24576 Feb  9 04:17 Group -rw-r--r--  1 rpm  rpm     28672 Feb  9 04:17 Installtid -rw-r--r--  1 rpm  rpm     45056 Feb  9 04:17 Name -rw-r--r--  1 rpm  rpm  31707136 Feb  9 04:17 Packages -rw-r--r--  1 rpm  rpm    339968 Feb  9 04:17 Providename -rw-r--r--  1 rpm  rpm    106496 Feb  9 04:17 Provideversion -rw-r--r--  1 rpm  rpm     12288 Dec  2 08:45 Pubkeys -rw-r--r--  1 rpm  rpm    253952 Feb  9 04:17 Requirename -rw-r--r--  1 rpm  rpm    167936 Feb  9 04:17 Requireversion -rw-r--r--  1 rpm  rpm     86016 Feb  9 04:17 Sha1header -rw-r--r--  1 rpm  rpm     49152 Feb  9 04:17 Sigmd5 -rw-r--r--  1 rpm  rpm     12288 Jan 15 21:09 Triggername

Let’s just backup all the /var/lib/rpm folder to have its original state: cd /var/lib tar czvf rpmdb.tar.gz rpm

Finally let’s rebuild the rpm database: cd /var/lib/rpm rm -f __db* rpm --rebuilddb -vv

After this, your rpm commands should be working again. (just try a yum update or a rpm query like: rpm -qa php). If this will not fix it try to see what error messages you got from the above commands or from your system logs.

Upgrade to Java SE 6 Update 12 on Fedora 10

- | Comments

After our ASA units were updated to the latest version of ASDM my Java client would no longer connect to ASDM. An upgrade to the latest version of Java was in order. Since fedora yum repository does not yet offer the latest version of Java I downloaded the latest rpm variant of JDK from http://java.sun.com/javase/downloads/index.jsp

The install steps are:

Grand executable permission to installer file

1
chmod +x jdk-6u12-linux-i586-rpm.bin

Run installer file

1
./jdk-6u12-linux-i586-rpm.bin

Rename symbolic links pointing to old java programs

1
2
3
4
cd /etc/alternatives
mv java java_old
mv javaws java_old
mv keytool keytool_old

Create new symbolic links

1
2
3
4
cd /etc/alternatives
ln -s /usr/java/latest/bin/java java
ln -s /usr/java/latest/bin/javaws javaws
ln -s /usr/java/latest/bin/keytool keytool

Verify that new java version is installed

1
2
3
4
javaws
Java(TM) Web Start 1.6.0_12 
Usage: javaws [run-options]   
  javaws [control-options]

Extending the Slow Query Log

- | Comments

Andy posted some very good links recently to video’s on how to optimize your web site. Although I spend more time optimizing the database you always have to go where the actual performance is lost. For MySQL the place to check for performance issues is the slow query log which I have mentioned in earlier posts. The limitation of this log is that a query has to take at least one second to appear in this log. This skips over queries that are executed thousands if not millions of times and which take less than a second. These queries might have just as much of a performance impact as queries that last several seconds each.

In the article below it shows how to patch the slow query log to track queries that last less than a second. Obviously you don’t want to have this running continually in production because the amount of logging would be enormous but in test environments or for a limited time in production this can be very useful. Be prepared to analyse some huge amounts of data though.

http://www.mysqlperformanceblog.com/2006/09/06/slow-query-log-analyzes-tools/

Cisco ASA 5505 ASDM Error: Unconnected Sockets Not Implemented

- | Comments

If you run a version of Java JRE newer than v6 Update 10 (latest at this time is V6 Update 12) and see this error when trying to connect to a Cisco ASA ASDM interface: ASDM is unable to continue loading. Click OK to exit from ASDM Unconnected sockets not implemented. then you are probably running an older ASA software (6.1.5 released on 09-OCT-2008 and older ones have this issue) and you need to upgrade in order to fix this issue. Any version newer than 6.1.5.51 (that is the latest one available at this time) will work as expected. This version was released by Cisco to fix this issue on 16-DEC-2008.

The upgrade is simple and you can use my step by step guide for this; just keep in mind you will have to reboot do activate the upgrade. After this, your ASDM should be working again.

AdBard - Dont Die!

- | Comments

So a couple of weeks ago I touted the AdBard folks and their FLOSS oriented ad network system.  Today we received the following email from them.  What is worse, is that the ads have already stopped appearing on the site.

It looks like they will be teaming up with  Free Software Foundation.

This email details your current earnings from your participation in the Ad Bard Network.  We are also excited to announce major changes to our network, including general improvements and the direct participation of the Free Software Foundation.  However, our planned changes require that we temporarily suspend the entire network for the month of February.  As a member you will be receiving payment for your outstanding earnings balances, and then if you elect to participate in our newly structured network you will be required to sign up again.  We apologize for the inconvenience of this, but hope that it helps achieve the end goal of increasing the earnings of member websites and improving the desirability of the network for advertisers.

Statistics:

LInux System Admin Blog average ad impressions: Hourly:           nn Daily:           nnn Monthly:      nnnnn

Outstanding earnings: $nnn

Due to the upgrade in process, please remove the JavaScript snippet from your website at this time.  No further advertisements will be displayed through this snippet, and before the end of February 2009 the handling for this javascript will be disabled and could result in an error on your website.  If you will require more than 2 weeks to remove the snippet, please send us an email and we will work with you as necessary.  A new snippet will be provided for the new website.

We will be issuing payments for all outstanding earnings through PayPal or via a check.  If your payment information has changed, please respond to this email with updated details.  Please be sure to include your Ad Bard username in your email.

If you have converted earnings into unused coupons, please reply to this email with details so that we are sure to properly credit you back.

Details about our enhanced network will be posted to http://adbard.net/ over the upcoming month.  You will also be receiving an update via email when it is possible to sign up for the new network.  A few of the planned changes include a limited number of advertising slots, the ability to participate in approving which FLOSS-appropriate advertisements are accepted, and improvements to our payment algorithms. The Free Software Foundation is actively advising us in this effort, and will help campaign for the new network once it goes live.

Thank you for your patience and participation in our evolving network. We hope that you like the changes that will be happening this month, and that you will continue to participate.

Cheers, -Jeremy

– Jeremy Andrews 877-875-8824 x100 Tag1 Consulting, Inc.

Web Site Slow? To Improve Performance - Sys Admins May Be of Little Help - Call the Designer

- | Comments

Here are two videos by Steve Souders, former chief performance Yahoo, currently at Google.  He determined that 95% of the wait on loading the Yahoo page is after the initial apache response is sent to the browser.  Doing more research he determined that this is about a 80/20 rule on most popular web sites.  It has enlightened me as to where to look to determine why web sites are slow. 

Although this video has been around for a while, its a must for system admins who bang their heads at trying to squeeze performance  out of the 10 to 20 percent of the user wait time they are responsible for.  

Steve’s Performance golden rule:

80% to 90% of the end-user response time is spent on the front end.  Start there.

Here is the older Yahoo video 

Steve Souders: “High Performance Web Sites: 14 Rules for Faster Pages” @ Yahoo! Video

Followed by the more recent Google talk

Check out Yslow firefox plug in as well as the IBM Page Detailer - its the product he uses to map out the front end work of the browser.  Great tools to start thinking about why your website is slow or why pages are loading slow.

Enforce SSL for Google Services

- | Comments

Most Google services are now avaible over encrypted ssl connections. Google Apps now offers the option to enforce ssl for most of it’s services. Here is the overview:

Email - Yes. Calendar - Yes. Docs - Yes. Sites - Yes. Chat - Yes. SSL supports Chat in Gmail. The Google Talk Client is always over a secure connection (TLS). Video - Not available. Start Page - Not available. This includes start page gadgets for email, chat, calendar, and docs account.

To enable SSL enforcement in Google Apps services login as an Administrative user for your Google Apps hosted domain, click on Domain Settings tab and check SSL checkbox.

Google Cutting Back on Apps Services.

- | Comments

The good days of having a customized start page for Google Apps are over. As of December 1st 2008 Google has ceased offering the Start Page Service to all new Google Apps customers.  This isn’t the only cut back Google has made as now only 100 e-mail  accounts are available with the free Standard version of Google Apps. I’m predicting that soon Google will make the Documents service only available to new paid customers of the Google Apps Suite.

Better stock up on those free and wonderful Google hosted services while they are still available.