Category Archives: Technical How-to

Cloning Linux on VMware

When you clone or ‘deploy from template’ a linux virtual machine on Vmware, specifically Red Hat based linux such as CentOS, you need additional steps on the cloned machine to make it work. The obvious settings you need to change are the IP address and hostname. But changing those settings is not enough. You also need to change other parameters.

When you clone a linux machine, the hardware address (or MAC address) of the NIC changes, which is correct — the cloned machine should never have the same MAC address as the source. However, the new MAC address is assigned to eth1, not eth0. The eth0 is still assigned the MAC address of the source, although it is commented out in udev’s network persistent file, so it’s not active.

When you cloned a linux machine and noticed that the network does not work, it is probably because you assigned the new IP address to eth0 (which is not active). You can use eth1 and assign the new IP address on that interface. However, I usually want to use eth0 to make it clean and simple. You can easily switch back to eth0 by editing the file /etc/udev/rules.d/70-persistent-net.rules. Edit the string that starts with SUBSYSTEM, remove or comment out the line for eth1, uncomment the line for eth0, and replace the ATTR(address) for eth0 to get the MAC address from eth1. Here’s a sample edited file:

# This file was automatically generated by the /lib/udev/write_net_rules
# program, run by the persistent-net-generator.rules rules file.
#
# You can modify it, as long as you keep each rule on a single
# line, and change only the value of the NAME= key.

# PCI device 0x8086:0x100f (e1000)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:60:66:88:00:02",
ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

# PCI device 0x8086:0x100f (e1000)
#SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:60:66:88:00:02",
ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"

Now edit the /etc/sysconfig/network-scripts/ifcfg-eth0 file to make sure that the DEVICE is eth0, the BOOTPROTO is static, and the HWADDR matches the ATTR{address} for eth0 in the 70-persistent-net.rules file.

Restart the network by issuing the command “service network restart” or you can reboot the system.

NetApp Fpolicy Tool to Block W32/XDocCrypt.a Malware

There is a virus going around called W32/XDocCrypt.a that is causing havoc to Excel and Word files located on the network.  This virus renames files to .scr.

 

If you do not have the latest cure and your files are stored on NetApp filers, you can prevent the virus from infecting your files by using the fpolicy tool on NetApp.  The McAfee vscan for NetApp storage does not work very well.

 

On the NetApp filers,verify that fpolicy is enabled by issuing this command:

 

options fpolicy

 

If it’s not enabled, enable it:

 

options fpolicy.enable on

 

Then run the following commands:

 

fpolicy create scrblocker screen
fpolicy ext inc set scrblocker scr
fpolicy monitor set scrblocker -p cifs create,rename
fpolicy options scrblocker rquired on
fpolicy enable scrblocker -f

 

If you are using vfiler, create the above commands on the vfiler.  Also, do not specify any volume, because it does not work.

 

The fpolicy tool can also be a great tool in blocking unwanted files on your filers such as mp3. For more information on fpolicy, go to this website.

 

Internal Web Analytics

There are a lot of tools out there that can analyze web traffic for your site. Leading the pack is Google Analytics. But what if you want statistics of your internal website, and you don’t necessarily want to send this information to an external provider such as Google? Here comes Piwik.  Piwik is very much like Google Analytics but can be installed on your internal network. The best part is that it’s free.

Since Piwik is a downloadable tool, you need to have a machine running web server and mysql. You can install it on your existing web server or on a separate web server. I installed it on a separate CentOS machine. I found the installation very easy. In fact, you just unzip a file and put those files in a web directory. The rest of the installation is via the browser. If there is a tool missing on your server, (in my case, I need the PDO extension) it will tell you how to install it. Pretty neat.

After installing the server, you just need to put a small javascript code on the pages you want to track. That’s it. Piwik will start gathering statistics for your site.

I also evaluated Splunk and it’s companion app – Splunk App for Web Intelligence, but I found that it is not ready for prime time. There are still bugs. No wonder it is still in beta. When I was evaluating, it wasn’t even able to get usable information from apache logs.

I’ve been using Awstats to extract statistics for internal websites for years. It has been very reliable but sometimes it provides inaccurate results. The open source Piwik web analytic tool provides accurate statistics and is the best tool I’ve used so far.

Performing maintenance tasks on vmware hosts

There are times when you need to perform hardware maintenance (such as adding a new Network Interface Card [NIC]) on VMware hosts, or the host simply disconnects from vCenter.  The only way to perform maintenance is to shutdown or reboot the hosts.  To minimize damage, here’s the procedure I use:

  1. Run vSphere client on the workstation.  Do not use the vSphere client on the servers. The reason being – a server might be a virtual machine (VM) which will go down.
  2. Using vSphere client, connect to VMware host, *not* the vCenter server.
  3. Login as user root.
  4. Shutdown all the VM’s, by right clicking the VM, selecting Power, Shutdown Guest.  This is faster than logging in to each machine using RDP and shutting it down.  The vmtools though have to be up to date, or else the Shutdown Guest option will be grayed out. If Shutdown Guest is grayed out, you need to login to the VM to shut it down.  Performing “Power Off” on the VM should be the last resort.
  5. Once all the VM’s are powered down, right click on the VMware host and select Enter Maintenance Mode.
  6. Go to the console of the VMware host, and press Alt-F11 to get the login prompt.
  7. Login as root.
  8. Issue the command “shutdown -h now” to power down the host.  If you just want to reboot, issue the command “shutdown -r now”.
  9. Wait until the machine is powered off.
  10. Perform maintenance.
  11. Power on the VMware host.  Look for any problems on the screen.  The equivalent of blue screen in VMware is purple screen.  When there’s a purple screen, that means there is something very wrong.
  12. When the VMware host is all booted up, go back to your workstation, and connect using vSphere client to the VMware host.
  13. Right click on the Vmware host first, and select “Exit Maintenance Mode”
  14. Power On all the VM’s.

If there are multiple VMware hosts, and Vmotion is licensed and enabled (i.e. Enterprise License), you can vmotion VMs to the other hosts, and perform maintenance.  When the host gets back, you can vmotion back the VM’s to the host, and do the same maintenance on the other.

 

Reinstalling a Node on a Scyld Beowulf cluster

This writeup describes how to restore a node back to the cluster after a node hard disk has been wiped out due to hardware error.

I was prompted to write this instruction because one of the nodes in our cluster failed. After the hardware has been replaced, I tried to put it back to the cluster, however, I was not able to. I tried to follow the instructions to no avail. I also posted a message to the scyld beowulf mailing list but I did not get any response.

Anyway, I was trying to add the node back to the cluster. Using beosetup, the new MAC address was registered as node 0. I tried to partition the disk using the beofdisk tool, then I restarted the node. Here’s the output:

# beofdisk -w -n 0

Disk /dev/hda: 4865 cylinders, 255 heads, 63 sectors/track
Old situation:
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System
/dev/hda1 * 0+ 0 1- 8001 89 Unknown
/dev/hda2 1 516 516 4144770 82 Linux swap
/dev/hda3 517 4864 4348 34925310 83 Linux
/dev/hda4 0 - 0 0 0 Empty
New situation:
Units = sectors of 512 bytes, counting from 0

Device Boot Start End #sectors Id System
/dev/hda1 * 63 16064 16002 89 Unknown
/dev/hda2 16065 8305604 8289540 82 Linux swap
/dev/hda3 8305605 78156224 69850620 83 Linux
/dev/hda4 0 - 0 0 Empty
Successfully wrote the new partition table

Re-reading the partition table ...

If you created or changed a DOS partition, /dev/foo7, say, then use dd (1) to zero the first 512 bytes: dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)
The partition table on node 0 has been modified.
You must reboot each affected node for changes to take effect.

# beoboot-install 0 /dev/hda
Creating boot images...
Installing beoboot on partition 1 of /dev/hda.
mke2fs 1.32 (09-Nov-2002)
/dev/hda1: 11/2000 files (0.0% non-contiguous), 268/8001 blocks
Done

rcp: /boot/boot.b: No such file or directory
Failed to copy boot.b to node 0:/tmp/.beoboot-install.mnt

After rebooting, it came out with an ERROR state on the BeoSetup window. Here’s the log:

node_up: Initializing cluster node 0 at Wed Mar 9 15:44:55 EST 2005.
node_up: Setting system clock from the master.
node_up: Configuring loopback interface.
node_up: Loading device support modules for kernel version 2.4.27-294r0048.Scyldsmp.
setup_fs: Configuring node filesystems using /etc/beowulf/fstab...
setup_fs: Checking /dev/hda2 (type=swap)...
chkswap: /dev/hda2: Unable to find swap-space signature
setup_fs: FSCK failure. (OK for RAM disks)
setup_fs: Mounting /dev/hda2 on swap (type=swap; options=defaults)
swapon: /dev/hda2: Invalid argument
setup_fs: Failed to mount /dev/hda2 on swap (fatal).

So, to solve this problem, you have to do 2 extra steps before rebooting the node. After executing beoboot-install, you should execute bpsh mk2fs -j on the data partitions and bpsh mkswap on the swap partition, such as

# bpsh 0 mk2fs -j /dev/hda3
# bpsh 0 mkswap /dev/hda2