GBiB: Deploying on VMWare ESXi

From genomewiki
Revision as of 15:32, 3 August 2015 by David Trudgian (talk | contribs) (Created page with "At UT Southwestern, the BioHPC high performance computing group provides services for a range of users, many of whom use the public UCSC Genome Browser regularly. Because of fire...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

At UT Southwestern, the BioHPC high performance computing group provides services for a range of users, many of whom use the public UCSC Genome Browser regularly. Because of firewall restrictions on campus it's not straightforward to setup a track hub accessible from the public browser site. Some data is also private, and cannot leave the University. To allow use of the UCSC browser on-campus we deployed GBIB on a VMWare ESXi host. This is simpler than configuring and maintaining a stand-alone mirror installation.

GBIB is distributed as a VirtualBox VM, which cannot be directly imported into ESXi. Although VirtualBox allows export of virtual machines in the standard OVF/OVA format, they will not import into ESXi without manual edits to the configuration (detailed here). Some further configuration changes are required to GBIB to allow access as a shared resource since it is designed to be run on and accessed from a single machine.

Obtain and convert the GBIB image into an OVF appliance

You will need a computer with VirtualBox 4.x installed to convert the GBIB image to OVF for import into ESXi. We used a machine running Red Hat Enterprise Linux 6.6 with VirtualBox 4.3.30

  • Extract the image from the .zip file.
  • Start VirtualBox and open the browserbox.vbox virtual machine definition using the Machine->Add menu option. Do not start the VM. If the VM is started the dynamic virtual disk will grow, increasing time and space needed for conversion.
  • Export the browserbox machine in OVF format. Choose File->Export Appliance from the VirtualBox menu.
  • Make sure to change the output filename so that it ends with an .ovf extension, we need to edit the configuration for import into ESXi, so we do not want disk images and configuration combined into a single archive.
  • Choose OVF 1.0 format.

The export process requires approx. 4 minutes on our system. It will create 1 .ovf configuration file, and 2 .vmdk virtual disk images.

Edit the OVF for VMWare ESXi compatibility

If you attempt to directly import the OVF into ESXi using the vSphere client you will receive various error messages. ESXi does not recognise some of the virtual hardware configured in the exported appliance. The configuration file 'browservox.ovf' must be edited as follows:

  • Change the vssd::VirtualSystemType entry:
 <vssd:VirtualSystemType>vmx-7</vssd:VirtualSystemType>
  • Change the virtual SATA controller to SCSI. Find the <Item> entry describing a 'SATA Controller' and modify it so it describes an lsilogic SCSI controller as below. Do not change the InstanceId for the controller - use the existing value.
 <Item>
  <rasd:Address>1</rasd:Address>
  <rasd:Caption>scsiController0</rasd:Caption>
  <rasd:Description>SCSI Controller</rasd:Description>
  <rasd:InstanceId>5</rasd:InstanceId>
  <rasd:ResourceSubType>lsilogic</rasd:ResourceSubType>
  <rasd:ResourceType>6</rasd:ResourceType>
 </Item>

There are other references to SATA, and VirtualBox specific configuration in the OVF file. However, the VMWare import seems to ignore these.

Import the OVF appliance into ESXi

Using the VMWare vSphere client application you can now import the machine to your ESXi server or cluster.

  • Choose 'File->Deploy OVF Template' from the menu.
  • Select the OVF file that you edited above.

At the 'Disk Format' step you will be given the option of importing disk images in thick-provisioned or thin-provisioned format. The GBIB VirtualBox appliance uses a 2TB dynamic (thin provisioned) image for data, so choose 'Thin Provision' unless you really wish to allocate over 2TB of space to the ESXi VM.

The appliance network adapter will map to a VM Network configured in ESXi. We will reconfigure networking later, so that the appliance uses a static IP that can be associated with a DNS name for easy acces by users.

The import itself will take some time, depending on the speed of your connection to the ESXi server.

Modifying VM virtual hardward configuration and booting

Before booting the virtual machine you should review and modify its settings to suit your expected usage. We modified the configuration, allocating 8 virtual CPUs and 4GB RAM. This gives ample performance for several users on our systems. The virtual network was re-assigned to a VMWare vSwitch linked to 10GbE hardware.

Boot the VM and connect to it using the vSphere client console. GBIB is set to auto-login as the browser user, and will attempt to update. However, networking is not configured correctly.

Configuring a static IP address

By default, the GBIB appliance expects to be running under VirtualBox on the same computer that will be used to access it. In our situation we want the appliance to use a static IP address on the network, so that it is accessible elsewhere. At the console use vi / nano to edit the file /etc/network/interfaces to set your static IP address, gateway and DNS servers, e.g:

# The primary network interface
auto eth0
iface eth0 inet static
address 10.0.54.38
netmask 255.255.255.0
gateway 10.0.54.254
dns-nameservers 10.0.236.1 10.0.237.1

If you have a hostname registered in DNS for the IP address used, set it in /etc/hostname and add the IP - hostname pairing to /etc/hosts

Restart the VM - it should now have network access, and you should be able to browse to the static IP address or associated DNS name to access the Genome Browser.

Note that to function correctly GBIB does need outbound access to the UCSC servers, see the documentation r.e. ports and hostnames for firewall exceptions.

Further Steps

At this point GBIB should be available for use, hosted on your ESXi systems. Access for maintenance etc. is via the vSphere client console. Depending on your needs you might want to:

  • Install the VMWare tools, enabling use of paravirtualized networking drivers, access to performance statistics etc.
  • Enable mirroring of commonly used tracks.
  • Create an administrative user other than the standard 'browser' user.
  • Reset browser / root passwords
  • Restore the default /etc/ssh/sshd/sshd_config configuration to allow remote administration via ssh
  • Disable auto login for the browser user - edit /etc/init.d/tty1.conf with reference to tty2.conf

These steps depend on your ESXi setup, and administrative / security needs. They are likely to differ greatly between sites so are not documented here.