Collapse AllExpand All

Troubleshooting EMC VNX Multipathing

Note
Note
FEATURE PREVIEW: The multipathing feature is not yet complete, and may change or be removed from future releases. It is included in this release so that users can try it out and provide feedback.
The Eucalyptus EMC VNX Multipathing feature requires the following to function properly:
  • Properly installed and configured Linux Device Mapper Multipathing software on both the Storage Controller and Node Controller hosts.
  • Correctly configured iSCSI path system property and related STORAGE_INTERFACES parameters in the “eucalyptus.conf” configuration file for both SC and NC.
Prerequisites for Troubleshooting Typical Multipathing Failures
Before you start diagnosing the problems with multipathing, make sure you set the proper logging level on both SC and NC machines, so that you can get detailed failure logs. To do that:
  • Set the “cloud.euca_log_level” system property to “DEBUG”
  • Uncomment the “LOGLEVEL=DEBUG” entry in the “eucalyptus.conf” file on the NC, and then restart the NC service
General Troubleshooting Techniques for Multipathing Failures
The following are general tips to help diagnose multipathing problems:
  • Make sure you turn on the DEBUG log level for both SC and NC so that you can get detailed information from the logs.
  • Eucalyptus calls some external Perl scripts to perform the actual iSCSI connect/disconnect operations. These scripts are:
    • /usr/share/eucalyptus/connect_iscsitarget.pl
    • /usr/share/eucalyptus/disconnect_iscsitarget.pl
    • /usr/share/eucalyptus/get_iscsitarget.pl
    The STDERR output of these scripts is logged; you can add debug code to print information to STDERR to see what happens during connection or disconnection operations.
  • The iscsiadm open-iscsi initiator command line tool can help you get the current status of all the iSCSI connections in the system. For example:
    iscsiadm -m session -P 3
  • Use the multipath command line tool to see multipathing status. For example:
    multipath -ll -v 3
Cannot attach volumes
This can occur for a number of reasons. To diagnose this, try some of the following:
  • Make sure you can attach a volume without using multipathing.
  • Check your SAN-related system properties to see if you have set the correct values.
  • Use a single path for the NC; for example, set “PARTITION.storage.ncpaths” to something like “192.168.25.182”. If you specify an iface in your path, like “iface0:192.168.25.182”, also make sure you have “iface0” defined with “STORAGE_INTERFACES” in “eucalyptus.conf” configuration file on the NC.
  • If you have no problem attaching a volume with a single path, the failure may be due to the incorrect state of the Linux device mapper multipathing tool. Check if the “multipathd” service is running on the NC hosts and if “/etc/multipath.conf” is installed and configured properly (for example, copy the example configuration provided by Eucalyptus). Remember to set “user_friendly_names” to “yes” in “/etc/multipath.conf”. You can try restarting “multipathd” and/or reloading “/etc/multipath.conf” if you changed it previously. Run “multipath -ll” on NC host and see if it returns reasonable output without any error.
  • Check that the “PARTITION.storage.ncpaths” configuration file entries are correct. A typo can cause volume attach failures.
  • Make sure that the networking configuration is correct for the NC hosts. If you set the paths without specific ifaces, check to see if you can connect to each IP in the path using default network interface; otherwise, check each path’s connectivity using a specific network interface.
  • Check network connectivity with all of the configured paths.
  • Check the “nc.log” log file for the string “connect_iscsitarget”. Examine the return results, especially the “stderr” output.
Not all paths are connected
Sometimes when you run “multipath -ll” on NC hosts after attaching a volume, you find that the multipath device does not have all of the paths connected. In this case, the problem could be due to one of the following:
  • There is a mistake in the paths in one of the “PARTITION.storage.ncpaths” entries. If one of the paths specified in the system property is wrong, then it is possible that the specific path can not be connected. Make sure you have all the paths specified correctly.
  • The missing paths are not valid networking paths, or have networking issues. For example, when you ignore the iface part of a path, are you sure that the destination of the path (the IP part of the path) can be connected via the default network interface? Or if you specified the iface, are you sure you defined the iface in the “eucalyptus.conf” configuration file, and that the destination can be connected with the specified network interface?
  • If the paths specified are all valid, but some of them do not have connectivity, try to ping each of the specified paths on the NC hosts to check for connectivity. If there are connectivity issues, contact your network administrator.
Snapshotting failed
The Eucalyptus Storage Controller needs to attach a volume on the machine it runs so it can upload to Walrus during an EC2 snapshot call. To help ensure maximum reliability for snapshotting, you should use multipathing for the SC host; this is configured with the “PARTITION.storage.scpaths” system property. When multipathing is enabled for the SC, if you see a snapshot failure, it may be caused by multipathing. Techniques for diagnosing SC multipathing failures is similar to those used for NC multipathing failures. In the case of SC multipathing failures, the logs are in “/var/log/eucalyptus/cloud-*.log”, not “nc.log”, since the iSCSI connect scripts are invoked by Java code.