Monthly Archives: November 2007

  • Xen : Administration commands

    Xen Daemons

    For the Xen server to function properly, the below daemons must be running on the VM server.

    xend : This is the Xen server control daemon. This daemon must be running to start and manage the virtual machines. The administrative interface of xend is “xm”. The xend daemon can be customized using the configuration file /etc/xen/xend-config.sxp.

    xendomains : This daemon helps on starting the guests automatically when the host server boots up. The Xen guest config file has to be placed in /etc/xen/auto/ for the xendomains to pickup and boot the guests when the server boots up.

    libvirtd: This daemon allows the administrators to access the hypervisors and the virtual machines using management tools like “virsh” and “virt-install”.

    Here are some more xm and virsh commands for creating, managing, and troubleshooting xen virtual machines.

    List the guest domains:  The command “xm list” displays the information of all running domains. The “–long” option displays full information on running guests.

    [[email protected] ~]# xm list
    Name                                ID   Mem  VCPUs    State   Time(s)
    Domain-0                            0   2048     8     r—–  45645.8
    guest04                             4   4096     2     -b—-    343.4
    guest05                             5   4096     2     -b—-    896.4
    guest06                             3   4096     2     r—–   2244.1
    [[email protected] ~]#

    In the output of “xm list”

    • Name – represents the domU guest VM name
    • ID -represents the domain ID
    • Mem – represents the amount of memory allotted to the guest domain (in MB)
    • VCPUs – represents the number of virtual CPUs assigned to a domain.
    • State – represents the running state of the guest OS.
      • r – running. The domain is currently running on a CPU.
      • b – blocked. The domain is currently not running.
      • p – paused. The domain is paused using “xm pause” command.
      • c – crashed . The domain has crashed and terminated abruptly
      • d – dying . The domain is the process of shutdown or crash.
    • Time – represents the total runtime of the domain as accounted by the Xen server.

    Starting a guest domain: The command “xm  create” is used to startup a Xen guest.  This command creates a guest based on the configuration file storage in /etc/xen. 

    [[email protected] ~]# xm create guest06
    Using config file “/etc/xen/guest06”.
    Started domain guest06
    [[email protected] ~]#

    If the Xen configuration file is not stored in /etc/xen, the file name with full path should be specified in “xm create” command.

    Shutdown a guest: The command “xm shutdown” is used to shutdown an Xen guest gracefully.

    [[email protected] ~]# xm shutdown guest06

    [[email protected] ~]# xm list
    Name                                ID   Mem  VCPUs    State   Time(s)
    Domain-0                            0   2048     8     r—–  45645.8
    guest04                             4   4096     2     -b—-    343.4
    guest05                             5   4096     2     -b—-    896.4
    [[email protected] ~]#

    Rebooting a guest: The command “xm reboot” is used to reboot a Xen domU guest.

    Terminate a guest: The command “xm destroy” is used to immediately terminate an Xen domU guest virtual machine.

    Status monitoring

    • xm uptime        – Displays uptime for a domain
    • xm top            – Monitors a host and its domains in real time
    • xm list             – Displays domain information
    • xm info            – Displays host information
    • xm vcpu-list     – Lists domain virtual processors
    • xm network-list – List virtual network interfaces for a domain.
    • virsh nodeinfo   – Gives basic information about the node.
    • virsh cpuinfo

    Troubleshooting

    • xm console     – Attaches to a domain console
    • xm dump-core – Dumps core of a specific domain
    • xm dmesg       – Reads and/or clears the xend daemon’s message buffer
    • xm log            – Gives Xend log
    • virsh dominfo   – Returns basic information about the domain
    • virsh dumpxml  – Gives domain informations as an XML.

    Xen Performance tuning

    • xm mem-max – Sets the maximum amount of memory for a domain
    • xm mem-set  – Sets the current memory usage for a domain
    • xm vcpu-set  – Sets the number of active processors for a domain

    Other commands

    • xm rename   – Renames a domain
    • xm sysrq      – Sends a system request to a domain
    • xm block-list – Lists virtual block devices for a domain
  • Linux performance tuning – vm.swappiness

    Linux kernel has improved memory subsystem, with which administrators now have a simple interface to fine-tune the swapping behavior of the kernel.  The linux kernel tunable parameter vm.swappiness (/proc/sys/vm/swappiness) can be used to define how aggressively memory pages are swapped to disk.
    Linux moves memory pages that have not been accessed for some time to the swap space even if there is enough free memory available. By changing the percentage in /proc/sys/vm/swappiness you can control the swapping behavior, depending on the system configuration.

    A high swappiness value means that the kernel will be more apt to unmap mapped pages. A low swappiness value means the opposite, the kernel will be less apt to unmap mapped pages. In other words, the higher the vm.swappiness value, the more the system will swap.

    vm.swappiness takes a value between 0 and 100 to change the balance between swapping applications and freeing cache. At 100, the kernel will always prefer to find inactive pages and swap them out; in other cases, whether a swapout occurs depends on how much application memory is in use and how poorly the cache is doing at finding and releasing inactive items.

    Systems with memory constraints that run batch jobs (processes that sleeps for long time) might benefit from an aggressive swapping behavior.

    To change swapping behavior, use either echo or sysctl

    # sysctl -w vm.swappiness=90

    Tuning the Linux memory subsystem is a tough task that requires constant monitoring to ensure that changes do not negatively affect other components in the server. If you do choose to modify the virtual memory parameters (in /proc/sys/vm), change only one parameter at a time and monitor how the server performs.

  • Netapp – how to identify disk speed

    Command: storage show disk –a (or) vol status –r 

    Filer> storage show disk –a

    Disk:             2b.18
    Shelf:            1
    Bay:              2
    Serial:           XYZASDF
    Vendor:           NETAPP
    Model:            X266_MTOMC320PAD
    Rev:              R5VV
    RPM:              5400
    WWN:              x:200:xxxx:1817d2

     

    Filer> vol status -r

          RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)   Phys (MB/blks)
          ——— ——  ————- —- —- —- —– ————–    ————–
          dparity   2a.49   2a    3   1   FC:B   –  ATA   5400 274400/561971200  274540/562258784
          parity    2b.26   2b    1   10  FC:B   –  ATA   5400 274400/561971200  274540/562258784

          RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
          ——— ——  ————- —- —- —- —– ————–    ————–
          dparity   3a.18   3a    1   2   FC:A   –  FCAL 10000 68000/139264000   69536/142410400
          parity    2a.50   2a    3   2   FC:A   –  FCAL 10000 68000/139264000   69536/142410400

          RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
          ——— ——  ————- —- —- —- —– ————–    ————–
          dparity   2a.64   2a    4   0   FC:A   –  FCAL 15000 136000/278528000  137104/280790184
          parity    2a.80   2a    5   0   FC:A   –  FCAL 15000 136000/278528000  137104/280790184
         

  • Netapp performance troubleshooting – nfs_hist

    This command displays the latency details of each NFS call and its histogram. Zero the counter with –z option before you start debugging. You should be in advanced privilege to use this command. We can find out from nfs_hist if any particular NFS operation takes more time to respond. What it does, is measure the response time a filer takes to handle a type of nfs call.

    The key concept is if there is a performance problem dealing with NFS on a filer, you can tell if it’s a network problem, or a filer problem by checking how long the filer is taking to respond to calls. If we are quick, then the network should be investigated. If our responses are slow, we need to concentrate on the filer, and not the network.

    Filer*> nfs_hist

    v3 getattr: 48342 (blocking requests) – millisecond units

            0        1        2        3        4        5        6        7
         2883    14076     1577     1274     1705     2324     2973     3634
          <16      <24      <32      <40      <48      <56      <64   UNUSED
        15739     1418      267      147       90       93       59        0
         <128     <192     <256     <320     <384     <448     <512   UNUSED
           73        6        2        2        0        0        0        0
        <1024    <1536    <2048    <2560    <3072    <3584    <4096   UNUSED
            0        0        0        0        0        0        0        0
        <8192   <12288   <16384   <20480   <24576   <28672   <32768   UNUSED
            0        0        0        0        0        0        0        0
       <65536   <98304  <131072  <163840  <196608  <229376  <262144  >262144
            0        0        0        0        0        0        0        0
            
            
    As you can see, there are a lot of calls here, that’s because this filer has done something like 48342 ops since its last reboot, or last time nfs_hist -z has been run. nfs_hist -z zero’s all the counters, so we can get a point of reference for this type of call. And we can zero the counters at anytime, and watch the number of calls change as a test goes on.

    What we see here, is the nfs v3 operation getattr has 48342 operations. That’s the first line. The second line shows us 0 – 7. This is a breakdown of the number of calls handled that that many milliseconds. so 2883 were handled in less then 0ms. And 14076 handled in 1ms. All the way to 3634 handled in 7 seconds. After 7 seconds, we group everything that took between 8ms and 16ms and count them up to 15739. And 17ms to 24ms was 1418. My thumb rule, you will not notice degradation in performance until you are above 8192ms.

    Use diag mode to show more nfs_hist statistics.