Jump to: navigation, search
MithiWiki Home > ConnectXf Home > ConnectXf Administration > Configuration > Diagnosing performance bottlenecks in a NAS access stack


Troubleshooting Icon.png
Troubleshooting
Product ConnectXf
Version All
Applies to Administrators
Level Advanced



Diagnosing performance bottlenecks in a NAS access stack

Overview

In an HA deployment of Connect Xf, the architecture involves multiple load balanced stateless hosts (bare metal or Virtual) to access a common shared mail storage. This is possible only via a NAS

By design these hosts now access the NAS over a network and all this adds upto 9 points of possible bottle-necking if not tuned or synchronized properly with best practice guidelines or without really understanding the load.

The NAS Access Stack explained

NAS Access stack.png

Component Label Layer Description
Connect Xf A Application Services The Mailing application and relevant services which access the storage in high speed, viz. SMTP that is the MTA accepting and queueing mail for delivery, POP/IMAP that allow the user to access their mailbox via any mobile, desktop or web client, and other services like quota etc.
Linux B NFS Client The NFS (Network File System) client component on Linux, which is responsible to connect to the NFS server (NAS) over TCP/IP.
Definition of NFS . The storage now becomes available as mount point and can be accessed like any other file system.The Applications can manipulate shared files as if they were stored locally.
C Operating System The Linux Operating system which is the host for the NFS client and applications.
VM Infra D VM Network The shared network capacity provided to the specific virtual machine on which the storage is mounted. This is critical since the access to the storage is via the network and any bottle-necking here will create a performance drop.
E VM Host Network The network cards and software on the Virtual machines' Host machine, which is shared amongst the various VMs running on this host. Again a very critical component since all access to NAS happens via the network. Any bottle-necking here will create a performance drop.
F VM Host CPU/RAM The CPU and RAM of the Virtual Machines'Host machine which is shared amongst the various VMs running on this host.
Network Infra G Dedicated LAN We recommend a dedicated GBPS or higher speed network between the hosts and the NAS device to reduce chance of any other system on the network creating a bottleneck at this layer and creating a "side effect impact" on the performance of the storage channel. These FAQS could be helpful. Any bottleneck in the network between the hosts and the storage can create an adverse impact on the I/O performance.
NAS applicance H NAS Network The network cards and software on the NAS appliance, which is shared amongst the various hosts accessing the device over NFS and other supported protocols. Again a very critical component since all access to NAS happens via the network. Any bottlenecking here will create a performance drop.
I NAS The internal device, disks, their caches, timeouts, and other parameters to ensure that the i/o access to the storage devices is optimal and fast.


As you would have seen in the diagram, the App is a bunch of services, using the infrastructure on which it is hosted. Specifically for i/o the primary activities are storing mail received via SMTP and serving mail for consumption by users via POP/IMAP.

A combination of write and read

Connect Xf uses standard open source components for MTA and Mail access vis. Qmail SMTP, postfix, courier Imap, and qmail pop which essentially access the mailstore. These components have been around for decades and are well tested and used in large production environments (not just in Mithi customers but elsewhere as well). Our experience with these and also the data from the server indicates that when there is a bottleneck, these services are at the receiving end (they are the effect and not the cause).

At the most they can get very busy serving a sudden burst of requests, or handling a DDOS attack, delivering a bulk mail etc. During these periods they may make more demands on the underlying infrastructure.

But certainly the infra should be able to cater to these requests. Typically we recommend sizing and spacing the solution for this with head room. Refer here for more information on choosing a NAS

While it is no doubt a complex stack with an interplay of so many components and services, the diagnosis can be simplified if we take a stack approach to doing this systematically.

Diagnosing bottlenecks in the NAS Access stack

We suggest a layer by layer diagnosis of the stack to determine the location and cause of the bottleneck. Our experience with most of our large customers' deployments indicate that in almost all cases the bottleneck occurs at point B or point H/I.

The table below provides a way to get layer by layer view of the parameters, which indicate the health of that layer. The impact of each parameter has been explained and the allowed/expected values/output at each stage has also been documented.

Note: Our (Mithi's) visibility ends at the point A and B and we are unable to peep below these to diagnose in the network, VM and NAS areas. Here is where we recommend that if you are maintaining an HA site with such and environment, please fill up the tables below with help from the VM, Storage and Network team and keep such a sheet handy at all times to quickly diagnose problems in this stack.

Also ideally the diagnosis should be done from all connected hosts, but again we have found that you can gain quick visibility by looking via the troubled hosts. Whether you do this for all hosts, or a few, is your call and depends on the situation. You can simply copy this table into excel or any spread sheet and work on this.

Host Component Layer Key Parameters Command Acceptable values/output Description/Resources Observed value Status (G/R) Conclusion
Host1 Connect Xf Application Services Queue size
/var/qmail/bin/qmail-qstat	
< 1000 Growing queue typically means deferred mail, indicating I/O bottlenecks. Assuming logs don’t indicate any software errors like permissions etc
SMTP concurrency
tail -f /var/log/messages | grep -i status	
< 40 A high concurrency indicates a DDOS attack or slow processing of SMTP, typically an effect of slower queueing. Assuming mail are being processed successfully with no errors, but very slowly
POP concurrency
tail -f /var/log/maillog | grep -i status	
< 10 A high concurrency indicates inablity to access and serve the mail to the client in good time. This is typically an effect of a degraded I/O. Assuming mail are being popped with no errors but very slowly
Local mail delivery concurrency
cat /var/qmail/control/concurrencylocal	
< 40 Determines how many files will be written to the NAS concurrently and hence the IOPS
IMAP nolocks setting
grep USELOCKS /usr/lib/courier-imap/etc/imapd	
IMAP_USELOCKS=0 should be off
iptables NFS ports
vi /mithi/mcs/modules/mithi-bl/conf/server/mithi-system.fw.conf.sh 	
SUNRPC_PORT=111
SUNRPC_DIRECTION=in_and_out
SUNRPC_PROTOCOL=tcp_and_udp""
ConfigPort $SUNRPC_PORT $SUNRPC_DIRECTION $SUNRPC_PROTOCOL

NFS_PORT=2049
NFS_DIRECTION=in_and_out
NFS_PROTOCOL=tcp_and_udp""
ConfigPort $NFS_PORT $NFS_DIRECTION $NFS_PROTOCOL
                
MOUNTD_PORT=<Mountd port from the rpcinfo command>
MOUNTD_DIRECTION=in_and_out
MOUNTD_PROTOCOL=tcp_and_udp""
ConfigPort $MOUNTD_PORT $MOUNTD_DIRECTION $MOUNTD_PROTOCOL

LOCKD_PORT=<nfslockmgr port from the rpcinfo command>
LOCKD_DIRECTION=in_and_out
LOCKD_PROTOCOL=tcp_and_udp""
ConfigPort $LOCKD_PORT $LOCKD_DIRECTION $LOCKD_PROTOCOL

Clean mail store agent configuration
/mithi/mcs/bin/listagents.sh | grep -I agent_clean	
You should not find any instance of this running
 ps -elf | grep -I agent_clean	
Typically this should be running in off peak hours
Linux NFS Client File:NFSping.tar.gz NFS connections open/wait
 netstat -n |grep <NAS server ip>	
A high number on WAIT states indicates processes having completed their tasks but havent yet released the connections on NFS.
nfsiostat <param> A high number indicates a very slow response from i/o, which points to a bottleneck nfs client downwards
Mount params mount |grep ver=4
rw,bg,vers=4,tcp,timeo=600,rsize=32768,wsize=32768,hard,intr 0 0	
verify the mount parameters with NAS vendor – each vendor may have some recommended configuration for performance
NFS ver 4 mount |grep ver=4
rw,bg,vers=4,tcp,timeo=600,rsize=32768,wsize=32768,hard,intr 0 0	
NFS 4 has better performance and config as compared to ver 3
USE TCP mount
mount |grep ver=4	
rw,bg,vers=4,tcp,timeo=600,rsize=32768,wsize=32768,hard,intr 0 0	
UDP is fast, but since we have a high write load, tcp is preferred
NF client cache
rw,bg,vers=4,tcp,timeo=600,rsize=32768,wsize=32768,hard,intr 0 0	
This is a trickier optimisation. Make sure this is definitely the problem before spending too much time on this. The default values are usually fine for most situations.
noatime, nodirtime mount |grep ver=4
rw,bg,vers=4,tcp,timeo=600,rsize=32768,wsize=32768,hard,intr,noatime,nodirtime 0 0	
add to the mount command, to reduce write traffic to the server
avoid async writes mount |grep ver=4
rw,bg,vers=4,tcp,timeo=600,rsize=32768,wsize=32768,hard,intr 0 0	
Though async writes improve performance significantly, in case of a crash there is possiblity of data loss and no way to know what was lost
read and write block size = MTU or multiple of MTU
mount |grep ver=4	
rw,bg,vers=4,tcp,timeo=600,rsize=32768,wsize=32768,hard,intr 0 0	
MTU is the size of the network packet – each IO request is divided in to packets and given to the server, if the MTU and read/write block size do not match or causes too many packets, the network traffic increases
Operating System Are all of the relevant daemons running rpcinfo -p
program vers proto   port
100000    2   tcp    111  portmapper
100000    2   udp    111  portmapper
100021    1   udp  32775  nlockmgr
100021    3   udp  32775  nlockmgr
100021    4   udp  32775  nlockmgr
100021    1   tcp  32768  nlockmgr
100021    3   tcp  32768  nlockmgr
100021    4   tcp  32768  nlockmgr
100024    1   udp  32776  status
100024    1   tcp  32769  status
100011    1   udp    671  rquotad
100011    2   udp    671  rquotad
100011    1   tcp    690  rquotad
100011    2   tcp    690  rquotad
100003    2   udp   2049  nfs
100003    3   udp   2049  nfs
100003    2   tcp   2049  nfsv
100003    3   tcp   2049  nfs
100005    1   udp    693  mountd
100005    1   tcp    708  mountd
100005    2   udp    693  mountd
100005    2   tcp    708  mountd
100005    3   udp    693  mountd
100005    3   tcp    708  mountd
time for df command

time df -h

<10ms if there is a mount problem of the NAS the df command takes a long time to respond
number of nfs mounts mount <8 This is OS dependent params
iowait from sar sar < 10 A high number indicates processes waiting for storage. Indicates a problem in NFS client or downwards.
IOPS iostat -zd <nfs device name> <500 for 5000 users IOPS is the number of read and write requests to the NAS – this determines the sizing of the NAS server
Disk write test
time dd if=/dev/zero of=/data/testfile bs=16k count=16384	
3-5 sec Storage access test (assuming the NAS is mounted on /data). Should be within threshold. Indicates a problem in NFS client or downwards.
Disk read test
time dd if=/data/testfile of=/dev/null bs=16k	
<1sec Storage access test (assuming the NAS is mounted on /data). Should be within threshold. Indicates a problem in NFS client or downwards.
RPC count Rpc-health
time in sync with NAS device date
host file should have no DNS or a valid DNS If the DNS is incorrect or not responding the network response is considerable limited
Tuned ADM
Other Infra Comps VM Infra VM Network MTU contact VM admin If the MTU is not same as other n/w devices the network traffic will be fragmented

To achieve the best throughput, you need to experiment and discover the best values for your setup.

It is possible to change the MTU of many network cards. If your clients are on a separate subnet (e.g. for a Beowulf cluster), it may be safe to configure all of the network cards to use a high MTU. This should be done in very-high-bandwidth environments.

VM Host Network MTU contact VM admin If the MTU is not same as other n/w devices the network traffic will be fragmented
VM Host CPU/RAM contact VM admin Should not be loaded, due to load on another VM
Network Infra Dedicated LAN ROUTER/SWITCH MTU contact network admin
ifconfig should show three ips on three separate NIC interfaces	
If the MTU is not same as other n/w devices the network traffic will be fragmented

Confirm with network team

MAC – IP cache contact network admin This needs to be inspected when the IP/NIC card of the server or the NAS is changed, the Router or Intelligent Switches may have cached the old ip to MAC map and may affect n/w flow
NAS Appliances NAS Network MTU If the MTU is not same as other n/w devices the network traffic will be fragmented
NAS read and write access to NAS client ips The NAS may have access control list to allow access to the NAS clients
date and time If the date and time of the NAS does not match (is behind to that of the server) that of the server the messages delivered to the NAS will be visible after that duration
setting to allow user permissions to mailjol:mailjol required for windows based nas
Mapping of special characters required for windows based nas

The message files contain special file characters like :, etc that are not valid for windows files names they need to be mapped"

Tuning for IOPS (number of disks or type of disks for required iops or size of NVRAM) the IOPS are determined by the number of disks in a VOLUME (data is accessed in parallel from different disks) type of disk/speed of disk and the cache maintained the NVRAM's non volatile memory

TO be tuned by vendor of NAS

Resources