Jump to: navigation, search
Alt text
About this image


Troubleshooting Icon.png
Troubleshooting
Product ConnectXf
Version All
Applies to Administrators
Level Advanced




Mail queue increasing on the server

The topic explains how to resolve a problem of mails that get stuck on server. There are various ways to resolve this problem depending on the symptoms.

Important

This topic applies only to qmail . The postfix queues are covered in a separate topic.

Symptom 1: Slow local mail delivery

Users are reporting slow local mail delivery, i. e. Mail from remote users to local or local to local users are delivered late.

Possible causes

  1. Disk is full.
  2. Spam attack has occurred on the server.
  3. Queue corruption

Diagnosis

1. Check if disk is full:

  df -h


Confirm whether the problem really exists. Check the queue sizes using the following command:

/var/qmail/bin/qmHandle -s

Following message is displayed:

Messages in local queue: 0

Messages in remote queue: 0

The messages in local queue should be at least 1000.

Solution

1. If disk is full, clean up the disk space by deleting some data.

2. If spam attack has occurred on your server, refer to the following links for the solution:

 http://xf.wiki.mithi.com/index.php/How_to_delete_mails_from_Postfix_Queue
 http://xf.wiki.mithi.com/index.php/Administration:Queue#Cleaning_the_Queue

3. If the queue is corrupt, refer to the following link for the solution:

 http://xf.wiki.mithi.com/index.php/How_to_replace_a_corrupt_queue_with_an_empty_queue


Symptom 2: Slow Remote mail delivery

Users are reporting slow Remote mail delivery. ( Mail from local to remote users are delivered late. Remote users can be branch offices or external domains like gmail. There is no problem in the Local mail delivery. )

Possible causes

  1. Disk is full
  2. DNS is not responding
  3. Network connectivity problem
  4. Some recipient server(s) have blocked mail from our server
  5. Queue corruption

Diagnosis

  1. Check if disk is full:
  df -h
  1. DNS is not responding
  host -a google.com

The output should indicate good response time. Less than 25 milliseconds. Confirm whether the problem really exists Check the queue sizes using the following command:

/var/qmail/bin/qmHandle -s

Following message is displayed:

Messages in local queue: 0

Messages in remote queue: 0

Here the messages in remote queue should be atleast 1000.

Solution

1. If disk is full, clean up the disk space by deleting some data.

2. DNS is not responding: Refer to the following link:

 http://xf.wiki.mithi.com/index.php/Administration:DNS#Troubleshooting

3. Network connectivity problem: Do the following:

     Cmd : ping 8.8.8.8

There should be 0% packet loss in output.

4. Some recipient server(s) have blocked mail from our server

   - Find the number of remote mail in queue per domain
       /var/qmail/bin/qmHandle -R | grep "To:" | cut -d '@' -f 2 | sort | uniq -c
   Eg.
       1 gmail.com
  

If it is found that many remote mail at one domain are stuck, check maillog to confirm the cause of problem of mail delivery to this domain.

   Until the remote domain allows our mail, create a postfix shunting queue for that domain.        

5. Queue corruption: Refer to the solution provided under Symptom1

Symptom 3: Slow local and remote mail delivery

Both local and remote mail delivery is slow Confirm whether the problem really exists

Check the queue sizes using the following command:

/var/qmail/bin/qmHandle -s

Following message is displayed:

Messages in local queue: 0

Messages in remote queue: 0

Here the messages in both queues should be at least 1000.

Diagnosis

  1. Disk is full:
  Cmd : df -h
 /mailstore and / partitions should not be more than 98% full.
 Solution : Delete some data to create space.

Possible causes

  1. Disk full
  2. Queue corruption
  3. Spam attack
  4. DNS is not responding

Diagnosis

  1. Disk is full:
  Cmd : df -h
 /mailstore and / partitions should not be more than 98% full.
 Solution : Delete some data to create space.
  1. DNS is not responding
  host -a google.com

The output should indicate good response time. Less than 25 milliseconds.


Solution

1. If disk is full, clean up the disk space by deleting some data.

2. Queue corruption

   - Refer to the solution provided under Symptom1

3. If spam attack has occurred on your server, refer to the following links for the solution:

 http://xf.wiki.mithi.com/index.php/How_to_delete_mails_from_Postfix_Queue
 http://xf.wiki.mithi.com/index.php/Administration:Queue#Cleaning_the_Queue

4. DNS is not responding

To detect the faulty dns server check each entry in /etc/resolv.conf individually using host command.

        host -a google.com <IP-ADDRESS-OF-DNS-SERVER>

The output should indicate good response time. Less than 25 milliseconds.

Further diagnostic of DNS.

If it is found that many remote mail at one domain are stuck, check maillog to confirm the cause of problem of mail delivery to this domain. Until the remote domain allows our mail, create a postfix shunting queue for that domain.

Symptom 4: Local Mail deferred in queue

Diagnosis

  • Search the /var/log/maillog for deferred mail
cat /var/log/maillog | grep -i "deferred"
  • On inspection of the /var/log/maillog for one such stuck mail the following error is observed: DuplicateMailMDC::_lock:_No_locks_available,_Error_executing_duplicate_mail_check
Aug  9 12:39:40 FE1XF qmail: 1376032180.707091 delivery 507026: failure: Details::_Error_executing_atleast_one_desired_MDC["Status"="Failure"]["Sender"="ankur.dokania@adityabirla.com"]
["Recipient"="abhijeet.nadgouda@adityabirla.com"]["Date"="Fri,_2_Aug_2013_11:39:38_+0530"]["MessageID"="1675100266.7790.1375423778503.JavaMail.root@127.0.0.1"]
["Subject"="Leave_Rules_-_Clarification_needed"]["SizeinKB"="3"]["AttachmentCount"="0"]["AttachmentList"=""]
["ReturnCode"="111"]["ProcessingTime"="0.680"]["ProcessedSteps"="
{ForwardToMailBoxMDC::_Info:_mailbox_location_'HOMS=127.0.0.1'_is_local,_skipping_forwarding.,_}
{::_Recipient:_abhijeet.nadgouda@adityabirla.com_mailsystem_is_connectxf,_}
{MailFiltersMDC::_,_MailFilter_executed_successfully,_}
{DuplicateMailMDC::_lock:_No_locks_available,_Error_executing_duplicate_mail_check,_}"]
  • The above error indicates that the duplicate mail checking system of MDC (Mail delivery system) is unable to write to the duplicate mail database due to an error locking a file. To get more details on this,
cat /var/log/messages | grep -i -e statd -e lockd
Aug  7 15:13:07 FE1XF rpc.statd[2928]: No canonical hostname found for 10.1.0.117
Aug  7 15:13:07 FE1XF rpc.statd[2928]: STAT_FAIL to FE1XF for SM_MON of 10.1.0.117
Aug  7 01:16:06 FE1XF kernel: lockd: cannot monitor 10.1.0.117

Cause

No canonical hostname found for 10.1.0.117

Typically lock errors are observed on shared storage environments like a NAS. The above diagnosis indicates that the hostname for the NAS IP (storage IP) is missing.

Resolution

Add an entry in /etc/hosts on the servers for NAS storage device. The host name can be anything.

vi /etc/hosts
Add the following entry
<IP of NAS device> <Name of NAS device>
For example
10.1.0.117  MyNASServer

References

http://dfwarden.blogspot.in/2011/02/psa-nfs-locking-in-rhel6-needs-reverse.html

Symptom 5: Mails are stuck in the queue: /mailstore directory has read-write-execute permissions (777)

Diagnosis

1: Check messages in queue.

[root@claimsmail ~]# /var/qmail/bin/qmail-qstat
messages in queue: 5239
messages in queue but not yet preprocessed: 0

2: Check local mails in the queue.

/var/qmail/bin/qmail-qread 

3: Verify if the Services are working fine

/mithi/mcs/bin/checkservices.sh

4: Check for the deferals in maillog.

cat /var/log/maillog | grep defer -i

If you find below errors in maillog indicates /mailstore directory got full permissions which is not allowed by qmail.

Sep 26 00:00:09 claimsmail qmail: 1348597809.158661 delivery 58357: deferral: Uh-oh:_home_directory_is_writable._(#4.7.0)/
Sep 26 00:00:09 claimsmail qmail: 1348597809.158668 delivery 58358: deferral: Uh-oh:_home_directory_is_writable._(#4.7.0)/
Sep 26 00:00:09 claimsmail qmail: 1348597809.158675 delivery 58359: deferral: Uh-oh:_home_directory_is_writable._(#4.7.0)/

Cause

If the folder in /mailstore directory has full permissions (777) that is read-write-execute, keeps all the messages in the queue and does not deliver it.

Solution

Change the permissions for mailstore folder to read-write.

1. Go to mailstore folder.

cd /mailstore

2. Grant the permission to all folder mailstore folder.

chown -R 755 *

3. Alarm the queue.

/mithi/mcs/bin/qmail--queue.alarm.sh

4. Check the queue.

/var/qmail/bin/qmail-qstat