Jump to: navigation, search
Alt text
About this image


Troubleshooting Icon.png
Troubleshooting
Product ConnectXf
Version 3.10
Applies to Administrators
Level Advanced




High load average on server - services not responding/responding slow

Symptoms

  • The top command shows the following
    • The load average is high
    • The swap and memory are fully used.
    • The idle time is zero
    • The IO waits are very high
  • Stopping the SMTP, POP, Spam etc services does not reduce the load on the server

Cause

  • Some process is taking up all the memory
  • The swapping is increasing the IO wait, reducing the idle time to zero
  • A lot of processes are in wait state increasing the load average.

Solution

  • Find out the process which is taking up all the memory as follows:
    • Run the top command
    • Keeping the caps lock OFF, press Shift+M, to get the list of processes running with the process taking the maximum memory on top.
    • Copy the PID of the process in the clipboard.
    • Exit the top command by pressing Cntrl+c.
    • Run cat /proc/<PID>/cmdline. This will give the process and its arguments.
    • If the process is non- critical, like a backup, periodic sync etc., kill the process using the command kill -9 <PID>

Known causes

Some of the known reasons of increased load average aer given below:

Cause Solution
In the DRBD setup if there is any hardware problem to the DR server the load on the primary server increases very high
  • Disconnect DRBD and check if the problem is getting resolved:
drbdadm disconnect all
  • Get the hardware audit done on the DR server.
There are lot of POP processes waiting on IO Users_cannot_login_to_their_accounts_using_the_POP_service
Hardware problem

Most of the time when the load is increasing abnormally, it is due to some hardware problem. It is highly recommended to get the hardware audit done in such cases.

NAS problem
LDAP problems