Nagios load spike every 7 hoursHigh load on a nagios server — How many service checks for a nagios server...
What is the highest possible scrabble score for placing a single tile
I found an audio circuit and I built it just fine, but I find it a bit too quiet. How do I amplify the output so that it is a bit louder?
Taxes on Dividends in a Roth IRA
How to make money from a browser who sees 5 seconds into the future of any web page?
How do you make your own symbol when Detexify fails?
"It doesn't matter" or "it won't matter"?
15% tax on $7.5k earnings. Is that right?
How to explain what's wrong with this application of the chain rule?
How to get directions in deep space?
Is my low blitz game drawing rate at www.chess.com an indicator that I am weak in chess?
Make a Bowl of Alphabet Soup
How to convince somebody that he is fit for something else, but not this job?
US tourist/student visa
Is there a RAID 0 Equivalent for RAM?
Are Captain Marvel's powers affected by Thanos breaking the Tesseract and claiming the stone?
Shouldn’t conservatives embrace universal basic income?
What is Cash Advance APR?
Mimic lecturing on blackboard, facing audience
What's the name of the logical fallacy where a debater extends a statement far beyond the original statement to make it true?
Is this part of the description of the Archfey warlock's Misty Escape feature redundant?
Why is the Sun approximated as a black body at ~ 5800 K?
What to do when eye contact makes your coworker uncomfortable?
What fields between the rationals and the reals allow a good notion of 2D distance?
Why do Radio Buttons not fill the entire outer circle?
Nagios load spike every 7 hours
High load on a nagios server — How many service checks for a nagios server is too many?Nagios remote monitoring: NRPE Vs. SSHHow to setup a nagios event handler to run only in non working hours?esxi nagios speed issueNagios - Service checks for all but notify in work hours for someNagios Core to Nagios Core CommunicationNagios - measuring Average CPU LoadSet different warning thresholds for Nagios on weekendsConfiguring Nagiostimeout errors from nagios / SNMP
I have a NagiosXi server monitoring 631 services on 63 hosts. Every seven hours the load on the server spikes up to 20ish and then gradually falls back to near-0.
There are no cron jobs running every 7 hours.
The server has 8 cores and 2GB RAM. The RAM is not an issue, it still sits at 1GB free during the spikes, and upping it to 4GB makes no difference. The server was also migrated to a new host a week or so ago with no changes.
We also have scheduled downtime on 17 of the hosts being monitored so they are only monitored during 6am-6pm Mon-Fri, this seems to make no difference to the load spikes.
Most checks are done on Windows servers, using check_wmi_plus.
During load spikes, I tend to see 5-8 instances of check_wmi_plus.pl
using 2-3% cpu, and a handful of httpd processes using the same, but nothing stands out as using a lot of cpu. Those processes also roll over quite fast so they are not hung or taking an unusual long period of time. The Service Check Execution Time in NagiosXi Performance Monitor tends to peak at ~5.5s with averages around 1s.
Can anyone suggest a possible cause, or how I can further troubleshoot this?
nagios
add a comment |
I have a NagiosXi server monitoring 631 services on 63 hosts. Every seven hours the load on the server spikes up to 20ish and then gradually falls back to near-0.
There are no cron jobs running every 7 hours.
The server has 8 cores and 2GB RAM. The RAM is not an issue, it still sits at 1GB free during the spikes, and upping it to 4GB makes no difference. The server was also migrated to a new host a week or so ago with no changes.
We also have scheduled downtime on 17 of the hosts being monitored so they are only monitored during 6am-6pm Mon-Fri, this seems to make no difference to the load spikes.
Most checks are done on Windows servers, using check_wmi_plus.
During load spikes, I tend to see 5-8 instances of check_wmi_plus.pl
using 2-3% cpu, and a handful of httpd processes using the same, but nothing stands out as using a lot of cpu. Those processes also roll over quite fast so they are not hung or taking an unusual long period of time. The Service Check Execution Time in NagiosXi Performance Monitor tends to peak at ~5.5s with averages around 1s.
Can anyone suggest a possible cause, or how I can further troubleshoot this?
nagios
Since you say it isn't a cron job then perhaps it is nagios itself. I'd look at the nagios log to see if it is restarting every 7 hours. If you are retaining state and have horribly slow disk I/O the load would spike. During the high load time runiotop -oP
to see if there is a process doing excessive I/O.
– Mark Wagner
Dec 4 '12 at 1:14
You might want to try and see if you can spread the scheduling for those windows servers, ie running server1 at 1/7 hours and the second at 2/7 and so one, basically running each check on a different hour.
– Danie
Dec 4 '12 at 7:53
add a comment |
I have a NagiosXi server monitoring 631 services on 63 hosts. Every seven hours the load on the server spikes up to 20ish and then gradually falls back to near-0.
There are no cron jobs running every 7 hours.
The server has 8 cores and 2GB RAM. The RAM is not an issue, it still sits at 1GB free during the spikes, and upping it to 4GB makes no difference. The server was also migrated to a new host a week or so ago with no changes.
We also have scheduled downtime on 17 of the hosts being monitored so they are only monitored during 6am-6pm Mon-Fri, this seems to make no difference to the load spikes.
Most checks are done on Windows servers, using check_wmi_plus.
During load spikes, I tend to see 5-8 instances of check_wmi_plus.pl
using 2-3% cpu, and a handful of httpd processes using the same, but nothing stands out as using a lot of cpu. Those processes also roll over quite fast so they are not hung or taking an unusual long period of time. The Service Check Execution Time in NagiosXi Performance Monitor tends to peak at ~5.5s with averages around 1s.
Can anyone suggest a possible cause, or how I can further troubleshoot this?
nagios
I have a NagiosXi server monitoring 631 services on 63 hosts. Every seven hours the load on the server spikes up to 20ish and then gradually falls back to near-0.
There are no cron jobs running every 7 hours.
The server has 8 cores and 2GB RAM. The RAM is not an issue, it still sits at 1GB free during the spikes, and upping it to 4GB makes no difference. The server was also migrated to a new host a week or so ago with no changes.
We also have scheduled downtime on 17 of the hosts being monitored so they are only monitored during 6am-6pm Mon-Fri, this seems to make no difference to the load spikes.
Most checks are done on Windows servers, using check_wmi_plus.
During load spikes, I tend to see 5-8 instances of check_wmi_plus.pl
using 2-3% cpu, and a handful of httpd processes using the same, but nothing stands out as using a lot of cpu. Those processes also roll over quite fast so they are not hung or taking an unusual long period of time. The Service Check Execution Time in NagiosXi Performance Monitor tends to peak at ~5.5s with averages around 1s.
Can anyone suggest a possible cause, or how I can further troubleshoot this?
nagios
nagios
edited Mar 23 '15 at 1:37
masegaloeh
16.3k74085
16.3k74085
asked Dec 3 '12 at 22:22
daryl_grahamdaryl_graham
161
161
Since you say it isn't a cron job then perhaps it is nagios itself. I'd look at the nagios log to see if it is restarting every 7 hours. If you are retaining state and have horribly slow disk I/O the load would spike. During the high load time runiotop -oP
to see if there is a process doing excessive I/O.
– Mark Wagner
Dec 4 '12 at 1:14
You might want to try and see if you can spread the scheduling for those windows servers, ie running server1 at 1/7 hours and the second at 2/7 and so one, basically running each check on a different hour.
– Danie
Dec 4 '12 at 7:53
add a comment |
Since you say it isn't a cron job then perhaps it is nagios itself. I'd look at the nagios log to see if it is restarting every 7 hours. If you are retaining state and have horribly slow disk I/O the load would spike. During the high load time runiotop -oP
to see if there is a process doing excessive I/O.
– Mark Wagner
Dec 4 '12 at 1:14
You might want to try and see if you can spread the scheduling for those windows servers, ie running server1 at 1/7 hours and the second at 2/7 and so one, basically running each check on a different hour.
– Danie
Dec 4 '12 at 7:53
Since you say it isn't a cron job then perhaps it is nagios itself. I'd look at the nagios log to see if it is restarting every 7 hours. If you are retaining state and have horribly slow disk I/O the load would spike. During the high load time run
iotop -oP
to see if there is a process doing excessive I/O.– Mark Wagner
Dec 4 '12 at 1:14
Since you say it isn't a cron job then perhaps it is nagios itself. I'd look at the nagios log to see if it is restarting every 7 hours. If you are retaining state and have horribly slow disk I/O the load would spike. During the high load time run
iotop -oP
to see if there is a process doing excessive I/O.– Mark Wagner
Dec 4 '12 at 1:14
You might want to try and see if you can spread the scheduling for those windows servers, ie running server1 at 1/7 hours and the second at 2/7 and so one, basically running each check on a different hour.
– Danie
Dec 4 '12 at 7:53
You might want to try and see if you can spread the scheduling for those windows servers, ie running server1 at 1/7 hours and the second at 2/7 and so one, basically running each check on a different hour.
– Danie
Dec 4 '12 at 7:53
add a comment |
3 Answers
3
active
oldest
votes
A high load does NOT necessarily mean that you are using high levels of CPU only it only provides the number of process at a snapshot in time that are ready to run and receive CPU time but not how much of it.
Nagios does spin off a lot of processes rapidly depending on how you have set its monitoring schedules and at times will cause a spike as it starts a lot of processes running as fast as possible, but they might not require very much CPU or go immediately into a sleep/wait state.
BTW, if you disable NOTIFICATIONS in Nagios, this does not stop it from continuing to monitor a given host or service.
add a comment |
Lower the rhel/centos defaults prefork settings in the default /etc/httpd/conf/httpd.conf
to something more realistic.
Use tools like apachebuddy.pl & apachetuner.sh to do the math on memory per process fork. allow more memory for other process on the system (mysql/postgresql/php) and reduce the MaxClient and MaxRequestChild.
I experienced this after the upgrade to 2014R1.1 from 2012R2.9. not sure if the latest version of XI2014 requires more resources for the web frontend.
This morning after lowering my settings, I noticed my load spikes are smaller, and navigating through the interface doesn't give me the grey unhappy face screen using forward and back buttons in browser. does this weirdness in the interface seem similar?
One last item, I'm looking at now, is what rhel modules in this default httpd.conf file are required. I see no sense in loading default modules if not needed. This server is a PROD enterprise server at my place of business with thousands of checks, so it needs to be solid.
UPDATE:
run
# service mysqld stop
# sh /usr/local/nagiosxi/scripts/repair_databases.sh
# service mysqld start
or optimize tables while online via
# mysql -u root -p
mysql> use nagios;
list your tables
mysql> show tables;
then
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
...
mysql> use nagiosql;
**list your tables**
mysql> show tables;
then
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
...
do this for all tables.
If you can stop the service for the couple of minutes, then do it via nagiosxi script. if you can't until a later time... do it online, but expect the interface to be a bit slow until queries are re-ran. It maybe also beneficial to flush your query cache
mysql> FLUSH QUERY CACHE;
http://assets.nagios.com/downloads/nagiosxi/docs/Repairing_The_Nagios_XI_Database.pdf
add a comment |
this is due to how kernel calculates load. see the source:
https://github.com/torvalds/linux/blob/master/include/linux/sched/loadavg.h
and you will get something like this: #define LOAD_FREQ (5*HZ+1)
LOAD_FREQ is the interval the kernel collects cpu load. Note that there is a minor shift with the value of 0.001s. So it take 5* 1000 *5.001 seconds to drift back to a multiple of 5 seconds. 25005/ 3600 is around 7 hours.
so I bet the system forks shourt tasks periodically and just gets "caught" by the kernel every 7 hours.
New contributor
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "2"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f454745%2fnagios-load-spike-every-7-hours%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
A high load does NOT necessarily mean that you are using high levels of CPU only it only provides the number of process at a snapshot in time that are ready to run and receive CPU time but not how much of it.
Nagios does spin off a lot of processes rapidly depending on how you have set its monitoring schedules and at times will cause a spike as it starts a lot of processes running as fast as possible, but they might not require very much CPU or go immediately into a sleep/wait state.
BTW, if you disable NOTIFICATIONS in Nagios, this does not stop it from continuing to monitor a given host or service.
add a comment |
A high load does NOT necessarily mean that you are using high levels of CPU only it only provides the number of process at a snapshot in time that are ready to run and receive CPU time but not how much of it.
Nagios does spin off a lot of processes rapidly depending on how you have set its monitoring schedules and at times will cause a spike as it starts a lot of processes running as fast as possible, but they might not require very much CPU or go immediately into a sleep/wait state.
BTW, if you disable NOTIFICATIONS in Nagios, this does not stop it from continuing to monitor a given host or service.
add a comment |
A high load does NOT necessarily mean that you are using high levels of CPU only it only provides the number of process at a snapshot in time that are ready to run and receive CPU time but not how much of it.
Nagios does spin off a lot of processes rapidly depending on how you have set its monitoring schedules and at times will cause a spike as it starts a lot of processes running as fast as possible, but they might not require very much CPU or go immediately into a sleep/wait state.
BTW, if you disable NOTIFICATIONS in Nagios, this does not stop it from continuing to monitor a given host or service.
A high load does NOT necessarily mean that you are using high levels of CPU only it only provides the number of process at a snapshot in time that are ready to run and receive CPU time but not how much of it.
Nagios does spin off a lot of processes rapidly depending on how you have set its monitoring schedules and at times will cause a spike as it starts a lot of processes running as fast as possible, but they might not require very much CPU or go immediately into a sleep/wait state.
BTW, if you disable NOTIFICATIONS in Nagios, this does not stop it from continuing to monitor a given host or service.
edited Dec 4 '12 at 17:33
answered Dec 3 '12 at 22:30
mdpcmdpc
10.2k84560
10.2k84560
add a comment |
add a comment |
Lower the rhel/centos defaults prefork settings in the default /etc/httpd/conf/httpd.conf
to something more realistic.
Use tools like apachebuddy.pl & apachetuner.sh to do the math on memory per process fork. allow more memory for other process on the system (mysql/postgresql/php) and reduce the MaxClient and MaxRequestChild.
I experienced this after the upgrade to 2014R1.1 from 2012R2.9. not sure if the latest version of XI2014 requires more resources for the web frontend.
This morning after lowering my settings, I noticed my load spikes are smaller, and navigating through the interface doesn't give me the grey unhappy face screen using forward and back buttons in browser. does this weirdness in the interface seem similar?
One last item, I'm looking at now, is what rhel modules in this default httpd.conf file are required. I see no sense in loading default modules if not needed. This server is a PROD enterprise server at my place of business with thousands of checks, so it needs to be solid.
UPDATE:
run
# service mysqld stop
# sh /usr/local/nagiosxi/scripts/repair_databases.sh
# service mysqld start
or optimize tables while online via
# mysql -u root -p
mysql> use nagios;
list your tables
mysql> show tables;
then
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
...
mysql> use nagiosql;
**list your tables**
mysql> show tables;
then
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
...
do this for all tables.
If you can stop the service for the couple of minutes, then do it via nagiosxi script. if you can't until a later time... do it online, but expect the interface to be a bit slow until queries are re-ran. It maybe also beneficial to flush your query cache
mysql> FLUSH QUERY CACHE;
http://assets.nagios.com/downloads/nagiosxi/docs/Repairing_The_Nagios_XI_Database.pdf
add a comment |
Lower the rhel/centos defaults prefork settings in the default /etc/httpd/conf/httpd.conf
to something more realistic.
Use tools like apachebuddy.pl & apachetuner.sh to do the math on memory per process fork. allow more memory for other process on the system (mysql/postgresql/php) and reduce the MaxClient and MaxRequestChild.
I experienced this after the upgrade to 2014R1.1 from 2012R2.9. not sure if the latest version of XI2014 requires more resources for the web frontend.
This morning after lowering my settings, I noticed my load spikes are smaller, and navigating through the interface doesn't give me the grey unhappy face screen using forward and back buttons in browser. does this weirdness in the interface seem similar?
One last item, I'm looking at now, is what rhel modules in this default httpd.conf file are required. I see no sense in loading default modules if not needed. This server is a PROD enterprise server at my place of business with thousands of checks, so it needs to be solid.
UPDATE:
run
# service mysqld stop
# sh /usr/local/nagiosxi/scripts/repair_databases.sh
# service mysqld start
or optimize tables while online via
# mysql -u root -p
mysql> use nagios;
list your tables
mysql> show tables;
then
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
...
mysql> use nagiosql;
**list your tables**
mysql> show tables;
then
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
...
do this for all tables.
If you can stop the service for the couple of minutes, then do it via nagiosxi script. if you can't until a later time... do it online, but expect the interface to be a bit slow until queries are re-ran. It maybe also beneficial to flush your query cache
mysql> FLUSH QUERY CACHE;
http://assets.nagios.com/downloads/nagiosxi/docs/Repairing_The_Nagios_XI_Database.pdf
add a comment |
Lower the rhel/centos defaults prefork settings in the default /etc/httpd/conf/httpd.conf
to something more realistic.
Use tools like apachebuddy.pl & apachetuner.sh to do the math on memory per process fork. allow more memory for other process on the system (mysql/postgresql/php) and reduce the MaxClient and MaxRequestChild.
I experienced this after the upgrade to 2014R1.1 from 2012R2.9. not sure if the latest version of XI2014 requires more resources for the web frontend.
This morning after lowering my settings, I noticed my load spikes are smaller, and navigating through the interface doesn't give me the grey unhappy face screen using forward and back buttons in browser. does this weirdness in the interface seem similar?
One last item, I'm looking at now, is what rhel modules in this default httpd.conf file are required. I see no sense in loading default modules if not needed. This server is a PROD enterprise server at my place of business with thousands of checks, so it needs to be solid.
UPDATE:
run
# service mysqld stop
# sh /usr/local/nagiosxi/scripts/repair_databases.sh
# service mysqld start
or optimize tables while online via
# mysql -u root -p
mysql> use nagios;
list your tables
mysql> show tables;
then
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
...
mysql> use nagiosql;
**list your tables**
mysql> show tables;
then
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
...
do this for all tables.
If you can stop the service for the couple of minutes, then do it via nagiosxi script. if you can't until a later time... do it online, but expect the interface to be a bit slow until queries are re-ran. It maybe also beneficial to flush your query cache
mysql> FLUSH QUERY CACHE;
http://assets.nagios.com/downloads/nagiosxi/docs/Repairing_The_Nagios_XI_Database.pdf
Lower the rhel/centos defaults prefork settings in the default /etc/httpd/conf/httpd.conf
to something more realistic.
Use tools like apachebuddy.pl & apachetuner.sh to do the math on memory per process fork. allow more memory for other process on the system (mysql/postgresql/php) and reduce the MaxClient and MaxRequestChild.
I experienced this after the upgrade to 2014R1.1 from 2012R2.9. not sure if the latest version of XI2014 requires more resources for the web frontend.
This morning after lowering my settings, I noticed my load spikes are smaller, and navigating through the interface doesn't give me the grey unhappy face screen using forward and back buttons in browser. does this weirdness in the interface seem similar?
One last item, I'm looking at now, is what rhel modules in this default httpd.conf file are required. I see no sense in loading default modules if not needed. This server is a PROD enterprise server at my place of business with thousands of checks, so it needs to be solid.
UPDATE:
run
# service mysqld stop
# sh /usr/local/nagiosxi/scripts/repair_databases.sh
# service mysqld start
or optimize tables while online via
# mysql -u root -p
mysql> use nagios;
list your tables
mysql> show tables;
then
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
...
mysql> use nagiosql;
**list your tables**
mysql> show tables;
then
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
mysql> optimize table $TABLENAME;
...
do this for all tables.
If you can stop the service for the couple of minutes, then do it via nagiosxi script. if you can't until a later time... do it online, but expect the interface to be a bit slow until queries are re-ran. It maybe also beneficial to flush your query cache
mysql> FLUSH QUERY CACHE;
http://assets.nagios.com/downloads/nagiosxi/docs/Repairing_The_Nagios_XI_Database.pdf
edited Mar 23 '15 at 1:36
masegaloeh
16.3k74085
16.3k74085
answered Jul 10 '14 at 13:51
user3258557user3258557
194
194
add a comment |
add a comment |
this is due to how kernel calculates load. see the source:
https://github.com/torvalds/linux/blob/master/include/linux/sched/loadavg.h
and you will get something like this: #define LOAD_FREQ (5*HZ+1)
LOAD_FREQ is the interval the kernel collects cpu load. Note that there is a minor shift with the value of 0.001s. So it take 5* 1000 *5.001 seconds to drift back to a multiple of 5 seconds. 25005/ 3600 is around 7 hours.
so I bet the system forks shourt tasks periodically and just gets "caught" by the kernel every 7 hours.
New contributor
add a comment |
this is due to how kernel calculates load. see the source:
https://github.com/torvalds/linux/blob/master/include/linux/sched/loadavg.h
and you will get something like this: #define LOAD_FREQ (5*HZ+1)
LOAD_FREQ is the interval the kernel collects cpu load. Note that there is a minor shift with the value of 0.001s. So it take 5* 1000 *5.001 seconds to drift back to a multiple of 5 seconds. 25005/ 3600 is around 7 hours.
so I bet the system forks shourt tasks periodically and just gets "caught" by the kernel every 7 hours.
New contributor
add a comment |
this is due to how kernel calculates load. see the source:
https://github.com/torvalds/linux/blob/master/include/linux/sched/loadavg.h
and you will get something like this: #define LOAD_FREQ (5*HZ+1)
LOAD_FREQ is the interval the kernel collects cpu load. Note that there is a minor shift with the value of 0.001s. So it take 5* 1000 *5.001 seconds to drift back to a multiple of 5 seconds. 25005/ 3600 is around 7 hours.
so I bet the system forks shourt tasks periodically and just gets "caught" by the kernel every 7 hours.
New contributor
this is due to how kernel calculates load. see the source:
https://github.com/torvalds/linux/blob/master/include/linux/sched/loadavg.h
and you will get something like this: #define LOAD_FREQ (5*HZ+1)
LOAD_FREQ is the interval the kernel collects cpu load. Note that there is a minor shift with the value of 0.001s. So it take 5* 1000 *5.001 seconds to drift back to a multiple of 5 seconds. 25005/ 3600 is around 7 hours.
so I bet the system forks shourt tasks periodically and just gets "caught" by the kernel every 7 hours.
New contributor
New contributor
answered 2 mins ago
dennis.sdennis.s
1
1
New contributor
New contributor
add a comment |
add a comment |
Thanks for contributing an answer to Server Fault!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f454745%2fnagios-load-spike-every-7-hours%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Since you say it isn't a cron job then perhaps it is nagios itself. I'd look at the nagios log to see if it is restarting every 7 hours. If you are retaining state and have horribly slow disk I/O the load would spike. During the high load time run
iotop -oP
to see if there is a process doing excessive I/O.– Mark Wagner
Dec 4 '12 at 1:14
You might want to try and see if you can spread the scheduling for those windows servers, ie running server1 at 1/7 hours and the second at 2/7 and so one, basically running each check on a different hour.
– Danie
Dec 4 '12 at 7:53