Monitoring pacemakerWeb based printer status monitoringPacemaker monitoring mysqlMonitoring Varnish with...

Can the Count of Monte Cristo's calculation of poison dosage be explained?

Talents during the time of Achasverous

How to use a mathematical expression as xticklable

Extracting single band from multi-band raster using QGIS

ip vs ifconfig commands pros and cons

Incompressible fluid definition

Metadata API deployments are failing in Spring '19

Is my plan for fixing my water heater leak bad?

Why do neural networks need so many training examples to perform?

Why does the DC-9-80 have this cusp in its fuselage?

Naming things the POV character doesn't know

Do my Windows system binaries contain sensitive information?

Why do members of Congress in committee hearings ask witnesses the same question multiple times?

Can the Grease spell force multiple saves?

Am I a Rude Number?

On what did Lego base the appearance of the new Hogwarts minifigs?

LM22678 Unstable output

Removing debris from PCB

How can I get the count of how many times a string appears in my list?

Using AWS Fargate as web server

Why didn't Eru and/or the Valar intervene when Sauron corrupted Númenor?

How would an AI self awareness kill switch work?

Quenching swords in dragon blood; why?

Can chords be played on the flute?

Monitoring pacemaker

Web based printer status monitoringPacemaker monitoring mysqlMonitoring Varnish with Heartbeat and Pacemakermonitor of systemd resource with pacemaker & corosync returns “not running” when cloned while it iscorosync/pacemaker “stale” state after a week of runningPacemaker on a single-node RHEL-6 systemDo Standby nodes participate/vote in Quorum - Pacemaker - Corosync - 3rd Quorum Only nodepacemaker with two nodes on different subnetPacemaker - standby or active?How to create resource for service using Corosync/Pacemaker

How do you monitor if pacemaker is still working? If all nodes are online and not in a state of standby or even offline/down?

Monitoring the services isn't the problem, this can be done directly. But im still not sure if I should monitor the status of the crm and if so, how to do it.

asked May 7 '12 at 6:40

Comradin

306311

bumped to the homepage by Community♦ 11 hours ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

There's some curses-based management command. I'd check to see what options are available on that command, if it'll just return with an exit code, etc., or at least parseable text. I assume you want to see nodes that are online/idle/whatever.

– cjc
May 7 '12 at 10:22

exchange.nagios.org/directory/Plugins/… or write a Nagios plugin to parse crm_mon -1 results.

– quanta
Jun 30 '12 at 5:12

add a comment |

How do you monitor if pacemaker is still working? If all nodes are online and not in a state of standby or even offline/down?

Monitoring the services isn't the problem, this can be done directly. But im still not sure if I should monitor the status of the crm and if so, how to do it.

asked May 7 '12 at 6:40

Comradin

306311

bumped to the homepage by Community♦ 11 hours ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

There's some curses-based management command. I'd check to see what options are available on that command, if it'll just return with an exit code, etc., or at least parseable text. I assume you want to see nodes that are online/idle/whatever.

– cjc
May 7 '12 at 10:22

exchange.nagios.org/directory/Plugins/… or write a Nagios plugin to parse crm_mon -1 results.

– quanta
Jun 30 '12 at 5:12

add a comment |

How do you monitor if pacemaker is still working? If all nodes are online and not in a state of standby or even offline/down?

Monitoring the services isn't the problem, this can be done directly. But im still not sure if I should monitor the status of the crm and if so, how to do it.

asked May 7 '12 at 6:40

Comradin

306311

How do you monitor if pacemaker is still working? If all nodes are online and not in a state of standby or even offline/down?

Monitoring the services isn't the problem, this can be done directly. But im still not sure if I should monitor the status of the crm and if so, how to do it.

monitoring crm pacemaker

asked May 7 '12 at 6:40

Comradin

306311

asked May 7 '12 at 6:40

Comradin

306311

asked May 7 '12 at 6:40

Comradin

306311

asked May 7 '12 at 6:40

Comradin

306311

asked May 7 '12 at 6:40

Comradin

306311

bumped to the homepage by Community♦ 11 hours ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

bumped to the homepage by Community♦ 11 hours ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

There's some curses-based management command. I'd check to see what options are available on that command, if it'll just return with an exit code, etc., or at least parseable text. I assume you want to see nodes that are online/idle/whatever.

– cjc
May 7 '12 at 10:22

exchange.nagios.org/directory/Plugins/… or write a Nagios plugin to parse crm_mon -1 results.

– quanta
Jun 30 '12 at 5:12

add a comment |

There's some curses-based management command. I'd check to see what options are available on that command, if it'll just return with an exit code, etc., or at least parseable text. I assume you want to see nodes that are online/idle/whatever.

– cjc
May 7 '12 at 10:22

exchange.nagios.org/directory/Plugins/… or write a Nagios plugin to parse crm_mon -1 results.

– quanta
Jun 30 '12 at 5:12

There's some curses-based management command. I'd check to see what options are available on that command, if it'll just return with an exit code, etc., or at least parseable text. I assume you want to see nodes that are online/idle/whatever.

– cjc
May 7 '12 at 10:22

exchange.nagios.org/directory/Plugins/… or write a Nagios plugin to parse crm_mon -1 results.

– quanta
Jun 30 '12 at 5:12

add a comment |

1 Answer
1

active

oldest

votes

By default, if the crm has a hissy-fit you'll know about it because the machine reboots. We run a Nagios check at work that does all sorts of checks for Pacemaker config in general (Make sure is-managed-default isn't false, that no resources have a non-zero failcount, all that kind of thing) -- I don't know where we got it from, but presumably it's floating around the 'tubes somewhere.

answered May 7 '12 at 10:15

womble♦

85.2k18141203

Our service provider runs mysql master-master nodes, with basic Heartbeat failover. We're on the mail list for Heartbeat messages, but we've also set up a Nagios check that looks at the MAC for the HA IP and the MAC for the standby master's IP. If they match, then we've missed an email and the IP floated to the the standby.

– cjc
May 7 '12 at 10:24

To be blunt, if you care which machine on a cluster a service is running, ur doin it rong.

– womble♦
May 7 '12 at 10:39

We were running long-running queries on the secondary master. It was a long-ago cost-saving decision to do that, instead of a proper reporting slave.

– cjc
May 7 '12 at 10:42

You'd be better off adding a couple of lines of code to whatever runs the reporting queries to detect the failure and try again.

– womble♦
May 7 '12 at 10:47

@womble, Im not caring with the services. Im just interested in the information if pacemaker thinks all nodes are still fine. Like one node being standby, offline, or worst case a split-brain happend.

– Comradin
May 7 '12 at 14:03

|
show 2 more comments

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "2"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f386606%2fmonitoring-pacemaker%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

answered May 7 '12 at 10:15

womble♦

85.2k18141203

Our service provider runs mysql master-master nodes, with basic Heartbeat failover. We're on the mail list for Heartbeat messages, but we've also set up a Nagios check that looks at the MAC for the HA IP and the MAC for the standby master's IP. If they match, then we've missed an email and the IP floated to the the standby.

– cjc
May 7 '12 at 10:24

To be blunt, if you care which machine on a cluster a service is running, ur doin it rong.

– womble♦
May 7 '12 at 10:39

We were running long-running queries on the secondary master. It was a long-ago cost-saving decision to do that, instead of a proper reporting slave.

– cjc
May 7 '12 at 10:42

You'd be better off adding a couple of lines of code to whatever runs the reporting queries to detect the failure and try again.

– womble♦
May 7 '12 at 10:47

@womble, Im not caring with the services. Im just interested in the information if pacemaker thinks all nodes are still fine. Like one node being standby, offline, or worst case a split-brain happend.

– Comradin
May 7 '12 at 14:03

|
show 2 more comments

answered May 7 '12 at 10:15

womble♦

85.2k18141203

Our service provider runs mysql master-master nodes, with basic Heartbeat failover. We're on the mail list for Heartbeat messages, but we've also set up a Nagios check that looks at the MAC for the HA IP and the MAC for the standby master's IP. If they match, then we've missed an email and the IP floated to the the standby.

– cjc
May 7 '12 at 10:24

To be blunt, if you care which machine on a cluster a service is running, ur doin it rong.

– womble♦
May 7 '12 at 10:39

We were running long-running queries on the secondary master. It was a long-ago cost-saving decision to do that, instead of a proper reporting slave.

– cjc
May 7 '12 at 10:42

You'd be better off adding a couple of lines of code to whatever runs the reporting queries to detect the failure and try again.

– womble♦
May 7 '12 at 10:47

@womble, Im not caring with the services. Im just interested in the information if pacemaker thinks all nodes are still fine. Like one node being standby, offline, or worst case a split-brain happend.

– Comradin
May 7 '12 at 14:03

|
show 2 more comments

answered May 7 '12 at 10:15

womble♦

85.2k18141203

answered May 7 '12 at 10:15

womble♦

85.2k18141203

answered May 7 '12 at 10:15

womble♦

85.2k18141203

answered May 7 '12 at 10:15

womble♦

85.2k18141203

answered May 7 '12 at 10:15

womble♦

85.2k18141203

Our service provider runs mysql master-master nodes, with basic Heartbeat failover. We're on the mail list for Heartbeat messages, but we've also set up a Nagios check that looks at the MAC for the HA IP and the MAC for the standby master's IP. If they match, then we've missed an email and the IP floated to the the standby.

– cjc
May 7 '12 at 10:24

To be blunt, if you care which machine on a cluster a service is running, ur doin it rong.

– womble♦
May 7 '12 at 10:39

We were running long-running queries on the secondary master. It was a long-ago cost-saving decision to do that, instead of a proper reporting slave.

– cjc
May 7 '12 at 10:42

You'd be better off adding a couple of lines of code to whatever runs the reporting queries to detect the failure and try again.

– womble♦
May 7 '12 at 10:47

@womble, Im not caring with the services. Im just interested in the information if pacemaker thinks all nodes are still fine. Like one node being standby, offline, or worst case a split-brain happend.

– Comradin
May 7 '12 at 14:03

|
show 2 more comments

Our service provider runs mysql master-master nodes, with basic Heartbeat failover. We're on the mail list for Heartbeat messages, but we've also set up a Nagios check that looks at the MAC for the HA IP and the MAC for the standby master's IP. If they match, then we've missed an email and the IP floated to the the standby.

– cjc
May 7 '12 at 10:24

To be blunt, if you care which machine on a cluster a service is running, ur doin it rong.

– womble♦
May 7 '12 at 10:39

We were running long-running queries on the secondary master. It was a long-ago cost-saving decision to do that, instead of a proper reporting slave.

– cjc
May 7 '12 at 10:42

You'd be better off adding a couple of lines of code to whatever runs the reporting queries to detect the failure and try again.

– womble♦
May 7 '12 at 10:47

@womble, Im not caring with the services. Im just interested in the information if pacemaker thinks all nodes are still fine. Like one node being standby, offline, or worst case a split-brain happend.

– Comradin
May 7 '12 at 14:03

Our service provider runs mysql master-master nodes, with basic Heartbeat failover. We're on the mail list for Heartbeat messages, but we've also set up a Nagios check that looks at the MAC for the HA IP and the MAC for the standby master's IP. If they match, then we've missed an email and the IP floated to the the standby.

– cjc
May 7 '12 at 10:24

To be blunt, if you care which machine on a cluster a service is running, ur doin it rong.

– womble♦
May 7 '12 at 10:39

We were running long-running queries on the secondary master. It was a long-ago cost-saving decision to do that, instead of a proper reporting slave.

– cjc
May 7 '12 at 10:42

You'd be better off adding a couple of lines of code to whatever runs the reporting queries to detect the failure and try again.

– womble♦
May 7 '12 at 10:47

@womble, Im not caring with the services. Im just interested in the information if pacemaker thinks all nodes are still fine. Like one node being standby, offline, or worst case a split-brain happend.

– Comradin
May 7 '12 at 14:03

|
show 2 more comments

draft saved

draft discarded

Thanks for contributing an answer to Server Fault!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ryfujk