Mailing List Archive: Problem doing failover on Multi State instance, Failed Master is still considered as Master

Problem doing failover on Multi State instance, Failed Master is still considered as Master

Nov 12, 2008, 3:53 AM

Post #1 of 9 (2846 views)

Hello,

I am testing my config again, and I can't do a failover between a failed
Master to the Slave (Slave didn't become the Master)

I disconnected the network cable on the Master (node2). The Slave
(node1) have detected that node2 is down, but it hasn't promoted the
DRBD instances to Master state. Why ? Could you see any bad constraint ?

These are my constraints:
<constraints>

<rsc_order id="mail-drbd-promote-then-mail-group"
first="Mail-drbd" first-action="promote" then="Mail" then-action="start"/>
<rsc_order id="montaxe-mail-then-anti-spam" first="Montaxe-mail"
first-action="start" then="Anti-Spam" then-action="start"/>
<rsc_order id="anti-spam-then-amavisd-new" first="Anti-Spam"
first-action="start" then="Amavisd-New" then-action="start"/>
<rsc_order id="amavisd-new-then-exim4" first="Amavisd-New"
first-action="start" then="Exim4" then-action="start"/>
<rsc_order id="exim4-then-courier-authdaemon" first="Exim4"
first-action="start" then="Courier-Authdaemon" then-action="start"/>
<rsc_order id="courier-authdaemon-then-courier-pop"
first="Courier-Authdaemon" first-action="start" then="Courier-POP3"
then-action="start"/>
<rsc_order id="courier-pop-then-IPaddr-Mail1"
first="Courier-POP3" first-action="start" then="IPaddr-Mail1"
then-action="start"/>
<rsc_order id="courier-pop-then-IPaddr-Mail2"
first="Courier-POP3" first-action="start" then="IPaddr-Mail2"
then-action="start"/>

<rsc_order id="samba-drbd-promote-then-samba-group"
first="Samba-drbd" first-action="promote" then="Samba" then-action="start"/>
<rsc_order id="samba-FileSystem-then-samba-service"
first="Montaxe-samba" first-action="start" then="Samba-Service"
then-action="start"/>
<rsc_order id="samba-service-then-IPaddr-Samba"
first="Samba-Service" first-action="start" then="IPaddr-Samba"
then-action="start"/>

<rsc_colocation id="mail_drbrd_rule" rsc="Mail"
with-rsc="Mail-drbd" with-rsc-role="Master" score="INFINITY"/>
<rsc_colocation id="samba_drbrd_rule" rsc="Samba"
with-rsc="Samba-drbd" with-rsc-role="Master" score="INFINITY"/>

<rsc_location id="mail-connectivity" rsc="Mail-drbd">
<rule id="mail-pingd-prefer-rule"
score-attribute="pingd" role="Master">
<expression id="mail-pingd-prefer"
attribute="pingd" operation="defined"/>
</rule>
</rsc_location>

<rsc_location id="samba-connectivity" rsc="Samba-drbd">
<rule id="samba-pingd-exclude-rule" score="-INFINITY" >
<expression id="samba-pingd-exclude"
attribute="pingd" operation="lt" value="2000"/>
</rule>
</rsc_location>

<rsc_location id="mail-primary-node" rsc="Mail-drbd">
<rule id="mail-preferred-primary-node" score="5000"
role="Master">
<expression attribute="#uname"
id="expression-mail-primary-node" operation="eq" value="node2"/>
</rule>
</rsc_location>

<rsc_location id="mail-secondary-node" rsc="Mail-drbd">
<rule id="mail-preferred-secondary-node" score="1000"
role="Master">
<expression attribute="#uname"
id="expression-mail-secondary-node" operation="eq" value="node1"/>
</rule>
</rsc_location>

<rsc_location id="samba-primary-node" rsc="Samba-drbd">
<rule id="samba-preferred-primary-node" score="INFINITY"
role="Master">
<expression attribute="#uname"
id="expression-samba-primary-node" operation="eq" value="node2"/>
</rule>
</rsc_location>
</constraints>

Thank you!

_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]

beekhof at gmail

Nov 12, 2008, 6:52 AM

Post #2 of 9 (2804 views)

Permalink

On Wed, Nov 12, 2008 at 12:53, Adrian Chapela
<achapela.rexistros@gmail.com> wrote:
> Hello,
>
> I am testing my config again, and I can't do a failover between a failed
> Master to the Slave (Slave didn't become the Master)
>
> I disconnected the network cable on the Master (node2). The Slave (node1)
> have detected that node2 is down, but it hasn't promoted the DRBD instances
> to Master state. Why ?

How many nodes? OpenAIS or Heartbeat?
What does the current CIB look like (including status)?

> Could you see any bad constraint ?
>
> These are my constraints:
> <constraints>
>
> <rsc_order id="mail-drbd-promote-then-mail-group" first="Mail-drbd"
> first-action="promote" then="Mail" then-action="start"/>
> <rsc_order id="montaxe-mail-then-anti-spam" first="Montaxe-mail"
> first-action="start" then="Anti-Spam" then-action="start"/>
> <rsc_order id="anti-spam-then-amavisd-new" first="Anti-Spam"
> first-action="start" then="Amavisd-New" then-action="start"/>
> <rsc_order id="amavisd-new-then-exim4" first="Amavisd-New"
> first-action="start" then="Exim4" then-action="start"/>
> <rsc_order id="exim4-then-courier-authdaemon" first="Exim4"
> first-action="start" then="Courier-Authdaemon" then-action="start"/>
> <rsc_order id="courier-authdaemon-then-courier-pop"
> first="Courier-Authdaemon" first-action="start" then="Courier-POP3"
> then-action="start"/>
> <rsc_order id="courier-pop-then-IPaddr-Mail1" first="Courier-POP3"
> first-action="start" then="IPaddr-Mail1" then-action="start"/>
> <rsc_order id="courier-pop-then-IPaddr-Mail2" first="Courier-POP3"
> first-action="start" then="IPaddr-Mail2" then-action="start"/>
>
>
> <rsc_order id="samba-drbd-promote-then-samba-group" first="Samba-drbd"
> first-action="promote" then="Samba" then-action="start"/>
> <rsc_order id="samba-FileSystem-then-samba-service"
> first="Montaxe-samba" first-action="start" then="Samba-Service"
> then-action="start"/>
> <rsc_order id="samba-service-then-IPaddr-Samba" first="Samba-Service"
> first-action="start" then="IPaddr-Samba" then-action="start"/>
>
>
> <rsc_colocation id="mail_drbrd_rule" rsc="Mail" with-rsc="Mail-drbd"
> with-rsc-role="Master" score="INFINITY"/>
> <rsc_colocation id="samba_drbrd_rule" rsc="Samba"
> with-rsc="Samba-drbd" with-rsc-role="Master" score="INFINITY"/>
>
>
> <rsc_location id="mail-connectivity" rsc="Mail-drbd">
> <rule id="mail-pingd-prefer-rule" score-attribute="pingd"
> role="Master">
> <expression id="mail-pingd-prefer" attribute="pingd"
> operation="defined"/>
> </rule>
> </rsc_location>
>
> <rsc_location id="samba-connectivity" rsc="Samba-drbd">
> <rule id="samba-pingd-exclude-rule" score="-INFINITY" >
> <expression id="samba-pingd-exclude" attribute="pingd"
> operation="lt" value="2000"/>
> </rule>
> </rsc_location>
>
>
> <rsc_location id="mail-primary-node" rsc="Mail-drbd">
> <rule id="mail-preferred-primary-node" score="5000"
> role="Master">
> <expression attribute="#uname"
> id="expression-mail-primary-node" operation="eq" value="node2"/>
> </rule>
> </rsc_location>
>
> <rsc_location id="mail-secondary-node" rsc="Mail-drbd">
> <rule id="mail-preferred-secondary-node" score="1000"
> role="Master">
> <expression attribute="#uname"
> id="expression-mail-secondary-node" operation="eq" value="node1"/>
> </rule>
> </rsc_location>
>
>
> <rsc_location id="samba-primary-node" rsc="Samba-drbd">
> <rule id="samba-preferred-primary-node" score="INFINITY"
> role="Master">
> <expression attribute="#uname"
> id="expression-samba-primary-node" operation="eq" value="node2"/>
> </rule>
> </rsc_location>
> </constraints>
>
> Thank you!
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker@clusterlabs.org
> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>

_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]

achapela.rexistros at gmail

Nov 12, 2008, 7:04 AM

Post #3 of 9 (2801 views)

Permalink

Andrew Beekhof escribió:
> On Wed, Nov 12, 2008 at 12:53, Adrian Chapela
> <achapela.rexistros@gmail.com> wrote:
>
>> Hello,
>>
>> I am testing my config again, and I can't do a failover between a failed
>> Master to the Slave (Slave didn't become the Master)
>>
>> I disconnected the network cable on the Master (node2). The Slave (node1)
>> have detected that node2 is down, but it hasn't promoted the DRBD instances
>> to Master state. Why ?
>>
>
> How many nodes? OpenAIS or Heartbeat?
>
Two nodes with Heartbeat.
> What does the current CIB look like (including status)?
>
Attached the CIB to mail.
>
>> Could you see any bad constraint ?
>>
>> These are my constraints:
>> <constraints>
>>
>> <rsc_order id="mail-drbd-promote-then-mail-group" first="Mail-drbd"
>> first-action="promote" then="Mail" then-action="start"/>
>> <rsc_order id="montaxe-mail-then-anti-spam" first="Montaxe-mail"
>> first-action="start" then="Anti-Spam" then-action="start"/>
>> <rsc_order id="anti-spam-then-amavisd-new" first="Anti-Spam"
>> first-action="start" then="Amavisd-New" then-action="start"/>
>> <rsc_order id="amavisd-new-then-exim4" first="Amavisd-New"
>> first-action="start" then="Exim4" then-action="start"/>
>> <rsc_order id="exim4-then-courier-authdaemon" first="Exim4"
>> first-action="start" then="Courier-Authdaemon" then-action="start"/>
>> <rsc_order id="courier-authdaemon-then-courier-pop"
>> first="Courier-Authdaemon" first-action="start" then="Courier-POP3"
>> then-action="start"/>
>> <rsc_order id="courier-pop-then-IPaddr-Mail1" first="Courier-POP3"
>> first-action="start" then="IPaddr-Mail1" then-action="start"/>
>> <rsc_order id="courier-pop-then-IPaddr-Mail2" first="Courier-POP3"
>> first-action="start" then="IPaddr-Mail2" then-action="start"/>
>>
>>
>> <rsc_order id="samba-drbd-promote-then-samba-group" first="Samba-drbd"
>> first-action="promote" then="Samba" then-action="start"/>
>> <rsc_order id="samba-FileSystem-then-samba-service"
>> first="Montaxe-samba" first-action="start" then="Samba-Service"
>> then-action="start"/>
>> <rsc_order id="samba-service-then-IPaddr-Samba" first="Samba-Service"
>> first-action="start" then="IPaddr-Samba" then-action="start"/>
>>
>>
>> <rsc_colocation id="mail_drbrd_rule" rsc="Mail" with-rsc="Mail-drbd"
>> with-rsc-role="Master" score="INFINITY"/>
>> <rsc_colocation id="samba_drbrd_rule" rsc="Samba"
>> with-rsc="Samba-drbd" with-rsc-role="Master" score="INFINITY"/>
>>
>>
>> <rsc_location id="mail-connectivity" rsc="Mail-drbd">
>> <rule id="mail-pingd-prefer-rule" score-attribute="pingd"
>> role="Master">
>> <expression id="mail-pingd-prefer" attribute="pingd"
>> operation="defined"/>
>> </rule>
>> </rsc_location>
>>
>> <rsc_location id="samba-connectivity" rsc="Samba-drbd">
>> <rule id="samba-pingd-exclude-rule" score="-INFINITY" >
>> <expression id="samba-pingd-exclude" attribute="pingd"
>> operation="lt" value="2000"/>
>> </rule>
>> </rsc_location>
>>
>>
>> <rsc_location id="mail-primary-node" rsc="Mail-drbd">
>> <rule id="mail-preferred-primary-node" score="5000"
>> role="Master">
>> <expression attribute="#uname"
>> id="expression-mail-primary-node" operation="eq" value="node2"/>
>> </rule>
>> </rsc_location>
>>
>> <rsc_location id="mail-secondary-node" rsc="Mail-drbd">
>> <rule id="mail-preferred-secondary-node" score="1000"
>> role="Master">
>> <expression attribute="#uname"
>> id="expression-mail-secondary-node" operation="eq" value="node1"/>
>> </rule>
>> </rsc_location>
>>
>>
>> <rsc_location id="samba-primary-node" rsc="Samba-drbd">
>> <rule id="samba-preferred-primary-node" score="INFINITY"
>> role="Master">
>> <expression attribute="#uname"
>> id="expression-samba-primary-node" operation="eq" value="node2"/>
>> </rule>
>> </rsc_location>
>> </constraints>
>>
>> Thank you!
>>
>> _______________________________________________
>> Pacemaker mailing list
>> Pacemaker@clusterlabs.org
>> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>>
>>
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker@clusterlabs.org
> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>
>

Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]

beekhof at gmail

Nov 13, 2008, 5:21 AM

Post #4 of 9 (2796 views)

Permalink

On Nov 12, 2008, at 4:04 PM, Adrian Chapela wrote:

>>> I am testing my config again, and I can't do a failover between a
>>> failed
>>> Master to the Slave (Slave didn't become the Master)
>>>
>>> I disconnected the network cable on the Master (node2). The Slave
>>> (node1)
>>> have detected that node2 is down, but it hasn't promoted the DRBD
>>> instances
>>> to Master state. Why ?

Based on the CIB you attached, I can see that the PE wants to promote
it, but is waiting for stonith to complete.
But you don't have any stonith resources defined so it will wait
forever.

_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]

achapela.rexistros at gmail

Nov 13, 2008, 7:01 AM

Post #5 of 9 (2798 views)

Permalink

Andrew Beekhof escribió:
>
> On Nov 12, 2008, at 4:04 PM, Adrian Chapela wrote:
>
>>>> I am testing my config again, and I can't do a failover between a
>>>> failed
>>>> Master to the Slave (Slave didn't become the Master)
>>>>
>>>> I disconnected the network cable on the Master (node2). The Slave
>>>> (node1)
>>>> have detected that node2 is down, but it hasn't promoted the DRBD
>>>> instances
>>>> to Master state. Why ?
>
> Based on the CIB you attached, I can see that the PE wants to promote
> it, but is waiting for stonith to complete.
> But you don't have any stonith resources defined so it will wait forever.
OK, I thinked in this as a possibility.

Now, my question is other.. how can I have a stonith with only two nodes ?

Thank you!
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker@clusterlabs.org
> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>

_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]

beekhof at gmail

Nov 13, 2008, 7:33 AM

Post #6 of 9 (2799 views)

Permalink

On Nov 13, 2008, at 4:01 PM, Adrian Chapela wrote:

> Andrew Beekhof escribió:
>>
>> On Nov 12, 2008, at 4:04 PM, Adrian Chapela wrote:
>>
>>>>> I am testing my config again, and I can't do a failover between
>>>>> a failed
>>>>> Master to the Slave (Slave didn't become the Master)
>>>>>
>>>>> I disconnected the network cable on the Master (node2). The
>>>>> Slave (node1)
>>>>> have detected that node2 is down, but it hasn't promoted the
>>>>> DRBD instances
>>>>> to Master state. Why ?
>>
>> Based on the CIB you attached, I can see that the PE wants to
>> promote it, but is waiting for stonith to complete.
>> But you don't have any stonith resources defined so it will wait
>> forever.
> OK, I thinked in this as a possibility.
>
> Now, my question is other.. how can I have a stonith with only two
> nodes ?

the number of nodes isnt really important, more the size of your
budget :-)
network power switches arent all cheap.
_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]

r.bhatia at ipax

Nov 13, 2008, 8:00 AM

Post #7 of 9 (2792 views)

Permalink

Adrian Chapela wrote:

> Now, my question is other.. how can I have a stonith with only two nodes ?

stonith is not directly related to the amount of nodes you have.
the configuration similar to [1] is working for me.

thou i do not know if instance_attributes has been shifted to
meta_attributes.

maybe andrew can explain that, as the configuration explained
does not include stonith yet.

cheers,
raoul

[1] http://it-consultant.su/download/rackpdu.xml
--
____________________________________________________________________
DI (FH) Raoul Bhatia M.Sc. email. r.bhatia@ipax.at
Technischer Leiter

IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at
Barawitzkagasse 10/2/2/11 email. office@ipax.at
1190 Wien tel. +43 1 3670030
FN 277995t HG Wien fax. +43 1 3670030 15
____________________________________________________________________

_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]

achapela.rexistros at gmail

Nov 13, 2008, 8:12 AM

Post #8 of 9 (2791 views)

Permalink

Raoul Bhatia [IPAX] escribió:
> Adrian Chapela wrote:
>
>
>> Now, my question is other.. how can I have a stonith with only two nodes ?
>>
I know the number of nodes is not a problem but I don't want depend of
another hardware (RACKPDU), but I don't know another solution to take a
node down without network connectivity.

> stonith is not directly related to the amount of nodes you have.
> the configuration similar to [1] is working for me.
>
> thou i do not know if instance_attributes has been shifted to
> meta_attributes.
>
> maybe andrew can explain that, as the configuration explained
> does not include stonith yet.
>
Could you recommend me a good rackpdu ? and a cheap pdu ;) ?
> cheers,
> raoul
>
> [1] http://it-consultant.su/download/rackpdu.xml
>

_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]

r.bhatia at ipax

Nov 13, 2008, 9:38 AM

Post #9 of 9 (2804 views)

Permalink

Adrian Chapela wrote:
> Raoul Bhatia [IPAX] escribió:
>> Adrian Chapela wrote:
>>
>>
>>> Now, my question is other.. how can I have a stonith with only two
>>> nodes ?
>>>
> I know the number of nodes is not a problem but I don't want depend of
> another hardware (RACKPDU), but I don't know another solution to take a
> node down without network connectivity.

you can try to use server with a remote managment card such as an ilo2
(hp servers). but i think its much harder to guarantee that such a
complex thing is always working as expected.

>> stonith is not directly related to the amount of nodes you have.
>> the configuration similar to [1] is working for me.
>>
> Could you recommend me a good rackpdu ? and a cheap pdu ;) ?

we're using the apc ones. i once compiled a list with recommendations
from nanog:
* APC
* http://www.apanet.pl/
* http://www.webpowerswitch.com/
* Emerson
* http://www.tripplite.com/
* http://www.baytech.net/
* Avocent/Cyclades
* http://www.audionics.co.uk/products/emu.htm
*
http://web2.raritan.adxstudio.com/products/power-management/Dominion-PX/DPCR20A-32/

* http://www.racksolutions.co.uk/

cheers,
raoul
--
____________________________________________________________________
DI (FH) Raoul Bhatia M.Sc. email. r.bhatia@ipax.at
Technischer Leiter

IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at
Barawitzkagasse 10/2/2/11 email. office@ipax.at
1190 Wien tel. +43 1 3670030
FN 277995t HG Wien fax. +43 1 3670030 15
____________________________________________________________________

_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Mailing List Archive

Mailing List Archive

Attached Files: