Mailing List Archive

Problem doing failover on Multi State instance, Failed Master is still considered as Master
Hello,

I am testing my config again, and I can't do a failover between a failed
Master to the Slave (Slave didn't become the Master)

I disconnected the network cable on the Master (node2). The Slave
(node1) have detected that node2 is down, but it hasn't promoted the
DRBD instances to Master state. Why ? Could you see any bad constraint ?

These are my constraints:
<constraints>

<rsc_order id="mail-drbd-promote-then-mail-group"
first="Mail-drbd" first-action="promote" then="Mail" then-action="start"/>
<rsc_order id="montaxe-mail-then-anti-spam" first="Montaxe-mail"
first-action="start" then="Anti-Spam" then-action="start"/>
<rsc_order id="anti-spam-then-amavisd-new" first="Anti-Spam"
first-action="start" then="Amavisd-New" then-action="start"/>
<rsc_order id="amavisd-new-then-exim4" first="Amavisd-New"
first-action="start" then="Exim4" then-action="start"/>
<rsc_order id="exim4-then-courier-authdaemon" first="Exim4"
first-action="start" then="Courier-Authdaemon" then-action="start"/>
<rsc_order id="courier-authdaemon-then-courier-pop"
first="Courier-Authdaemon" first-action="start" then="Courier-POP3"
then-action="start"/>
<rsc_order id="courier-pop-then-IPaddr-Mail1"
first="Courier-POP3" first-action="start" then="IPaddr-Mail1"
then-action="start"/>
<rsc_order id="courier-pop-then-IPaddr-Mail2"
first="Courier-POP3" first-action="start" then="IPaddr-Mail2"
then-action="start"/>


<rsc_order id="samba-drbd-promote-then-samba-group"
first="Samba-drbd" first-action="promote" then="Samba" then-action="start"/>
<rsc_order id="samba-FileSystem-then-samba-service"
first="Montaxe-samba" first-action="start" then="Samba-Service"
then-action="start"/>
<rsc_order id="samba-service-then-IPaddr-Samba"
first="Samba-Service" first-action="start" then="IPaddr-Samba"
then-action="start"/>


<rsc_colocation id="mail_drbrd_rule" rsc="Mail"
with-rsc="Mail-drbd" with-rsc-role="Master" score="INFINITY"/>
<rsc_colocation id="samba_drbrd_rule" rsc="Samba"
with-rsc="Samba-drbd" with-rsc-role="Master" score="INFINITY"/>


<rsc_location id="mail-connectivity" rsc="Mail-drbd">
<rule id="mail-pingd-prefer-rule"
score-attribute="pingd" role="Master">
<expression id="mail-pingd-prefer"
attribute="pingd" operation="defined"/>
</rule>
</rsc_location>

<rsc_location id="samba-connectivity" rsc="Samba-drbd">
<rule id="samba-pingd-exclude-rule" score="-INFINITY" >
<expression id="samba-pingd-exclude"
attribute="pingd" operation="lt" value="2000"/>
</rule>
</rsc_location>


<rsc_location id="mail-primary-node" rsc="Mail-drbd">
<rule id="mail-preferred-primary-node" score="5000"
role="Master">
<expression attribute="#uname"
id="expression-mail-primary-node" operation="eq" value="node2"/>
</rule>
</rsc_location>

<rsc_location id="mail-secondary-node" rsc="Mail-drbd">
<rule id="mail-preferred-secondary-node" score="1000"
role="Master">
<expression attribute="#uname"
id="expression-mail-secondary-node" operation="eq" value="node1"/>
</rule>
</rsc_location>


<rsc_location id="samba-primary-node" rsc="Samba-drbd">
<rule id="samba-preferred-primary-node" score="INFINITY"
role="Master">
<expression attribute="#uname"
id="expression-samba-primary-node" operation="eq" value="node2"/>
</rule>
</rsc_location>
</constraints>

Thank you!

_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker
Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]
On Wed, Nov 12, 2008 at 12:53, Adrian Chapela
<achapela.rexistros@gmail.com> wrote:
> Hello,
>
> I am testing my config again, and I can't do a failover between a failed
> Master to the Slave (Slave didn't become the Master)
>
> I disconnected the network cable on the Master (node2). The Slave (node1)
> have detected that node2 is down, but it hasn't promoted the DRBD instances
> to Master state. Why ?

How many nodes? OpenAIS or Heartbeat?
What does the current CIB look like (including status)?

> Could you see any bad constraint ?
>
> These are my constraints:
> <constraints>
>
> <rsc_order id="mail-drbd-promote-then-mail-group" first="Mail-drbd"
> first-action="promote" then="Mail" then-action="start"/>
> <rsc_order id="montaxe-mail-then-anti-spam" first="Montaxe-mail"
> first-action="start" then="Anti-Spam" then-action="start"/>
> <rsc_order id="anti-spam-then-amavisd-new" first="Anti-Spam"
> first-action="start" then="Amavisd-New" then-action="start"/>
> <rsc_order id="amavisd-new-then-exim4" first="Amavisd-New"
> first-action="start" then="Exim4" then-action="start"/>
> <rsc_order id="exim4-then-courier-authdaemon" first="Exim4"
> first-action="start" then="Courier-Authdaemon" then-action="start"/>
> <rsc_order id="courier-authdaemon-then-courier-pop"
> first="Courier-Authdaemon" first-action="start" then="Courier-POP3"
> then-action="start"/>
> <rsc_order id="courier-pop-then-IPaddr-Mail1" first="Courier-POP3"
> first-action="start" then="IPaddr-Mail1" then-action="start"/>
> <rsc_order id="courier-pop-then-IPaddr-Mail2" first="Courier-POP3"
> first-action="start" then="IPaddr-Mail2" then-action="start"/>
>
>
> <rsc_order id="samba-drbd-promote-then-samba-group" first="Samba-drbd"
> first-action="promote" then="Samba" then-action="start"/>
> <rsc_order id="samba-FileSystem-then-samba-service"
> first="Montaxe-samba" first-action="start" then="Samba-Service"
> then-action="start"/>
> <rsc_order id="samba-service-then-IPaddr-Samba" first="Samba-Service"
> first-action="start" then="IPaddr-Samba" then-action="start"/>
>
>
> <rsc_colocation id="mail_drbrd_rule" rsc="Mail" with-rsc="Mail-drbd"
> with-rsc-role="Master" score="INFINITY"/>
> <rsc_colocation id="samba_drbrd_rule" rsc="Samba"
> with-rsc="Samba-drbd" with-rsc-role="Master" score="INFINITY"/>
>
>
> <rsc_location id="mail-connectivity" rsc="Mail-drbd">
> <rule id="mail-pingd-prefer-rule" score-attribute="pingd"
> role="Master">
> <expression id="mail-pingd-prefer" attribute="pingd"
> operation="defined"/>
> </rule>
> </rsc_location>
>
> <rsc_location id="samba-connectivity" rsc="Samba-drbd">
> <rule id="samba-pingd-exclude-rule" score="-INFINITY" >
> <expression id="samba-pingd-exclude" attribute="pingd"
> operation="lt" value="2000"/>
> </rule>
> </rsc_location>
>
>
> <rsc_location id="mail-primary-node" rsc="Mail-drbd">
> <rule id="mail-preferred-primary-node" score="5000"
> role="Master">
> <expression attribute="#uname"
> id="expression-mail-primary-node" operation="eq" value="node2"/>
> </rule>
> </rsc_location>
>
> <rsc_location id="mail-secondary-node" rsc="Mail-drbd">
> <rule id="mail-preferred-secondary-node" score="1000"
> role="Master">
> <expression attribute="#uname"
> id="expression-mail-secondary-node" operation="eq" value="node1"/>
> </rule>
> </rsc_location>
>
>
> <rsc_location id="samba-primary-node" rsc="Samba-drbd">
> <rule id="samba-preferred-primary-node" score="INFINITY"
> role="Master">
> <expression attribute="#uname"
> id="expression-samba-primary-node" operation="eq" value="node2"/>
> </rule>
> </rsc_location>
> </constraints>
>
> Thank you!
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker@clusterlabs.org
> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>

_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker
Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]
Andrew Beekhof escribió:
> On Wed, Nov 12, 2008 at 12:53, Adrian Chapela
> <achapela.rexistros@gmail.com> wrote:
>
>> Hello,
>>
>> I am testing my config again, and I can't do a failover between a failed
>> Master to the Slave (Slave didn't become the Master)
>>
>> I disconnected the network cable on the Master (node2). The Slave (node1)
>> have detected that node2 is down, but it hasn't promoted the DRBD instances
>> to Master state. Why ?
>>
>
> How many nodes? OpenAIS or Heartbeat?
>
Two nodes with Heartbeat.
> What does the current CIB look like (including status)?
>
Attached the CIB to mail.
>
>> Could you see any bad constraint ?
>>
>> These are my constraints:
>> <constraints>
>>
>> <rsc_order id="mail-drbd-promote-then-mail-group" first="Mail-drbd"
>> first-action="promote" then="Mail" then-action="start"/>
>> <rsc_order id="montaxe-mail-then-anti-spam" first="Montaxe-mail"
>> first-action="start" then="Anti-Spam" then-action="start"/>
>> <rsc_order id="anti-spam-then-amavisd-new" first="Anti-Spam"
>> first-action="start" then="Amavisd-New" then-action="start"/>
>> <rsc_order id="amavisd-new-then-exim4" first="Amavisd-New"
>> first-action="start" then="Exim4" then-action="start"/>
>> <rsc_order id="exim4-then-courier-authdaemon" first="Exim4"
>> first-action="start" then="Courier-Authdaemon" then-action="start"/>
>> <rsc_order id="courier-authdaemon-then-courier-pop"
>> first="Courier-Authdaemon" first-action="start" then="Courier-POP3"
>> then-action="start"/>
>> <rsc_order id="courier-pop-then-IPaddr-Mail1" first="Courier-POP3"
>> first-action="start" then="IPaddr-Mail1" then-action="start"/>
>> <rsc_order id="courier-pop-then-IPaddr-Mail2" first="Courier-POP3"
>> first-action="start" then="IPaddr-Mail2" then-action="start"/>
>>
>>
>> <rsc_order id="samba-drbd-promote-then-samba-group" first="Samba-drbd"
>> first-action="promote" then="Samba" then-action="start"/>
>> <rsc_order id="samba-FileSystem-then-samba-service"
>> first="Montaxe-samba" first-action="start" then="Samba-Service"
>> then-action="start"/>
>> <rsc_order id="samba-service-then-IPaddr-Samba" first="Samba-Service"
>> first-action="start" then="IPaddr-Samba" then-action="start"/>
>>
>>
>> <rsc_colocation id="mail_drbrd_rule" rsc="Mail" with-rsc="Mail-drbd"
>> with-rsc-role="Master" score="INFINITY"/>
>> <rsc_colocation id="samba_drbrd_rule" rsc="Samba"
>> with-rsc="Samba-drbd" with-rsc-role="Master" score="INFINITY"/>
>>
>>
>> <rsc_location id="mail-connectivity" rsc="Mail-drbd">
>> <rule id="mail-pingd-prefer-rule" score-attribute="pingd"
>> role="Master">
>> <expression id="mail-pingd-prefer" attribute="pingd"
>> operation="defined"/>
>> </rule>
>> </rsc_location>
>>
>> <rsc_location id="samba-connectivity" rsc="Samba-drbd">
>> <rule id="samba-pingd-exclude-rule" score="-INFINITY" >
>> <expression id="samba-pingd-exclude" attribute="pingd"
>> operation="lt" value="2000"/>
>> </rule>
>> </rsc_location>
>>
>>
>> <rsc_location id="mail-primary-node" rsc="Mail-drbd">
>> <rule id="mail-preferred-primary-node" score="5000"
>> role="Master">
>> <expression attribute="#uname"
>> id="expression-mail-primary-node" operation="eq" value="node2"/>
>> </rule>
>> </rsc_location>
>>
>> <rsc_location id="mail-secondary-node" rsc="Mail-drbd">
>> <rule id="mail-preferred-secondary-node" score="1000"
>> role="Master">
>> <expression attribute="#uname"
>> id="expression-mail-secondary-node" operation="eq" value="node1"/>
>> </rule>
>> </rsc_location>
>>
>>
>> <rsc_location id="samba-primary-node" rsc="Samba-drbd">
>> <rule id="samba-preferred-primary-node" score="INFINITY"
>> role="Master">
>> <expression attribute="#uname"
>> id="expression-samba-primary-node" operation="eq" value="node2"/>
>> </rule>
>> </rsc_location>
>> </constraints>
>>
>> Thank you!
>>
>> _______________________________________________
>> Pacemaker mailing list
>> Pacemaker@clusterlabs.org
>> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>>
>>
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker@clusterlabs.org
> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>
>
Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]
On Nov 12, 2008, at 4:04 PM, Adrian Chapela wrote:

>>> I am testing my config again, and I can't do a failover between a
>>> failed
>>> Master to the Slave (Slave didn't become the Master)
>>>
>>> I disconnected the network cable on the Master (node2). The Slave
>>> (node1)
>>> have detected that node2 is down, but it hasn't promoted the DRBD
>>> instances
>>> to Master state. Why ?

Based on the CIB you attached, I can see that the PE wants to promote
it, but is waiting for stonith to complete.
But you don't have any stonith resources defined so it will wait
forever.

_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker
Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]
Andrew Beekhof escribió:
>
> On Nov 12, 2008, at 4:04 PM, Adrian Chapela wrote:
>
>>>> I am testing my config again, and I can't do a failover between a
>>>> failed
>>>> Master to the Slave (Slave didn't become the Master)
>>>>
>>>> I disconnected the network cable on the Master (node2). The Slave
>>>> (node1)
>>>> have detected that node2 is down, but it hasn't promoted the DRBD
>>>> instances
>>>> to Master state. Why ?
>
> Based on the CIB you attached, I can see that the PE wants to promote
> it, but is waiting for stonith to complete.
> But you don't have any stonith resources defined so it will wait forever.
OK, I thinked in this as a possibility.

Now, my question is other.. how can I have a stonith with only two nodes ?

Thank you!
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker@clusterlabs.org
> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>


_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker
Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]
On Nov 13, 2008, at 4:01 PM, Adrian Chapela wrote:

> Andrew Beekhof escribió:
>>
>> On Nov 12, 2008, at 4:04 PM, Adrian Chapela wrote:
>>
>>>>> I am testing my config again, and I can't do a failover between
>>>>> a failed
>>>>> Master to the Slave (Slave didn't become the Master)
>>>>>
>>>>> I disconnected the network cable on the Master (node2). The
>>>>> Slave (node1)
>>>>> have detected that node2 is down, but it hasn't promoted the
>>>>> DRBD instances
>>>>> to Master state. Why ?
>>
>> Based on the CIB you attached, I can see that the PE wants to
>> promote it, but is waiting for stonith to complete.
>> But you don't have any stonith resources defined so it will wait
>> forever.
> OK, I thinked in this as a possibility.
>
> Now, my question is other.. how can I have a stonith with only two
> nodes ?

the number of nodes isnt really important, more the size of your
budget :-)
network power switches arent all cheap.
_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker
Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]
Adrian Chapela wrote:

> Now, my question is other.. how can I have a stonith with only two nodes ?

stonith is not directly related to the amount of nodes you have.
the configuration similar to [1] is working for me.

thou i do not know if instance_attributes has been shifted to
meta_attributes.

maybe andrew can explain that, as the configuration explained
does not include stonith yet.

cheers,
raoul

[1] http://it-consultant.su/download/rackpdu.xml
--
____________________________________________________________________
DI (FH) Raoul Bhatia M.Sc. email. r.bhatia@ipax.at
Technischer Leiter

IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at
Barawitzkagasse 10/2/2/11 email. office@ipax.at
1190 Wien tel. +43 1 3670030
FN 277995t HG Wien fax. +43 1 3670030 15
____________________________________________________________________

_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker
Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]
Raoul Bhatia [IPAX] escribió:
> Adrian Chapela wrote:
>
>
>> Now, my question is other.. how can I have a stonith with only two nodes ?
>>
I know the number of nodes is not a problem but I don't want depend of
another hardware (RACKPDU), but I don't know another solution to take a
node down without network connectivity.

> stonith is not directly related to the amount of nodes you have.
> the configuration similar to [1] is working for me.
>
> thou i do not know if instance_attributes has been shifted to
> meta_attributes.
>
> maybe andrew can explain that, as the configuration explained
> does not include stonith yet.
>
Could you recommend me a good rackpdu ? and a cheap pdu ;) ?
> cheers,
> raoul
>
> [1] http://it-consultant.su/download/rackpdu.xml
>


_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker
Re: Problem doing failover on Multi State instance, Failed Master is still considered as Master [ In reply to ]
Adrian Chapela wrote:
> Raoul Bhatia [IPAX] escribió:
>> Adrian Chapela wrote:
>>
>>
>>> Now, my question is other.. how can I have a stonith with only two
>>> nodes ?
>>>
> I know the number of nodes is not a problem but I don't want depend of
> another hardware (RACKPDU), but I don't know another solution to take a
> node down without network connectivity.

you can try to use server with a remote managment card such as an ilo2
(hp servers). but i think its much harder to guarantee that such a
complex thing is always working as expected.

>> stonith is not directly related to the amount of nodes you have.
>> the configuration similar to [1] is working for me.
>>
> Could you recommend me a good rackpdu ? and a cheap pdu ;) ?

we're using the apc ones. i once compiled a list with recommendations
from nanog:
* APC
* http://www.apanet.pl/
* http://www.webpowerswitch.com/
* Emerson
* http://www.tripplite.com/
* http://www.baytech.net/
* Avocent/Cyclades
* http://www.audionics.co.uk/products/emu.htm
*
http://web2.raritan.adxstudio.com/products/power-management/Dominion-PX/DPCR20A-32/

* http://www.racksolutions.co.uk/

cheers,
raoul
--
____________________________________________________________________
DI (FH) Raoul Bhatia M.Sc. email. r.bhatia@ipax.at
Technischer Leiter

IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at
Barawitzkagasse 10/2/2/11 email. office@ipax.at
1190 Wien tel. +43 1 3670030
FN 277995t HG Wien fax. +43 1 3670030 15
____________________________________________________________________

_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker