Mailing List Archive

resource not restarted due to score value
Hi,

in our 2-node-cluster we have a clone resource StorGr1 and two primitive resources
DummyVM1 and DummyVM2.
StorGr1 should be started before DummyVM1 and DummyVM2 due to order constraints.
StorGr1 clone was started on both cluster nodes goat1 and sheep1.
DummyVM1 and DummyVM2 were both started on node goat1.

Then we stopped StorGr1 on node goat1. We expected a restart of DummyVM1 and
DummyVM2 on the second node sheep1 due to the order constraints.
But only DummyVM2 was restarted on the second node sheep1.
DummyVM1 was stopped and remained in the stopped state:

Clone Set: StorGr1-clone [StorGr1]
Started: [ sheep1 ]
Stopped: [ StorGr1:1 ]
DummyVM1 (ocf::pacemaker:Dummy): Stopped
DummyVM2 (ocf::pacemaker:Dummy): Started sheep1

Difference: DummyVM1 has a higher allocation score value for goat1 and
DummyVM2 has a higher allocation score value for sheep1.

How can we achieve a restart of the primitive resources independently of the
allocation score value ?
Do we need other or additional constraints ?

Best regards,
Armin Haussecker

Extract from CIB:
primitive DummyVM1 ocf:pacemaker:Dummy \
op monitor interval="60s" timeout="60s" \
op start on-fail="restart" interval="0" \
op stop on-fail="ignore" interval="0" \
meta is-managed="true" resource-stickiness="1000" migration-threshold="2"
primitive DummyVM2 ocf:pacemaker:Dummy \
op monitor interval="60s" timeout="60s" \
op start on-fail="restart" interval="0" \
op stop on-fail="ignore" interval="0" \
meta is-managed="true" resource-stickiness="1000" migration-threshold="2"
primitive StorGr1 ocf:heartbeat:Dummy \
op monitor on-fail="restart" interval="60s" \
op start on-fail="restart" interval="0" \
op stop on-fail="ignore" interval="0" \
meta is-managed="true" resource-stickiness="1000" migration-threshold="2"
clone StorGr1-clone StorGr1 \
meta target-role="Started" interleave="true" ordered="true"

location score-DummyVM1 DummyVM1 400: goat1
location score-DummyVM2 DummyVM2 400: sheep1

order start-DummyVM1-after-StorGr1-clone inf: StorGr1-clone DummyVM1
order start-DummyVM2-after-StorGr1-clone inf: StorGr1-clone DummyVM2









_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Re: resource not restarted due to score value [ In reply to ]
On Fri, Feb 4, 2011 at 12:02 PM, Haussecker, Armin
<armin.haussecker@ts.fujitsu.com> wrote:
> Hi,
>
> in our 2-node-cluster we have a clone resource StorGr1 and two primitive resources
> DummyVM1 and DummyVM2.
> StorGr1 should be started before DummyVM1 and DummyVM2 due to order constraints.
> StorGr1 clone was started on both cluster nodes goat1 and sheep1.
> DummyVM1 and DummyVM2 were both started on node goat1.
>
> Then we stopped StorGr1 on node goat1. We expected a restart of DummyVM1 and
> DummyVM2 on the second node sheep1 due to the order constraints.
> But only DummyVM2 was restarted on the second node sheep1.
> DummyVM1 was stopped and remained in the stopped state:
>
> Clone Set: StorGr1-clone [StorGr1]
>     Started: [ sheep1 ]
>     Stopped: [ StorGr1:1 ]
> DummyVM1        (ocf::pacemaker:Dummy): Stopped
> DummyVM2        (ocf::pacemaker:Dummy): Started sheep1
>
> Difference: DummyVM1 has a higher allocation score value for goat1 and
> DummyVM2 has a higher allocation score value for sheep1.
>
> How can we achieve a restart of the primitive resources independently of the
> allocation score value ?
> Do we need other or additional constraints ?

Shouldn't need to.
Please attach the result of cibadmin -Ql when the cluster is in this state.

Also some indication of what version you're running would be helpful.

>
> Best regards,
> Armin Haussecker
>
> Extract from CIB:
> primitive DummyVM1 ocf:pacemaker:Dummy \
>        op monitor interval="60s" timeout="60s" \
>        op start on-fail="restart" interval="0" \
>        op stop on-fail="ignore" interval="0" \
>        meta is-managed="true" resource-stickiness="1000" migration-threshold="2"
> primitive DummyVM2 ocf:pacemaker:Dummy \
>        op monitor interval="60s" timeout="60s" \
>        op start on-fail="restart" interval="0" \
>        op stop on-fail="ignore" interval="0" \
>        meta is-managed="true" resource-stickiness="1000" migration-threshold="2"
> primitive StorGr1 ocf:heartbeat:Dummy \
>        op monitor on-fail="restart" interval="60s" \
>        op start on-fail="restart" interval="0" \
>        op stop on-fail="ignore" interval="0" \
>        meta is-managed="true" resource-stickiness="1000" migration-threshold="2"
> clone StorGr1-clone StorGr1 \
>        meta target-role="Started" interleave="true" ordered="true"
>
> location score-DummyVM1 DummyVM1 400: goat1
> location score-DummyVM2 DummyVM2 400: sheep1
>
> order start-DummyVM1-after-StorGr1-clone inf: StorGr1-clone DummyVM1
> order start-DummyVM2-after-StorGr1-clone inf: StorGr1-clone DummyVM2
>
>
>
>
>
>
>
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Re: resource not restarted due to score value [ In reply to ]
Hi,

we have sles11 sp1 with pacemaker 1.1.2-0.7.1 and corosync 1.2.6-0.2.2.

Attached please find
cibadmin -Ql before stopping StorGr1 on node goat1 (diag.before)
cibadmin -Ql after stopping StorGr1 on node goat1 (diag.after)
crm_mon after stopping StorGr1 on node goat1 (diag.crm_mon)
ptest -sL after stopping StorGr1 on node goat1 (diag.ptest)

Regards,
Armin Haussecker


-----Original Message-----
From: linux-ha-bounces@lists.linux-ha.org [mailto:linux-ha-bounces@lists.linux-ha.org] On Behalf Of Andrew Beekhof
Sent: Monday, February 07, 2011 8:46 AM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] resource not restarted due to score value

On Fri, Feb 4, 2011 at 12:02 PM, Haussecker, Armin
<armin.haussecker@ts.fujitsu.com> wrote:
> Hi,
>
> in our 2-node-cluster we have a clone resource StorGr1 and two primitive resources
> DummyVM1 and DummyVM2.
> StorGr1 should be started before DummyVM1 and DummyVM2 due to order constraints.
> StorGr1 clone was started on both cluster nodes goat1 and sheep1.
> DummyVM1 and DummyVM2 were both started on node goat1.
>
> Then we stopped StorGr1 on node goat1. We expected a restart of DummyVM1 and
> DummyVM2 on the second node sheep1 due to the order constraints.
> But only DummyVM2 was restarted on the second node sheep1.
> DummyVM1 was stopped and remained in the stopped state:
>
> Clone Set: StorGr1-clone [StorGr1]
>     Started: [ sheep1 ]
>     Stopped: [ StorGr1:1 ]
> DummyVM1        (ocf::pacemaker:Dummy): Stopped
> DummyVM2        (ocf::pacemaker:Dummy): Started sheep1
>
> Difference: DummyVM1 has a higher allocation score value for goat1 and
> DummyVM2 has a higher allocation score value for sheep1.
>
> How can we achieve a restart of the primitive resources independently of the
> allocation score value ?
> Do we need other or additional constraints ?

Shouldn't need to.
Please attach the result of cibadmin -Ql when the cluster is in this state.

Also some indication of what version you're running would be helpful.

>
> Best regards,
> Armin Haussecker
>
> Extract from CIB:
> primitive DummyVM1 ocf:pacemaker:Dummy \
>        op monitor interval="60s" timeout="60s" \
>        op start on-fail="restart" interval="0" \
>        op stop on-fail="ignore" interval="0" \
>        meta is-managed="true" resource-stickiness="1000" migration-threshold="2"
> primitive DummyVM2 ocf:pacemaker:Dummy \
>        op monitor interval="60s" timeout="60s" \
>        op start on-fail="restart" interval="0" \
>        op stop on-fail="ignore" interval="0" \
>        meta is-managed="true" resource-stickiness="1000" migration-threshold="2"
> primitive StorGr1 ocf:heartbeat:Dummy \
>        op monitor on-fail="restart" interval="60s" \
>        op start on-fail="restart" interval="0" \
>        op stop on-fail="ignore" interval="0" \
>        meta is-managed="true" resource-stickiness="1000" migration-threshold="2"
> clone StorGr1-clone StorGr1 \
>        meta target-role="Started" interleave="true" ordered="true"
>
> location score-DummyVM1 DummyVM1 400: goat1
> location score-DummyVM2 DummyVM2 400: sheep1
>
> order start-DummyVM1-after-StorGr1-clone inf: StorGr1-clone DummyVM1
> order start-DummyVM2-after-StorGr1-clone inf: StorGr1-clone DummyVM2
>
>
>
>
>
>
>
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Re: resource not restarted due to score value [ In reply to ]
Happily this appears to be fixed in 1.1.5 (which I believe should be
available for SLES "soon").

On Mon, Feb 7, 2011 at 9:17 AM, Haussecker, Armin
<armin.haussecker@ts.fujitsu.com> wrote:
> Hi,
>
> we have sles11 sp1 with pacemaker 1.1.2-0.7.1 and corosync 1.2.6-0.2.2.
>
> Attached please find
> cibadmin -Ql before stopping StorGr1 on node goat1 (diag.before)
> cibadmin -Ql after stopping StorGr1 on node goat1  (diag.after)
> crm_mon      after stopping StorGr1 on node goat1  (diag.crm_mon)
> ptest -sL    after stopping StorGr1 on node goat1  (diag.ptest)
>
> Regards,
> Armin Haussecker
>
>
> -----Original Message-----
> From: linux-ha-bounces@lists.linux-ha.org [mailto:linux-ha-bounces@lists.linux-ha.org] On Behalf Of Andrew Beekhof
> Sent: Monday, February 07, 2011 8:46 AM
> To: General Linux-HA mailing list
> Subject: Re: [Linux-HA] resource not restarted due to score value
>
> On Fri, Feb 4, 2011 at 12:02 PM, Haussecker, Armin
> <armin.haussecker@ts.fujitsu.com> wrote:
>> Hi,
>>
>> in our 2-node-cluster we have a clone resource StorGr1 and two primitive resources
>> DummyVM1 and DummyVM2.
>> StorGr1 should be started before DummyVM1 and DummyVM2 due to order constraints.
>> StorGr1 clone was started on both cluster nodes goat1 and sheep1.
>> DummyVM1 and DummyVM2 were both started on node goat1.
>>
>> Then we stopped StorGr1 on node goat1. We expected a restart of DummyVM1 and
>> DummyVM2 on the second node sheep1 due to the order constraints.
>> But only DummyVM2 was restarted on the second node sheep1.
>> DummyVM1 was stopped and remained in the stopped state:
>>
>> Clone Set: StorGr1-clone [StorGr1]
>>     Started: [ sheep1 ]
>>     Stopped: [ StorGr1:1 ]
>> DummyVM1        (ocf::pacemaker:Dummy): Stopped
>> DummyVM2        (ocf::pacemaker:Dummy): Started sheep1
>>
>> Difference: DummyVM1 has a higher allocation score value for goat1 and
>> DummyVM2 has a higher allocation score value for sheep1.
>>
>> How can we achieve a restart of the primitive resources independently of the
>> allocation score value ?
>> Do we need other or additional constraints ?
>
> Shouldn't need to.
> Please attach the result of cibadmin -Ql when the cluster is in this state.
>
> Also some indication of what version you're running would be helpful.
>
>>
>> Best regards,
>> Armin Haussecker
>>
>> Extract from CIB:
>> primitive DummyVM1 ocf:pacemaker:Dummy \
>>        op monitor interval="60s" timeout="60s" \
>>        op start on-fail="restart" interval="0" \
>>        op stop on-fail="ignore" interval="0" \
>>        meta is-managed="true" resource-stickiness="1000" migration-threshold="2"
>> primitive DummyVM2 ocf:pacemaker:Dummy \
>>        op monitor interval="60s" timeout="60s" \
>>        op start on-fail="restart" interval="0" \
>>        op stop on-fail="ignore" interval="0" \
>>        meta is-managed="true" resource-stickiness="1000" migration-threshold="2"
>> primitive StorGr1 ocf:heartbeat:Dummy \
>>        op monitor on-fail="restart" interval="60s" \
>>        op start on-fail="restart" interval="0" \
>>        op stop on-fail="ignore" interval="0" \
>>        meta is-managed="true" resource-stickiness="1000" migration-threshold="2"
>> clone StorGr1-clone StorGr1 \
>>        meta target-role="Started" interleave="true" ordered="true"
>>
>> location score-DummyVM1 DummyVM1 400: goat1
>> location score-DummyVM2 DummyVM2 400: sheep1
>>
>> order start-DummyVM1-after-StorGr1-clone inf: StorGr1-clone DummyVM1
>> order start-DummyVM2-after-StorGr1-clone inf: StorGr1-clone DummyVM2
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems