Mailing List Archive

Re: [corosync] Unable to join cluster from a newly-installed centos 6.2 node
On 03/02/2012 05:29 PM, Diego Lima wrote:
> Hello,
>
> I've recently installed Corosync on two CentOS 6.2 machines. One is
> working fine but on the other machine I've been unable to connect to
> the cluster. On the logs I can see this whenever I start
> corosync+pacemaker:
>
> Mar 2 21:33:16 no2 corosync[15924]: [MAIN ] Corosync Cluster
> Engine ('1.4.1'): started and ready to provide service.
> Mar 2 21:33:16 no2 corosync[15924]: [MAIN ] Corosync built-in
> features: nss dbus rdma snmp
> Mar 2 21:33:16 no2 corosync[15924]: [MAIN ] Successfully read main
> configuration file '/etc/corosync/corosync.conf'.
> Mar 2 21:33:16 no2 corosync[15924]: [TOTEM ] Initializing transport
> (UDP/IP Multicast).
> Mar 2 21:33:16 no2 corosync[15924]: [TOTEM ] Initializing
> transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Mar 2 21:33:16 no2 corosync[15924]: [TOTEM ] The network interface
> [172.16.100.2] is now up.
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info:
> process_ais_conf: Reading configure
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info:
> config_find_init: Local handle: 4730966301143465987 for logging
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info:
> config_find_next: Processing additional logging options...
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: get_config_opt:
> Found 'off' for option: debug
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: get_config_opt:
> Found 'no' for option: to_logfile
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: get_config_opt:
> Found 'yes' for option: to_syslog
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: get_config_opt:
> Found 'daemon' for option: syslog_facility
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info:
> config_find_init: Local handle: 7739444317642555396 for quorum
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info:
> config_find_next: No additional configuration supplied for: quorum
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: get_config_opt:
> No default for option: provider
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info:
> config_find_init: Local handle: 5650605097994944517 for service
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info:
> config_find_next: Processing additional service options...
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: get_config_opt:
> Found '0' for option: ver
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: get_config_opt:
> Defaulting to 'pcmk' for option: clustername
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: get_config_opt:
> Defaulting to 'no' for option: use_logd
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: get_config_opt:
> Defaulting to 'no' for option: use_mgmtd
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: pcmk_startup:
> CRM: Initialized
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] Logging: Initialized
> pcmk_startup
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: pcmk_startup:
> Maximum core file size is: 18446744073709551615
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: pcmk_startup: Service: 10
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: pcmk_startup:
> Local hostname: no2.informidia.int
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info:
> pcmk_update_nodeid: Local node id: 40112300
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: update_member:
> Creating entry for node 40112300 born on 0
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: update_member:
> 0x766520 Node 40112300 now known as no2.informidia.int (was: (null))
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: update_member:
> Node no2.informidia.int now has 1 quorum votes (was 0)
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: update_member:
> Node 40112300/no2.informidia.int is now: member
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: spawn_child:
> Forked child 15930 for process stonith-ng
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: spawn_child:
> Forked child 15931 for process cib
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: spawn_child:
> Forked child 15932 for process lrmd
> Mar 2 21:33:16 no2 lrmd: [15932]: info: G_main_add_SignalHandler:
> Added signal handler for signal 15
> Mar 2 21:33:16 no2 stonith-ng: [15930]: info: Invoked:
> /usr/lib64/heartbeat/stonithd
> Mar 2 21:33:16 no2 stonith-ng: [15930]: info: crm_log_init_worker:
> Changed active directory to /var/lib/heartbeat/cores/root
> Mar 2 21:33:16 no2 stonith-ng: [15930]: info:
> G_main_add_SignalHandler: Added signal handler for signal 17
> Mar 2 21:33:16 no2 stonith-ng: [15930]: info: get_cluster_type:
> Cluster type is: 'openais'
> Mar 2 21:33:16 no2 stonith-ng: [15930]: notice: crm_cluster_connect:
> Connecting to cluster infrastructure: classic openais (with plugin)
> Mar 2 21:33:16 no2 stonith-ng: [15930]: info:
> init_ais_connection_classic: Creating connection to our Corosync
> plugin
> Mar 2 21:33:16 no2 cib: [15931]: info: crm_log_init_worker: Changed
> active directory to /var/lib/heartbeat/cores/hacluster
> Mar 2 21:33:16 no2 cib: [15931]: info: G_main_add_TriggerHandler:
> Added signal manual handler
> Mar 2 21:33:16 no2 cib: [15931]: info: G_main_add_SignalHandler:
> Added signal handler for signal 17
> Mar 2 21:33:16 no2 lrmd: [15932]: info: G_main_add_SignalHandler:
> Added signal handler for signal 17
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: spawn_child:
> Forked child 15933 for process attrd
> Mar 2 21:33:16 no2 stonith-ng: [15930]: info:
> init_ais_connection_classic: AIS connection established
> Mar 2 21:33:16 no2 cib: [15931]: info: retrieveCib: Reading cluster
> configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
> /var/lib/heartbeat/crm/cib.xml.sig)
> Mar 2 21:33:16 no2 lrmd: [15932]: info: enabling coredumps
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: spawn_child:
> Forked child 15934 for process pengine
> Mar 2 21:33:16 no2 stonith-ng: [15930]: info: get_ais_nodeid: Server
> details: id=40112300 uname=no2.informidia.int cname=pcmk
> Mar 2 21:33:16 no2 cib: [15931]: info: validate_with_relaxng:
> Creating RNG parser context
> Mar 2 21:33:16 no2 lrmd: [15932]: info: G_main_add_SignalHandler:
> Added signal handler for signal 10
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: spawn_child:
> Forked child 15935 for process crmd
> Mar 2 21:33:16 no2 stonith-ng: [15930]: info:
> init_ais_connection_once: Connection to 'classic openais (with
> plugin)': established
> Mar 2 21:33:16 no2 lrmd: [15932]: info: G_main_add_SignalHandler:
> Added signal handler for signal 12
> Mar 2 21:33:16 no2 corosync[15924]: [SERV ] Service engine loaded:
> Pacemaker Cluster Manager 1.1.6
> Mar 2 21:33:16 no2 stonith-ng: [15930]: info: crm_new_peer: Node
> no2.informidia.int now has id: 40112300
> Mar 2 21:33:16 no2 lrmd: [15932]: info: Started.
> Mar 2 21:33:16 no2 corosync[15924]: [SERV ] Service engine loaded:
> corosync extended virtual synchrony service
> Mar 2 21:33:16 no2 stonith-ng: [15930]: info: crm_new_peer: Node
> 40112300 is now known as no2.informidia.int
> Mar 2 21:33:16 no2 corosync[15924]: [SERV ] Service engine loaded:
> corosync configuration service
> Mar 2 21:33:16 no2 stonith-ng: [15930]: info: main: Starting
> stonith-ng mainloop
> Mar 2 21:33:16 no2 corosync[15924]: [SERV ] Service engine loaded:
> corosync cluster closed process group service v1.01
> Mar 2 21:33:16 no2 corosync[15924]: [SERV ] Service engine loaded:
> corosync cluster config database access v1.01
> Mar 2 21:33:16 no2 corosync[15924]: [SERV ] Service engine loaded:
> corosync profile loading service
> Mar 2 21:33:16 no2 corosync[15924]: [SERV ] Service engine loaded:
> corosync cluster quorum service v0.1
> Mar 2 21:33:16 no2 corosync[15924]: [MAIN ] Compatibility mode set
> to whitetank. Using V1 and V2 of the synchronization engine.
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] notice:
> pcmk_peer_update: Transitional membership event on ring 24: memb=0,
> new=0, lost=0
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] notice:
> pcmk_peer_update: Stable membership event on ring 24: memb=1, new=1,
> lost=0
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info:
> pcmk_peer_update: NEW: no2.informidia.int 40112300
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info:
> pcmk_peer_update: MEMB: no2.informidia.int 40112300
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: update_member:
> Node no2.informidia.int now has process list:
> 00000000000000000000000000111312 (1118994)
> Mar 2 21:33:16 no2 corosync[15924]: [TOTEM ] A processor joined or
> left the membership and a new membership was formed.
> Mar 2 21:33:16 no2 corosync[15924]: [CPG ] chosen downlist:
> sender r(0) ip(172.16.100.2) ; members(old:0 left:0)
> Mar 2 21:33:16 no2 corosync[15924]: [MAIN ] Completed service
> synchronization, ready to provide service.
> Mar 2 21:33:16 no2 corosync[15924]: [pcmk ] info: pcmk_ipc:
> Recorded connection 0x775b60 for stonith-ng/15930
> Mar 2 21:33:16 no2 cib: [15931]: info: startCib: CIB Initialization
> completed successfully
> Mar 2 21:33:16 no2 cib: [15931]: info: get_cluster_type: Cluster type
> is: 'openais'
> Mar 2 21:33:16 no2 cib: [15931]: notice: crm_cluster_connect:
> Connecting to cluster infrastructure: classic openais (with plugin)
> Mar 2 21:33:16 no2 cib: [15931]: info: init_ais_connection_classic:
> Creating connection to our Corosync plugin
> Mar 2 21:33:17 no2 cib: [15931]: info: init_ais_connection_classic:
> AIS connection established
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info: pcmk_ipc:
> Recorded connection 0x7715b0 for cib/15931
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info: pcmk_ipc:
> Sending membership update 24 to cib
> Mar 2 21:33:17 no2 cib: [15931]: info: get_ais_nodeid: Server
> details: id=40112300 uname=no2.informidia.int cname=pcmk
> Mar 2 21:33:17 no2 cib: [15931]: info: init_ais_connection_once:
> Connection to 'classic openais (with plugin)': established
> Mar 2 21:33:17 no2 cib: [15931]: info: crm_new_peer: Node
> no2.informidia.int now has id: 40112300
> Mar 2 21:33:17 no2 cib: [15931]: info: crm_new_peer: Node 40112300 is
> now known as no2.informidia.int
> Mar 2 21:33:17 no2 cib: [15931]: info: cib_init: Starting cib mainloop
> Mar 2 21:33:17 no2 cib: [15931]: info: ais_dispatch_message:
> Membership 24: quorum still lost
> Mar 2 21:33:17 no2 cib: [15931]: info: crm_update_peer: Node
> no2.informidia.int: id=40112300 state=member (new) addr=r(0)
> ip(172.16.100.2) (new) votes=1 (new) born=0 seen=24
> proc=00000000000000000000000000111312 (new)
> Mar 2 21:33:17 no2 crmd: [15935]: info: Invoked: /usr/lib64/heartbeat/crmd
> Mar 2 21:33:17 no2 attrd: [15933]: info: Invoked: /usr/lib64/heartbeat/attrd
> Mar 2 21:33:17 no2 pengine: [15934]: info: Invoked:
> /usr/lib64/heartbeat/pengine
> Mar 2 21:33:17 no2 crmd: [15935]: info: crm_log_init_worker: Changed
> active directory to /var/lib/heartbeat/cores/hacluster
> Mar 2 21:33:17 no2 crmd: [15935]: info: main: CRM Hg Version:
> a02c0f19a00c1eb2527ad38f146ebc0834814558
> Mar 2 21:33:17 no2 attrd: [15933]: notice: crm_cluster_connect:
> Connecting to cluster infrastructure: classic openais (with plugin)
> Mar 2 21:33:17 no2 crmd: [15935]: info: crmd_init: Starting crmd
> Mar 2 21:33:17 no2 crmd: [15935]: info: G_main_add_SignalHandler:
> Added signal handler for signal 17
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info: pcmk_ipc:
> Recorded connection 0x77a380 for attrd/15933
> Mar 2 21:33:17 no2 attrd: [15933]: notice: main: Starting mainloop...
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] notice:
> pcmk_peer_update: Transitional membership event on ring 28: memb=1,
> new=0, lost=0
> Mar 2 21:33:17 no2 crmd: [15935]: info: do_cib_control: CIB
> connection established
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info:
> pcmk_peer_update: memb: no2.informidia.int 40112300
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] notice:
> pcmk_peer_update: Stable membership event on ring 28: memb=2, new=1,
> lost=0
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info: update_member:
> Creating entry for node 23335084 born on 28
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info: update_member:
> Node 23335084/unknown is now: member
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info:
> pcmk_peer_update: NEW: .pending. 23335084
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info:
> pcmk_peer_update: MEMB: .pending. 23335084
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info:
> pcmk_peer_update: MEMB: no2.informidia.int 40112300
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info:
> send_member_notification: Sending membership update 28 to 1 children
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info: update_member:
> 0x766520 Node 40112300 ((null)) born on: 28
> Mar 2 21:33:17 no2 crmd: [15935]: info: get_cluster_type: Cluster
> type is: 'openais'
> Mar 2 21:33:17 no2 crmd: [15935]: notice: crm_cluster_connect:
> Connecting to cluster infrastructure: classic openais (with plugin)
> Mar 2 21:33:17 no2 crmd: [15935]: info: init_ais_connection_classic:
> Creating connection to our Corosync plugin
> Mar 2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
> header contains an error: 16777216
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
> compressed=0, size=1392508928, total=-1560281088)
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
> min=592, total=-1560281088, size=1392508928, bz2_size=1
> Mar 2 21:33:17 no2 corosync[15924]: [TOTEM ] A processor joined or
> left the membership and a new membership was formed.
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info: update_member:
> 0x77a1b0 Node 23335084 (no1.informidia.int) born on: 16
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info: update_member:
> 0x77a1b0 Node 23335084 now known as no1.informidia.int (was: (null))
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info: update_member:
> Node no1.informidia.int now has process list:
> 00000000000000000000000000111312 (1118994)
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info: update_member:
> Node no1.informidia.int now has 1 quorum votes (was 0)
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info:
> send_member_notification: Sending membership update 28 to 1 children
> Mar 2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
> header contains an error: 16777216
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
> compressed=0, size=1392508928, total=-1560281088)
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
> min=592, total=-1560281088, size=1392508928, bz2_size=1
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] WARN:
> route_ais_message: Sending message to local.crmd failed: ipc delivery
> failed (rc=-2)
> Mar 2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
> header contains an error: 16777216
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
> compressed=0, size=1392508928, total=-1560281088)
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
> min=592, total=-1560281088, size=1392508928, bz2_size=1
> Mar 2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
> header contains an error: 16777216
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
> compressed=0, size=1392508928, total=-1560281088)
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
> min=592, total=-1560281088, size=1392508928, bz2_size=1
> Mar 2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
> header contains an error: 16777216
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
> compressed=0, size=1392508928, total=-1560281088)
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
> min=592, total=-1560281088, size=1392508928, bz2_size=1
> Mar 2 21:33:17 no2 crmd: [15935]: info: init_ais_connection_classic:
> AIS connection established
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info: pcmk_ipc:
> Recorded connection 0x77ebf0 for crmd/15935
> Mar 2 21:33:17 no2 corosync[15924]: [pcmk ] info: pcmk_ipc:
> Sending membership update 28 to crmd
> Mar 2 21:33:17 no2 crmd: [15935]: info: get_ais_nodeid: Server
> details: id=40112300 uname=no2.informidia.int cname=pcmk
> Mar 2 21:33:17 no2 crmd: [15935]: info: init_ais_connection_once:
> Connection to 'classic openais (with plugin)': established
> Mar 2 21:33:17 no2 crmd: [15935]: info: crm_new_peer: Node
> no2.informidia.int now has id: 40112300
> Mar 2 21:33:17 no2 crmd: [15935]: info: crm_new_peer: Node 40112300
> is now known as no2.informidia.int
> Mar 2 21:33:17 no2 crmd: [15935]: info: ais_status_callback: status:
> no2.informidia.int is now unknown
> Mar 2 21:33:17 no2 crmd: [15935]: info: do_ha_control: Connected to the cluster
> Mar 2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
> header contains an error: 16777216
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
> compressed=0, size=1392508928, total=-1560281088)
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
> min=592, total=-1560281088, size=1392508928, bz2_size=1
> Mar 2 21:33:17 no2 corosync[15924]: [CPG ] chosen downlist:
> sender r(0) ip(172.16.100.1) ; members(old:1 left:0)
> Mar 2 21:33:17 no2 crmd: [15935]: info: do_started: Delaying start,
> no membership data (0000000000100000)
> Mar 2 21:33:17 no2 crmd: [15935]: info: crmd_init: Starting crmd's mainloop
> Mar 2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
> header contains an error: 16777216
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
> compressed=0, size=1392508928, total=-1560281088)
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
> min=592, total=-1560281088, size=1392508928, bz2_size=1
> Mar 2 21:33:17 no2 corosync[15924]: [MAIN ] Completed service
> synchronization, ready to provide service.
> Mar 2 21:33:17 no2 crmd: [15935]: info: config_query_callback:
> Shutdown escalation occurs after: 1200000ms
> Mar 2 21:33:17 no2 crmd: [15935]: info: config_query_callback:
> Checking for expired actions every 900000ms
> Mar 2 21:33:17 no2 crmd: [15935]: info: config_query_callback:
> Sending expected-votes=2 to corosync
> Mar 2 21:33:17 no2 crmd: [15935]: notice: ais_dispatch_message:
> Membership 28: quorum acquired
> Mar 2 21:33:17 no2 crmd: [15935]: info: crm_new_peer: Node
> no1.informidia.int now has id: 23335084
> Mar 2 21:33:17 no2 crmd: [15935]: info: crm_new_peer: Node 23335084
> is now known as no1.informidia.int
> Mar 2 21:33:17 no2 crmd: [15935]: info: ais_status_callback: status:
> no1.informidia.int is now unknown
> Mar 2 21:33:17 no2 crmd: [15935]: info: ais_status_callback: status:
> no1.informidia.int is now member (was unknown)
> Mar 2 21:33:17 no2 crmd: [15935]: info: crm_update_peer: Node
> no1.informidia.int: id=23335084 state=member (new) addr=r(0)
> ip(172.16.100.1) votes=1 born=16 seen=28
> proc=00000000000000000000000000111312
> Mar 2 21:33:17 no2 crmd: [15935]: notice: crmd_peer_update: Status
> update: Client no2.informidia.int/crmd now has status [online]
> (DC=<null>)
> Mar 2 21:33:17 no2 crmd: [15935]: info: ais_status_callback: status:
> no2.informidia.int is now member (was unknown)
> Mar 2 21:33:17 no2 crmd: [15935]: info: crm_update_peer: Node
> no2.informidia.int: id=40112300 state=member (new) addr=r(0)
> ip(172.16.100.2) (new) votes=1 (new) born=28 seen=28
> proc=00000000000000000000000000111312 (new)
> Mar 2 21:33:17 no2 crmd: [15935]: WARN: check_message_sanity: Message
> with no size
> Mar 2 21:33:17 no2 crmd: [15935]: ERROR: check_message_sanity:
> Invalid message 0: (dest=<all>:unknown, from=local:unknown.1678814208,
> compressed=1280, size=14592, total=0)
> Mar 2 21:33:17 no2 crmd: [15935]: ERROR: ais_dispatch_message:
> Invalid message (id=0, dest=<all>:unknown,
> from=local:unknown.1678814208): min=592, total=0, size=0,
> bz2_size=14592
> Mar 2 21:33:17 no2 crmd: [15935]: WARN: check_message_sanity: Message
> with no size
> Mar 2 21:33:17 no2 crmd: [15935]: ERROR: check_message_sanity:
> Invalid message 0: (dest=<all>:unknown, from=local:unknown.1678814208,
> compressed=1280, size=14592, total=0)
> Mar 2 21:33:17 no2 crmd: [15935]: ERROR: ais_dispatch_message:
> Invalid message (id=0, dest=<all>:unknown,
> from=local:unknown.1678814208): min=592, total=0, size=0,
> bz2_size=14592
> Mar 2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
> header contains an error: 16777216
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
> compressed=0, size=1392508928, total=-1560281088)
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
> min=592, total=-1560281088, size=1392508928, bz2_size=1
> Mar 2 21:33:17 no2 crmd: [15935]: info: do_started: The local CRM is
> operational
> Mar 2 21:33:17 no2 crmd: [15935]: info: do_state_transition: State
> transition S_STARTING -> S_PENDING [ input=I_PENDING
> cause=C_FSA_INTERNAL origin=do_started ]
> Mar 2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
> header contains an error: 16777216
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
> compressed=0, size=1392508928, total=-1560281088)
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
> min=592, total=-1560281088, size=1392508928, bz2_size=1
> Mar 2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
> header contains an error: 16777216
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
> compressed=0, size=1392508928, total=-1560281088)
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
> min=592, total=-1560281088, size=1392508928, bz2_size=1
> Mar 2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
> header contains an error: 16777216
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
> compressed=0, size=1392508928, total=-1560281088)
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
> min=592, total=-1560281088, size=1392508928, bz2_size=1
> Mar 2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
> header contains an error: 16777216
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
> compressed=0, size=1392508928, total=-1560281088)
> Mar 2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
> min=592, total=-1560281088, size=1392508928, bz2_size=1
> Mar 2 21:33:18 no2 crmd: [15935]: WARN: check_message_sanity: Message
> with no size
> Mar 2 21:33:18 no2 crmd: [15935]: ERROR: check_message_sanity:
> Invalid message 0: (dest=<all>:unknown, from=local:unknown.1678814208,
> compressed=1280, size=14592, total=0)
> Mar 2 21:33:18 no2 crmd: [15935]: ERROR: ais_dispatch_message:
> Invalid message (id=0, dest=<all>:unknown,
> from=local:unknown.1678814208): min=592, total=0, size=0,
> bz2_size=14592
> Mar 2 21:33:18 no2 crmd: [15935]: WARN: check_message_sanity: Message
> with no size
> Mar 2 21:33:18 no2 crmd: [15935]: ERROR: check_message_sanity:
> Invalid message 0: (dest=<all>:unknown, from=local:unknown.1678814208,
> compressed=1280, size=14592, total=0)
> Mar 2 21:33:18 no2 crmd: [15935]: ERROR: ais_dispatch_message:
> Invalid message (id=0, dest=<all>:unknown,
> from=local:unknown.1678814208): min=592, total=0, size=0,
> bz2_size=14592
> Mar 2 21:33:18 no2 crmd: [15935]: WARN: check_message_sanity: Message
> with no size
> Mar 2 21:33:18 no2 crmd: [15935]: ERROR: check_message_sanity:
> Invalid message 0: (dest=<all>:unknown, from=local:unknown.1678814208,
> compressed=1280, size=14592, total=0)
> Mar 2 21:33:18 no2 crmd: [15935]: ERROR: ais_dispatch_message:
> Invalid message (id=0, dest=<all>:unknown,
> from=local:unknown.1678814208): min=592, total=0, size=0,
> bz2_size=14592
> Mar 2 21:33:18 no2 crmd: [15935]: info: te_connect_stonith:
> Attempting connection to fencing daemon...
> Mar 2 21:33:19 no2 crmd: [15935]: info: te_connect_stonith: Connected
> Mar 2 21:33:19 no2 crmd: [15935]: WARN: check_message_sanity: Message
> with no size
> Mar 2 21:33:19 no2 crmd: [15935]: ERROR: check_message_sanity:
> Invalid message 0: (dest=<all>:unknown, from=local:unknown.1678814208,
> compressed=1280, size=14592, total=0)
> Mar 2 21:33:19 no2 crmd: [15935]: ERROR: ais_dispatch_message:
> Invalid message (id=0, dest=<all>:unknown,
> from=local:unknown.1678814208): min=592, total=0, size=0,
> bz2_size=14592
>
> I've tried to erase the CIB, remove the machine from the cluster and
> even uninstall and re-install corosync and openais (and its deps) but
> the error persists.
>

This looks like a recently discovered IPC error with how messages are
defined inside pacemaker. I believe andrew has a fix for that. CCing
pacemaker list for that topic.

Regards
-steve

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
Re: [corosync] Unable to join cluster from a newly-installed centos 6.2 node [ In reply to ]
On Sat, Mar 3, 2012 at 2:53 PM, Steven Dake <sdake@redhat.com> wrote:
> On 03/02/2012 05:29 PM, Diego Lima wrote:
>> Hello,
>>
>> I've recently installed Corosync on two CentOS 6.2 machines. One is
>> working fine but on the other machine I've been unable to connect to
>> the cluster. On the logs I can see this whenever I start
>> corosync+pacemaker:
>>
>> Mar  2 21:33:16 no2 corosync[15924]:   [MAIN  ] Corosync Cluster
>> Engine ('1.4.1'): started and ready to provide service.
>> Mar  2 21:33:16 no2 corosync[15924]:   [MAIN  ] Corosync built-in
>> features: nss dbus rdma snmp
>> Mar  2 21:33:16 no2 corosync[15924]:   [MAIN  ] Successfully read main
>> configuration file '/etc/corosync/corosync.conf'.
>> Mar  2 21:33:16 no2 corosync[15924]:   [TOTEM ] Initializing transport
>> (UDP/IP Multicast).
>> Mar  2 21:33:16 no2 corosync[15924]:   [TOTEM ] Initializing
>> transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
>> Mar  2 21:33:16 no2 corosync[15924]:   [TOTEM ] The network interface
>> [172.16.100.2] is now up.
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info:
>> process_ais_conf: Reading configure
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info:
>> config_find_init: Local handle: 4730966301143465987 for logging
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info:
>> config_find_next: Processing additional logging options...
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: get_config_opt:
>> Found 'off' for option: debug
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: get_config_opt:
>> Found 'no' for option: to_logfile
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: get_config_opt:
>> Found 'yes' for option: to_syslog
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: get_config_opt:
>> Found 'daemon' for option: syslog_facility
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info:
>> config_find_init: Local handle: 7739444317642555396 for quorum
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info:
>> config_find_next: No additional configuration supplied for: quorum
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: get_config_opt:
>> No default for option: provider
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info:
>> config_find_init: Local handle: 5650605097994944517 for service
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info:
>> config_find_next: Processing additional service options...
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: get_config_opt:
>> Found '0' for option: ver
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: get_config_opt:
>> Defaulting to 'pcmk' for option: clustername
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: get_config_opt:
>> Defaulting to 'no' for option: use_logd
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: get_config_opt:
>> Defaulting to 'no' for option: use_mgmtd
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: pcmk_startup:
>> CRM: Initialized
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] Logging: Initialized
>> pcmk_startup
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: pcmk_startup:
>> Maximum core file size is: 18446744073709551615
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: pcmk_startup: Service: 10
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: pcmk_startup:
>> Local hostname: no2.informidia.int
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info:
>> pcmk_update_nodeid: Local node id: 40112300
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: update_member:
>> Creating entry for node 40112300 born on 0
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: update_member:
>> 0x766520 Node 40112300 now known as no2.informidia.int (was: (null))
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: update_member:
>> Node no2.informidia.int now has 1 quorum votes (was 0)
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: update_member:
>> Node 40112300/no2.informidia.int is now: member
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: spawn_child:
>> Forked child 15930 for process stonith-ng
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: spawn_child:
>> Forked child 15931 for process cib
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: spawn_child:
>> Forked child 15932 for process lrmd
>> Mar  2 21:33:16 no2 lrmd: [15932]: info: G_main_add_SignalHandler:
>> Added signal handler for signal 15
>> Mar  2 21:33:16 no2 stonith-ng: [15930]: info: Invoked:
>> /usr/lib64/heartbeat/stonithd
>> Mar  2 21:33:16 no2 stonith-ng: [15930]: info: crm_log_init_worker:
>> Changed active directory to /var/lib/heartbeat/cores/root
>> Mar  2 21:33:16 no2 stonith-ng: [15930]: info:
>> G_main_add_SignalHandler: Added signal handler for signal 17
>> Mar  2 21:33:16 no2 stonith-ng: [15930]: info: get_cluster_type:
>> Cluster type is: 'openais'
>> Mar  2 21:33:16 no2 stonith-ng: [15930]: notice: crm_cluster_connect:
>> Connecting to cluster infrastructure: classic openais (with plugin)
>> Mar  2 21:33:16 no2 stonith-ng: [15930]: info:
>> init_ais_connection_classic: Creating connection to our Corosync
>> plugin
>> Mar  2 21:33:16 no2 cib: [15931]: info: crm_log_init_worker: Changed
>> active directory to /var/lib/heartbeat/cores/hacluster
>> Mar  2 21:33:16 no2 cib: [15931]: info: G_main_add_TriggerHandler:
>> Added signal manual handler
>> Mar  2 21:33:16 no2 cib: [15931]: info: G_main_add_SignalHandler:
>> Added signal handler for signal 17
>> Mar  2 21:33:16 no2 lrmd: [15932]: info: G_main_add_SignalHandler:
>> Added signal handler for signal 17
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: spawn_child:
>> Forked child 15933 for process attrd
>> Mar  2 21:33:16 no2 stonith-ng: [15930]: info:
>> init_ais_connection_classic: AIS connection established
>> Mar  2 21:33:16 no2 cib: [15931]: info: retrieveCib: Reading cluster
>> configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
>> /var/lib/heartbeat/crm/cib.xml.sig)
>> Mar  2 21:33:16 no2 lrmd: [15932]: info: enabling coredumps
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: spawn_child:
>> Forked child 15934 for process pengine
>> Mar  2 21:33:16 no2 stonith-ng: [15930]: info: get_ais_nodeid: Server
>> details: id=40112300 uname=no2.informidia.int cname=pcmk
>> Mar  2 21:33:16 no2 cib: [15931]: info: validate_with_relaxng:
>> Creating RNG parser context
>> Mar  2 21:33:16 no2 lrmd: [15932]: info: G_main_add_SignalHandler:
>> Added signal handler for signal 10
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: spawn_child:
>> Forked child 15935 for process crmd
>> Mar  2 21:33:16 no2 stonith-ng: [15930]: info:
>> init_ais_connection_once: Connection to 'classic openais (with
>> plugin)': established
>> Mar  2 21:33:16 no2 lrmd: [15932]: info: G_main_add_SignalHandler:
>> Added signal handler for signal 12
>> Mar  2 21:33:16 no2 corosync[15924]:   [SERV  ] Service engine loaded:
>> Pacemaker Cluster Manager 1.1.6
>> Mar  2 21:33:16 no2 stonith-ng: [15930]: info: crm_new_peer: Node
>> no2.informidia.int now has id: 40112300
>> Mar  2 21:33:16 no2 lrmd: [15932]: info: Started.
>> Mar  2 21:33:16 no2 corosync[15924]:   [SERV  ] Service engine loaded:
>> corosync extended virtual synchrony service
>> Mar  2 21:33:16 no2 stonith-ng: [15930]: info: crm_new_peer: Node
>> 40112300 is now known as no2.informidia.int
>> Mar  2 21:33:16 no2 corosync[15924]:   [SERV  ] Service engine loaded:
>> corosync configuration service
>> Mar  2 21:33:16 no2 stonith-ng: [15930]: info: main: Starting
>> stonith-ng mainloop
>> Mar  2 21:33:16 no2 corosync[15924]:   [SERV  ] Service engine loaded:
>> corosync cluster closed process group service v1.01
>> Mar  2 21:33:16 no2 corosync[15924]:   [SERV  ] Service engine loaded:
>> corosync cluster config database access v1.01
>> Mar  2 21:33:16 no2 corosync[15924]:   [SERV  ] Service engine loaded:
>> corosync profile loading service
>> Mar  2 21:33:16 no2 corosync[15924]:   [SERV  ] Service engine loaded:
>> corosync cluster quorum service v0.1
>> Mar  2 21:33:16 no2 corosync[15924]:   [MAIN  ] Compatibility mode set
>> to whitetank.  Using V1 and V2 of the synchronization engine.
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] notice:
>> pcmk_peer_update: Transitional membership event on ring 24: memb=0,
>> new=0, lost=0
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] notice:
>> pcmk_peer_update: Stable membership event on ring 24: memb=1, new=1,
>> lost=0
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info:
>> pcmk_peer_update: NEW:  no2.informidia.int 40112300
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info:
>> pcmk_peer_update: MEMB: no2.informidia.int 40112300
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: update_member:
>> Node no2.informidia.int now has process list:
>> 00000000000000000000000000111312 (1118994)
>> Mar  2 21:33:16 no2 corosync[15924]:   [TOTEM ] A processor joined or
>> left the membership and a new membership was formed.
>> Mar  2 21:33:16 no2 corosync[15924]:   [CPG   ] chosen downlist:
>> sender r(0) ip(172.16.100.2) ; members(old:0 left:0)
>> Mar  2 21:33:16 no2 corosync[15924]:   [MAIN  ] Completed service
>> synchronization, ready to provide service.
>> Mar  2 21:33:16 no2 corosync[15924]:   [pcmk  ] info: pcmk_ipc:
>> Recorded connection 0x775b60 for stonith-ng/15930
>> Mar  2 21:33:16 no2 cib: [15931]: info: startCib: CIB Initialization
>> completed successfully
>> Mar  2 21:33:16 no2 cib: [15931]: info: get_cluster_type: Cluster type
>> is: 'openais'
>> Mar  2 21:33:16 no2 cib: [15931]: notice: crm_cluster_connect:
>> Connecting to cluster infrastructure: classic openais (with plugin)
>> Mar  2 21:33:16 no2 cib: [15931]: info: init_ais_connection_classic:
>> Creating connection to our Corosync plugin
>> Mar  2 21:33:17 no2 cib: [15931]: info: init_ais_connection_classic:
>> AIS connection established
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info: pcmk_ipc:
>> Recorded connection 0x7715b0 for cib/15931
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info: pcmk_ipc:
>> Sending membership update 24 to cib
>> Mar  2 21:33:17 no2 cib: [15931]: info: get_ais_nodeid: Server
>> details: id=40112300 uname=no2.informidia.int cname=pcmk
>> Mar  2 21:33:17 no2 cib: [15931]: info: init_ais_connection_once:
>> Connection to 'classic openais (with plugin)': established
>> Mar  2 21:33:17 no2 cib: [15931]: info: crm_new_peer: Node
>> no2.informidia.int now has id: 40112300
>> Mar  2 21:33:17 no2 cib: [15931]: info: crm_new_peer: Node 40112300 is
>> now known as no2.informidia.int
>> Mar  2 21:33:17 no2 cib: [15931]: info: cib_init: Starting cib mainloop
>> Mar  2 21:33:17 no2 cib: [15931]: info: ais_dispatch_message:
>> Membership 24: quorum still lost
>> Mar  2 21:33:17 no2 cib: [15931]: info: crm_update_peer: Node
>> no2.informidia.int: id=40112300 state=member (new) addr=r(0)
>> ip(172.16.100.2)  (new) votes=1 (new) born=0 seen=24
>> proc=00000000000000000000000000111312 (new)
>> Mar  2 21:33:17 no2 crmd: [15935]: info: Invoked: /usr/lib64/heartbeat/crmd
>> Mar  2 21:33:17 no2 attrd: [15933]: info: Invoked: /usr/lib64/heartbeat/attrd
>> Mar  2 21:33:17 no2 pengine: [15934]: info: Invoked:
>> /usr/lib64/heartbeat/pengine
>> Mar  2 21:33:17 no2 crmd: [15935]: info: crm_log_init_worker: Changed
>> active directory to /var/lib/heartbeat/cores/hacluster
>> Mar  2 21:33:17 no2 crmd: [15935]: info: main: CRM Hg Version:
>> a02c0f19a00c1eb2527ad38f146ebc0834814558
>> Mar  2 21:33:17 no2 attrd: [15933]: notice: crm_cluster_connect:
>> Connecting to cluster infrastructure: classic openais (with plugin)
>> Mar  2 21:33:17 no2 crmd: [15935]: info: crmd_init: Starting crmd
>> Mar  2 21:33:17 no2 crmd: [15935]: info: G_main_add_SignalHandler:
>> Added signal handler for signal 17
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info: pcmk_ipc:
>> Recorded connection 0x77a380 for attrd/15933
>> Mar  2 21:33:17 no2 attrd: [15933]: notice: main: Starting mainloop...
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] notice:
>> pcmk_peer_update: Transitional membership event on ring 28: memb=1,
>> new=0, lost=0
>> Mar  2 21:33:17 no2 crmd: [15935]: info: do_cib_control: CIB
>> connection established
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info:
>> pcmk_peer_update: memb: no2.informidia.int 40112300
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] notice:
>> pcmk_peer_update: Stable membership event on ring 28: memb=2, new=1,
>> lost=0
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info: update_member:
>> Creating entry for node 23335084 born on 28
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info: update_member:
>> Node 23335084/unknown is now: member
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info:
>> pcmk_peer_update: NEW:  .pending. 23335084
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info:
>> pcmk_peer_update: MEMB: .pending. 23335084
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info:
>> pcmk_peer_update: MEMB: no2.informidia.int 40112300
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info:
>> send_member_notification: Sending membership update 28 to 1 children
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info: update_member:
>> 0x766520 Node 40112300 ((null)) born on: 28
>> Mar  2 21:33:17 no2 crmd: [15935]: info: get_cluster_type: Cluster
>> type is: 'openais'
>> Mar  2 21:33:17 no2 crmd: [15935]: notice: crm_cluster_connect:
>> Connecting to cluster infrastructure: classic openais (with plugin)
>> Mar  2 21:33:17 no2 crmd: [15935]: info: init_ais_connection_classic:
>> Creating connection to our Corosync plugin
>> Mar  2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
>> header contains an error: 16777216
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
>> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
>> compressed=0, size=1392508928, total=-1560281088)
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
>> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
>> min=592, total=-1560281088, size=1392508928, bz2_size=1
>> Mar  2 21:33:17 no2 corosync[15924]:   [TOTEM ] A processor joined or
>> left the membership and a new membership was formed.
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info: update_member:
>> 0x77a1b0 Node 23335084 (no1.informidia.int) born on: 16
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info: update_member:
>> 0x77a1b0 Node 23335084 now known as no1.informidia.int (was: (null))
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info: update_member:
>> Node no1.informidia.int now has process list:
>> 00000000000000000000000000111312 (1118994)
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info: update_member:
>> Node no1.informidia.int now has 1 quorum votes (was 0)
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info:
>> send_member_notification: Sending membership update 28 to 1 children
>> Mar  2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
>> header contains an error: 16777216
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
>> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
>> compressed=0, size=1392508928, total=-1560281088)
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
>> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
>> min=592, total=-1560281088, size=1392508928, bz2_size=1
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] WARN:
>> route_ais_message: Sending message to local.crmd failed: ipc delivery
>> failed (rc=-2)
>> Mar  2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
>> header contains an error: 16777216
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
>> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
>> compressed=0, size=1392508928, total=-1560281088)
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
>> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
>> min=592, total=-1560281088, size=1392508928, bz2_size=1
>> Mar  2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
>> header contains an error: 16777216
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
>> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
>> compressed=0, size=1392508928, total=-1560281088)
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
>> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
>> min=592, total=-1560281088, size=1392508928, bz2_size=1
>> Mar  2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
>> header contains an error: 16777216
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
>> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
>> compressed=0, size=1392508928, total=-1560281088)
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
>> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
>> min=592, total=-1560281088, size=1392508928, bz2_size=1
>> Mar  2 21:33:17 no2 crmd: [15935]: info: init_ais_connection_classic:
>> AIS connection established
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info: pcmk_ipc:
>> Recorded connection 0x77ebf0 for crmd/15935
>> Mar  2 21:33:17 no2 corosync[15924]:   [pcmk  ] info: pcmk_ipc:
>> Sending membership update 28 to crmd
>> Mar  2 21:33:17 no2 crmd: [15935]: info: get_ais_nodeid: Server
>> details: id=40112300 uname=no2.informidia.int cname=pcmk
>> Mar  2 21:33:17 no2 crmd: [15935]: info: init_ais_connection_once:
>> Connection to 'classic openais (with plugin)': established
>> Mar  2 21:33:17 no2 crmd: [15935]: info: crm_new_peer: Node
>> no2.informidia.int now has id: 40112300
>> Mar  2 21:33:17 no2 crmd: [15935]: info: crm_new_peer: Node 40112300
>> is now known as no2.informidia.int
>> Mar  2 21:33:17 no2 crmd: [15935]: info: ais_status_callback: status:
>> no2.informidia.int is now unknown
>> Mar  2 21:33:17 no2 crmd: [15935]: info: do_ha_control: Connected to the cluster
>> Mar  2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
>> header contains an error: 16777216
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
>> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
>> compressed=0, size=1392508928, total=-1560281088)
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
>> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
>> min=592, total=-1560281088, size=1392508928, bz2_size=1
>> Mar  2 21:33:17 no2 corosync[15924]:   [CPG   ] chosen downlist:
>> sender r(0) ip(172.16.100.1) ; members(old:1 left:0)
>> Mar  2 21:33:17 no2 crmd: [15935]: info: do_started: Delaying start,
>> no membership data (0000000000100000)
>> Mar  2 21:33:17 no2 crmd: [15935]: info: crmd_init: Starting crmd's mainloop
>> Mar  2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
>> header contains an error: 16777216
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
>> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
>> compressed=0, size=1392508928, total=-1560281088)
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
>> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
>> min=592, total=-1560281088, size=1392508928, bz2_size=1
>> Mar  2 21:33:17 no2 corosync[15924]:   [MAIN  ] Completed service
>> synchronization, ready to provide service.
>> Mar  2 21:33:17 no2 crmd: [15935]: info: config_query_callback:
>> Shutdown escalation occurs after: 1200000ms
>> Mar  2 21:33:17 no2 crmd: [15935]: info: config_query_callback:
>> Checking for expired actions every 900000ms
>> Mar  2 21:33:17 no2 crmd: [15935]: info: config_query_callback:
>> Sending expected-votes=2 to corosync
>> Mar  2 21:33:17 no2 crmd: [15935]: notice: ais_dispatch_message:
>> Membership 28: quorum acquired
>> Mar  2 21:33:17 no2 crmd: [15935]: info: crm_new_peer: Node
>> no1.informidia.int now has id: 23335084
>> Mar  2 21:33:17 no2 crmd: [15935]: info: crm_new_peer: Node 23335084
>> is now known as no1.informidia.int
>> Mar  2 21:33:17 no2 crmd: [15935]: info: ais_status_callback: status:
>> no1.informidia.int is now unknown
>> Mar  2 21:33:17 no2 crmd: [15935]: info: ais_status_callback: status:
>> no1.informidia.int is now member (was unknown)
>> Mar  2 21:33:17 no2 crmd: [15935]: info: crm_update_peer: Node
>> no1.informidia.int: id=23335084 state=member (new) addr=r(0)
>> ip(172.16.100.1)  votes=1 born=16 seen=28
>> proc=00000000000000000000000000111312
>> Mar  2 21:33:17 no2 crmd: [15935]: notice: crmd_peer_update: Status
>> update: Client no2.informidia.int/crmd now has status [online]
>> (DC=<null>)
>> Mar  2 21:33:17 no2 crmd: [15935]: info: ais_status_callback: status:
>> no2.informidia.int is now member (was unknown)
>> Mar  2 21:33:17 no2 crmd: [15935]: info: crm_update_peer: Node
>> no2.informidia.int: id=40112300 state=member (new) addr=r(0)
>> ip(172.16.100.2)  (new) votes=1 (new) born=28 seen=28
>> proc=00000000000000000000000000111312 (new)
>> Mar  2 21:33:17 no2 crmd: [15935]: WARN: check_message_sanity: Message
>> with no size
>> Mar  2 21:33:17 no2 crmd: [15935]: ERROR: check_message_sanity:
>> Invalid message 0: (dest=<all>:unknown, from=local:unknown.1678814208,
>> compressed=1280, size=14592, total=0)
>> Mar  2 21:33:17 no2 crmd: [15935]: ERROR: ais_dispatch_message:
>> Invalid message (id=0, dest=<all>:unknown,
>> from=local:unknown.1678814208): min=592, total=0, size=0,
>> bz2_size=14592
>> Mar  2 21:33:17 no2 crmd: [15935]: WARN: check_message_sanity: Message
>> with no size
>> Mar  2 21:33:17 no2 crmd: [15935]: ERROR: check_message_sanity:
>> Invalid message 0: (dest=<all>:unknown, from=local:unknown.1678814208,
>> compressed=1280, size=14592, total=0)
>> Mar  2 21:33:17 no2 crmd: [15935]: ERROR: ais_dispatch_message:
>> Invalid message (id=0, dest=<all>:unknown,
>> from=local:unknown.1678814208): min=592, total=0, size=0,
>> bz2_size=14592
>> Mar  2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
>> header contains an error: 16777216
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
>> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
>> compressed=0, size=1392508928, total=-1560281088)
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
>> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
>> min=592, total=-1560281088, size=1392508928, bz2_size=1
>> Mar  2 21:33:17 no2 crmd: [15935]: info: do_started: The local CRM is
>> operational
>> Mar  2 21:33:17 no2 crmd: [15935]: info: do_state_transition: State
>> transition S_STARTING -> S_PENDING [ input=I_PENDING
>> cause=C_FSA_INTERNAL origin=do_started ]
>> Mar  2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
>> header contains an error: 16777216
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
>> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
>> compressed=0, size=1392508928, total=-1560281088)
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
>> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
>> min=592, total=-1560281088, size=1392508928, bz2_size=1
>> Mar  2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
>> header contains an error: 16777216
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
>> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
>> compressed=0, size=1392508928, total=-1560281088)
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
>> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
>> min=592, total=-1560281088, size=1392508928, bz2_size=1
>> Mar  2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
>> header contains an error: 16777216
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
>> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
>> compressed=0, size=1392508928, total=-1560281088)
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
>> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
>> min=592, total=-1560281088, size=1392508928, bz2_size=1
>> Mar  2 21:33:17 no2 cib: [15931]: WARN: check_message_sanity: Message
>> header contains an error: 16777216
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: check_message_sanity: Invalid
>> message 33554432: (dest=<all>:unknown, from=:unknown.156688,
>> compressed=0, size=1392508928, total=-1560281088)
>> Mar  2 21:33:17 no2 cib: [15931]: ERROR: ais_dispatch_message: Invalid
>> message (id=33554432, dest=<all>:unknown, from=:unknown.156688):
>> min=592, total=-1560281088, size=1392508928, bz2_size=1
>> Mar  2 21:33:18 no2 crmd: [15935]: WARN: check_message_sanity: Message
>> with no size
>> Mar  2 21:33:18 no2 crmd: [15935]: ERROR: check_message_sanity:
>> Invalid message 0: (dest=<all>:unknown, from=local:unknown.1678814208,
>> compressed=1280, size=14592, total=0)
>> Mar  2 21:33:18 no2 crmd: [15935]: ERROR: ais_dispatch_message:
>> Invalid message (id=0, dest=<all>:unknown,
>> from=local:unknown.1678814208): min=592, total=0, size=0,
>> bz2_size=14592
>> Mar  2 21:33:18 no2 crmd: [15935]: WARN: check_message_sanity: Message
>> with no size
>> Mar  2 21:33:18 no2 crmd: [15935]: ERROR: check_message_sanity:
>> Invalid message 0: (dest=<all>:unknown, from=local:unknown.1678814208,
>> compressed=1280, size=14592, total=0)
>> Mar  2 21:33:18 no2 crmd: [15935]: ERROR: ais_dispatch_message:
>> Invalid message (id=0, dest=<all>:unknown,
>> from=local:unknown.1678814208): min=592, total=0, size=0,
>> bz2_size=14592
>> Mar  2 21:33:18 no2 crmd: [15935]: WARN: check_message_sanity: Message
>> with no size
>> Mar  2 21:33:18 no2 crmd: [15935]: ERROR: check_message_sanity:
>> Invalid message 0: (dest=<all>:unknown, from=local:unknown.1678814208,
>> compressed=1280, size=14592, total=0)
>> Mar  2 21:33:18 no2 crmd: [15935]: ERROR: ais_dispatch_message:
>> Invalid message (id=0, dest=<all>:unknown,
>> from=local:unknown.1678814208): min=592, total=0, size=0,
>> bz2_size=14592
>> Mar  2 21:33:18 no2 crmd: [15935]: info: te_connect_stonith:
>> Attempting connection to fencing daemon...
>> Mar  2 21:33:19 no2 crmd: [15935]: info: te_connect_stonith: Connected
>> Mar  2 21:33:19 no2 crmd: [15935]: WARN: check_message_sanity: Message
>> with no size
>> Mar  2 21:33:19 no2 crmd: [15935]: ERROR: check_message_sanity:
>> Invalid message 0: (dest=<all>:unknown, from=local:unknown.1678814208,
>> compressed=1280, size=14592, total=0)
>> Mar  2 21:33:19 no2 crmd: [15935]: ERROR: ais_dispatch_message:
>> Invalid message (id=0, dest=<all>:unknown,
>> from=local:unknown.1678814208): min=592, total=0, size=0,
>> bz2_size=14592
>>
>> I've tried to erase the CIB, remove the machine from the cluster and
>> even uninstall and re-install corosync and openais (and its deps) but
>> the error persists.
>>
>
> This looks like a recently discovered IPC error with how messages are
> defined inside pacemaker.

No, that was introduced when I added support for corosync 2.0 which
was long after 6.2 went out.

>  I believe andrew has a fix for that.  CCing
> pacemaker list for that topic.
>
> Regards
> -steve
> _______________________________________________
> discuss mailing list
> discuss@corosync.org
> http://lists.corosync.org/mailman/listinfo/discuss

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org