Mailing List Archive: vfiler migrate: overview and thoughts

vfiler migrate: overview and thoughts

Feb 8, 2012, 10:26 AM

Post #1 of 6 (7258 views)

vfiler migrate: overview and thoughts

When I first read there was a way to move a vFiler from one node of a
NetApp cluster to another I was excited. I was imagining something akin to
VMWare's vMotion, a transparent movement of services. Digging a little
deeper showed that NetApp's "vfiler migrate" functionality isn't nearly as
automagic as I'd hoped.

Here are some observations.

* Disk ownership of the resources must be software based whether your
filer is using actual disks or array LUNs (on a vSeries filer like our
3170). We had some concerns that the feature might not work well with
array LUNs but it appears that Data OnTap doesn't know any difference
between an Array LUN and an actual disk in this context.

* The vfiler migrate command effectively moves complete aggregates from
one filer head to another. This means that all volumes on the aggregate(s)
involved must be tied only to the vfiler being moved, with no LUNs,
exports or shares presented from the context of the root filer or any
other vFiler (in our environment we already had a standard of creating
separate aggregates for each vFiler so this wasn't a problem). For
example, after one failed attempt to migrate a filer, I had added a CIFS
share to the root volume of the vFiler via the root filer, to gain access
to the etc folder of the vFiler. I forgot to remove that share, and broke
later migration attempts for a new reason.

* We've tested the vfiler migrate command dozens of times now on three
different vFilers, in preparation for the migration of a production vFiler
later this week. Two of those vFilers have migrated flawlessly every time,
and one seems to fail about 30% of the time for various reasons which we
can sometimes identify and sometimes not.

* Reasons for failure include:
- A CIFS share from the root filer head to the vFiler's root
volume. My bad.
- Possible FC noise between the root filer and the SAN behind it.
- Possible SCSI reservations issues between the root filer and the
SAN.
- Invalid credentials (fat-fingered a password, I think) for the
"source" remote root filer. Oddly, the migrate command still stopped the
vFiler, offlined its volumes and aggregates, and removed the vFiler from
the source root filer before the process failed.
- Poor alignment of the planets? Bad karma?

* In general, it seems that the vfiler migrate just fails sometimes. In
every failure, however, recovery has been straightforward. The "vfiler
create <vFiler name> -r <path to vFiler root volume>" command recovers the
vFiler every time, albeit without a proper network configuration. The
vFiler comes back up but with its virtual NIC having no subnet mask or
assignment to a physical interface. Makes sense, I guess, as the migrate
command never got to the part where it normally asks what mask and
interface to use. This needs to be reassigned, either from the CLI using
ifconfig or the "manage vFiler" wizard in FilerView (note that this will
also overwrite etc/exports with a default but will save a backup first).

Given that there are arguably a slew more shops out there running HA
VMWare clusters than there are running HA NetApp clusters, it's probably
not fair to expect that vfiler migrate is going to be as slick or even as
well understood/documented in the wild as vMotion. Also, inherent
limitations in things like the CIFS protocol make it necessary that some
services will have to be interrupted. But overall I'd claim that the
feature is useful and even when it doesn't work as hoped recovery is
straightforward and reliable. We're planning on proceeding with our
production move later this week.

Hope this helps anyone in the same situation,

Randy
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

Re: vfiler migrate: overview and thoughts [ In reply to ]

fcocquyt at stanford

Feb 8, 2012, 10:55 AM

Post #2 of 6 (7132 views)

Permalink

Hi Randy,
Are you managing vfiler migrations the NMC or are you initiating via command line 'vfiler migrate'?
In my experience the NMC is more robust in terms of error checking and cleanup if the migration fails.

There are bugs, but by using the NMC I found the risks are mitigated - I wrote up our experiences here

http://www.vmadmin.info/2011/02/vfiler-non-disruptive-migration.html

We just upgraded both our clusters to 8.1RC2 and plan to use vFiler migration (now datamotion) to evacuate the prod cluster to upgrade its disk trays non-disruptively.

good luck,

Fletcher

On Feb 8, 2012, at 10:26 AM, Randy Rue wrote:

> vfiler migrate: overview and thoughts
>
> When I first read there was a way to move a vFiler from one node of a
> NetApp cluster to another I was excited. I was imagining something akin to
> VMWare's vMotion, a transparent movement of services. Digging a little
> deeper showed that NetApp's "vfiler migrate" functionality isn't nearly as
> automagic as I'd hoped.
>
> Here are some observations.
>
> * Disk ownership of the resources must be software based whether your
> filer is using actual disks or array LUNs (on a vSeries filer like our
> 3170). We had some concerns that the feature might not work well with
> array LUNs but it appears that Data OnTap doesn't know any difference
> between an Array LUN and an actual disk in this context.
>
> * The vfiler migrate command effectively moves complete aggregates from
> one filer head to another. This means that all volumes on the aggregate(s)
> involved must be tied only to the vfiler being moved, with no LUNs,
> exports or shares presented from the context of the root filer or any
> other vFiler (in our environment we already had a standard of creating
> separate aggregates for each vFiler so this wasn't a problem). For
> example, after one failed attempt to migrate a filer, I had added a CIFS
> share to the root volume of the vFiler via the root filer, to gain access
> to the etc folder of the vFiler. I forgot to remove that share, and broke
> later migration attempts for a new reason.
>
> * We've tested the vfiler migrate command dozens of times now on three
> different vFilers, in preparation for the migration of a production vFiler
> later this week. Two of those vFilers have migrated flawlessly every time,
> and one seems to fail about 30% of the time for various reasons which we
> can sometimes identify and sometimes not.
>
> * Reasons for failure include:
> - A CIFS share from the root filer head to the vFiler's root
> volume. My bad.
> - Possible FC noise between the root filer and the SAN behind it.
> - Possible SCSI reservations issues between the root filer and the
> SAN.
> - Invalid credentials (fat-fingered a password, I think) for the
> "source" remote root filer. Oddly, the migrate command still stopped the
> vFiler, offlined its volumes and aggregates, and removed the vFiler from
> the source root filer before the process failed.
> - Poor alignment of the planets? Bad karma?
>
> * In general, it seems that the vfiler migrate just fails sometimes. In
> every failure, however, recovery has been straightforward. The "vfiler
> create <vFiler name> -r <path to vFiler root volume>" command recovers the
> vFiler every time, albeit without a proper network configuration. The
> vFiler comes back up but with its virtual NIC having no subnet mask or
> assignment to a physical interface. Makes sense, I guess, as the migrate
> command never got to the part where it normally asks what mask and
> interface to use. This needs to be reassigned, either from the CLI using
> ifconfig or the "manage vFiler" wizard in FilerView (note that this will
> also overwrite etc/exports with a default but will save a backup first).
>
>
> Given that there are arguably a slew more shops out there running HA
> VMWare clusters than there are running HA NetApp clusters, it's probably
> not fair to expect that vfiler migrate is going to be as slick or even as
> well understood/documented in the wild as vMotion. Also, inherent
> limitations in things like the CIFS protocol make it necessary that some
> services will have to be interrupted. But overall I'd claim that the
> feature is useful and even when it doesn't work as hoped recovery is
> straightforward and reliable. We're planning on proceeding with our
> production move later this week.
>
>
> Hope this helps anyone in the same situation,
>
> Randy
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> http://www.teaparty.net/mailman/listinfo/toasters

RE: vfiler migrate: overview and thoughts [ In reply to ]

rrue at fhcrc

Feb 8, 2012, 11:11 AM

Post #3 of 6 (7146 views)

Permalink

I've been working via the CLI, good to know the NMC will give better
options.

From: Fletcher Cocquyt [mailto:fcocquyt@stanford.edu]
Sent: Wednesday, February 08, 2012 10:56 AM
To: Randy Rue
Cc: toasters@teaparty.net
Subject: Re: vfiler migrate: overview and thoughts

Hi Randy,

Are you managing vfiler migrations the NMC or are you initiating via
command line 'vfiler migrate'?

In my experience the NMC is more robust in terms of error checking and
cleanup if the migration fails.

There are bugs, but by using the NMC I found the risks are mitigated - I
wrote up our experiences here

http://www.vmadmin.info/2011/02/vfiler-non-disruptive-migration.html

We just upgraded both our clusters to 8.1RC2 and plan to use vFiler
migration (now datamotion) to evacuate the prod cluster to upgrade its
disk trays non-disruptively.

good luck,

Fletcher

On Feb 8, 2012, at 10:26 AM, Randy Rue wrote:

vfiler migrate: overview and thoughts

When I first read there was a way to move a vFiler from one node of a
NetApp cluster to another I was excited. I was imagining something akin to
VMWare's vMotion, a transparent movement of services. Digging a little
deeper showed that NetApp's "vfiler migrate" functionality isn't nearly as
automagic as I'd hoped.

Here are some observations.

* Disk ownership of the resources must be software based whether your
filer is using actual disks or array LUNs (on a vSeries filer like our
3170). We had some concerns that the feature might not work well with
array LUNs but it appears that Data OnTap doesn't know any difference
between an Array LUN and an actual disk in this context.

* The vfiler migrate command effectively moves complete aggregates from
one filer head to another. This means that all volumes on the aggregate(s)
involved must be tied only to the vfiler being moved, with no LUNs,
exports or shares presented from the context of the root filer or any
other vFiler (in our environment we already had a standard of creating
separate aggregates for each vFiler so this wasn't a problem). For
example, after one failed attempt to migrate a filer, I had added a CIFS
share to the root volume of the vFiler via the root filer, to gain access
to the etc folder of the vFiler. I forgot to remove that share, and broke
later migration attempts for a new reason.

* We've tested the vfiler migrate command dozens of times now on three
different vFilers, in preparation for the migration of a production vFiler
later this week. Two of those vFilers have migrated flawlessly every time,
and one seems to fail about 30% of the time for various reasons which we
can sometimes identify and sometimes not.

* Reasons for failure include:
- A CIFS share from the root filer head to the vFiler's root
volume. My bad.
- Possible FC noise between the root filer and the SAN behind
it.
- Possible SCSI reservations issues between the root filer and
the
SAN.
- Invalid credentials (fat-fingered a password, I think) for
the
"source" remote root filer. Oddly, the migrate command still
stopped the
vFiler, offlined its volumes and aggregates, and removed the vFiler from
the source root filer before the process failed.
- Poor alignment of the planets? Bad karma?

* In general, it seems that the vfiler migrate just fails sometimes. In
every failure, however, recovery has been straightforward. The "vfiler
create <vFiler name> -r <path to vFiler root volume>" command recovers the
vFiler every time, albeit without a proper network configuration. The
vFiler comes back up but with its virtual NIC having no subnet mask or
assignment to a physical interface. Makes sense, I guess, as the migrate
command never got to the part where it normally asks what mask and
interface to use. This needs to be reassigned, either from the CLI using
ifconfig or the "manage vFiler" wizard in FilerView (note that this will
also overwrite etc/exports with a default but will save a backup first).

Given that there are arguably a slew more shops out there running HA
VMWare clusters than there are running HA NetApp clusters, it's probably
not fair to expect that vfiler migrate is going to be as slick or even as
well understood/documented in the wild as vMotion. Also, inherent
limitations in things like the CIFS protocol make it necessary that some
services will have to be interrupted. But overall I'd claim that the
feature is useful and even when it doesn't work as hoped recovery is
straightforward and reliable. We're planning on proceeding with our
production move later this week.

Hope this helps anyone in the same situation,

Randy
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

RE: vfiler migrate: overview and thoughts [ In reply to ]

sgelb at insightinvestments

Feb 8, 2012, 11:59 AM

Post #4 of 6 (7194 views)

Permalink

Really good discussion here. You are using the -m nocopy method which
is a method with disk reassign. This does not work from NMC. NMC is
data motion which guarantees a 120 second failover but does not work
between cluster pairs. Regular vfiler migrate uses snapmirror similar
to this but doesn't guarantee failover in 120 seconds. The -m nocopy
method was formerly called snapmover and is between cluster pairs or
v-series neighborhoods. Vfiler migrate works between cluster pairs and
other clusters, but for -m nocopy, all nodes must see the disks (cluster
pair or vseries neighborhood).

The cifs share from vfiler0 would be a hangup so good to test that first
like you did.

I really like the NMC method with data motion but unless you have
another cluster pair to migrate to that isn't an option and migrate -m
nocopy should be faster than that with no data movement (but still no
guarantee on timing but knowing no copy it should always be faster).

Also, when you recreate a vfiler there is no network and the interfaces
are unconfigured. You have to ifconfig them then vfiler run vfiler name
route add default as well. I wouldn't use filerview or the setup -e
wizard since that does whack exports, hosts.equiv, hosts, etc. Running
ifconfig is much easier since it just configures and binds the
unconfigured ip that matches the one in the ifconfig without modifying
files like setup. Then confirm the entry in /etc/rc of vfiler0 in case
or reboot for persistence. Ideally there would be more automation but
it is quick to reconcile when you check vfiler status -a and /etc/rc.

From: toasters-bounces@teaparty.net
[mailto:toasters-bounces@teaparty.net] On Behalf Of Randy Rue
Sent: Wednesday, February 08, 2012 11:12 AM
To: toasters@teaparty.net
Subject: RE: vfiler migrate: overview and thoughts

I've been working via the CLI, good to know the NMC will give better
options...

From: Fletcher Cocquyt [mailto:fcocquyt@stanford.edu]
Sent: Wednesday, February 08, 2012 10:56 AM
To: Randy Rue
Cc: toasters@teaparty.net
Subject: Re: vfiler migrate: overview and thoughts

Hi Randy,

Are you managing vfiler migrations the NMC or are you initiating via
command line 'vfiler migrate'?

In my experience the NMC is more robust in terms of error checking and
cleanup if the migration fails.

There are bugs, but by using the NMC I found the risks are mitigated - I
wrote up our experiences here

http://www.vmadmin.info/2011/02/vfiler-non-disruptive-migration.html

We just upgraded both our clusters to 8.1RC2 and plan to use vFiler
migration (now datamotion) to evacuate the prod cluster to upgrade its
disk trays non-disruptively.

good luck,

Fletcher

On Feb 8, 2012, at 10:26 AM, Randy Rue wrote:

vfiler migrate: overview and thoughts

When I first read there was a way to move a vFiler from one node of a
NetApp cluster to another I was excited. I was imagining something akin
to
VMWare's vMotion, a transparent movement of services. Digging a little
deeper showed that NetApp's "vfiler migrate" functionality isn't nearly
as
automagic as I'd hoped.

Here are some observations.

* Disk ownership of the resources must be software based whether your
filer is using actual disks or array LUNs (on a vSeries filer like our
3170). We had some concerns that the feature might not work well with
array LUNs but it appears that Data OnTap doesn't know any difference
between an Array LUN and an actual disk in this context.

* The vfiler migrate command effectively moves complete aggregates from
one filer head to another. This means that all volumes on the
aggregate(s)
involved must be tied only to the vfiler being moved, with no LUNs,
exports or shares presented from the context of the root filer or any
other vFiler (in our environment we already had a standard of creating
separate aggregates for each vFiler so this wasn't a problem). For
example, after one failed attempt to migrate a filer, I had added a CIFS
share to the root volume of the vFiler via the root filer, to gain
access
to the etc folder of the vFiler. I forgot to remove that share, and
broke
later migration attempts for a new reason.

* We've tested the vfiler migrate command dozens of times now on three
different vFilers, in preparation for the migration of a production
vFiler
later this week. Two of those vFilers have migrated flawlessly every
time,
and one seems to fail about 30% of the time for various reasons which we
can sometimes identify and sometimes not.

* Reasons for failure include:
- A CIFS share from the root filer head to the vFiler's root
volume. My bad.
- Possible FC noise between the root filer and the SAN
behind it.
- Possible SCSI reservations issues between the root filer
and the
SAN.
- Invalid credentials (fat-fingered a password, I think) for
the
"source" remote root filer. Oddly, the migrate command still
stopped the
vFiler, offlined its volumes and aggregates, and removed the vFiler from
the source root filer before the process failed.
- Poor alignment of the planets? Bad karma?

* In general, it seems that the vfiler migrate just fails sometimes. In
every failure, however, recovery has been straightforward. The "vfiler
create <vFiler name> -r <path to vFiler root volume>" command recovers
the
vFiler every time, albeit without a proper network configuration. The
vFiler comes back up but with its virtual NIC having no subnet mask or
assignment to a physical interface. Makes sense, I guess, as the migrate
command never got to the part where it normally asks what mask and
interface to use. This needs to be reassigned, either from the CLI using
ifconfig or the "manage vFiler" wizard in FilerView (note that this will
also overwrite etc/exports with a default but will save a backup first).

Given that there are arguably a slew more shops out there running HA
VMWare clusters than there are running HA NetApp clusters, it's probably
not fair to expect that vfiler migrate is going to be as slick or even
as
well understood/documented in the wild as vMotion. Also, inherent
limitations in things like the CIFS protocol make it necessary that some
services will have to be interrupted. But overall I'd claim that the
feature is useful and even when it doesn't work as hoped recovery is
straightforward and reliable. We're planning on proceeding with our
production move later this week.

Hope this helps anyone in the same situation,

Randy
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

RE: vfiler migrate: overview and thoughts [ In reply to ]

andrey.borzenkov at ts

Feb 8, 2012, 7:02 PM

Post #5 of 6 (7225 views)

Permalink

"does not work between cluster pairs" - you mean "between heads in cluster pair"? May be these are subtleties of English language that non-native speaker misses.
________________________________________
From: toasters-bounces@teaparty.net [toasters-bounces@teaparty.net] On Behalf Of Gelb, Scott [sgelb@insightinvestments.com]
Sent: Wednesday, February 08, 2012 23:59
To: Randy Rue; toasters@teaparty.net
Subject: RE: vfiler migrate: overview and thoughts

Really good discussion here. You are using the –m nocopy method which is a method with disk reassign. This does not work from NMC. NMC is data motion which guarantees a 120 second failover but does not work between cluster pairs. Regular vfiler migrate uses snapmirror similar to this but doesn’t guarantee failover in 120 seconds. The –m nocopy method was formerly called snapmover and is between cluster pairs or v-series neighborhoods. Vfiler migrate works between cluster pairs and other clusters, but for –m nocopy, all nodes must see the disks (cluster pair or vseries neighborhood).

The cifs share from vfiler0 would be a hangup so good to test that first like you did.

I really like the NMC method with data motion but unless you have another cluster pair to migrate to that isn’t an option and migrate –m nocopy should be faster than that with no data movement (but still no guarantee on timing but knowing no copy it should always be faster).

Also, when you recreate a vfiler there is no network and the interfaces are unconfigured. You have to ifconfig them then vfiler run vfiler name route add default as well. I wouldn’t use filerview or the setup –e wizard since that does whack exports, hosts.equiv, hosts, etc. Running ifconfig is much easier since it just configures and binds the unconfigured ip that matches the one in the ifconfig without modifying files like setup. Then confirm the entry in /etc/rc of vfiler0 in case or reboot for persistence. Ideally there would be more automation but it is quick to reconcile when you check vfiler status –a and /etc/rc.

From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Randy Rue
Sent: Wednesday, February 08, 2012 11:12 AM
To: toasters@teaparty.net
Subject: RE: vfiler migrate: overview and thoughts

I've been working via the CLI, good to know the NMC will give better options…

From: Fletcher Cocquyt [mailto:fcocquyt@stanford.edu]<mailto:[mailto:fcocquyt@stanford.edu]>
Sent: Wednesday, February 08, 2012 10:56 AM
To: Randy Rue
Cc: toasters@teaparty.net<mailto:toasters@teaparty.net>
Subject: Re: vfiler migrate: overview and thoughts

Hi Randy,
Are you managing vfiler migrations the NMC or are you initiating via command line 'vfiler migrate'?
In my experience the NMC is more robust in terms of error checking and cleanup if the migration fails.

There are bugs, but by using the NMC I found the risks are mitigated - I wrote up our experiences here

http://www.vmadmin.info/2011/02/vfiler-non-disruptive-migration.html

We just upgraded both our clusters to 8.1RC2 and plan to use vFiler migration (now datamotion) to evacuate the prod cluster to upgrade its disk trays non-disruptively.

good luck,

Fletcher

On Feb 8, 2012, at 10:26 AM, Randy Rue wrote:

vfiler migrate: overview and thoughts

When I first read there was a way to move a vFiler from one node of a
NetApp cluster to another I was excited. I was imagining something akin to
VMWare's vMotion, a transparent movement of services. Digging a little
deeper showed that NetApp's "vfiler migrate" functionality isn't nearly as
automagic as I'd hoped.

Here are some observations.

* Disk ownership of the resources must be software based whether your
filer is using actual disks or array LUNs (on a vSeries filer like our
3170). We had some concerns that the feature might not work well with
array LUNs but it appears that Data OnTap doesn't know any difference
between an Array LUN and an actual disk in this context.

* The vfiler migrate command effectively moves complete aggregates from
one filer head to another. This means that all volumes on the aggregate(s)
involved must be tied only to the vfiler being moved, with no LUNs,
exports or shares presented from the context of the root filer or any
other vFiler (in our environment we already had a standard of creating
separate aggregates for each vFiler so this wasn't a problem). For
example, after one failed attempt to migrate a filer, I had added a CIFS
share to the root volume of the vFiler via the root filer, to gain access
to the etc folder of the vFiler. I forgot to remove that share, and broke
later migration attempts for a new reason.

* We've tested the vfiler migrate command dozens of times now on three
different vFilers, in preparation for the migration of a production vFiler
later this week. Two of those vFilers have migrated flawlessly every time,
and one seems to fail about 30% of the time for various reasons which we
can sometimes identify and sometimes not.

* Reasons for failure include:
- A CIFS share from the root filer head to the vFiler's root
volume. My bad.
- Possible FC noise between the root filer and the SAN behind it.
- Possible SCSI reservations issues between the root filer and the
SAN.
- Invalid credentials (fat-fingered a password, I think) for the
"source" remote root filer. Oddly, the migrate command still stopped the
vFiler, offlined its volumes and aggregates, and removed the vFiler from
the source root filer before the process failed.
- Poor alignment of the planets? Bad karma?

* In general, it seems that the vfiler migrate just fails sometimes. In
every failure, however, recovery has been straightforward. The "vfiler
create <vFiler name> -r <path to vFiler root volume>" command recovers the
vFiler every time, albeit without a proper network configuration. The
vFiler comes back up but with its virtual NIC having no subnet mask or
assignment to a physical interface. Makes sense, I guess, as the migrate
command never got to the part where it normally asks what mask and
interface to use. This needs to be reassigned, either from the CLI using
ifconfig or the "manage vFiler" wizard in FilerView (note that this will
also overwrite etc/exports with a default but will save a backup first).

Given that there are arguably a slew more shops out there running HA
VMWare clusters than there are running HA NetApp clusters, it's probably
not fair to expect that vfiler migrate is going to be as slick or even as
well understood/documented in the wild as vMotion. Also, inherent
limitations in things like the CIFS protocol make it necessary that some
services will have to be interrupted. But overall I'd claim that the
feature is useful and even when it doesn't work as hoped recovery is
straightforward and reliable. We're planning on proceeding with our
production move later this week.

Hope this helps anyone in the same situation,

Randy
_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters

RE: vfiler migrate: overview and thoughts [ In reply to ]

sgelb at insightinvestments

Feb 8, 2012, 11:09 PM

Post #6 of 6 (7158 views)

Permalink

Correct.. between 2 nodes in the same cluster.

-----Original Message-----
From: Borzenkov, Andrey [mailto:andrey.borzenkov@ts.fujitsu.com]
Sent: Wednesday, February 08, 2012 7:02 PM
To: Gelb, Scott; Randy Rue; toasters@teaparty.net
Subject: RE: vfiler migrate: overview and thoughts

"does not work between cluster pairs" - you mean "between heads in
cluster pair"? May be these are subtleties of English language that
non-native speaker misses.
________________________________________
From: toasters-bounces@teaparty.net [toasters-bounces@teaparty.net] On
Behalf Of Gelb, Scott [sgelb@insightinvestments.com]
Sent: Wednesday, February 08, 2012 23:59
To: Randy Rue; toasters@teaparty.net
Subject: RE: vfiler migrate: overview and thoughts

Really good discussion here. You are using the -m nocopy method which
is a method with disk reassign. This does not work from NMC. NMC is
data motion which guarantees a 120 second failover but does not work
between cluster pairs. Regular vfiler migrate uses snapmirror similar
to this but doesn't guarantee failover in 120 seconds. The -m nocopy
method was formerly called snapmover and is between cluster pairs or
v-series neighborhoods. Vfiler migrate works between cluster pairs and
other clusters, but for -m nocopy, all nodes must see the disks (cluster
pair or vseries neighborhood).

The cifs share from vfiler0 would be a hangup so good to test that first
like you did.

I really like the NMC method with data motion but unless you have
another cluster pair to migrate to that isn't an option and migrate -m
nocopy should be faster than that with no data movement (but still no
guarantee on timing but knowing no copy it should always be faster).

Also, when you recreate a vfiler there is no network and the interfaces
are unconfigured. You have to ifconfig them then vfiler run vfiler name
route add default as well. I wouldn't use filerview or the setup -e
wizard since that does whack exports, hosts.equiv, hosts, etc. Running
ifconfig is much easier since it just configures and binds the
unconfigured ip that matches the one in the ifconfig without modifying
files like setup. Then confirm the entry in /etc/rc of vfiler0 in case
or reboot for persistence. Ideally there would be more automation but
it is quick to reconcile when you check vfiler status -a and /etc/rc.

From: toasters-bounces@teaparty.net
[mailto:toasters-bounces@teaparty.net] On Behalf Of Randy Rue
Sent: Wednesday, February 08, 2012 11:12 AM
To: toasters@teaparty.net
Subject: RE: vfiler migrate: overview and thoughts

I've been working via the CLI, good to know the NMC will give better
options...

From: Fletcher Cocquyt
[mailto:fcocquyt@stanford.edu]<mailto:[mailto:fcocquyt@stanford.edu]>
Sent: Wednesday, February 08, 2012 10:56 AM
To: Randy Rue
Cc: toasters@teaparty.net<mailto:toasters@teaparty.net>
Subject: Re: vfiler migrate: overview and thoughts

Hi Randy,
Are you managing vfiler migrations the NMC or are you initiating via
command line 'vfiler migrate'?
In my experience the NMC is more robust in terms of error checking and
cleanup if the migration fails.

There are bugs, but by using the NMC I found the risks are mitigated - I
wrote up our experiences here

http://www.vmadmin.info/2011/02/vfiler-non-disruptive-migration.html

We just upgraded both our clusters to 8.1RC2 and plan to use vFiler
migration (now datamotion) to evacuate the prod cluster to upgrade its
disk trays non-disruptively.

good luck,

Fletcher

On Feb 8, 2012, at 10:26 AM, Randy Rue wrote:

vfiler migrate: overview and thoughts

When I first read there was a way to move a vFiler from one node of a
NetApp cluster to another I was excited. I was imagining something akin
to VMWare's vMotion, a transparent movement of services. Digging a
little deeper showed that NetApp's "vfiler migrate" functionality isn't
nearly as automagic as I'd hoped.

Here are some observations.

* Disk ownership of the resources must be software based whether your
filer is using actual disks or array LUNs (on a vSeries filer like our
3170). We had some concerns that the feature might not work well with
array LUNs but it appears that Data OnTap doesn't know any difference
between an Array LUN and an actual disk in this context.

* The vfiler migrate command effectively moves complete aggregates from
one filer head to another. This means that all volumes on the
aggregate(s) involved must be tied only to the vfiler being moved, with
no LUNs, exports or shares presented from the context of the root filer
or any other vFiler (in our environment we already had a standard of
creating separate aggregates for each vFiler so this wasn't a problem).
For example, after one failed attempt to migrate a filer, I had added a
CIFS share to the root volume of the vFiler via the root filer, to gain
access to the etc folder of the vFiler. I forgot to remove that share,
and broke later migration attempts for a new reason.

* We've tested the vfiler migrate command dozens of times now on three
different vFilers, in preparation for the migration of a production
vFiler later this week. Two of those vFilers have migrated flawlessly
every time, and one seems to fail about 30% of the time for various
reasons which we can sometimes identify and sometimes not.

* Reasons for failure include:
- A CIFS share from the root filer head to the vFiler's root
volume. My bad.
- Possible FC noise between the root filer and the SAN
behind it.
- Possible SCSI reservations issues between the root filer
and the SAN.
- Invalid credentials (fat-fingered a password, I think) for
the
"source" remote root filer. Oddly, the migrate command still
stopped the
vFiler, offlined its volumes and aggregates, and removed the vFiler from
the source root filer before the process failed.
- Poor alignment of the planets? Bad karma?

* In general, it seems that the vfiler migrate just fails sometimes. In
every failure, however, recovery has been straightforward. The "vfiler
create <vFiler name> -r <path to vFiler root volume>" command recovers
the vFiler every time, albeit without a proper network configuration.
The vFiler comes back up but with its virtual NIC having no subnet mask
or assignment to a physical interface. Makes sense, I guess, as the
migrate command never got to the part where it normally asks what mask
and interface to use. This needs to be reassigned, either from the CLI
using ifconfig or the "manage vFiler" wizard in FilerView (note that
this will also overwrite etc/exports with a default but will save a
backup first).

Given that there are arguably a slew more shops out there running HA
VMWare clusters than there are running HA NetApp clusters, it's probably
not fair to expect that vfiler migrate is going to be as slick or even
as well understood/documented in the wild as vMotion. Also, inherent
limitations in things like the CIFS protocol make it necessary that some
services will have to be interrupted. But overall I'd claim that the
feature is useful and even when it doesn't work as hoped recovery is
straightforward and reliable. We're planning on proceeding with our
production move later this week.

Hope this helps anyone in the same situation,

Randy
_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters