Mailing List Archive

snapmirror source and target aggregate usage don't match?
Hello All,

We run two 8.3 filers with a list of vservers and their associated volumes, with each volume snapmirrored (volume level) from the active primary cluster to matching vserver/volumes on the passive secondary.

Both clusters have a similar set of aggregates of just about equal size. Both clusters' aggregates contain the same list of volumes of the same size, with the same space total/used/available on both sets.

But on the target cluster the same aggregates are reporting 30% more used space.

This is about on par with the dedupe savings we're getting on the primary so when I first noticed this my thought was to check that dedupe was OK on the target. But if you look in the webUI, it reports that no "storage efficiency" is available on a replication target, and ended up thinking this meant that the secondary data would have to be full-sized. I even recall asking someone and having this confirmed, but can't recall if that came from the vendor SE or our VAR SE or a support tech or.

Now we're approaching the space limit of the secondary cluster and I'm looking deeper. At this point, as it appears that for each volume the total/used/free space matches after dedupe on the source, I'm thinking that dedupe properties aren't exposed on the target but the data is still a true copy of the deduped original. This is supported by being able to view dedupe stats on the target via the CLI that show the same savings as on the source.

Note that we're also snapshotting these volumes, and while we're deduping daily, we're snapshotting hourly. A colleague mentioned remembering that this could mean mirrored data that's not deduped yet is being replicated full-size. But if so, wouldn't this be reflected in the dedupe stats on the target?

OK, just found that "storage aggregate show -fields usedsize,physical-used" on the primary/source cluster shows that used and physical-used are about identical for all aggrs. On the secondary/target, used is consistently larger than physical-used and the total difference makes up the 30% I'm "missing."

Is this a problem with my reporting? Are we actually OK and I need to look at physical-used instead of used? Or if we're not OK, where is the space being used and can I get it back?

Thanks in advance for your guidance...

Randy
Re: snapmirror source and target aggregate usage don't match? [ In reply to ]
It could be the target is not thin provisioned.  What is the output of "vol show -fields space-guarantee,space-guarantee-enabled" for one of the volumes on the source and the target?  Do they match?


From: "Rue, Randy" <rrue@fredhutch.org>
To: "'toasters@teaparty.net'" <toasters@teaparty.net>
Sent: Saturday, December 17, 2016 8:09 AM
Subject: snapmirror source and target aggregate usage don't match?

<!--#yiv8850567270 _filtered #yiv8850567270 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;}#yiv8850567270 #yiv8850567270 p.yiv8850567270MsoNormal, #yiv8850567270 li.yiv8850567270MsoNormal, #yiv8850567270 div.yiv8850567270MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:11.0pt;font-family:"Calibri", "sans-serif";}#yiv8850567270 a:link, #yiv8850567270 span.yiv8850567270MsoHyperlink {color:blue;text-decoration:underline;}#yiv8850567270 a:visited, #yiv8850567270 span.yiv8850567270MsoHyperlinkFollowed {color:purple;text-decoration:underline;}#yiv8850567270 span.yiv8850567270EmailStyle17 {font-family:"Calibri", "sans-serif";color:windowtext;}#yiv8850567270 .yiv8850567270MsoChpDefault {font-family:"Calibri", "sans-serif";} _filtered #yiv8850567270 {margin:1.0in 1.0in 1.0in 1.0in;}#yiv8850567270 div.yiv8850567270WordSection1 {}-->Hello All,   We run two 8.3 filers with a list of vservers and their associated volumes, with each volume snapmirrored (volume level) from the active primary cluster to matching vserver/volumes on the passive secondary.   Both clusters have a similar set of aggregates of just about equal size. Both clusters’ aggregates contain the same list of volumes of the same size, with the same space total/used/available on both sets.   But on the target cluster the same aggregates are reporting 30% more used space.   This is about on par with the dedupe savings we’re getting on the primary so when I first noticed this my thought was to check that dedupe was OK on the target. But if you look in the webUI, it reports that no “storage efficiency” is available on a replication target, and ended up thinking this meant that the secondary data would have to be full-sized. I even recall asking someone and having this confirmed, but can’t recall if that came from the vendor SE or our VAR SE or a support tech or.   Now we’re approaching the space limit of the secondary cluster and I’m looking deeper. At this point, as it appears that for each volume the total/used/free space matches after dedupe on the source, I’m thinking that dedupe properties aren’t exposed on the target but the data is still a true copy of the deduped original. This is supported by being able to view dedupe stats on the target via the CLI that show the same savings as on the source.   Note that we’re also snapshotting these volumes, and while we’re deduping daily, we’re snapshotting hourly. A colleague mentioned remembering that this could mean mirrored data that’s not deduped yet is being replicated full-size. But if so, wouldn’t this be reflected in the dedupe stats on the target?   OK, just found that “storage aggregate show -fields usedsize,physical-used” on the primary/source cluster shows that used and physical-used are about identical for all aggrs. On the secondary/target, used is consistently larger than physical-used and the total difference makes up the 30% I’m “missing.”   Is this a problem with my reporting? Are we actually OK and I need to look at physical-used instead of used? Or if we’re not OK, where is the space being used and can I get it back?   Thanks in advance for your guidance…   Randy    
_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters
RE: snapmirror source and target aggregate usage don't match? [ In reply to ]
Thank you for a NA docs pointer from a list member and the opportunity for a little RTFM before I reply to the list!

Looks like my colleague’s recollection of earlier versions still applies to 8.3. Essentially, we've been snapmirroring un-deduped data.

My next question is, is it realistic to run dedupes hourly on 30-40 volumes totalling ~100TB? Because that's a much easier proposition than amending our SLAs to lengthen our snapshot cycles.

And to get both on the same cycle, is it possible to make snapshots dependent on dedupe finishing, or do we just assume dedupe will complete, and if so, if it doesn't, what are the consequences? For example, if a dedupe that usually finishes at 5 minutes after the hour isn't done, and snapmirror runs then, will snapmirror then be syncing a full hour of full-sized changes?

Last question for now: assuming both are on the same schedule, how do I get the current lost space back? Will it be reclaimed when the schedules are synced and the snapshots have rolled off? Or do I need to destroy and recreate the target volumes?

Hope to hear from you, especially from any other shop running both snapmirror and dedupe.

Randy

(and if a solution requires DOT9 we do have an upgrade on our roadmap)


Replying to just toy for now. Hopefully this will help and you can report back
Look at this...
https://library.netapp.com/ecm/ecm_download_file/ECMLP2348026

Page 142.

I think you may need to work out some scheduling and that may help. Going to look a bit more...i have another idea, but that may only be ONTAP 9 related.


Get Outlook for iOS<https://aka.ms/o0ukef>

_____________________________
From: Rue, Randy <rrue@fredhutch.org<mailto:rrue@fredhutch.org>>
Sent: Saturday, December 17, 2016 11:14 AM
Subject: snapmirror source and target aggregate usage don't match?
To: <toasters@teaparty.net<mailto:toasters@teaparty.net>>


Hello All,

We run two 8.3 filers with a list of vservers and their associated volumes, with each volume snapmirrored (volume level) from the active primary cluster to matching vserver/volumes on the passive secondary.

Both clusters have a similar set of aggregates of just about equal size. Both clusters’ aggregates contain the same list of volumes of the same size, with the same space total/used/available on both sets.

But on the target cluster the same aggregates are reporting 30% more used space.

This is about on par with the dedupe savings we’re getting on the primary so when I first noticed this my thought was to check that dedupe was OK on the target. But if you look in the webUI, it reports that no “storage efficiency” is available on a replication target, and ended up thinking this meant that the secondary data would have to be full-sized. I even recall asking someone and having this confirmed, but can’t recall if that came from the vendor SE or our VAR SE or a support tech or.

Now we’re approaching the space limit of the secondary cluster and I’m looking deeper. At this point, as it appears that for each volume the total/used/free space matches after dedupe on the source, I’m thinking that dedupe properties aren’t exposed on the target but the data is still a true copy of the deduped original. This is supported by being able to view dedupe stats on the target via the CLI that show the same savings as on the source.

Note that we’re also snapshotting these volumes, and while we’re deduping daily, we’re snapshotting hourly. A colleague mentioned remembering that this could mean mirrored data that’s not deduped yet is being replicated full-size. But if so, wouldn’t this be reflected in the dedupe stats on the target?

OK, just found that “storage aggregate show -fields usedsize,physical-used” on the primary/source cluster shows that used and physical-used are about identical for all aggrs. On the secondary/target, used is consistently larger than physical-used and the total difference makes up the 30% I’m “missing.”

Is this a problem with my reporting? Are we actually OK and I need to look at physical-used instead of used? Or if we’re not OK, where is the space being used and can I get it back?

Thanks in advance for your guidance…

Randy
Re: snapmirror source and target aggregate usage don't match? [ In reply to ]
Trial by fire?

You will likely need to get a rough baseline of how long a single dedupe
run takes on a source volume.
Then look at doing multiple volumes and see how much the process slows
down.
With Spinnng media, the dedupe process can take a while on larger volumes.

You may need to get a little creative and group some of the volumes
together and created schedules for the deduping to happen at specific
times. I really doubt you are going to be able to dedupe 30-40 volumes
totalling ~100TB every hour. You should probably stagger and run throughout
the day with different schedules

you can create custom vol eff. policies that only run for a certain
duration ("-" is default for no duration or up to 999 hours (whole numbers
in hours)) and the qos-policy (for vol eff operations!) can be set to
background (run, but do not impede) or best-effort (may slightly impact
operations)

as far as the timing, you may want to look at scripting with powershell or
the NetApp SDK.
Wait for a vol eff operation to finish, then do snapshots and/or mirroring.

We could spend lots of time on this. I hope this helps at least a little.


--tmac

*Tim McCarthy, **Principal Consultant*

*Proud Member of the #NetAppATeam <https://twitter.com/NetAppATeam>*

*I Blog at TMACsRack <https://tmacsrack.wordpress.com/>*




On Sat, Dec 17, 2016 at 2:13 PM, Rue, Randy <rrue@fredhutch.org> wrote:

> Thank you for a NA docs pointer from a list member and the opportunity for
> a little RTFM before I reply to the list!
>
> Looks like my colleague’s recollection of earlier versions still applies
> to 8.3. Essentially, we've been snapmirroring un-deduped data.
>
> My next question is, is it realistic to run dedupes hourly on 30-40
> volumes totalling ~100TB? Because that's a much easier proposition than
> amending our SLAs to lengthen our snapshot cycles.
>
> And to get both on the same cycle, is it possible to make snapshots
> dependent on dedupe finishing, or do we just assume dedupe will complete,
> and if so, if it doesn't, what are the consequences? For example, if a
> dedupe that usually finishes at 5 minutes after the hour isn't done, and
> snapmirror runs then, will snapmirror then be syncing a full hour of
> full-sized changes?
>
> Last question for now: assuming both are on the same schedule, how do I
> get the current lost space back? Will it be reclaimed when the schedules
> are synced and the snapshots have rolled off? Or do I need to destroy and
> recreate the target volumes?
>
> Hope to hear from you, especially from any other shop running both
> snapmirror and dedupe.
>
> Randy
>
> (and if a solution requires DOT9 we do have an upgrade on our roadmap)
>
>
>
> Replying to just toy for now. Hopefully this will help and you can report
> back
> Look at this...
> https://library.netapp.com/ecm/ecm_download_file/ECMLP2348026
>
> Page 142.
>
> I think you may need to work out some scheduling and that may help. Going
> to look a bit more...i have another idea, but that may only be ONTAP 9
> related.
>
>
> Get Outlook for iOS <https://aka.ms/o0ukef>
>
> _____________________________
> From: Rue, Randy <rrue@fredhutch.org>
> Sent: Saturday, December 17, 2016 11:14 AM
> Subject: snapmirror source and target aggregate usage don't match?
> To: <toasters@teaparty.net>
>
>
> Hello All,
>
>
>
> We run two 8.3 filers with a list of vservers and their associated
> volumes, with each volume snapmirrored (volume level) from the active
> primary cluster to matching vserver/volumes on the passive secondary.
>
>
>
> Both clusters have a similar set of aggregates of just about equal size.
> Both clusters’ aggregates contain the same list of volumes of the same
> size, with the same space total/used/available on both sets.
>
>
>
> But on the target cluster the same aggregates are reporting 30% more used
> space.
>
>
>
> This is about on par with the dedupe savings we’re getting on the primary
> so when I first noticed this my thought was to check that dedupe was OK on
> the target. But if you look in the webUI, it reports that no “storage
> efficiency” is available on a replication target, and ended up thinking
> this meant that the secondary data would have to be full-sized. I even
> recall asking someone and having this confirmed, but can’t recall if that
> came from the vendor SE or our VAR SE or a support tech or.
>
>
>
> Now we’re approaching the space limit of the secondary cluster and I’m
> looking deeper. At this point, as it appears that for each volume the
> total/used/free space matches after dedupe on the source, I’m thinking that
> dedupe properties aren’t exposed on the target but the data is still a true
> copy of the deduped original. This is supported by being able to view
> dedupe stats on the target via the CLI that show the same savings as on the
> source.
>
>
>
> Note that we’re also snapshotting these volumes, and while we’re deduping
> daily, we’re snapshotting hourly. A colleague mentioned remembering that
> this could mean mirrored data that’s not deduped yet is being replicated
> full-size. But if so, wouldn’t this be reflected in the dedupe stats on the
> target?
>
>
>
> OK, just found that “storage aggregate show -fields
> usedsize,physical-used” on the primary/source cluster shows that used and
> physical-used are about identical for all aggrs. On the secondary/target,
> used is consistently larger than physical-used and the total difference
> makes up the 30% I’m “missing.”
>
>
>
> Is this a problem with my reporting? Are we actually OK and I need to look
> at physical-used instead of used? Or if we’re not OK, where is the space
> being used and can I get it back?
>
>
>
> Thanks in advance for your guidance…
>
>
>
> Randy
>
>
>
>
>
>
>
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> http://www.teaparty.net/mailman/listinfo/toasters
>
>
Re: snapmirror source and target aggregate usage don't match? [ In reply to ]
Maybe also review the dedup savings you are acheveing and consider
disabling it on volumes getting 10% savings or so or less since the
metadata required for dedup is something like 7%.

On Dec 17, 2016 9:23 PM, "tmac" <tmacmd@gmail.com> wrote:

Trial by fire?

You will likely need to get a rough baseline of how long a single dedupe
run takes on a source volume.
Then look at doing multiple volumes and see how much the process slows
down.
With Spinnng media, the dedupe process can take a while on larger volumes.

You may need to get a little creative and group some of the volumes
together and created schedules for the deduping to happen at specific
times. I really doubt you are going to be able to dedupe 30-40 volumes
totalling ~100TB every hour. You should probably stagger and run throughout
the day with different schedules

you can create custom vol eff. policies that only run for a certain
duration ("-" is default for no duration or up to 999 hours (whole numbers
in hours)) and the qos-policy (for vol eff operations!) can be set to
background (run, but do not impede) or best-effort (may slightly impact
operations)

as far as the timing, you may want to look at scripting with powershell or
the NetApp SDK.
Wait for a vol eff operation to finish, then do snapshots and/or mirroring.

We could spend lots of time on this. I hope this helps at least a little.


--tmac

*Tim McCarthy, **Principal Consultant*

*Proud Member of the #NetAppATeam <https://twitter.com/NetAppATeam>*

*I Blog at TMACsRack <https://tmacsrack.wordpress.com/>*




On Sat, Dec 17, 2016 at 2:13 PM, Rue, Randy <rrue@fredhutch.org> wrote:

> Thank you for a NA docs pointer from a list member and the opportunity for
> a little RTFM before I reply to the list!
>
> Looks like my colleague’s recollection of earlier versions still applies
> to 8.3. Essentially, we've been snapmirroring un-deduped data.
>
> My next question is, is it realistic to run dedupes hourly on 30-40
> volumes totalling ~100TB? Because that's a much easier proposition than
> amending our SLAs to lengthen our snapshot cycles.
>
> And to get both on the same cycle, is it possible to make snapshots
> dependent on dedupe finishing, or do we just assume dedupe will complete,
> and if so, if it doesn't, what are the consequences? For example, if a
> dedupe that usually finishes at 5 minutes after the hour isn't done, and
> snapmirror runs then, will snapmirror then be syncing a full hour of
> full-sized changes?
>
> Last question for now: assuming both are on the same schedule, how do I
> get the current lost space back? Will it be reclaimed when the schedules
> are synced and the snapshots have rolled off? Or do I need to destroy and
> recreate the target volumes?
>
> Hope to hear from you, especially from any other shop running both
> snapmirror and dedupe.
>
> Randy
>
> (and if a solution requires DOT9 we do have an upgrade on our roadmap)
>
>
>
> Replying to just toy for now. Hopefully this will help and you can report
> back
> Look at this...
> https://library.netapp.com/ecm/ecm_download_file/ECMLP2348026
>
> Page 142.
>
> I think you may need to work out some scheduling and that may help. Going
> to look a bit more...i have another idea, but that may only be ONTAP 9
> related.
>
>
> Get Outlook for iOS <https://aka.ms/o0ukef>
>
> _____________________________
> From: Rue, Randy <rrue@fredhutch.org>
> Sent: Saturday, December 17, 2016 11:14 AM
> Subject: snapmirror source and target aggregate usage don't match?
> To: <toasters@teaparty.net>
>
>
> Hello All,
>
>
>
> We run two 8.3 filers with a list of vservers and their associated
> volumes, with each volume snapmirrored (volume level) from the active
> primary cluster to matching vserver/volumes on the passive secondary.
>
>
>
> Both clusters have a similar set of aggregates of just about equal size.
> Both clusters’ aggregates contain the same list of volumes of the same
> size, with the same space total/used/available on both sets.
>
>
>
> But on the target cluster the same aggregates are reporting 30% more used
> space.
>
>
>
> This is about on par with the dedupe savings we’re getting on the primary
> so when I first noticed this my thought was to check that dedupe was OK on
> the target. But if you look in the webUI, it reports that no “storage
> efficiency” is available on a replication target, and ended up thinking
> this meant that the secondary data would have to be full-sized. I even
> recall asking someone and having this confirmed, but can’t recall if that
> came from the vendor SE or our VAR SE or a support tech or.
>
>
>
> Now we’re approaching the space limit of the secondary cluster and I’m
> looking deeper. At this point, as it appears that for each volume the
> total/used/free space matches after dedupe on the source, I’m thinking that
> dedupe properties aren’t exposed on the target but the data is still a true
> copy of the deduped original. This is supported by being able to view
> dedupe stats on the target via the CLI that show the same savings as on the
> source.
>
>
>
> Note that we’re also snapshotting these volumes, and while we’re deduping
> daily, we’re snapshotting hourly. A colleague mentioned remembering that
> this could mean mirrored data that’s not deduped yet is being replicated
> full-size. But if so, wouldn’t this be reflected in the dedupe stats on the
> target?
>
>
>
> OK, just found that “storage aggregate show -fields
> usedsize,physical-used” on the primary/source cluster shows that used and
> physical-used are about identical for all aggrs. On the secondary/target,
> used is consistently larger than physical-used and the total difference
> makes up the 30% I’m “missing.”
>
>
>
> Is this a problem with my reporting? Are we actually OK and I need to look
> at physical-used instead of used? Or if we’re not OK, where is the space
> being used and can I get it back?
>
>
>
> Thanks in advance for your guidance…
>
>
>
> Randy
>
>
>
>
>
>
>
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> http://www.teaparty.net/mailman/listinfo/toasters
>
>

_______________________________________________
Toasters mailing list
Toasters@teaparty.net
http://www.teaparty.net/mailman/listinfo/toasters