Mailing List Archive

Libvirt Snapshots
Hello Everyone,

I've been trying to come up with a solution for libvirt snapshots to fix the issue with snapshotting when a volume is attached:

https://bugs.launchpad.net/nova/+bug/946830

The main issue here is that calling snapshot in libvirt makes an internal snapshot of the entire vm, which a) doesn't work for attached volumes b) wastes a bunch of space while snapshotting memory and ephemeral disks which aren't used.

There are two potential approaches to solving the issue, and I've prototyped them below. I need feedback on which approach is better.

OPTION A --> snapshot using qemu-img

This method shuts down the vm and uses qemu-img to create the snapshot in the disk image

Pros:
works with older versions of libvirt

Cons:
shutting off the vm during snapshotting is overkill and annoying

Caveats:
if it is safe to create disk file snapshots while libvirt has a file handle open, i can use suspend/resume which is better than managedSave.
If it is safe to delete snapshots while the disk is being written to, i can resume sooner, minimizing pause time
if it is additionally safe to create snapshots while the disk is being written to, we can avoid pausing the vm altogether! (sounds dangerous though)

https://github.com/vishvananda/nova/blob/fix-libvirt-snapshot-old/nova/virt/libvirt/connection.py#L619

OPTION B --> libvirt 9.5 snapshots

This method uses the newer snapshot xml in libvirt 9.5 to snapshot only the root disk.

Pros:
plays nicely with libvirt, so the vm is only paused for the minimum amount of time
Cons:
requires libvirt 9.5, which doesn't exist in oneiric

Caveats:
This code is untested and a couple tests don't pass yet because I haven't made an oneiric vm. I want to make sure this is the right approach before I go through the hassle of updating.

https://github.com/vishvananda/nova/blob/fix-libvirt-snapshot/nova/virt/libvirt/connection.py#L619

So I could use some specific feedback from kvm/libvirt folks on the following questions:

a) is it safe to use qemu-img to create/delete a snapshot in a disk file that libvirt is writing to.
if not:
b) is it safe to use qemu-img to delete a snapshot in a disk file that libvirt is writing to but not actively using.
if not:
c) is it safe to use qemu-img to create/delete a snapshot in a disk file that libvirt has an open file handle to.

And I could use input from the community on which of the approaches above to use:

Do we standardize on libvirt 9.5+? or do we use the compatible version that causes a bigger outage during the snapshot?

Ideal for me would be that at least b) above is true and we can get by with the compatible version.

Vish
Re: Libvirt Snapshots [ In reply to ]
> Pros:
> plays nicely with libvirt, so the vm is only paused for the minimum amount
> of time
> Cons:
> requires libvirt 9.5, which doesn't exist in oneiric

libvirt 9.5 requirement sounds acceptable.

Reason:
In general, Essex does not target 11.10-Oneiric, because Diablo is
part of Oneiric.

Essex is really designed to be fully integrated with newer platforms,
such as Ubuntu 12.04 LTS "Precise" and Debian 7.0 "Wheezy", so those
are the targets. (not sure about other vendor's plans)

As for 11.10-Oneiric:
Essex was essentially back-ported to it, so a good solution would be
just extending the backport effort to include the newer libvirt, as
part of the OpenStack PPA for Oneiric.

--
-Alexey Eromenko "Technologov"
Re: Libvirt Snapshots [ In reply to ]
Hmmm,

I think that's a dangerous thing to say.
Openstack (imho) should be trying to work with as many distro versions as possible and not just the newest (and not just the newest ubuntu/debian).
Otherwise nobody will ever really be "using" openstack (or they will have to wait 1 year+ until that "new" distro it was made on becomes accepted as "standard").

I think its fine to target libvirt 9.5, but seeing that openstack is more than just 1 distro, it should really be something that is discussed and taken with more than a grain of salt (since pkg version bumps really do affect many, many consumers and integrators and distributions, as now they all need to figure out how to get that version working). So tread carefully here please :-) Especially with something as key as libvirt...

On 3/8/12 6:42 PM, "Alexey Eromenko" <al4321@gmail.com> wrote:

> Pros:
> plays nicely with libvirt, so the vm is only paused for the minimum amount
> of time
> Cons:
> requires libvirt 9.5, which doesn't exist in oneiric

libvirt 9.5 requirement sounds acceptable.

Reason:
In general, Essex does not target 11.10-Oneiric, because Diablo is
part of Oneiric.

Essex is really designed to be fully integrated with newer platforms,
such as Ubuntu 12.04 LTS "Precise" and Debian 7.0 "Wheezy", so those
are the targets. (not sure about other vendor's plans)

As for 11.10-Oneiric:
Essex was essentially back-ported to it, so a good solution would be
just extending the backport effort to include the newer libvirt, as
part of the OpenStack PPA for Oneiric.

--
-Alexey Eromenko "Technologov"

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp
Re: Libvirt Snapshots [ In reply to ]
On Thu, Mar 08, 2012 at 06:02:54PM -0800, Vishvananda Ishaya wrote:
> So I could use some specific feedback from kvm/libvirt folks on the following questions:
>
> a) is it safe to use qemu-img to create/delete a snapshot in a disk file that libvirt is writing to.
> if not:
> b) is it safe to use qemu-img to delete a snapshot in a disk file that libvirt is writing to but not actively using.
> if not:
> c) is it safe to use qemu-img to create/delete a snapshot in a disk file that libvirt has an open file handle to.

Sadly, the answer is no to all those questions. For Qcow2 files, using
internal snapshots, you cannot make *any* changes to the qcow2 file,
while QEMU has it open. The reasons are that QEMU may have metadata
changes pending to the file which have not yet flushed to disk, and
second, creating/deleteing the snapshot with qemu-img may cause
metadat changes that QEMU won't be aware of. Either way you will likely
cause corruption of the qcow2 file.

For these reasons, QEMU provides monitor commands for snapshotting,
that libvirt uses whenever the guest is running. Libvirt will only
use qemu-img, if the the guest is offline.

Regards,
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: Libvirt Snapshots [ In reply to ]
2012/3/9 Vishvananda Ishaya <vishvananda@gmail.com>:
> OPTION B --> libvirt 9.5 snapshots
>
> This method uses the newer snapshot xml in libvirt 9.5 to snapshot only the
> root disk.
>
> Pros:
> plays nicely with libvirt, so the vm is only paused for the minimum amount
> of time
> Cons:
> requires libvirt 9.5, which doesn't exist in oneiric

Oneiric is old hat. I'm cool with 0.9.5. If someone wants to spend
time building an inferior (i.e. that requires more downtime during
snapshotting) implementation that works with older versions of
libvirt, that's fine, but as a project, my view is: There's excellent
free software available that enables us to build cooler things faster.
We should use it.

Besides, libvirt 0.9.5 has been out for over 6 months. It's not *that* new.

--
Soren Hansen             | http://linux2go.dk/
Senior Software Engineer | http://www.cisco.com/
Ubuntu Developer         | http://www.ubuntu.com/
OpenStack Developer      | http://www.openstack.org/
Re: Libvirt Snapshots [ In reply to ]
On Fri, Mar 09, 2012 at 03:57:30PM +0100, Soren Hansen wrote:
> 2012/3/9 Vishvananda Ishaya <vishvananda@gmail.com>:
> > OPTION B --> libvirt 9.5 snapshots
> >
> > This method uses the newer snapshot xml in libvirt 9.5 to snapshot only the
> > root disk.
> >
> > Pros:
> > plays nicely with libvirt, so the vm is only paused for the minimum amount
> > of time
> > Cons:
> > requires libvirt 9.5, which doesn't exist in oneiric
>
> Oneiric is old hat. I'm cool with 0.9.5. If someone wants to spend
> time building an inferior (i.e. that requires more downtime during
> snapshotting) implementation that works with older versions of
> libvirt, that's fine, but as a project, my view is: There's excellent
> free software available that enables us to build cooler things faster.
> We should use it.
>
> Besides, libvirt 0.9.5 has been out for over 6 months. It's not *that* new.

And it possible for people to build any new libvirt release for
any old Ubuntu release they desire to support. So if someone really
wants to use new OpenStack on older distros they aren't locked out.

Regards,
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: Libvirt Snapshots [ In reply to ]
Pedantry: It's QEMU/KVM, not libvirt, that holds the disks open.  The
pedantry does make a difference here I think...

A more sustainable option than being on the bleeding edge of libvirt
may be to try to bypass libvirt and issue those safe QEMU monitor
commands directly.  Libvirt would normally prevent this, but it looks
like there is a QEMU monitor command built into libvirt:
http://blog.vmsplice.net/2011/03/how-to-access-qemu-monitor-through.html

So what does a libvirt snapshot actually do?  Here's the code:
http://libvirt.org/git/?p=libvirt.git;a=blob;f=src/qemu/qemu_driver.c;h=be678f36527fd7918a1aaa69f54a6c939f258714;hb=HEAD

qemuDomainSnapshotCreateXML is the main function; it looks like we
want to pass in VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY
That calls qemuDomainSnapshotCreateDiskActive, which then pauses the
VM, snapshots each disk in series using the QEMU monitor, and then
resumes the VM.

We should do the same thing, but better:

Suspend the domain using libvirt
Snapshot each disk we want to snapshot _in parallel_, by using the
libvirt QEMU monitor pass-through.  Remote volumes could use the
correct driver so that e.g. a SAN disk could make use of a hardware
snapshot,
Resume the domain using libvirt

For essex, it  sounds like step 2 probably means "snapshot the root
volume only" using qemu, and don't snapshot remote volumes.

Post Essex:
We could try to avoid suspending / resuming the domain, by using a
filesystem thaw.  It looks like libvirt has some (very new) support
for this, but as this relies on a QEMU guest agent, I suspect we'd do
better to roll our own here that could be cross-hypervisor
We can allow selective snapshotting of disks.  On a database, for
example, you really do want to snapshot all the disks together.
We could also support "optimistiic" snapshots, which just do a
snapshot without suspending anything. The use-case is that the caller
issues a filesystem thaw e.g. over SSH,  then requests a snapshot on
each disk that they care about though Openstack, then resumes normal
operation over SSH.

I actually like that third option the best. I'd like to be able to
snapshot the root volume like I do any other volume, and I'd like to
be in control of the suspension mechanism. Suspending the entire VM
is a little extreme, particularly if I'm using a
filesystem/application that offers me a lower-impact alternative!
However, an easy way to "just snapshot everything" with a single call
is also attractive, and I'd imagine people using OpenStack directly
(rather than though code) would definitely use that.

Justin


On Fri, Mar 9, 2012 at 12:18 AM, Daniel P. Berrange <berrange@redhat.com> wrote:
>
> On Thu, Mar 08, 2012 at 06:02:54PM -0800, Vishvananda Ishaya wrote:
> > So I could use some specific feedback from kvm/libvirt folks on the following questions:
> >
> > a) is it safe to use qemu-img to create/delete a snapshot in a disk file that libvirt is writing to.
> > if not:
> > b) is it safe to use qemu-img to delete a snapshot in a disk file that libvirt is writing to but not actively using.
> > if not:
> > c) is it safe to use qemu-img to create/delete a snapshot in a disk file that libvirt has an open file handle to.
>
> Sadly, the answer is no to all those questions. For Qcow2 files, using
> internal snapshots, you cannot make *any* changes to the qcow2 file,
> while QEMU has it open. The reasons are that QEMU may have metadata
> changes pending to the file which have not yet flushed to disk, and
> second, creating/deleteing the snapshot with qemu-img may cause
> metadat changes that QEMU won't be aware of. Either way you will likely
> cause corruption of the qcow2 file.
>
> For these reasons, QEMU provides monitor commands for snapshotting,
> that libvirt uses whenever the guest is running. Libvirt will only
> use qemu-img, if the the guest is offline.
>
> Regards,
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
Re: Libvirt Snapshots [ In reply to ]
On Fri, Mar 09, 2012 at 08:13:06AM -0800, Justin Santa Barbara wrote:
> Pedantry: It's QEMU/KVM, not libvirt, that holds the disks open.  The
> pedantry does make a difference here I think...
>
> A more sustainable option than being on the bleeding edge of libvirt
> may be to try to bypass libvirt and issue those safe QEMU monitor
> commands directly.  Libvirt would normally prevent this, but it looks
> like there is a QEMU monitor command built into libvirt:
> http://blog.vmsplice.net/2011/03/how-to-access-qemu-monitor-through.html

The command line and monitor passthrough capability is there as a means
to perform short term workarounds/hacks on a specific version of libvirt,
for functionality not already available via a supported API.

We make absolutely no guarentee that usage of passthrough capabilities
will continue to work correctly if you upgrade to a newer libvirt. As
such it is not something you really want to use as the basis for a
production release of OpenStack that expects compatibility with future
libvirt releases

http://berrange.com/posts/2011/12/19/using-command-line-arg-monitor-command-passthrough-with-libvirt-and-kvm/

Regards,
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: Libvirt Snapshots [ In reply to ]
Even though it's more of a libvirt question since the topic of snapshot is being discussed, thought of asking it. Does libvirt 0.95 uses the backing file concept? or is that the same thing that Vish mentioned as option 1

Ranga

On Mar 9, 2012, at 10:06 AM, "Daniel P. Berrange" <berrange@redhat.com> wrote:

> On Fri, Mar 09, 2012 at 03:57:30PM +0100, Soren Hansen wrote:
>> 2012/3/9 Vishvananda Ishaya <vishvananda@gmail.com>:
>>> OPTION B --> libvirt 9.5 snapshots
>>>
>>> This method uses the newer snapshot xml in libvirt 9.5 to snapshot only the
>>> root disk.
>>>
>>> Pros:
>>> plays nicely with libvirt, so the vm is only paused for the minimum amount
>>> of time
>>> Cons:
>>> requires libvirt 9.5, which doesn't exist in oneiric
>>
>> Oneiric is old hat. I'm cool with 0.9.5. If someone wants to spend
>> time building an inferior (i.e. that requires more downtime during
>> snapshotting) implementation that works with older versions of
>> libvirt, that's fine, but as a project, my view is: There's excellent
>> free software available that enables us to build cooler things faster.
>> We should use it.
>>
>> Besides, libvirt 0.9.5 has been out for over 6 months. It's not *that* new.
>
> And it possible for people to build any new libvirt release for
> any old Ubuntu release they desire to support. So if someone really
> wants to use new OpenStack on older distros they aren't locked out.
>
> Regards,
> Daniel
> --
> |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org -o- http://virt-manager.org :|
> |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
Re: Libvirt Snapshots [ In reply to ]
On Fri, Mar 09, 2012 at 10:43:35AM -0600, rbabu@hexagrid.com wrote:
> Even though it's more of a libvirt question since the topic of snapshot
> is being discussed, thought of asking it. Does libvirt 0.95 uses the
> backing file concept? or is that the same thing that Vish mentioned
> as option 1

The latest snapshot APIs in libvirt are broadly configurable by passing
in suitable XML. So if you want to take snapshots on the SAN, or using
LVM or backing files, they can all be made to fit in with libvirt's
new APIs. I'm not entirely familiar with how to use it, so if you want
fine details head over to the libvirt mailing lists where the authors
of the libvirt snapshot code will be able to assist.

Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: Libvirt Snapshots [ In reply to ]
Thanks for the background. My thoughts:

* Telling a user to build from source isn't a great option for them -
it's painful, they don't get updates automatically etc. Are we going
to start distributing packages again?

* I can't imagine any open source project removing functionality like
the QEMU pass-through - those sort of tactics are confined to
commercial products that are trying to lock you in, in my experience.
I'm sure libvirt would never try that, and if they did I'm sure it
would rapidly be forked.

* I can imagine that libvirt might switch to the JSON QEMU monitor
protocol (I don't understand why it isn't already, I'm probably wrong
here?). Thankfully the QEMU JSON protocol also has a pass-through,
but we would ideally switch to the better protocol when/if libvirt
switches over.

* It sounds like this is the exact use-case for which the pass-through
functionality was designed - libvirt doesn't support what we need (at
least in commonly-utilized versions), we have some additional
potential roadmap features that may not be supported, so we _should_
use the pass-through.


Once commonly distributed versions of libvirt have the functionality
we need, then we can remove this functionality from OpenStack, but we
do have to fix this for Essex. Hopefully this can continue long-term:
OpenStack can develop the functionality that clouds need that libvirt
is missing, in Python instead of in C, and then libvirt can
incorporate this functionality where you want to.

Justin


> The command line and monitor passthrough capability is there as a means
> to perform short term workarounds/hacks on a specific version of libvirt,
> for functionality not already available via a supported API.
>
> We make absolutely no guarentee that usage of passthrough capabilities
> will continue to work correctly if you upgrade to a newer libvirt. As
> such it is not something you really want to use as the basis for a
> production release of OpenStack that expects compatibility with future
> libvirt releases
>
> http://berrange.com/posts/2011/12/19/using-command-line-arg-monitor-command-passthrough-with-libvirt-and-kvm/
>
...
>
> And it possible for people to build any new libvirt release for
> any old Ubuntu release they desire to support. So if someone really
> wants to use new OpenStack on older distros they aren't locked out.
>
> Regards,
> Daniel
Re: Libvirt Snapshots [ In reply to ]
On Fri, Mar 09, 2012 at 09:21:59AM -0800, Justin Santa Barbara wrote:
> Thanks for the background. My thoughts:
>
> * Telling a user to build from source isn't a great option for them -
> it's painful, they don't get updates automatically etc. Are we going
> to start distributing packages again?
>
> * I can't imagine any open source project removing functionality like
> the QEMU pass-through - those sort of tactics are confined to
> commercial products that are trying to lock you in, in my experience.
> I'm sure libvirt would never try that, and if they did I'm sure it
> would rapidly be forked.

It is not that we would intentionally remove the functionality. It
is more a problem that libvirt or QEMU changes the way they behave,
and this breaks some assumptions that are being relied upon by the
user of the passthrough. For example, in the past we have changed
the way we configure networks, and changed the way we configure
disks several times. Both those cases would have broken apps if
they had been using the passthrough feature.

I can't guarentee that if you use passthrough for managing snapshots,
that you won't get broken by future libvirt, because libvirt's
support for snapshots is under active development both at the libvirt
and QEMU layers. So an assumption that works today, may not work in
the future.

> * I can imagine that libvirt might switch to the JSON QEMU monitor
> protocol (I don't understand why it isn't already, I'm probably wrong
> here?). Thankfully the QEMU JSON protocol also has a pass-through,
> but we would ideally switch to the better protocol when/if libvirt
> switches over.

We already use the JSON monitor protocol, if enabled at build time,
but it is configurable by the packager, so you can't neccessarily
predict it.

> * It sounds like this is the exact use-case for which the pass-through
> functionality was designed - libvirt doesn't support what we need (at
> least in commonly-utilized versions), we have some additional
> potential roadmap features that may not be supported, so we _should_
> use the pass-through.

The only safe way to use the passthrough, is to limit it to precise
versions. ie, you would want to default to using the modern libvirt
APIs, and *only* fallback to using pasthrough on older versions. That
should be reasonably safe, since old code doesn't often change itself :-)
So basically you'd be writing two version of the openstack snapshot
code, and hopefully throw away the pasthrough version in a couple of
releases time.

Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: Libvirt Snapshots [ In reply to ]
On Mar 9, 2012, at 8:51 AM, Daniel P. Berrange wrote:

> On Fri, Mar 09, 2012 at 10:43:35AM -0600, rbabu@hexagrid.com wrote:
>> Even though it's more of a libvirt question since the topic of snapshot
>> is being discussed, thought of asking it. Does libvirt 0.95 uses the
>> backing file concept? or is that the same thing that Vish mentioned
>> as option 1
>
> The latest snapshot APIs in libvirt are broadly configurable by passing
> in suitable XML. So if you want to take snapshots on the SAN, or using
> LVM or backing files, they can all be made to fit in with libvirt's
> new APIs. I'm not entirely familiar with how to use it, so if you want
> fine details head over to the libvirt mailing lists where the authors
> of the libvirt snapshot code will be able to assist.

So far my experiments with the newer code in libvirt have been unsuccessful, so we may have to just go with the older version that does a managed save in between.

Using the external snapshot creates a backing file which it seems to be impossible to remove. I can extract the snapshot but attempting to delete it gives me:

libvirtError: unsupported configuration: deletion of 1 external disk snapshots not supported yet

I can't seem to do an internal only snapshot of just the disk. Doing it while the machine is running gives me:

libvirtError: unsupported configuration: active qemu domains require external disk snapshots; disk vda requested internal

and using managed save first:

libvirtError: unsupported configuration: disk snapshots of inactive domains not implemented yet

Unless there is some magic incantation to allow me to create an external snapshot and then later remove it so I don't leave a chain of backing files around I think we have to go with the compatibility version.