Mailing List Archive

Enhance SnapshotDeletionPolicy to allow taking multiple snapshots
Hi

Today, SnapshotDeletionPolicy (SDP) allows taking only one snapshot. I need
to be able to take multiple snapshots. Consider multiple processes doing
several things on the index - each needs a snapshot of the index so that
commit it needs doesn't get deleted under the covers. SDP is perfect, only
it allows for only one snapshot. So I was thinking to extend it to a
MultiSDP which adds an 'id' parameter to snapshot() and release(). But then,
I was thinking - why shouldn't that exist in SDP? It won't make the API
anymore complicated, and in addition won't introduce yet another DP class.

This can be done in two ways:
1) snapshot() and release() get the extra parameter. For convenience we can
allow for a null id, in which case only one snapshot w/ a null id can be
taken (until it's released). To avoid making up an id if you really need a
single snapshot.
2) add variants snapshot() and release() which take an id as argument. Or
... extend SDP to MultiSDP.

I'd prefer if we keep that functionality in SDP, but if you prefer an
extension to it, then we'll need to allow for easier extension of SDP (I
think we should do that anyway).

What do you think?

Shai
Re: Enhance SnapshotDeletionPolicy to allow taking multiple snapshots [ In reply to ]
This would be great!

I think we should just enhance SDP rather than make a new MDP.

Mike

On Tue, Apr 27, 2010 at 9:59 AM, Shai Erera <serera@gmail.com> wrote:
> Hi
>
> Today, SnapshotDeletionPolicy (SDP) allows taking only one snapshot. I need
> to be able to take multiple snapshots. Consider multiple processes doing
> several things on the index - each needs a snapshot of the index so that
> commit it needs doesn't get deleted under the covers. SDP is perfect, only
> it allows for only one snapshot. So I was thinking to extend it to a
> MultiSDP which adds an 'id' parameter to snapshot() and release(). But then,
> I was thinking - why shouldn't that exist in SDP? It won't make the API
> anymore complicated, and in addition won't introduce yet another DP class.
>
> This can be done in two ways:
> 1) snapshot() and release() get the extra parameter. For convenience we can
> allow for a null id, in which case only one snapshot w/ a null id can be
> taken (until it's released). To avoid making up an id if you really need a
> single snapshot.
> 2) add variants snapshot() and release() which take an id as argument. Or
> ... extend SDP to MultiSDP.
>
> I'd prefer if we keep that functionality in SDP, but if you prefer an
> extension to it, then we'll need to allow for easier extension of SDP (I
> think we should do that anyway).
>
> What do you think?
>
> Shai
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Enhance SnapshotDeletionPolicy to allow taking multiple snapshots [ In reply to ]
I think we should enhance SDP.
I also think we shouldn't do IDs. snapshot() returns IndexCommitPoint,
release() should get a parameter accepting IndexCommitPoint, that's
all.

On Tue, Apr 27, 2010 at 18:54, Michael McCandless
<lucene@mikemccandless.com> wrote:
> This would be great!
>
> I think we should just enhance SDP rather than make a new MDP.
>
> Mike
>
> On Tue, Apr 27, 2010 at 9:59 AM, Shai Erera <serera@gmail.com> wrote:
>> Hi
>>
>> Today, SnapshotDeletionPolicy (SDP) allows taking only one snapshot. I need
>> to be able to take multiple snapshots. Consider multiple processes doing
>> several things on the index - each needs a snapshot of the index so that
>> commit it needs doesn't get deleted under the covers. SDP is perfect, only
>> it allows for only one snapshot. So I was thinking to extend it to a
>> MultiSDP which adds an 'id' parameter to snapshot() and release(). But then,
>> I was thinking - why shouldn't that exist in SDP? It won't make the API
>> anymore complicated, and in addition won't introduce yet another DP class.
>>
>> This can be done in two ways:
>> 1) snapshot() and release() get the extra parameter. For convenience we can
>> allow for a null id, in which case only one snapshot w/ a null id can be
>> taken (until it's released). To avoid making up an id if you really need a
>> single snapshot.
>> 2) add variants snapshot() and release() which take an id as argument. Or
>> ... extend SDP to MultiSDP.
>>
>> I'd prefer if we keep that functionality in SDP, but if you prefer an
>> extension to it, then we'll need to allow for easier extension of SDP (I
>> think we should do that anyway).
>>
>> What do you think?
>>
>> Shai
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>



--
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Enhance SnapshotDeletionPolicy to allow taking multiple snapshots [ In reply to ]
On Tue, Apr 27, 2010 at 11:03 AM, Earwin Burrfoot <earwin@gmail.com> wrote:
> I think we should enhance SDP.
> I also think we shouldn't do IDs. snapshot() returns IndexCommitPoint,
> release() should get a parameter accepting IndexCommitPoint, that's
> all.

+1

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: Enhance SnapshotDeletionPolicy to allow taking multiple snapshots [ In reply to ]
That's an interesting point ...

I still think we should use IDs though, because the application that uses
this (like me :)) may want to persist the snapshots used elsewhere, in case
the JVM crashes. That's another weak point of SDP today - if the JVM dies,
your snapshot might get deleted, which depending on why you took it may not
be what you want. So, what I've implemented actually allows init'ing SDP w/
existing snapshots information (a Map from String [id] to String
[segmentsName]), so that when you open IndexWriter, the wrapped DP won't
delete those snapshots for you (onInit will ensure that).

Another thing I've run into is several processes that take a snapshot over
the same commit. The IDs allow me to safely release only one snapshot, but
the commit will remain because it's snapshotted by another process. While
it's still supported if release() receives IndexCommit (I can decRef that IC
or something), it will be less clear?

So ... while both those scenarios are still supported if we don't use IDs, I
think IDs will be easier for apps to integrate w/ that feature. Sometimes
it's useful why a snapshot is held, and if you use a meaningful ID, like
"copyIndex[timestamp]", one can determine that snapshot is not needed
anymore. In fact, some comments I've received about using Id were that Ids
are too simple, and perhaps a more complex object can be associated w/ a
snapshot. But for now I think Id is enough, until proven otherwise.

What do you think?

Shai

On Tue, Apr 27, 2010 at 6:04 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> On Tue, Apr 27, 2010 at 11:03 AM, Earwin Burrfoot <earwin@gmail.com>
> wrote:
> > I think we should enhance SDP.
> > I also think we shouldn't do IDs. snapshot() returns IndexCommitPoint,
> > release() should get a parameter accepting IndexCommitPoint, that's
> > all.
>
> +1
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>