Mailing List Archive

how to detect index integrity?
Hi,

Is there any way to detect the index's integrity?
Sometimes I came upon exceptions like these. If it happens, my only way
is to delete the corrupted index.

* Exception in thread "main" java.io.IOException : read past EOF
* java.lang.ArrayIndexOutOfBoundsException



Being able to verify the index's integrity will prevent nightmare like
this: http://java2.5341.com/msg/73077.html


Chris
Re: how to detect index integrity? [ In reply to ]
Somebody recently submitted a NoOpDirectory, which may help you detect
a corrupt index. There are no tools that will fix a corrupt index,
though.

Otis

--- Chris Lu <chris.lu@gmail.com> wrote:
> Hi,
>
> Is there any way to detect the index's integrity?
> Sometimes I came upon exceptions like these. If it happens, my only
> way
> is to delete the corrupted index.
>
> * Exception in thread "main" java.io.IOException : read past EOF
> * java.lang.ArrayIndexOutOfBoundsException
>
>
>
> Being able to verify the index's integrity will prevent nightmare
> like
> this: http://java2.5341.com/msg/73077.html
>
>
> Chris
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: how to detect index integrity? [ In reply to ]
That would definitely help!
Where to get the source? I couldn't find it in Sandbox or the source
code tree that I know of.

Chris

Otis Gospodnetic wrote:

>Somebody recently submitted a NoOpDirectory, which may help you detect
>a corrupt index. There are no tools that will fix a corrupt index,
>though.
>
>Otis
>
>--- Chris Lu <chris.lu@gmail.com> wrote:
>
>
>>Hi,
>>
>>Is there any way to detect the index's integrity?
>>Sometimes I came upon exceptions like these. If it happens, my only
>>way
>>is to delete the corrupted index.
>>
>> * Exception in thread "main" java.io.IOException : read past EOF
>> * java.lang.ArrayIndexOutOfBoundsException
>>
>>
>>
>>Being able to verify the index's integrity will prevent nightmare
>>like
>>this: http://java2.5341.com/msg/73077.html
>>
>>
>>Chris
>>
>>
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>
Re: how to detect index integrity? [ In reply to ]
It is currently in bugzilla.

Otis

--- Chris Lu <chris.lu@gmail.com> wrote:
> That would definitely help!
> Where to get the source? I couldn't find it in Sandbox or the source
> code tree that I know of.
>
> Chris
>
> Otis Gospodnetic wrote:
>
> >Somebody recently submitted a NoOpDirectory, which may help you
> detect
> >a corrupt index. There are no tools that will fix a corrupt index,
> >though.
>
> >Otis
> >
> >--- Chris Lu <chris.lu@gmail.com> wrote:
> >
> >
> >>Hi,
> >>
> >>Is there any way to detect the index's integrity?
> >>Sometimes I came upon exceptions like these. If it happens, my only
> >>way
> >>is to delete the corrupted index.
> >>
> >> * Exception in thread "main" java.io.IOException : read past
> EOF
> >> * java.lang.ArrayIndexOutOfBoundsException
> >>
> >>
> >>
> >>Being able to verify the index's integrity will prevent nightmare
> >>like
> >>this: http://java2.5341.com/msg/73077.html
> >>
> >>
> >>Chris
> >>
> >>
> >>
> >
>
>---------------------------------------------------------------------
> >To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: how to detect index integrity? [ In reply to ]
> From: chris.lu@gmail.com
> Sent: Fri 3/18/2005 11:34 PM

> Is there any way to detect the index's integrity?
> Sometimes I came upon exceptions like these. If it happens, my only way
> is to delete the corrupted index.

> * Exception in thread "main" java.io.IOException : read past EOF
> * java.lang.ArrayIndexOutOfBoundsException

> [ ... ]

I did too, which is why I wrote NullDirectory. You can find the
sources and a description in bugzilla.

http://issues.apache.org/bugzilla/show_bug.cgi?id=33851

Look at the tests for examples of use. I would value your feedback.
--
Ravi/
Re: how to detect index integrity? [ In reply to ]
Hi, Ravi,

I used your NullDirectory.java, and found it works fine for smaller
indexes, but when it comes to larger indexes(I am not quite sure, just
observation), it always throws this exception at here:

private void refill() throws IOException {
long start = bufferStart + bufferPosition;
long end = start + BUFFER_SIZE;
if (end > length) // don't read past EOF
end = length;
bufferLength = (int)(end - start);
if (bufferLength == 0)
throw new IOException("read past EOF");
....

The print out and stack trace are:
merging segments _2io (50 docs) _2k3 (50 docs) _2li (50 docs) _2mx (50
docs) _2oc (50 docs) _2pr (50 docs) _2r6 (50 docs) _2sl (50 docs) _2u0
(50 docs) _2u4 (3 docs) into _0 (453 docs)
merging segments _25x (50 docs) _27c (50 docs) _28r (50 docs) _2a6 (50
docs) _2bl (50 docs) _2d0 (50 docs) _2ef (50 docs) _2fu (50 docs) _2h9
(50 docs) _0 (453 docs)ERROR 44|java.io.IOException: read past EOF|...
java.io.IOException: read past EOF
at org.apache.lucene.store.InputStream.refill(InputStream.java:154)
at org.apache.lucene.store.InputStream.readByte(InputStream.java:43)
at org.apache.lucene.store.InputStream.readVInt(InputStream.java:83)
at
org.apache.lucene.index.CompoundFileReader.<init>(CompoundFileReader.java:66)
at
org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:104)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:94)
at
org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:480)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:366)
at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:389)



Chris

Ravi Rao wrote:

>>From: chris.lu@gmail.com
>>Sent: Fri 3/18/2005 11:34 PM
>>
>>
>
>
>
>>Is there any way to detect the index's integrity?
>>Sometimes I came upon exceptions like these. If it happens, my only way
>>is to delete the corrupted index.
>>
>>
>
>
>
>> * Exception in thread "main" java.io.IOException : read past EOF
>> * java.lang.ArrayIndexOutOfBoundsException
>>
>>
>
>
>
>>[ ... ]
>>
>>
>
>I did too, which is why I wrote NullDirectory. You can find the
>sources and a description in bugzilla.
>
> http://issues.apache.org/bugzilla/show_bug.cgi?id=33851
>
>Look at the tests for examples of use. I would value your feedback.
>
>
>------------------------------------------------------------------------
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
Re: how to detect index integrity? [ In reply to ]
Hi, Ravi,

I used your NullDirectory.java, and found it works fine for smaller
indexes, but when it comes to larger indexes(I am not quite sure, just
observation), it always throws this exception at here:

private void refill() throws IOException {
long start = bufferStart + bufferPosition;
long end = start + BUFFER_SIZE;
if (end > length) // don't read past EOF
end = length;
bufferLength = (int)(end - start);
if (bufferLength == 0)
throw new IOException("read past EOF");
....

The print out and stack trace are:
merging segments _2io (50 docs) _2k3 (50 docs) _2li (50 docs) _2mx (50
docs) _2oc (50 docs) _2pr (50 docs) _2r6 (50 docs) _2sl (50 docs) _2u0
(50 docs) _2u4 (3 docs) into _0 (453 docs)
merging segments _25x (50 docs) _27c (50 docs) _28r (50 docs) _2a6 (50
docs) _2bl (50 docs) _2d0 (50 docs) _2ef (50 docs) _2fu (50 docs) _2h9
(50 docs) _0 (453 docs)ERROR 44|java.io.IOException: read past EOF|...
java.io.IOException: read past EOF
at org.apache.lucene.store.InputStream.refill(InputStream.java:154)
at org.apache.lucene.store.InputStream.readByte(InputStream.java:43)
at org.apache.lucene.store.InputStream.readVInt(InputStream.java:83)
at
org.apache.lucene.index.CompoundFileReader.<init>(CompoundFileReader.java:66)
at
org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:104)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:94)
at
org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:480)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:366)
at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:389)



Chris

Ravi Rao wrote:

>>From: chris.lu@gmail.com
>>Sent: Fri 3/18/2005 11:34 PM
>>
>>
>
>
>
>>Is there any way to detect the index's integrity?
>>Sometimes I came upon exceptions like these. If it happens, my only way
>>is to delete the corrupted index.
>>
>>
>
>
>
>> * Exception in thread "main" java.io.IOException : read past EOF
>> * java.lang.ArrayIndexOutOfBoundsException
>>
>>
>
>
>
>>[ ... ]
>>
>>
>
>I did too, which is why I wrote NullDirectory. You can find the
>sources and a description in bugzilla.
>
> http://issues.apache.org/bugzilla/show_bug.cgi?id=33851
>
>Look at the tests for examples of use. I would value your feedback.
>
>
>------------------------------------------------------------------------
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
Re: how to detect index integrity? [ In reply to ]
Hi, Ravi,

I used your NullDirectory.java, and found it works fine for smaller
indexes, but when it comes to larger indexes(I am not quite sure, just
observation), it always throws this exception at here:

private void refill() throws IOException {
long start = bufferStart + bufferPosition;
long end = start + BUFFER_SIZE;
if (end > length) // don't read past EOF
end = length;
bufferLength = (int)(end - start);
if (bufferLength == 0)
throw new IOException("read past EOF");
....

The print out and stack trace are:
merging segments _2io (50 docs) _2k3 (50 docs) _2li (50 docs) _2mx (50
docs) _2oc (50 docs) _2pr (50 docs) _2r6 (50 docs) _2sl (50 docs) _2u0
(50 docs) _2u4 (3 docs) into _0 (453 docs)
merging segments _25x (50 docs) _27c (50 docs) _28r (50 docs) _2a6 (50
docs) _2bl (50 docs) _2d0 (50 docs) _2ef (50 docs) _2fu (50 docs) _2h9
(50 docs) _0 (453 docs)ERROR 44|java.io.IOException: read past EOF|...
java.io.IOException: read past EOF
at org.apache.lucene.store.InputStream.refill(InputStream.java:154)
at org.apache.lucene.store.InputStream.readByte(InputStream.java:43)
at org.apache.lucene.store.InputStream.readVInt(InputStream.java:83)
at
org.apache.lucene.index.CompoundFileReader.<init>(CompoundFileReader.java:66)
at
org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:104)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:94)
at
org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:480)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:366)
at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:389)



Chris

Ravi Rao wrote:

>>From: chris.lu@gmail.com
>>Sent: Fri 3/18/2005 11:34 PM
>>
>>
>
>
>
>>Is there any way to detect the index's integrity?
>>Sometimes I came upon exceptions like these. If it happens, my only way
>>is to delete the corrupted index.
>>
>>
>
>
>
>> * Exception in thread "main" java.io.IOException : read past EOF
>> * java.lang.ArrayIndexOutOfBoundsException
>>
>>
>
>
>
>>[ ... ]
>>
>>
>
>I did too, which is why I wrote NullDirectory. You can find the
>sources and a description in bugzilla.
>
> http://issues.apache.org/bugzilla/show_bug.cgi?id=33851
>
>Look at the tests for examples of use. I would value your feedback.
>
>
>------------------------------------------------------------------------
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
RE: how to detect index integrity? [ In reply to ]
> From: chris.lu@gmail.com
> Sent: Fri 3/25/2005 2:19 AM

> I used your NullDirectory.java, and found it works fine for smaller
> indexes, but when it comes to larger indexes(I am not quite sure, just
> observation), it always throws this exception at here:

> [. ... code and stack trace moved to end of mail ...]

Chris,

The call to addIndexes results in the target directory (in this case
an instance of NullDirectory) being optimized twice. Your stacktrace
shows that it is the second call to optimize that is the problem. I
assume this is 'lucene-1.4-final'.

I will get to this as soon as I can. Thanks for taking the time to
send me this information.
--
Ravi/
-----------------------------------------------------------

private void refill() throws IOException {
long start = bufferStart + bufferPosition;
long end = start + BUFFER_SIZE;
if (end > length) // don't read past EOF
end = length;
bufferLength = (int)(end - start);
if (bufferLength == 0)
throw new IOException("read past EOF");
....

The print out and stack trace are:
merging segments _2io (50 docs) _2k3 (50 docs) _2li (50 docs) _2mx (50 docs) _2oc (50 docs) _2pr (50 docs) _2r6 (50 docs) _2sl (50 docs) _2u0 (50 docs) _2u4 (3 docs) into _0 (453 docs)
merging segments _25x (50 docs) _27c (50 docs) _28r (50 docs) _2a6 (50 docs) _2bl (50 docs) _2d0 (50 docs) _2ef (50 docs) _2fu (50 docs) _2h9 (50 docs) _0 (453 docs)ERROR 44|java.io.IOException: read past EOF|...
java.io.IOException: read past EOF
at org.apache.lucene.store.InputStream.refill(InputStream.java:154)
at org.apache.lucene.store.InputStream.readByte(InputStream.java:43)
at org.apache.lucene.store.InputStream.readVInt(InputStream.java:83)
at org.apache.lucene.index.CompoundFileReader.<init>(CompoundFileReader.java:66)
at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:104)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:94)
at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:480)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:366)
at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:389)
Re: how to detect index integrity? [ In reply to ]
Hi Ravi:

I'd like to use this too. Do you have an update on this?

Thanks

-John


On Fri, 25 Mar 2005 07:28:17 -0600, Ravi Rao <rrao@alterpoint.com> wrote:
> > From: chris.lu@gmail.com
> > Sent: Fri 3/25/2005 2:19 AM
>
> > I used your NullDirectory.java, and found it works fine for smaller
> > indexes, but when it comes to larger indexes(I am not quite sure, just
> > observation), it always throws this exception at here:
>
> > [. ... code and stack trace moved to end of mail ...]
>
> Chris,
>
> The call to addIndexes results in the target directory (in this case
> an instance of NullDirectory) being optimized twice. Your stacktrace
> shows that it is the second call to optimize that is the problem. I
> assume this is 'lucene-1.4-final'.
>
> I will get to this as soon as I can. Thanks for taking the time to
> send me this information.
> --
> Ravi/
> -----------------------------------------------------------
>
> private void refill() throws IOException {
> long start = bufferStart + bufferPosition;
> long end = start + BUFFER_SIZE;
> if (end > length) // don't read past EOF
> end = length;
> bufferLength = (int)(end - start);
> if (bufferLength == 0)
> throw new IOException("read past EOF");
> ....
>
> The print out and stack trace are:
> merging segments _2io (50 docs) _2k3 (50 docs) _2li (50 docs) _2mx (50 docs) _2oc (50 docs) _2pr (50 docs) _2r6 (50 docs) _2sl (50 docs) _2u0 (50 docs) _2u4 (3 docs) into _0 (453 docs)
> merging segments _25x (50 docs) _27c (50 docs) _28r (50 docs) _2a6 (50 docs) _2bl (50 docs) _2d0 (50 docs) _2ef (50 docs) _2fu (50 docs) _2h9 (50 docs) _0 (453 docs)ERROR 44|java.io.IOException: read past EOF|...
> java.io.IOException: read past EOF
> at org.apache.lucene.store.InputStream.refill(InputStream.java:154)
> at org.apache.lucene.store.InputStream.readByte(InputStream.java:43)
> at org.apache.lucene.store.InputStream.readVInt(InputStream.java:83)
> at org.apache.lucene.index.CompoundFileReader.<init>(CompoundFileReader.java:66)
> at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:104)
> at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:94)
> at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:480)
> at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:366)
> at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:389)
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org