Mailing List Archive

XML dumps
The XML database dumps are missing all through May, apparently
because of a memory leak that is being worked on, as described
here,
https://phabricator.wikimedia.org/T98585

However, that information doesn't reach the person who wants to
download a fresh dump and looks here,
http://dumps.wikimedia.org/backup-index.html

I think it should be possible to make a regular schedule for
when these dumps should be produced, e.g. once each month or
once every second month, and treat any delay as a bug. The
process to produce them has been halted by errors many times
in the past, and even when it runs as intended the interval
is unpredictable. Now when there is a bug, all dumps are
halted, i.e. much delayed. For a user of the dumps, this is
extremely frustrating. With proper release management, it
should be possible to run the old version of the process
until the new version has been tested, first on some smaller
wikis, and gradually on the larger ones.


--
Lars Aronsson (lars@aronsson.se)
Aronsson Datateknik - http://aronsson.se



_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: XML dumps [ In reply to ]
hear hear
Gerard

On 29 May 2015 at 01:52, Lars Aronsson <lars@aronsson.se> wrote:

> The XML database dumps are missing all through May, apparently
> because of a memory leak that is being worked on, as described
> here,
> https://phabricator.wikimedia.org/T98585
>
> However, that information doesn't reach the person who wants to
> download a fresh dump and looks here,
> http://dumps.wikimedia.org/backup-index.html
>
> I think it should be possible to make a regular schedule for
> when these dumps should be produced, e.g. once each month or
> once every second month, and treat any delay as a bug. The
> process to produce them has been halted by errors many times
> in the past, and even when it runs as intended the interval
> is unpredictable. Now when there is a bug, all dumps are
> halted, i.e. much delayed. For a user of the dumps, this is
> extremely frustrating. With proper release management, it
> should be possible to run the old version of the process
> until the new version has been tested, first on some smaller
> wikis, and gradually on the larger ones.
>
>
> --
> Lars Aronsson (lars@aronsson.se)
> Aronsson Datateknik - http://aronsson.se
>
>
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: XML dumps [ In reply to ]
On 05/28/2015 07:52 PM, Lars Aronsson wrote:
> With proper release management, it
> should be possible to run the old version of the process
> until the new version has been tested, first on some smaller
> wikis, and gradually on the larger ones.

I understand your frustration; however release management was not the
issue in this case. According to Ariel Glenn on the task
(https://phabricator.wikimedia.org/T98585#1284441), "It's not a new
leak, it's just that the largest single stubs file in our dumps runs is
now produced by wikidata!".

I.E. it was caused by changes to the input data (i.e. our projects), not
by changes to the code.

Matt Flaschen

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l