Mailing List Archive

Postmortem for the MediaWiki 1.17 release
Hi everyone

I've just posted postmortem notes on the MediaWiki 1.17 release here:
http://www.mediawiki.org/wiki/MediaWiki_1.17/Release_postmortem

...and since I expect there will be some editing/futzing with that
page, I've included the full wikitext below. Also, I wouldn't be
surprised if this generates some discussion on this list.

(start of wikitext):

We released [[MediaWiki 1.17]] on June 22. In the interests of doing
better next time, a small group of us (Tim, Chad, Sam, Sumana, and
RobLa) got together to brainstorm what went right and what we need to
look at. [[User:RobLa-WMF|RobLa]] then summarized that discussion,
and wrote this summary up. Any first person references are probably
me (RobLa), and any references to "we" is probably the group above.
See the history for this page for the raw notes.

Note: this is specifically about the MediaWiki 1.17.0 release, rather
than the 1.17 deployment.

== Timeline ==

Here is the timeline, derived from SVN commit logs:
* 2010-07-28 - MediaWiki 1.16.0 released
* 2010-12-07 - REL1_17 branched. This is the branch that MediaWiki
1.17.0 was based on.
* 2011-02-03 - 1.17wmf1 branched
* 2011-05-05 - MediaWiki 1.17.0beta1 tagged
* 2011-06-14 - MediaWiki 1.17.0rc1 released
* 2011-06-22 - MediaWiki 1.17.0 released

== How it went ==

We started by brainstorming "what went well" and "what to look at".
In the initial brainstorming, the original group had many more items
in the "what to look at" section than in the "what went well". I
then set about organizing things, and settled upon four categories:
substance, polish, timing, and process. What became clear was that we
felt pretty good about the substance and polish of the release (where
positive and negatives balanced out pretty well), but the timing and
process categories had the most that we needed to look at.

=== Substance and polish ===

As for the substance, it went very well. We had three large features
(ResourceLoader, category sorting and the new installer) that
complicated this release. As of this writing, it looks like these
features are in pretty good shape, and we can be pretty proud of
releasing them in the state that they're in. We fixed a lot of bugs
(207 noted in the [[Release notes/1.17|release notes]), and made many
smaller improvement to the codebase. Everyone was right to be very
eager to get this release out.

Things of substance that didn't go so well: our PostgreSQL support
suffered until quite late in the process, and our command line
installer is incomplete in some frustrating ways. On PostgreSQL: the
developers who fixed the last of the bugs aren't people that use
PostgreSQL on a day-to-day basis. The folks that normally develop our
PostgreSQL support had other engagements, and we don't have a very
deep list of people to fall back on. We need to work out a plan for
engaging PostgreSQL users as developers in this area, or it will be
very difficult to continue support for this DB. The command line
interface to the installer just needs a little more time to mature;
there are many ways of solving this problem without delaying a
release, but I won't get overly prescriptive in this writeup.

The polish of 1.17 was superb. The release notes were well-written,
and there hasn't been an urgent need for a rapid 1.17.1 release.
We'll do one anyway, since there were a couple of niggly bugs that can
be fixed easily enough.

=== Timing ===

As noted, the biggest area for improvement is around the timing and
release process. It wasn't all bad; we did (just barely) manage to
keep the release cycle under one year. Still, that's much longer than
our aspiration of quarterly releases, or even the previous historic
norm of 2-3 releases per year. Moreover, it has been a long time
since branching 1.17, so we already have seven months worth of work
backed up for future releases. 1.18 was branched in early May, so in
addition to the five months of changes we have backed up for that
release, we already have two more months of changes backed up for
1.19.

The biggest thing that delayed this release (and the 1.17 deployment
in March) was the code review backlog. That topic has been covered in
many earlier threads, but a brief recap: after the 1.16 release, we
fell way behind on code review, relying solely on Tim up until that
point. We added more reviewers in October, which helped us get the
backlog down to a reasonable level by December. We branched, finished
off the 1.17-specific review, and deployed. Further minor review work
was needed prior to the 1.17 release. With more Wikimedia Foundation
developers spending 20% of their time on review, we're optimistic
we'll be able to finish off the backlog and stay on top of the review
process.

As we drew closer to the 1.17 release, we issued 1.17 beta 1. This
beta unintentionally lasted several weeks as we tried to finish off
the last of the release blockers. In particular, a security bug we
worked on during this time created an awkward situation, since we had
to iterate multiple times to fully plug the hole. The good news,
though, is that the period was long enough for us to get some good
end-user testing and bug reporting prior to the final release.

=== Process ===

Process is where we need the most work. The actual logistics of
putting up the tarball and other bits are working well (these haven't
changed in years), but everything leading up to that point could use a
lot of streamlining.

The first issue is purely one of scoping. Right now, we're not
terribly deliberate about what goes in and what is out. Part of the
problem we have here is that opinions vary as to what a reasonable
release interval is. The range of opinion seems to be anywhere from
"multiple times a day" to "every six months". It's difficult to plan
this without getting consensus on this point, and it's difficult to
get consensus without first proving that we can get on top of the code
review backlog and stay on top of it. If we go with a longer cycle,
we can consider adopting a process similar to GNOME<ref>Example of
GNOME release timeline: http://live.gnome.org/ThreePointOne</ref> or
Ubuntu or other project that has a good track record for sticking with
a regular releases. The most interesting practices there involve
having clear deadlines for proposing new features, deadlines for
features being done or pulled, and other date-risk mitigation
strategies.

As with the code review process last year, this year, we're probably
too reliant on Tim to not only drive but execute many steps. One way
we can speed up the process is to document it, making it clear where
we are in the process, and more importantly, how people can help.
"Help" can mean explicitly doing the work, but it can also be simply
"don't do things that delay the release further", or "stop others from
delaying the release". We have a wonderful [[Release checklist]], but
that list was too focused on the last steps before the release. Many
steps before the actual publication of the tarball were missing, so
they've been added into that document. More work can be done there.
Additionally, we will probably experiment with other team members
(e.g. Chad) performing at least alpha or beta releases.

During this release, we tagged many things "1.17" for backporting to
trunk. This process was useful, as long as people remember to untag
once they've merged. There was some confusion at various times who
was responsible for doing this work. It switched sometimes between
Roan, Chad, Tim and others. Additionally, pretty much everyone felt
empowered to tag things for backporting, but there probably wasn't
enough discipline in trimming that list back before actually making
the change. Some unreviewed changes were backported (or directly
applied) to the release branch, causing confusion and delay. We have
a policy about backporting
<ref>http://www.mediawiki.org/wiki/Commit_access_requests#Guidelines_for_applying_patches
- bullet points 4 & 5</ref>, but that policy wasn't followed very
closely.

The process of finding release notes that weren't added and then
backporting them was work that could have been done by people other
than Tim, but Tim ended up doing most of this. This is work that
needs to happen sooner in the process in a more distributed fashion.
Additionally, one way to avoid this extra work is to keep backporting
to a minimum in the first place.

This gets to the larger issue of communication and momentum at the end
of this process. With timezone differences, it's not sustainable to
have daily scrums all of the time, but having scrums during the last
couple of weeks or so in the process may help keep things moving to
the end.

== Recommendations ==
This section is intentionally left unfinished. The goal of this was
to establish and document what happened. To the extent anything is
incorrect or misleading above, corrections are encouraged.
Recommendations for new things to try based on lessons learned from
this release should be included below:

* ''your recommendation here''

...and possibly discussed on the talk page (suggestions above may be
ruthlessly edited; talk page is better for attribution and
preservation).

== References ==

<references/>

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Postmortem for the MediaWiki 1.17 release [ In reply to ]
Rob Lanphier wrote:
> As noted, the biggest area for improvement is around the timing and
> release process. It wasn't all bad; we did (just barely) manage to
> keep the release cycle under one year. Still, that's much longer than
> our aspiration of quarterly releases, or even the previous historic
> norm of 2-3 releases per year. Moreover, it has been a long time
> since branching 1.17, so we already have seven months worth of work
> backed up for future releases. 1.18 was branched in early May, so in
> addition to the five months of changes we have backed up for that
> release, we already have two more months of changes backed up for
> 1.19.

[...]

> The first issue is purely one of scoping. Right now, we're not
> terribly deliberate about what goes in and what is out. Part of the
> problem we have here is that opinions vary as to what a reasonable
> release interval is. The range of opinion seems to be anywhere from
> "multiple times a day" to "every six months". It's difficult to plan
> this without getting consensus on this point, and it's difficult to
> get consensus without first proving that we can get on top of the code
> review backlog and stay on top of it. If we go with a longer cycle,
> we can consider adopting a process similar to GNOME<ref>Example of
> GNOME release timeline: http://live.gnome.org/ThreePointOne</ref> or
> Ubuntu or other project that has a good track record for sticking with
> a regular releases. The most interesting practices there involve
> having clear deadlines for proposing new features, deadlines for
> features being done or pulled, and other date-risk mitigation
> strategies.

Thank you for writing all of this up. It looks like it probably took quite a
bit of time, and I appreciate it.

I pulled out two paragraphs that seem to be the nuggets. Without having this
thread devolve into another chase-your-tail thread, I'd say that the main
issue is that the release manager for 1.17 has a much more conservative
approach, and when looking at it from that lens, 1.17 was right on time.

Tim has outlined on this mailing list why he believes that more infrequent
releases are better, and his arguments are not necessarily invalid, I just
don't think they have any consensus behind them. I think Wikimedia and other
MediaWiki users would like a faster release process. But that's _completely
irrelevant_ when it's one person doing the work and putting together the
final release.

That, in a nutshell, seems to be the point of contention. The release (and
deployment!) timelines are perfectly aligned with a conservative approach,
but a lot of others (Brion, Neil, Chad, Roan, and in some ways Erik, among
others) have recommended a less conservative approach ("perfect is the enemy
of the done") that I believe would keep end-users and developers much
happier.

There's been a recent change-up in Wikimedia staffing, so I don't know who
will be managing the 1.18 release, but if it's the same person, my bet is
that it's going to take the same amount of time. In my view, a few people
(one?) see the longer release/deployment period as a feature, while the
majority of people see it as a bug. :-)

MZMcBride



_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Postmortem for the MediaWiki 1.17 release [ In reply to ]
On 7 July 2011 20:55, MZMcBride <z@mzmcbride.com> wrote:

> Tim has outlined on this mailing list why he believes that more infrequent
> releases are better, and his arguments are not necessarily invalid, I just
> don't think they have any consensus behind them. I think Wikimedia and other
> MediaWiki users would like a faster release process. But that's _completely
> irrelevant_ when it's one person doing the work and putting together the
> final release.


Are we talking about WMF deployments or tarballs here? Speaking as a
tarball user, 2 releases a year, maybe 3, is *just fine*.


- d.

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Postmortem for the MediaWiki 1.17 release [ In reply to ]
On Thu, Jul 7, 2011 at 10:02 PM, David Gerard <dgerard@gmail.com> wrote:
> Are we talking about WMF deployments or tarballs here? Speaking as a
> tarball user, 2 releases a year, maybe 3, is *just fine*.
>
I think 3 releases per year is fine. However, I think we should deploy
to WMF sites much more often than that. That's basically been my
position throughout this debate.

Roan

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Postmortem for the MediaWiki 1.17 release [ In reply to ]
On Thu, Jul 7, 2011 at 10:18 PM, Roan Kattouw <roan.kattouw@gmail.com> wrote:
> On Thu, Jul 7, 2011 at 10:02 PM, David Gerard <dgerard@gmail.com> wrote:
>> Are we talking about WMF deployments or tarballs here? Speaking as a
>> tarball user, 2 releases a year, maybe 3, is *just fine*.
>>
> I think 3 releases per year is fine. However, I think we should deploy
> to WMF sites much more often than that. That's basically been my
> position throughout this debate.
>
+1

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Postmortem for the MediaWiki 1.17 release [ In reply to ]
David Gerard wrote:
> On 7 July 2011 20:55, MZMcBride <z@mzmcbride.com> wrote:
>> Tim has outlined on this mailing list why he believes that more infrequent
>> releases are better, and his arguments are not necessarily invalid, I just
>> don't think they have any consensus behind them. I think Wikimedia and other
>> MediaWiki users would like a faster release process. But that's _completely
>> irrelevant_ when it's one person doing the work and putting together the
>> final release.
>
> Are we talking about WMF deployments or tarballs here? Speaking as a
> tarball user, 2 releases a year, maybe 3, is *just fine*.

As far as I'm aware, tarball releases and Wikimedia deployments have largely
shifted to being at approximately the same (slower) pace, but they're not
synchronized. But you're absolutely right that there's no need for that to
be the case.

I'm muddying the waters a bit by discussing both releases and deployments at
once, and for that I apologize. That said, they are obviously
interconnected. Ideally you want code (Wikimedia deployments) that has been
run in the wild for a while in order to catch issues that would never be
caught in development. That makes for a better tarball release.

In this case, you also largely have the same person filling both roles
(currently? I don't know). That is, Tim was the 1.17 release manager and he
was the point-person doing the 1.17 deployment, as far as I remember, at
least. As I said in my previous post, there have been some shifts in job
titles (cf. Erik's e-mail a few weeks ago), which I think correlate to some
shifts in job responsibilities, but that's still unclear to me.

For what it's worth, I agree that two or three tarball releases per year
would be fine, that just means getting Wikimedia deployments off of the same
schedule.

MZMcBride



_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Postmortem for the MediaWiki 1.17 release [ In reply to ]
On 07/07/11 03:42, Rob Lanphier wrote:
> http://live.gnome.org/ThreePointOne <snip>
> having clear deadlines for proposing new features, deadlines for
> features being done or pulled, and other date-risk mitigation
> strategies.

Having a roadmap like Gnome is the way I am advocating.

Another way I could consider is having a stable branch and only merge in
stable/reviewed patches. After each merge you can either:
- hold for more patches
- release on live site
- tag a release (beta, RC...)
This path is probably as predictable as the first one. Its drawback is
that new features might have less attention.

Anyway, both ways are *very* far away from our wiki-way of handling
/trunk/ (which is messy).

--
Ashar Voultoiz


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Postmortem for the MediaWiki 1.17 release [ In reply to ]
On 07/07/11 03:42, Rob Lanphier wrote:
> we're probably too reliant on Tim to not only drive but execute
> many steps.
<snip>
> Additionally, we will probably experiment with other team members
> (e.g. Chad) performing at least alpha or beta releases.

This is actually a great way to train new people. Let Chad takes the
release management cycle, make sure Tim is around though or next release
he will have to be trained by Chad :-b

--
Ashar Voultoiz


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l