Mailing List Archive

Status of 1.17 & 1.18
Hi all,


I'm wondering what the status for 1.17 is. How far are we from RC? Is
there any more review left?

Related to this, as our review burden for 1.17 lessens, we should
start to think about 1.18: re-recruit reviewers again, start thinking
about when to branch 1.18, etc. Are there any plans related to that?


Bryan

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
Bryan Tong Minh <bryan.tongminh@gmail.com> writes:

> I'm wondering what the status for 1.17 is. How far are we from RC? Is
> there any more review left?

Initially, Chad was going to be the release manager for 1.17, but other
issues meant that he isn't going to have time to manage it right now.

I think the current plan is to have Tim make the 1.17 release. I don't
think any merges are left at this time besides, maybe, the recent XSS
fix.

Still, since Tim is is the release manager, he should have more
definitive answers.

> Related to this, as our review burden for 1.17 lessens, we should
> start to think about 1.18: re-recruit reviewers again, start thinking
> about when to branch 1.18, etc. Are there any plans related to that?

Nothing formal yet, but all of us are very aware of the need to make
code review happen in a timely manner. I'll be watching CRStats
(http://toolserver.org/~robla/crstats/crstats.html) closely and
encouraging developers to help in code review.

I think branching 1.18 immediately after the 1.17 release (or now, for
that matter) will help us manage code review better. If we have people
testing the 1.18 branch and updating regularly (similar to what Ubuntu
does for their development) and we set a date (July 15th?) when we know
we have to have a release prepared, then that will help Code Review all
the more.

But after a bit of discussion on IRC, I think we should try to get
Tim's, Brion's and anyone else's opinion on what they think about the
release schedule and code review.

Mark

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
2011/4/14 Mark A. Hershberger <mhershberger@wikimedia.org>:
> I think branching 1.18 immediately after the 1.17 release (or now, for
> that matter) will help us manage code review better.  If we have people
> testing the 1.18 branch and updating regularly (similar to what Ubuntu
> does for their development) and we set a date (July 15th?) when we know
> we have to have a release prepared, then that will help Code Review all
> the more.
>
July?!?

I know 1.17 took a long time, but that was like a year's worth of
code. We should strike to keep the branch-to-release time as low as we
can, and it definitely needs to be WAY less than 3 months. It's been
like 4 months for 1.17, but 1.17 was quite exceptional, and more
frequent and quicker releases should become the rule.

My opinion is it would be best to branch 1.18 now-ish and revert
Happy-melon's Action changes (he wholeheartedly agreed that's 1.19
material).

Slightly off-topic:

Also, we should get our code review act together in a more sustainable
way. I've brought this up before, but it hasn't gotten a lot of
attention, probably due to the 1.17 craze. We have to have a serious
discussion about code review reform (to use a political-sounding
term); I think the tech staff meeting after the Berlin hackathon would
be a good venue for discussing the WMF side of this. The conference
itself is really supposed to be a hackathon this time, so I'm not sure
that having a protracted discussion there would be a very good idea;
that's basically what we did the whole time last year, and this year
is supposed to not be like that for a reason.

As always we do of course need to be careful to not want to solve this
"internally" between WMF staff, but have a public discussion with
everyone regardless of whether they happen to be paid. However, my
impression is that this particular topic is one that mainly involves
staff and that it would be acceptable to hammer something out
internally and propose that on wikitech-l as something of a draft, in
this particular case. I'd be very interested to hear how unpaid
developers feel about that, as some of them have called out this
practice as undesirable back in September.

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
2011/4/14 Mark Hershberger <mhershberger@wikimedia.org>:
> Sorry, I should have been clearer.  Yes, branch now(ish) and then aim for a
> 1.18 release on July 15th.  My idea is that setting a date for the release
> to be soon and early would provide the motivation to the people involved in
> code review to keep it up-to-date.
>
The point I was trying to make was that July is by no means "soon and
early" in my book. It's three months away, which is way to long.
Setting a date is nice, but if we can get a release out before the set
date, that's a good thing, and I think we can (and /should/) get 1.18
out way faster.

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
Roan Kattouw wrote:
> As always we do of course need to be careful to not want to solve this
> "internally" between WMF staff, but have a public discussion with
> everyone regardless of whether they happen to be paid. However, my
> impression is that this particular topic is one that mainly involves
> staff and that it would be acceptable to hammer something out
> internally and propose that on wikitech-l as something of a draft, in
> this particular case. I'd be very interested to hear how unpaid
> developers feel about that, as some of them have called out this
> practice as undesirable back in September.

I'm not a developer, but I can say this: from the outside, there's the
appearance that a major part of the problem is that nobody seems to really
be in charge. I see this problem in both Wikimedia and MediaWiki code
development.

MZMcBride



_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
On 14/04/11 21:19, Bryan Tong Minh wrote:
> Hi all,
>
>
> I'm wondering what the status for 1.17 is. How far are we from RC? Is
> there any more review left?

I'll be doing a 1.17beta1 release soon, probably early next week.
There are 16 revisions tagged for backporting, those will have to be
reviewed and backported, and I'll have to have a quick look through
the older backports to make sure everything has a RELEASE-NOTES entry
and looks more or less sane.

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
On 15/04/11 04:22, Roan Kattouw wrote:
> 2011/4/14 Mark Hershberger <mhershberger@wikimedia.org>:
>> Sorry, I should have been clearer. Yes, branch now(ish) and then aim for a
>> 1.18 release on July 15th. My idea is that setting a date for the release
>> to be soon and early would provide the motivation to the people involved in
>> code review to keep it up-to-date.
>>
> The point I was trying to make was that July is by no means "soon and
> early" in my book. It's three months away, which is way to long.
> Setting a date is nice, but if we can get a release out before the set
> date, that's a good thing, and I think we can (and /should/) get 1.18
> out way faster.

My preference is for 2 to 3 major releases per year. We branched 1.17
in December and we're looking at doing a release in April. So a 4
month cycle would imply branching 1.18 in April and releasing in August.

I don't think having 4 or 5 major releases per year would serve anyone
particularly well. A slower release cadence means:

* Less hassle for non-Wikimedia users, since upgrades between major
releases require more work. Extensions break, patches break, DB
upgrades need to be done.

* Less branches to backport to. This reduces the amount of work that
needs to be done to backport security fixes and other bug fixes. We
drop support for branches based on time elapsed, not number of
versions released.

* Less branches to test against. If you're writing an extension that
is meant to work on multiple MediaWiki versions, it will be easier if
there are less versions that you need to test against, and potentially
write special-case code for.

* It's easier to do major projects in trunk. When you merge work in to
trunk from a development branch, it's necessary to stabilise the code
before the next release. This can take a long time for a major
project. Both the new installer and the resource loader benefited from
a long release cycle in this way.

* More opportunity for whole-project review. When a project begins and
ends in a single release cycle, reviewers can wait for the project to
reach a state where the original developer is happy with it before
they start reviewing and giving comments. This means that the reviewer
doesn't have to spend so much time looking at intermediate commits.

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
On 15 April 2011 06:53, Tim Starling <tstarling@wikimedia.org> wrote:
> * Less hassle for non-Wikimedia users, since upgrades between major
> releases require more work. Extensions break, patches break, DB
> upgrades need to be done.

People upgrade seldom. If we have one release per year it is likely that
the code they upgrade to is already so old nobody remembers how it works.

> * Less branches to backport to. This reduces the amount of work that
> needs to be done to backport security fixes and other bug fixes. We
> drop support for branches based on time elapsed, not number of
> versions released.

I agree with this one, although I'm not the one who feels the pain here.

> * Less branches to test against. If you're writing an extension that
> is meant to work on multiple MediaWiki versions, it will be easier if
> there are less versions that you need to test against, and potentially
> write special-case code for.

On the other hand, with few releases far and between, I need to write lot
of compatibility code in Translate extension to even support the latest
stable release and trunk at the same time. Having branches for different
releases for my extension sounds like a lot of effort to maintain them,
not even speaking about supporting them.

But all of this is moot, since you're proposing 3 releases per year
and I'm complaining about having only one or two releases per year.
Three releases would be enough for me.

-Niklas

--
Niklas Laxström

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
2011/4/15 Tim Starling <tstarling@wikimedia.org>:
> My preference is for 2 to 3 major releases per year. We branched 1.17
> in December and we're looking at doing a release in April. So a 4
> month cycle would imply branching 1.18 in April and releasing in August.
>
> I don't think having 4 or 5 major releases per year would serve anyone
> particularly well. A slower release cadence means:
>
I can get on board with having 3 releases per year, but I'll reiterate
that 3 months, let alone 4, between branching and releasing is too
long. Yes, 1.17 took 4 months to stabilize, but it was 10 months'
worth of code, so that's a 1:2.5 ratio. Interpolating that suggests
that a release with 4 months' worth of code can be prepared in less
than 2 months, and I think that once code review is organized properly
such that large backlogs don't happen anymore (we had a very large
backlog for 1.17 and I think we'll have a comparable one, considering
the difference in elapsed time, for 1.18, but I'd really like to have
this organized properly for 1.19 or 1.20), we can do better than that.

Instead, you're proposing a 1:1 workflow where, at any given point in
time, we always have a release branch that's being stabilized, which
means we have to perpetually maintain three branches (trunk,
deployment, release) instead of two, and are always in the process of
preparing a release. I don't like that idea, and I think it's
unnecessary.

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
On 15/04/11 19:26, Roan Kattouw wrote:
> 2011/4/15 Tim Starling <tstarling@wikimedia.org>:
>> My preference is for 2 to 3 major releases per year. We branched 1.17
>> in December and we're looking at doing a release in April. So a 4
>> month cycle would imply branching 1.18 in April and releasing in August.
>>
>> I don't think having 4 or 5 major releases per year would serve anyone
>> particularly well. A slower release cadence means:
>>
> I can get on board with having 3 releases per year, but I'll reiterate
> that 3 months, let alone 4, between branching and releasing is too
> long. Yes, 1.17 took 4 months to stabilize, but it was 10 months'
> worth of code, so that's a 1:2.5 ratio. Interpolating that suggests
> that a release with 4 months' worth of code can be prepared in less
> than 2 months, and I think that once code review is organized properly
> such that large backlogs don't happen anymore (we had a very large
> backlog for 1.17 and I think we'll have a comparable one, considering
> the difference in elapsed time, for 1.18, but I'd really like to have
> this organized properly for 1.19 or 1.20), we can do better than that.
>
> Instead, you're proposing a 1:1 workflow where, at any given point in
> time, we always have a release branch that's being stabilized, which
> means we have to perpetually maintain three branches (trunk,
> deployment, release) instead of two, and are always in the process of
> preparing a release. I don't like that idea, and I think it's
> unnecessary.

That's a fair point. I didn't mean to propose a 1:1 workflow, I meant
to just make a point about release schedules.

I know that different developers have different ideas about branch
point schedules and how they should relate to release schedules. I
don't have a strong view at this stage.

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
2011/4/15 Tim Starling <tstarling@wikimedia.org>:
> That's a fair point. I didn't mean to propose a 1:1 workflow, I meant
> to just make a point about release schedules.
>
OK. If your main point was to say that we should branch 1.18 in April
and 1.19 in August, I'm cool with that. Releasing too infrequently is
bad, as we saw with 1.17, but you make solid points to support the
notion that releasing too frequently introduces problems of its own,
and that we should find middle ground. Speaking in terms of release
cycle length, I think that 4-6 months (2-3 releases/yr) is a bit long
and 3-4 months (3-4 releases/yr) is better, but I'm sure we can work
out a number. Your point that release cycle length should be
consciously and carefully decided on is a very good one, and I'm sorry
I hijacked it with my release latency argument.

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
On Thu, Apr 14, 2011 at 9:17 AM, Mark A. Hershberger <
mhershberger@wikimedia.org> wrote:

> But after a bit of discussion on IRC, I think we should try to get
> Tim's, Brion's and anyone else's opinion on what they think about the
> release schedule and code review.
>

I say push out a release immediately.

-- brion
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
On Fri, Apr 15, 2011 at 2:26 AM, Roan Kattouw <roan.kattouw@gmail.com>wrote:

> 2011/4/15 Tim Starling <tstarling@wikimedia.org>:
> > My preference is for 2 to 3 major releases per year. We branched 1.17
> > in December and we're looking at doing a release in April. So a 4
> > month cycle would imply branching 1.18 in April and releasing in August.
> >
> > I don't think having 4 or 5 major releases per year would serve anyone
> > particularly well. A slower release cadence means:
> >
> I can get on board with having 3 releases per year, but I'll reiterate
> that 3 months, let alone 4, between branching and releasing is too
> long.


I'd be happy with about two weeks: push 'beta' tarballs in the first week,
'release candidates' in the second week.

In the meantime, we should be running 1.18 on live servers, with a maximum
of a week lag from trunk, and preferably much less. Ongoing work on trunk
should always be keeping stability in mind, and code review should
concentrate on ensuring that code is being actively tested and used.

I know we had some delays due to wanting to finish the security fixes, but
I'm extremely concerned that trunk hasn't been being maintained this way
since the initial 1.17 push.

Unexercised code is dangerous code that will break when you least expect it;
we need to get code into use fast, where it won't sit idle until we push it
live with a thousand other things we've forgotten about.

-- brion
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
2011/4/15 Brion Vibber <brion@pobox.com>:
> I'd be happy with about two weeks: push 'beta' tarballs in the first week,
> 'release candidates' in the second week.
>
> In the meantime, we should be running 1.18 on live servers, with a maximum
> of a week lag from trunk, and preferably much less. Ongoing work on trunk
> should always be keeping stability in mind, and code review should
> concentrate on ensuring that code is being actively tested and used.
>
Amen to this, the rest of your post, and your previous post (release
1.17 ASAP). You're formulating my opinions better than I could;
cheesy-sounding but true :P

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
On 04/15/2011 12:07 PM, Brion Vibber wrote:
> Unexercised code is dangerous code that will break when you least expect it;
> we need to get code into use fast, where it won't sit idle until we push it
> live with a thousand other things we've forgotten about.

Translate wiki deserves major props for running a real world wiki on
trunk. Its hard to count all the bugs get caught that way. Maybe once
the heterogeneous deployment situation gets figured out we could do
something similar with a particular project...

peace,
--michael

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
Hoi,
It makes sense for translatewiki.net to run on trunk. This way we are
exposed to the latest messages and get as much localisation done before code
actually hits production servers. Running another project just because it
will run trunk only makes sense when it running trunk has added value.

What you can do is adopt translatewiki.net as your barometer for code
quality and help it run as smoothly as possible.
Thanks,
GerardM

On 15 April 2011 19:36, Michael Dale <mdale@wikimedia.org> wrote:

> On 04/15/2011 12:07 PM, Brion Vibber wrote:
> > Unexercised code is dangerous code that will break when you least expect
> it;
> > we need to get code into use fast, where it won't sit idle until we push
> it
> > live with a thousand other things we've forgotten about.
>
> Translate wiki deserves major props for running a real world wiki on
> trunk. Its hard to count all the bugs get caught that way. Maybe once
> the heterogeneous deployment situation gets figured out we could do
> something similar with a particular project...
>
> peace,
> --michael
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
On Fri, Apr 15, 2011 at 12:10 PM, Gerard Meijssen <gerard.meijssen@gmail.com
> wrote:

> Hoi,
> It makes sense for translatewiki.net to run on trunk. This way we are
> exposed to the latest messages and get as much localisation done before
> code
> actually hits production servers. Running another project just because it
> will run trunk only makes sense when it running trunk has added value.
>
> What you can do is adopt translatewiki.net as your barometer for code
> quality and help it run as smoothly as possible.
>

translatewiki.net is a great help, but don't forget that it doesn't run all
the same extensions as are used in Wikimedia production sites. Regressions
affecting things like CentralAuth can and do strike with very little
warning; we've had several in the last few weeks that are only being caught
because I have it set up on my workstation's dev instance and I see the
breakages while I'm testing unrelated things.

It's important to actually be exercising the same code and the same
configurations that are running in production. And when some bugs still
don't get caught during that testing, it helps *a lot* to have only a
minimal change set to look at since your last deployment. Changes can be
rolled back more easily, and the problems found and fixed and redeployed
more easily.

-- brion
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
On 16/04/11 03:07, Brion Vibber wrote:
> In the meantime, we should be running 1.18 on live servers, with a maximum
> of a week lag from trunk, and preferably much less. Ongoing work on trunk
> should always be keeping stability in mind, and code review should
> concentrate on ensuring that code is being actively tested and used.

Yeah, I've heard this before. It didn't work the first time around,
and I don't think it can work now. We can't use Wikipedia as a testing
site for alpha-quality code anymore.

I think we should have a cycle of:

* Development branch merges and other major work in trunk.
* Review and stabilisation of the course of a couple of months,
alongside general development work.
* Branch point.
* A period of backports and review to ensure the stability of the new
branch.
* Testing, for 1-2 weeks.
* Deployment.

This is what we did for 1.17, and it worked well, leading to a 1.17
deployment which caused a minimum of disruption.

> Unexercised code is dangerous code that will break when you least expect it;
> we need to get code into use fast, where it won't sit idle until we push it
> live with a thousand other things we've forgotten about.

This certainly wasn't my experience with the 1.17 deployment. We had a
great deal of review and testing of the 1.17 branch, and many bugs
were fixed without having to get Wikipedians to tell us about them.

> translatewiki.net is a great help, but don't forget that it doesn't run all
> the same extensions as are used in Wikimedia production sites.

No it doesn't, that's why we set up public test wikis which did have a
similar set of extensions: first a set of wikis separate from the main
cluster on prototype.wikimedia.org, and then a test wiki which was
part of the cluster. Then we did a staged deployment, deploying 1.17
to several wikis at a time.

CT and Robla were very supportive of this deployment strategy, and
setting up permanent systems for deploying different versions to
different wikis is now a high priority project.

We had a significant amount of manpower dedicated to testing the
software on prototype.wikimedia.org, both Wikimedia staff and experts
contracted via Calcey QA.

It's not the same site as it was when you first proposed this policy.

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
On Fri, Apr 15, 2011 at 5:06 PM, Tim Starling <tstarling@wikimedia.org>wrote:

>
> This is what we did for 1.17, and it worked well, leading to a 1.17
> deployment which caused a minimum of disruption.
>
> > Unexercised code is dangerous code that will break when you least expect
> it;
> > we need to get code into use fast, where it won't sit idle until we push
> it
> > live with a thousand other things we've forgotten about.
>
> This certainly wasn't my experience with the 1.17 deployment. We had a
> great deal of review and testing of the 1.17 branch, and many bugs
> were fixed without having to get Wikipedians to tell us about them.
>

*nod*

I think I've oversimplified with the 'deploy more often' part of things;
lemme try to reorganize my arguments into something hopefully more cogent.

**tl;dr summary: More frequent testing of smaller pieces of changed code
among small, but real sets of users should be a useful component of getting
things tested and deployed faster and more safely. Judicious testing and
deployment should help support a safer, but still much more aggressive,
overall update frequency.**


It's certainly a fact that many bugs were found and fixed before deployment
-- the organization around testing and bugfixing was in many ways FAR FAR
superior to any deployment we've done before, and I don't mean to take away
from that.

But it's also true that there were other bugs buried in code that had been
changed 8-9 months previously, making it harder to track them down -- and
much more difficult to revert them if a fix wasn't obvious. I certainly
experienced that during the 1.17 deployment, and received the same
impression from other developers at the time.

There was also a production outage for a time due to the choice not to
initially do a staged rollout. This lesson has been learned, so should not
be an issue in future deployments.

> translatewiki.net is a great help, but don't forget that it doesn't run
> all
> > the same extensions as are used in Wikimedia production sites.
>
> No it doesn't, that's why we set up public test wikis which did have a
> similar set of extensions: first a set of wikis separate from the main
> cluster on prototype.wikimedia.org, and then a test wiki which was
> part of the cluster.


Indeed, that is a very useful component of an ongoing development+deployment
strategy. But lack of real traffic and real usage makes this only a limited
part of testing. I also experienced that some of the prototype sites were
broken for days or weeks (CentralAuth configuration problems was my
impression?), which prevented me from being able to confirm some bugs
reported against prototype sites at the time.

One thing that can help with this is to run more actual, but lower traffic,
sites on the prototype infrastructure so people are really dogfooding them:
a broken prototype site should be something requiring an immediate fix.

For instance us programmers probably use www.mediawiki.org a lot more
aggressively than regular people do, *and* we have access to the code and
some have access to the server infrastructure. It might be an ideal
candidate for receiving more frequent updates from trunk.


> Then we did a staged deployment, deploying 1.17
> to several wikis at a time.
>

This was one of my recommendations for the 1.17 deployment, so yes that's
exactly the sort of thing I'm advocating.

It was initially rejected because the old heterogeneous deploy scripts were
out of date and it was worried that they wouldn't get done in time and might
just break things worse. They then got reimplemented in a hurry when it
turned out that yes, indeed, 1.17 broke when simply applied to the entire
cluster at once -- reimplementing it was definitely the right choice and it
significantly smoothed out the deployment once it happened.

It's not the same site as it was when you first proposed this policy.
>

It's a bigger site with more users, increasing the danger that small changes
will cause unexpected breakages. I believe that smaller change sets that get
more directly tested will help to reduce that danger.

Major sites like Google and Facebook are much more aggressive about A/B
testing and progressive rollouts than we've ever been -- not in place of all
other forms of testing and review, but definitely in addition. We have
relatively limited resources, but we're not just three guys with an rsync
script anymore... I think we can do better with what we've got.


I think this is a situation that will benefit from more aggressive testing,
including more live & A/B testing: fine-grained rollouts mean fine-grained
testing and fine-grained debugging. Not always perfect, but if problems get
exposed and fixed quicker, in a relatively small audience but still big
enough to drive real usage behavior, I think that's a win.

I do agree that just slapping trunk onto *.wikipedia.org every couple days
isn't a great idea at this stage, but I think we can find an intermediate
level that gets code into real, live usage on an ongoing rolling basis. Some
things that may help:

* Full heterogenous deployment system so real but lower-traffic sites can be
regularly run on more aggressive update schedules than high-traffic sites
* Targeting specific experimental code to specific sites (production
prototypes?)
* Being able to better separate fixed backend and more experimental frontend
code for a/b testing
* Cleaner separation of modules: we shouldn't have to update CentralAuth to
update ProofreadPage on Wikisource.

One issue we see at present is that since we version and deploy core and
extensions together, it's tough to get a semi-experimental extension into
limited deployment with regular updates. Let's make sure that's clean and
easy to do to; right now it's very easy to deploy experimental JavaScript
into a gadget or site JS, but an extension may just sit idle in SVN for
years, unusable in production even if it's limited, modular code because no
one wants to deploy it. If there's interest it may get a prototype site, but
if they only get used by the testing crew or when we ask someone to go and
make some fake edits on them, they're not going to have all their bugs
exercised.

Being able to do more self-directed prototype sites with the upcoming
virtualization infrastructure should help with that, and for certain
front-end things it should be possible to use JS whatsits to hook some of
that code into live sites for opt-in or a/b testing -- further reducing
dangers by removing the server-side variations and providing an instant
switch-back to the old code.


I don't advocate just blindly updating the whole stack all the time; I
advocate aiming for smaller pieces that can be run and tested more easily
and more safely in more flexible ways.

As a power user willing to risk my neck to make things better, I want to be
able to opt in to the "Wikipedia beta" and actually get an experimental new
feature *on Wikipedia or Commons* a lot more often. As a developer, I want
to be able to get things into other peoples' hands so they can test them for
me and give me feedback.

This is one of the reasons I'm excited about the future of Gadgets -- the
JS+CSS side has always been the free-for-all where experimental tools can
actually be created and tested and used in a real environment, while
MediaWiki's PHP side has remained difficult to update in pieces. It's easier
to deploy those things, and should get even easier and more powerful with
time.

We should consider what we can do to make the PHP side smoother and easier
as well, though obviously we are much more limited for security and
functional safety reasons.

-- brion
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
Στις 15-04-2011, ημέρα Παρ, και ώρα 18:41 -0700, ο/η Brion Vibber
έγραψε:

>
> One issue we see at present is that since we version and deploy core and
> extensions together, it's tough to get a semi-experimental extension into
> limited deployment with regular updates. Let's make sure that's clean and
> easy to do to; right now it's very easy to deploy experimental JavaScript
> into a gadget or site JS, but an extension may just sit idle in SVN for
> years, unusable in production even if it's limited, modular code because no
> one wants to deploy it. If there's interest it may get a prototype site, but
> if they only get used by the testing crew or when we ask someone to go and
> make some fake edits on them, they're not going to have all their bugs
> exercised.
>
> Being able to do more self-directed prototype sites with the upcoming
> virtualization infrastructure should help with that, and for certain
> front-end things it should be possible to use JS whatsits to hook some of
> that code into live sites for opt-in or a/b testing -- further reducing
> dangers by removing the server-side variations and providing an instant
> switch-back to the old code.
>
>
> I don't advocate just blindly updating the whole stack all the time; I
> advocate aiming for smaller pieces that can be run and tested more easily
> and more safely in more flexible ways.
>
> As a power user willing to risk my neck to make things better, I want to be
> able to opt in to the "Wikipedia beta" and actually get an experimental new
> feature *on Wikipedia or Commons* a lot more often. As a developer, I want
> to be able to get things into other peoples' hands so they can test them for
> me and give me feedback.

The ability to easily test a feature, an extension or an update on a
small percentage of users, based on opt-in, project/language or simple
random percentage, is something that many shops have and that we should
prioritize adding to our deployment toolkit. It's unrealistic to think
that we will uncover all of the issues, even the serious ones, ourselves
running on a test environment, or even on the cluster. Our users
exercise this code in ways that aren't even on our radar, which is a
good thing; let's make use of it.

FWIW I also support having a much more agressive testing, deployment and
release schedule, for many of the reasons already described by others.

Ariel


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
Assuming that there are no destructive bugs in the reviewed code, we
could have en.alpha.wikipedia.org urls.

Our tests also need to be improved, so that we don't keep hitting the
same boulders.


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
On Fri, Apr 15, 2011 at 9:41 PM, Brion Vibber <brion@pobox.com> wrote:
> But it's also true that there were other bugs buried in code that had been
> changed 8-9 months previously, making it harder to track them down -- and
> much more difficult to revert them if a fix wasn't obvious. I certainly
> experienced that during the 1.17 deployment, and received the same
> impression from other developers at the time.

For a concrete example, see
<http://www.mediawiki.org/wiki/Special:Code/MediaWiki/83544> and
follow-up commits. I made that commit shortly after 1.17 deployment,
working with Roan to resolve a bug in my categorylinks rewrite. It
turned out that I got confused by my own variable naming and
completely broke non-uppercase collations in the process of fixing the
bug that was visible on Wikimedia. That required effort by
translatewiki.net to track down the bug again more than a month later.
I'm pretty sure I wouldn't have made that mistake if I had been
writing the fix two weeks later instead of six months later.

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
On Thu, Apr 14, 2011 at 8:27 PM, Tim Starling <tstarling@wikimedia.org>wrote:

> On 14/04/11 21:19, Bryan Tong Minh wrote:
> > Hi all,
> >
> >
> > I'm wondering what the status for 1.17 is. How far are we from RC? Is
> > there any more review left?
>
> I'll be doing a 1.17beta1 release soon, probably early next week.
> There are 16 revisions tagged for backporting, those will have to be
> reviewed and backported, and I'll have to have a quick look through
> the older backports to make sure everything has a RELEASE-NOTES entry
> and looks more or less sane.
>

Any updates on this process? What's left to do, and what can people help
with?

-- brion
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Status of 1.17 & 1.18 [ In reply to ]
On Wed, Apr 27, 2011 at 12:33 PM, Brion Vibber <brion@pobox.com> wrote:
> Any updates on this process? What's left to do, and what can people help
> with?
>
> -- brion
>

Getting the merges reviewed [0] and making sure we have a good set of
release notes. That's what I know of, and Reedy's been working on the
latter.

-Chad

[0] http://mediawiki.org/wiki/Special:Code/MediaWiki/status/new?path=/branches/REL1_17/

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l