Mailing List Archive

editing channels - "How was this edit made?"
Hi,

In the Bangalore DevCamp I spoke a bit with Brion about a way to
measure various ways of editing MediaWiki pages. The original idea was
to measure how much the mobile editing, when it becomes widely
available, is actually used. A simplistic solution would be add a
boolean "rev_mobile" field to the revision table, but this can apply
to a lot of other things, for example:
* Visual Editor vs. the current wiki-syntax editor
* A usual browser vs. AutoWikiBrowser vs. direct API calls
* bots vs. non-bots
* for file uploads, Special:Upload vs. Special:UploadWizard

Things get even more complicated, because several such flags may apply
at once: for example, I can imagine a human editor using a mobile
editing interface with a bot flag, because he makes a lot of tiny
edits and the community doesn't want them to appear in RecentChanges.

And of course, there may be privacy and performance implications, too.

Nevertheless, some kind of metrics of the various contributions
channels would be useful. Any more ideas?

--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
‪“We're living in pieces,
I want to live in peace.” – T. Moore‬

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
Dario has been proposing RevTagging to exactly address this need, see:
http://www.mediawiki.org/wiki/Revtagging

I really think we should put this on the roadmap for 2013 for Mediawiki, we
definitely need this more granular level of instrumentation for determining
the source of an edit.

Best
Diederik


On Tue, Nov 13, 2012 at 6:19 AM, Amir E. Aharoni <
amir.aharoni@mail.huji.ac.il> wrote:

> Hi,
>
> In the Bangalore DevCamp I spoke a bit with Brion about a way to
> measure various ways of editing MediaWiki pages. The original idea was
> to measure how much the mobile editing, when it becomes widely
> available, is actually used. A simplistic solution would be add a
> boolean "rev_mobile" field to the revision table, but this can apply
> to a lot of other things, for example:
> * Visual Editor vs. the current wiki-syntax editor
> * A usual browser vs. AutoWikiBrowser vs. direct API calls
> * bots vs. non-bots
> * for file uploads, Special:Upload vs. Special:UploadWizard
>
> Things get even more complicated, because several such flags may apply
> at once: for example, I can imagine a human editor using a mobile
> editing interface with a bot flag, because he makes a lot of tiny
> edits and the community doesn't want them to appear in RecentChanges.
>
> And of course, there may be privacy and performance implications, too.
>
> Nevertheless, some kind of metrics of the various contributions
> channels would be useful. Any more ideas?
>
> --
> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
> http://aharoni.wordpress.com
> ‪“We're living in pieces,
> I want to live in peace.” – T. Moore‬
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
Diederik van Liere wrote:
> Dario has been proposing RevTagging to exactly address this need, see:
> https://www.mediawiki.org/wiki/Revtagging
>
> I really think we should put this on the roadmap for 2013 for Mediawiki, we
> definitely need this more granular level of instrumentation for determining
> the source of an edit.

Please stop top-posting. If you don't understand what that means, please
read <https://wiki.toolserver.org/view/Mailing_list_etiquette>.

As I posted at <https://www.mediawiki.org/wiki/Talk:Revtagging>, it's not
clear to me why the built-in revision tagging system in MediaWiki is
insufficient for your needs. It _feels_ like wheel-reinvention, but perhaps
there's some key component I'm missing.

MZMcBride



_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
On 13/11/12 23:42, MZMcBride wrote:
> Please stop top-posting. If you don't understand what that means, please
> read <https://wiki.toolserver.org/view/Mailing_list_etiquette>.
>
> As I posted at <https://www.mediawiki.org/wiki/Talk:Revtagging>, it's not
> clear to me why the built-in revision tagging system in MediaWiki is
> insufficient for your needs. It _feels_ like wheel-reinvention, but perhaps
> there's some key component I'm missing.

It should indeed be enough to use change_tag.

Also note that some parameters listed in the page are redundant for some
campaigns (such as adding the bot name).


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
On 2012-11-14, at 18:33, Platonides <Platonides@gmail.com> wrote:

> On 13/11/12 23:42, MZMcBride wrote:
>> Please stop top-posting. If you don't understand what that means, please
>> read <https://wiki.toolserver.org/view/Mailing_list_etiquette>.
>>
>> As I posted at <https://www.mediawiki.org/wiki/Talk:Revtagging>, it's not
>> clear to me why the built-in revision tagging system in MediaWiki is
>> insufficient for your needs. It _feels_ like wheel-reinvention, but perhaps
>> there's some key component I'm missing.
>
> It should indeed be enough to use change_tag.
>
> Also note that some parameters listed in the page are redundant for some
> campaigns (such as adding the bot name).


I think that the Analytics team would prefer either:
1) detect source of edit in the URL
Or
2) have a hook activated after a successful edit and have the data send to the pixel service

Having this data in a MySQL table poses a lot of challenges with respect of importing that data into the analytics cluster

Best
Diederik
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
On 15.11.2012, 4:06 Diederik wrote:

> I think that the Analytics team would prefer either:
> 1) detect source of edit in the URL
> Or
> 2) have a hook activated after a successful edit and have the data send to the pixel service

> Having this data in a MySQL table poses a lot of challenges with
> respect of importing that data into the analytics cluster

That's for analytics purposes. However, there can be other use cases
for which tags in the DB are perfect, for example filter recent
changes for edits made only via a particular channel.

--
Best regards,
Max Semenik ([[User:MaxSem]])


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
Max Semenik wrote:
> On 15.11.2012, 4:06 Diederik wrote:
>> I think that the Analytics team would prefer either:
>> 1) detect source of edit in the URL
>> Or
>> 2) have a hook activated after a successful edit and have the data send to
>> the pixel service
>
>> Having this data in a MySQL table poses a lot of challenges with
>> respect of importing that data into the analytics cluster
>
> That's for analytics purposes. However, there can be other use cases
> for which tags in the DB are perfect, for example filter recent
> changes for edits made only via a particular channel.

Right, which is why a revision tagging system exists in MediaWiki core
currently. If someone wanted to, for example, modify the MobileFrontend
extension to add a "mobile" tag to edits, it would be trivial to do. The
tagging infrastructure is already in place.

Going back to the broader point, I'm completely lost as to why the Analytics
team can't handle a structured database.

MZMcBride



_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
On Nov 15, 2012, at 2:51 PM, MZMcBride <z@mzmcbride.com> wrote:

> Max Semenik wrote:
>> On 15.11.2012, 4:06 Diederik wrote:
>>> I think that the Analytics team would prefer either:
>>> 1) detect source of edit in the URL
>>> Or
>>> 2) have a hook activated after a successful edit and have the data send to
>>> the pixel service
>>
>>> Having this data in a MySQL table poses a lot of challenges with
>>> respect of importing that data into the analytics cluster
>>
>> That's for analytics purposes. However, there can be other use cases
>> for which tags in the DB are perfect, for example filter recent
>> changes for edits made only via a particular channel.

Max is right.

The general issue is that the revision table could use a generalized metadata store the same way that page table has page_props[1]. This is not the same, but sometimes coincident to analytical needs (I assume that if we come up with a way to attach revision-based metadata, it would be easy to expose that same data to the analytics pipeline for RevTagging).

To Amir's original suggestion, I think that hacking in a rev_mobile field into the rev table sounds extremely clunky. I'd be worried that over time this will end up with an explosion that will resemble our Recentchanges table[2]. I assume that's why Amir brought this up to suggestion.

On the other hand, a "revision_props" the way page_props would be a terrible waste of space and performance—imagine storing a boolean in a BLOB with every revision?

Perhaps we could use a small varchar or smallint in place of the BLOB and not too high impact but fliexible enough to handle both existing (key: mobile_edit; value:1) and future needs? Especially if mobile_edit=0 isn't actually stored as an entry at all.


> Right, which is why a revision tagging system exists in MediaWiki core
> currently. If someone wanted to, for example, modify the MobileFrontend
> extension to add a "mobile" tag to edits, it would be trivial to do. The
> tagging infrastructure is already in place.

It's unfortunate that RevTagging got mixed in this discussion, but I hope this clarifies the distinction between mobile's needs and RevTagging.

Currently, MW has a very limited ability to attach metadata revision table to the revision table in the form of new cols to the revision table (existing cols are… limited[3]) The issue is that this data is prioritized for transactional use, not necessary analytical use (in wiki[4]: "is needed to operate the website and, in particular, to populate article revision histories").

In analytical systems, data is fed down a different pipeline in order to be "online" and have no impact to the web transactions. Naïvely, that's because analytical questions on transactional databases look like "COUNT * FROM sometable" which are full table scans (or thereabouts) and are expensive. Adding the metadata for analytical purposes based on the OLTP store would then be "COUNT * FROM sometable GROUP BY datafromothertable JOIN awholemessoftables" which are multiple full table scans, and pretty soon that is would require a dedicated offline read-only DB, and still be terribly slow.

So there is a need to attach metadata needed for analytics (which may or may not be the same metadata "needed to operate the website") at runtime so that it can be run down the analytical data pipeline without needing to hit the live OLTP store continually asking things like "give me the campaign that this revision occurred under?" especially when things like "campaign" probably have no importance at all to the website itself.

My thinking that if we had a way of attaching arbitrary meta to revisions, then, in cases where the two needs are coincident, all we have to do is expose that same meta to analytics through their pixel service (revtagging) and we're good to go. If revtagging isn't up, or hasn't recorded it, we could still go back to the transactional store offline and amortize the missing information.

> Going back to the broader point, I'm completely lost as to why the Analytics
> team can't handle a structured database.

I assume this last is a bit tongue-in-cheek, but I LOL'd… for completely different reasons.

[1]: http://www.mediawiki.org/wiki/Manual:Page_props_table
[2]: http://www.mediawiki.org/wiki/Manual:Recentchanges_table
[3]: http://www.mediawiki.org/wiki/Manual:Revision_table
[4]: http://www.mediawiki.org/wiki/Revtagging



terry chay 최태리
Director of Features Engineering
Wikimedia Foundation
“Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment.”

p: +1 (415) 839-6885 x6832
m: +1 (408) 480-8902
e: tchay@wikimedia.org
i: http://terrychay.com/
w: http://meta.wikimedia.org/wiki/User:Tychay
aim: terrychay
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
On Nov 15, 2012, at 2:51 PM, MZMcBride <z@mzmcbride.com> wrote:

> Right, which is why a revision tagging system exists in MediaWiki core
> currently. If someone wanted to, for example, modify the MobileFrontend
> extension to add a "mobile" tag to edits, it would be trivial to do. The
> tagging infrastructure is already in place.

I misread this, I didn't realize MZMcBride is talking about RecentChanges.

How unreasonable would it be to call ChangeTag::AddTags('mobile', $rc_id); for mobile edits? On first blush, the only major consequences is extracting the data since it'd be buried in a ts_tags blob?
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
>
> I misread this, I didn't realize MZMcBride is talking about RecentChanges.
>
> How unreasonable would it be to call ChangeTag::AddTags('mobile', $rc_id); for mobile edits? On first blush, the only major consequences is extracting the data since it'd be buried in a ts_tags blob?

Why would you look at ts_tags? change_tag table is much easier to pull
out as it uses a more normalized layout.

-bawolff

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
On Nov 16, 2012, at 3:25 PM, bawolff <bawolff+wn@gmail.com> wrote:

>>
>> I misread this, I didn't realize MZMcBride is talking about RecentChanges.
>>
>> How unreasonable would it be to call ChangeTag::AddTags('mobile', $rc_id); for mobile edits? On first blush, the only major consequences is extracting the data since it'd be buried in a ts_tags blob?
>
> Why would you look at ts_tags? change_tag table is much easier to pull
> out as it uses a more normalized layout.


True, sorry, I didn't look closely enough to realize that tag_summary is denormalized change_tag.

However, this doesn't deal with the problem that [[Special:Tags]] will get cluttered with this approach. It might work for a "mobileedit" tag, but valid_tag cannot grow arbitrarily.


terry chay 최태리
Director of Features Engineering
Wikimedia Foundation
“Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment.”

p: +1 (415) 839-6885 x6832
m: +1 (408) 480-8902
e: tchay@wikimedia.org
i: http://terrychay.com/
w: http://meta.wikimedia.org/wiki/User:Tychay
aim: terrychay

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
>
>
>
> True, sorry, I didn't look closely enough to realize that tag_summary is
> denormalized change_tag.
>
> However, this doesn't deal with the problem that [[Special:Tags]] will get
> cluttered with this approach. It might work for a "mobileedit" tag, but
> valid_tag cannot grow arbitrarily.
>

True, but that's already a problem with the tag system. EN Wikipedia's
Special:tags is full of things with description "This tag is
inactive.". In the long run we will probably have to find some way of
managing having a very long list of tags.

Another problem though is people might want to track what edits are
mobile (or whatever else) but they may not want each one to be shown
in RC as mobile edit (since it adds clutter). Perhaps using the
currently unused ct_params to be able to make certain tags hidden (be
able to filter by them, but not show up in the line in RC) would be a
solution.

-bawolff

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
You n*rds are 100 years behind Facebook, who already shows
Yesterday via email
About an hour ago via mobile
59 minutes ago near Tsoying, Kao-hsiung
24 minutes ago via POCO Beautycamera
Throw in the towel.

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
On Mon, Nov 19, 2012 at 7:43 PM, <jidanni@jidanni.org> wrote:
> You n*rds are 100 years behind Facebook, who already shows
> Yesterday via email
> About an hour ago via mobile
> 59 minutes ago near Tsoying, Kao-hsiung
> 24 minutes ago via POCO Beautycamera
> Throw in the towel.
>

Another excellent post by jidanni


--
John

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
On Mon, 19 Nov 2012 21:11:40 -0800, John Du Hart <compwhizii@gmail.com>
wrote:

> On Mon, Nov 19, 2012 at 7:43 PM, <jidanni@jidanni.org> wrote:
>> You n*rds are 100 years behind Facebook, who already shows
>> Yesterday via email
>> About an hour ago via mobile
>> 59 minutes ago near Tsoying, Kao-hsiung
>> 24 minutes ago via POCO Beautycamera
>> Throw in the towel.
>>
>
> Another excellent post by jidanni
>

Sooo... when do we set him as 'moderated'?

--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: editing channels - "How was this edit made?" [ In reply to ]
On Tue, Nov 20, 2012 at 4:00 PM, Daniel Friesen
<daniel@nadir-seen-fire.com> wrote:
> Sooo... when do we set him as 'moderated'?

I've already notified one of the list moderators, We don't need to
discuss and bring any more attention to this.

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l