Mailing List Archive

1 2 3 4 5 6 7  View All
Re: [Wikimedia-l] Quality issues [ In reply to ]
Hi Markus,


On 1 December 2015 at 23:43, Markus Krötzsch <markus at
semantic-mediawiki.org>
<wikidata%40lists.wikimedia.org?Subject=Re%3A%20%5BWikidata%5D%20%5BWikimedia-l%5D%20Quality%20issues&In-Reply-To=%3C565E30AB.6000709%40semantic-mediawiki.org%3E>
wrote:

> [.I continue cross-posting for this reply, but it would make sense to
> return the thread to the Wikidata list where it started, so as to avoid
> partial discussions happening in many places.]


Apologies for the late reply.

While you indicated that you had crossposted this reply to
Wikimedia-l, it didn't turn up in my inbox. I only saw it today, after
Atlasowa pointed it out on the Signpost op-ed's talk page.[1]


> On 27.11.2015 12:08, Andreas Kolbe wrote:

> >* Wikipedia content is considered a reliable source in Wikidata, and
*> >* Wikidata content is used as a reliable source by Google, where it
*> >* appears without any indication of its provenance.*

> This prompted me to reply. I wanted to write an email that merely says: >
"Really? Where did you get this from?" (Google using Wikidata content)

Multiple sources, including what appears to be your own research
group's writing:[2]

---o0o---

In December 2013, Google announced that their own collaboratively
edited knowledge base, Freebase, is to be discontinued in favour of
Wikidata, which gives Wikidata a prominent role as an in[p]ut for
Google Knowledge Graph. The research group Knowledge Systems
<https://ddll.inf.tu-dresden.de/web/Knowledge_Systems/en> is working
in close cooperation with the development team behind Wikidata, and
provides, e.g., the regular Wikidata RDF-Exports.

---o0o---


> But then I read the rest ... so here you go ...


> Your email mixes up many things and effects, some of which are important
> issues (e.g., the fact that VIAF is not a primary data source that
> should be used in citations). Many other of your remarks I find very
> hard to take serious, including but not limited to the following:

> * A rather bizarre connection between licensing models and
> accountability (as if it would make content more credible if you are
> legally required to say that you found it on Wikipedia, or even give a
> list of user names and IPs who contributed)


Both Freebase and Wikipedia have attribution licences. When Bing's
Snapshot displays information drawn from Freebase or Wikipedia, it's
indicated thus at the bottom of the infobox[3]:

---o0o---

Data from Freebase · Wikipedia

---o0o---

I take this as a token gesture to these sources' attribution licences.

Given the amount of space they have available, I would think most
people would agree that this form of attribution is sufficient. You
couldn't possibly expect them to list all contributors who have ever
contributed to the lead of the Wikipedia article, for example, as the
letter of the licence might require.

However, I think it's proper and important that those minimal
attributions are there. And given Wikidata's CC0 licence, I don't
expect re-users to continue attributing in this manner. This view is
shared by Max Klein for example, who is quoted to that effect in the
Signpost op-ed.[4]


> * Some stories that I think you really just made up for the sake of > argument (Denny alone has picked the Wikidata license?


Denny led the development team. There are multiple public instances
and accounts of his having advocated this choice and convinced people
of the wisdom of it, in Wikidata talk pages and elsewhere, including a
recent post on the Wikidata mailing list.[5]

Interestingly, he originally said that this would mean there could be
no imports from Wikipedia, and that there was in fact no intention to
import data from Wikipedias (see op-ed).[6] He also said, higher up on
that page, that this was "for starters", and that that decision could
easily be changed later on by the community.[7]


> Google displays Wikidata content?


See above. If Wikidata plays "a prominent role as an in[p]ut for
Google Knowledge Graph" then I would expect there to be
correspondences between Knowledge Graph and Wikidata content.


> Bing is fuelled by Wikimedia?)


I spoke of "Wikimedia-fuelled search engines like Google and Bing" in
the context of the Google Knowledge Graph and Bing's Snapshot/Satori
equivalent.

We all know that in both cases, much of the content Google and Bing
display in these infoboxes comes from Wikimedia projects (Wikipedia,
Commons and now, apparently, Wikidata).

> * Some disjointed remarks about the history of capitalism> * The assertion that content is worse just because the author who > created it used a bot for editing


I spoke of "bot users mass-importing unreliable data". It's not the
bot method that makes the data unreliable: they are unreliable to
begin with (because they are unsourced, nobody verifies the source,
etc.).

As I pointed out in this week's op-ed, of the top fifteen hoaxes in
the English Wikipedia, six have active Wikidata items (or rather, had:
they were deleted this morning, after the op-ed appeared).

This is what I mean by unreliable data.


> * The idea that engineers want to build systems with bad data because > they like the challenge of cleaning it up -- I mean: really! There is > nothing one can even say to this.


Again, this is not quite what I was trying to convey. My impression is
that the current community effort at Wikidata emphasises speed: hence
the mass imports of data from Wikipedia, whether verifiable or not,
contrary to original intentions, as represented by Denny's quote
above.

As far as I can make out, present-day thinking among many Wikidatans
is: let's get lots of data in fast even though we know some of it will
be bad. Afterwards, we can then apply clever methods to check for
inconsistencies and clean our data up -- which is a challenge people
do seem to warm to. Meanwhile, others throw up their arms in dismay
and say, "Stop! You're importing bad data."

Wouldn't you agree that this characterises some of the recent
discussions on the Wikidata Project Chat page?

The two camps seem approximately evenly represented in the discussions
I've seen. But while the one camp says "Stop!", the other camp
continues importing. So in practice, the importers are getting their
way.


> * The complaint that Wikimedia employs too much engineering expertise > and too little content expertise (when, in reality, it is a key > principle of Wikimedia to keep out of content, and communities regularly > complain WMF would still meddle too much).


Is it not obvious that I was talking about community practices rather
than the actions of Wikimedia staff?


> * All those convincing arguments you make against open, anonymous > editing because of it being easy to manipulate (I've heard this from > Wikipedia critics ten years ago; wonder what became of them)


Such criticisms are still regularly levelled at Wikipedia, in
top-quality publications. If you really want, I can send you a
literature list, but you could begin with this article in Newsweek.[6]


> * And, finally, the culminating conspiracy theory of total control over > political opinion, destroying all plurality by allowing only one > viewpoint (not exactly what I observe on the Web ...) -- and topping > this by blaming it all on the choice of a particular Creative Commons > license for Wikidata! Really, you can't make this up.


The information provided by default to billions of search engine users
*matters*. You can never prevent an individual from going to a website
that espouses a different view, but you don't have to for that
information to have a measurable effect.

Robert Epstein and Ronald E. Robertson recently published a paper on
what they called "The search engine manipulation effect (SEME) and its
possible impact on the outcomes of elections".[9] It provides further
detail.


> Summing up: either this is an elaborate satire that tries to test how > serious an answer you will get on a Wikimedia list, or you should > *seriously* rethink what you wrote here, take back the things that are > obviously bogus, and have a down-to-earth discussion about the topics > you really care about (licenses and cyclic sourcing on Wikimedia > projects, I guess; "capitalist companies controlling public media" > should be discussed in another forum).


No satire was intended. I hope I have succeeded in making my points clearer.


Regards,

Andreas

[1]
https://en.wikipedia.org/wiki/Wikipedia_talk:Wikipedia_Signpost/2015-12-02/Op-ed
[2] https://ddll.inf.tu-dresden.de/web/Wikidata/en
[3]
http://www.bing.com/search?q=jerusalem&go=Submit&qs=n&form=QBLH&pq=jerusalem&sc=9-9&sp=-1&sk=&cvid=62C12B6CC7B94CD1A9081E17AC205270
[4]
https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-12-02/Op-ed
[5] https://lists.wikimedia.org/pipermail/wikidata/2015-December/007769.html
[6] https://archive.is/ZbV5A#selection-2997.0-3009.26
[7] https://archive.is/ZbV5A#selection-2755.308-2763.27
[8]
http://www.newsweek.com/2015/04/03/manipulating-wikipedia-promote-bogus-business-school-316133.html
[9] http://www.pnas.org/content/112/33/E4512.abstract
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Such issues are always going to crop up when you're attempting to describe
the world using Aristotelian propositions. In a source like Wikipedia, we
can provide some nuance, explain both sides of the issue, the history of
both claims, and let the reader decide. In a database, we are limited to
saying that Jerusalem either is or is not the capital of Israel.

To be fair, this is not an weakness that is implementation-specific to
Wikidata; it is always going to happen when you try to describe the world
in this way. It's not something that can be fixed with adding sources, or
by bolting fancy new technical gadgets onto the side of the database.

Cheers,
Craig

On 8 December 2015 at 06:58, Andrea Zanni <zanni.andrea84@gmail.com> wrote:

> On Mon, Dec 7, 2015 at 9:53 PM, Andreas Kolbe <jayen466@gmail.com> wrote:
>
> > Hi Yaroslav,
> >
> > Thanks for the background. The "POV pushing" you describe is of course
> what
> > Graham and Ford are examining in their paper.
> >
> > For what it's worth, the Wikidata item for Jerusalem[1] still contains
> the
> > statement "capital of Israel" today.
> >
>
>
> Really, I do not understand the difference between this kind of problem and
> Wikipedia's edit wars or conflicts.
> Wikidata represents knowledge in a structured, collaborative way: both
> features define it, and it seems the op-ed just doesn't like them (either
> one or both).
>
> Aubrey
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Criag is right this cant be fixed within the database because the data base
is applying one truth where there is no one truth for everyone. This will
always be the single biggest flaw of Wikidata no matter how data is
presented it can never be the absolute truth unless its measurable through
some mathematical scientific process that can replicated by everyone,
translated into any language.

Wikipedia's answer is to present all considerations in an equal manor and
not interpret the facts....

Wikidata defines what is fact, what is truth, what is right thats a big
task and is something the community has never tackled before... should we
even try, has the damage already been done or should we narrow the range of
recorded data, could we flag alternatives, could we give a measure of
acceptance for each fact. are there alternative means....

Quality itself has many different measures and many different ways of being
measured all of which are the truth for the question being asked...

Are we even asking the questions we need to in the way we need to?



On 8 December 2015 at 07:52, Craig Franklin <cfranklin@halonetwork.net>
wrote:

> Such issues are always going to crop up when you're attempting to describe
> the world using Aristotelian propositions. In a source like Wikipedia, we
> can provide some nuance, explain both sides of the issue, the history of
> both claims, and let the reader decide. In a database, we are limited to
> saying that Jerusalem either is or is not the capital of Israel.
>
> To be fair, this is not an weakness that is implementation-specific to
> Wikidata; it is always going to happen when you try to describe the world
> in this way. It's not something that can be fixed with adding sources, or
> by bolting fancy new technical gadgets onto the side of the database.
>
> Cheers,
> Craig
>
> On 8 December 2015 at 06:58, Andrea Zanni <zanni.andrea84@gmail.com>
> wrote:
>
> > On Mon, Dec 7, 2015 at 9:53 PM, Andreas Kolbe <jayen466@gmail.com>
> wrote:
> >
> > > Hi Yaroslav,
> > >
> > > Thanks for the background. The "POV pushing" you describe is of course
> > what
> > > Graham and Ford are examining in their paper.
> > >
> > > For what it's worth, the Wikidata item for Jerusalem[1] still contains
> > the
> > > statement "capital of Israel" today.
> > >
> >
> >
> > Really, I do not understand the difference between this kind of problem
> and
> > Wikipedia's edit wars or conflicts.
> > Wikidata represents knowledge in a structured, collaborative way: both
> > features define it, and it seems the op-ed just doesn't like them (either
> > one or both).
> >
> > Aubrey
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> >
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
>



--
GN.
President Wikimedia Australia
WMAU: http://www.wikimedia.org.au/wiki/User:Gnangarra
Photo Gallery: http://gnangarra.redbubble.com
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
On Tue, Dec 8, 2015 at 1:15 AM, Gnangarra <gnangarra@gmail.com> wrote:
> Criag is right this cant be fixed within the database because the data base
> is applying one truth where there is no one truth for everyone. This will
> always be the single biggest flaw of Wikidata no matter how data is
> presented it can never be the absolute truth unless its measurable through
> some mathematical scientific process that can replicated by everyone,
> translated into any language.
>
> Wikipedia's answer is to present all considerations in an equal manor and
> not interpret the facts....
>
> Wikidata defines what is fact, what is truth, what is right thats a big
> task and is something the community has never tackled before... should we
> even try, has the damage already been done or should we narrow the range of
> recorded data, could we flag alternatives, could we give a measure of
> acceptance for each fact. are there alternative means....

That is actually not correct. We have built Wikidata from the very
beginning with some core believes. One of them is that Wikidata isn't
supposed to have the one truth but instead is able to represent
various different points of view and link to sources claiming these.
Look for example at the country statements for Jerusalem:
https://www.wikidata.org/wiki/Q1218
Now I am the first to say that this will not be able to capture the
full complexity of the world around us. But that's not what it is
meant to do. However please be aware that we have built more than just
a dumb database with Wikidata and have gone to great length to make it
possible to capture knowledge diversity.


Cheers
Lydia

--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Amen to that! This discussion about Jerusalem reminds me of the discussion
we had about the nationality of Anne Frank. For those interested, there
have been some heated debates about whether Mobile should use the text in
Wikidata "label descriptions" or rather some basic presentation of the P31
property. Most descriptions are still blank anyway. Personally I think
texts such as "capital of Israel" or "holocaust victim" are both better
than blank, but many disagree with me.

Both of these represent associated items that have a lot of eyes on them,
but what about our more obscure items? Lots of these may be improved by the
people who originally created a Wikipedia page for them. As a Wikipedia
editor who has created over 2000 Wikipedia pages, I feel somewhat dismayed
at the idea that I need to walk through this long list and add statements
to their Wikidata items as the responsible party who introduced them to the
Wikiverse in the first place. But if I had a gadget that would tell me
which of my created Wikipedia articles had 0-3 statements, I would probably
update those.

On Thu, Dec 10, 2015 at 9:27 AM, Lydia Pintscher <
lydia.pintscher@wikimedia.de> wrote:

> On Tue, Dec 8, 2015 at 1:15 AM, Gnangarra <gnangarra@gmail.com> wrote:
> > Criag is right this cant be fixed within the database because the data
> base
> > is applying one truth where there is no one truth for everyone. This will
> > always be the single biggest flaw of Wikidata no matter how data is
> > presented it can never be the absolute truth unless its measurable
> through
> > some mathematical scientific process that can replicated by everyone,
> > translated into any language.
> >
> > Wikipedia's answer is to present all considerations in an equal manor and
> > not interpret the facts....
> >
> > Wikidata defines what is fact, what is truth, what is right thats a big
> > task and is something the community has never tackled before... should we
> > even try, has the damage already been done or should we narrow the range
> of
> > recorded data, could we flag alternatives, could we give a measure of
> > acceptance for each fact. are there alternative means....
>
> That is actually not correct. We have built Wikidata from the very
> beginning with some core believes. One of them is that Wikidata isn't
> supposed to have the one truth but instead is able to represent
> various different points of view and link to sources claiming these.
> Look for example at the country statements for Jerusalem:
> https://www.wikidata.org/wiki/Q1218
> Now I am the first to say that this will not be able to capture the
> full complexity of the world around us. But that's not what it is
> meant to do. However please be aware that we have built more than just
> a dumb database with Wikidata and have gone to great length to make it
> possible to capture knowledge diversity.
>
>
> Cheers
> Lydia
>
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
I agree getting bogged down on one item of data isnt helpful but the data
does need to show its disputed and the data item on Israel
<https://www.wikidata.org/wiki/Q801> should at least have Tel Aviv listed
as its mentonym


within the database because the data base
> is applying one truth where there is no one truth for everyone. This will
> always be the single biggest flaw of Wikidata no matter how data is
> presented it can never be the absolute truth

The Jerusalem/Israel example where the data doesnt indicate its disputed
means that it will propagated as an absolute truth...


Then again this is shifting away from the original concern over quality
that the ability to verify the information isnt clear combined with the
CC0 license the already established practice on other sources. Wikidata for
falsehoods being easily manipulated its going to have a impact.

On 10 December 2015 at 16:44, Jane Darnell <jane023@gmail.com> wrote:

> Amen to that! This discussion about Jerusalem reminds me of the discussion
> we had about the nationality of Anne Frank. For those interested, there
> have been some heated debates about whether Mobile should use the text in
> Wikidata "label descriptions" or rather some basic presentation of the P31
> property. Most descriptions are still blank anyway. Personally I think
> texts such as "capital of Israel" or "holocaust victim" are both better
> than blank, but many disagree with me.
>
> Both of these represent associated items that have a lot of eyes on them,
> but what about our more obscure items? Lots of these may be improved by the
> people who originally created a Wikipedia page for them. As a Wikipedia
> editor who has created over 2000 Wikipedia pages, I feel somewhat dismayed
> at the idea that I need to walk through this long list and add statements
> to their Wikidata items as the responsible party who introduced them to the
> Wikiverse in the first place. But if I had a gadget that would tell me
> which of my created Wikipedia articles had 0-3 statements, I would probably
> update those.
>
> On Thu, Dec 10, 2015 at 9:27 AM, Lydia Pintscher <
> lydia.pintscher@wikimedia.de> wrote:
>
> > On Tue, Dec 8, 2015 at 1:15 AM, Gnangarra <gnangarra@gmail.com> wrote:
> > > Criag is right this cant be fixed within the database because the data
> > base
> > > is applying one truth where there is no one truth for everyone. This
> will
> > > always be the single biggest flaw of Wikidata no matter how data is
> > > presented it can never be the absolute truth unless its measurable
> > through
> > > some mathematical scientific process that can replicated by everyone,
> > > translated into any language.
> > >
> > > Wikipedia's answer is to present all considerations in an equal manor
> and
> > > not interpret the facts....
> > >
> > > Wikidata defines what is fact, what is truth, what is right thats a big
> > > task and is something the community has never tackled before... should
> we
> > > even try, has the damage already been done or should we narrow the
> range
> > of
> > > recorded data, could we flag alternatives, could we give a measure of
> > > acceptance for each fact. are there alternative means....
> >
> > That is actually not correct. We have built Wikidata from the very
> > beginning with some core believes. One of them is that Wikidata isn't
> > supposed to have the one truth but instead is able to represent
> > various different points of view and link to sources claiming these.
> > Look for example at the country statements for Jerusalem:
> > https://www.wikidata.org/wiki/Q1218
> > Now I am the first to say that this will not be able to capture the
> > full complexity of the world around us. But that's not what it is
> > meant to do. However please be aware that we have built more than just
> > a dumb database with Wikidata and have gone to great length to make it
> > possible to capture knowledge diversity.
> >
> >
> > Cheers
> > Lydia
> >
> > --
> > Lydia Pintscher - http://about.me/lydia.pintscher
> > Product Manager for Wikidata
> >
> > Wikimedia Deutschland e.V.
> > Tempelhofer Ufer 23-24
> > 10963 Berlin
> > www.wikimedia.de
> >
> > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
> >
> > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> > unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> > Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
> >
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
>



--
GN.
President Wikimedia Australia
WMAU: http://www.wikimedia.org.au/wiki/User:Gnangarra
Photo Gallery: http://gnangarra.redbubble.com
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Just as this discussion shifts, so does Wikidata quality. Both, hopefully,
in a more constructive direction, which was Lydia's original point.

On Thu, Dec 10, 2015 at 10:14 AM, Gnangarra <gnangarra@gmail.com> wrote:

> I agree getting bogged down on one item of data isnt helpful but the data
> does need to show its disputed and the data item on Israel
> <https://www.wikidata.org/wiki/Q801> should at least have Tel Aviv listed
> as its mentonym
>
>
> within the database because the data base
> > is applying one truth where there is no one truth for everyone. This will
> > always be the single biggest flaw of Wikidata no matter how data is
> > presented it can never be the absolute truth
>
> The Jerusalem/Israel example where the data doesnt indicate its disputed
> means that it will propagated as an absolute truth...
>
>
> Then again this is shifting away from the original concern over quality
> that the ability to verify the information isnt clear combined with the
> CC0 license the already established practice on other sources. Wikidata for
> falsehoods being easily manipulated its going to have a impact.
>
> On 10 December 2015 at 16:44, Jane Darnell <jane023@gmail.com> wrote:
>
> > Amen to that! This discussion about Jerusalem reminds me of the
> discussion
> > we had about the nationality of Anne Frank. For those interested, there
> > have been some heated debates about whether Mobile should use the text in
> > Wikidata "label descriptions" or rather some basic presentation of the
> P31
> > property. Most descriptions are still blank anyway. Personally I think
> > texts such as "capital of Israel" or "holocaust victim" are both better
> > than blank, but many disagree with me.
> >
> > Both of these represent associated items that have a lot of eyes on them,
> > but what about our more obscure items? Lots of these may be improved by
> the
> > people who originally created a Wikipedia page for them. As a Wikipedia
> > editor who has created over 2000 Wikipedia pages, I feel somewhat
> dismayed
> > at the idea that I need to walk through this long list and add statements
> > to their Wikidata items as the responsible party who introduced them to
> the
> > Wikiverse in the first place. But if I had a gadget that would tell me
> > which of my created Wikipedia articles had 0-3 statements, I would
> probably
> > update those.
> >
> > On Thu, Dec 10, 2015 at 9:27 AM, Lydia Pintscher <
> > lydia.pintscher@wikimedia.de> wrote:
> >
> > > On Tue, Dec 8, 2015 at 1:15 AM, Gnangarra <gnangarra@gmail.com> wrote:
> > > > Criag is right this cant be fixed within the database because the
> data
> > > base
> > > > is applying one truth where there is no one truth for everyone. This
> > will
> > > > always be the single biggest flaw of Wikidata no matter how data is
> > > > presented it can never be the absolute truth unless its measurable
> > > through
> > > > some mathematical scientific process that can replicated by everyone,
> > > > translated into any language.
> > > >
> > > > Wikipedia's answer is to present all considerations in an equal manor
> > and
> > > > not interpret the facts....
> > > >
> > > > Wikidata defines what is fact, what is truth, what is right thats a
> big
> > > > task and is something the community has never tackled before...
> should
> > we
> > > > even try, has the damage already been done or should we narrow the
> > range
> > > of
> > > > recorded data, could we flag alternatives, could we give a measure of
> > > > acceptance for each fact. are there alternative means....
> > >
> > > That is actually not correct. We have built Wikidata from the very
> > > beginning with some core believes. One of them is that Wikidata isn't
> > > supposed to have the one truth but instead is able to represent
> > > various different points of view and link to sources claiming these.
> > > Look for example at the country statements for Jerusalem:
> > > https://www.wikidata.org/wiki/Q1218
> > > Now I am the first to say that this will not be able to capture the
> > > full complexity of the world around us. But that's not what it is
> > > meant to do. However please be aware that we have built more than just
> > > a dumb database with Wikidata and have gone to great length to make it
> > > possible to capture knowledge diversity.
> > >
> > >
> > > Cheers
> > > Lydia
> > >
> > > --
> > > Lydia Pintscher - http://about.me/lydia.pintscher
> > > Product Manager for Wikidata
> > >
> > > Wikimedia Deutschland e.V.
> > > Tempelhofer Ufer 23-24
> > > 10963 Berlin
> > > www.wikimedia.de
> > >
> > > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
> > >
> > > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> > > unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> > > Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
> > >
> > > _______________________________________________
> > > Wikimedia-l mailing list, guidelines at:
> > > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > > Wikimedia-l@lists.wikimedia.org
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> >
>
>
>
> --
> GN.
> President Wikimedia Australia
> WMAU: http://www.wikimedia.org.au/wiki/User:Gnangarra
> Photo Gallery: http://gnangarra.redbubble.com
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Hoi,
The other side of being easily manipulated is that it is easy to rectify.
The Signpost is FUD in so many ways and incorrect as well. Yes, you may
have a concern about falsehoods. However, this is not going to be helped
much by insisting that everything is to be sourced. It is also not the only
way to consider quality and arguably it is the least helpful way of
improving the quality at Wikidata.

Typically what has been established on other sources is acceptable as valid
for now. When we compare and find differences, it is of relevance to find
sources and even document the differences. When it is a falsehood we should
flag them as such. Sources can be wrong or considered to be wrong.

The case for the CC-0 license is so in line with what the WMF stands for.
Our aim is to share in the sum of all knowledge and it is the most obvious
way to do it. When Wikidata is found to document falsehoods or established
truths that are problematic, we gain a quality where people come to
Wikidata to learn what they need to learn.

So where some see a problem, there is opportunity.
Thanks,
GerardM

On 10 December 2015 at 10:14, Gnangarra <gnangarra@gmail.com> wrote:

> I agree getting bogged down on one item of data isnt helpful but the data
> does need to show its disputed and the data item on Israel
> <https://www.wikidata.org/wiki/Q801> should at least have Tel Aviv listed
> as its mentonym
>
>
> within the database because the data base
> > is applying one truth where there is no one truth for everyone. This will
> > always be the single biggest flaw of Wikidata no matter how data is
> > presented it can never be the absolute truth
>
> The Jerusalem/Israel example where the data doesnt indicate its disputed
> means that it will propagated as an absolute truth...
>
>
> Then again this is shifting away from the original concern over quality
> that the ability to verify the information isnt clear combined with the
> CC0 license the already established practice on other sources. Wikidata for
> falsehoods being easily manipulated its going to have a impact.
>
> On 10 December 2015 at 16:44, Jane Darnell <jane023@gmail.com> wrote:
>
> > Amen to that! This discussion about Jerusalem reminds me of the
> discussion
> > we had about the nationality of Anne Frank. For those interested, there
> > have been some heated debates about whether Mobile should use the text in
> > Wikidata "label descriptions" or rather some basic presentation of the
> P31
> > property. Most descriptions are still blank anyway. Personally I think
> > texts such as "capital of Israel" or "holocaust victim" are both better
> > than blank, but many disagree with me.
> >
> > Both of these represent associated items that have a lot of eyes on them,
> > but what about our more obscure items? Lots of these may be improved by
> the
> > people who originally created a Wikipedia page for them. As a Wikipedia
> > editor who has created over 2000 Wikipedia pages, I feel somewhat
> dismayed
> > at the idea that I need to walk through this long list and add statements
> > to their Wikidata items as the responsible party who introduced them to
> the
> > Wikiverse in the first place. But if I had a gadget that would tell me
> > which of my created Wikipedia articles had 0-3 statements, I would
> probably
> > update those.
> >
> > On Thu, Dec 10, 2015 at 9:27 AM, Lydia Pintscher <
> > lydia.pintscher@wikimedia.de> wrote:
> >
> > > On Tue, Dec 8, 2015 at 1:15 AM, Gnangarra <gnangarra@gmail.com> wrote:
> > > > Criag is right this cant be fixed within the database because the
> data
> > > base
> > > > is applying one truth where there is no one truth for everyone. This
> > will
> > > > always be the single biggest flaw of Wikidata no matter how data is
> > > > presented it can never be the absolute truth unless its measurable
> > > through
> > > > some mathematical scientific process that can replicated by everyone,
> > > > translated into any language.
> > > >
> > > > Wikipedia's answer is to present all considerations in an equal manor
> > and
> > > > not interpret the facts....
> > > >
> > > > Wikidata defines what is fact, what is truth, what is right thats a
> big
> > > > task and is something the community has never tackled before...
> should
> > we
> > > > even try, has the damage already been done or should we narrow the
> > range
> > > of
> > > > recorded data, could we flag alternatives, could we give a measure of
> > > > acceptance for each fact. are there alternative means....
> > >
> > > That is actually not correct. We have built Wikidata from the very
> > > beginning with some core believes. One of them is that Wikidata isn't
> > > supposed to have the one truth but instead is able to represent
> > > various different points of view and link to sources claiming these.
> > > Look for example at the country statements for Jerusalem:
> > > https://www.wikidata.org/wiki/Q1218
> > > Now I am the first to say that this will not be able to capture the
> > > full complexity of the world around us. But that's not what it is
> > > meant to do. However please be aware that we have built more than just
> > > a dumb database with Wikidata and have gone to great length to make it
> > > possible to capture knowledge diversity.
> > >
> > >
> > > Cheers
> > > Lydia
> > >
> > > --
> > > Lydia Pintscher - http://about.me/lydia.pintscher
> > > Product Manager for Wikidata
> > >
> > > Wikimedia Deutschland e.V.
> > > Tempelhofer Ufer 23-24
> > > 10963 Berlin
> > > www.wikimedia.de
> > >
> > > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
> > >
> > > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> > > unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> > > Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
> > >
> > > _______________________________________________
> > > Wikimedia-l mailing list, guidelines at:
> > > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > > Wikimedia-l@lists.wikimedia.org
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> >
>
>
>
> --
> GN.
> President Wikimedia Australia
> WMAU: http://www.wikimedia.org.au/wiki/User:Gnangarra
> Photo Gallery: http://gnangarra.redbubble.com
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Hoi,
The other side of the coin of being easily manipulated is that it is easy
to rectify. The Signpost is FUD in so many ways and incorrect as well. Yes,
you may have a concern about falsehoods. However, this is not going to be
helped much by insisting that everything is to be sourced. It is also not
the only way to consider quality and arguably it is the least helpful way
of improving the quality at Wikidata.

Typically what has been established on other sources is acceptable as valid
for now. When we compare and find differences, it is of relevance to find
sources and even document the differences. When it is a falsehood we should
flag them as such. Sources can be wrong or considered to be wrong. The
point however is that by concentrating on differences first we make the
most effective use of people who like these kinds of puzzles.

The case for the CC-0 license is so in line with what the WMF stands for.
Our aim is to share in the sum of all knowledge and it is the most obvious
way to do it. When Wikidata is found to document falsehoods or established
truths that are problematic, we gain a quality where people come to
Wikidata to learn what they need to learn.

When you say it has an impact, OK. Let it have an impact but lets consider
arguments and that is exactly what the author of this article did not do.
It is the one reason why what he wrote is FUD. So do consider quality and
recognise that we have made enormous strides forward. When this recognition
sinks in, when people understand how quality actually works, the kind of
quality that makes a difference improving Wikidata, we can easily go on
doing what we do. We may be bold and should be bold, we may make mistakes
and we do learn as we go along.
Thanks,
GerardM

On 10 December 2015 at 10:14, Gnangarra <gnangarra@gmail.com> wrote:

> I agree getting bogged down on one item of data isnt helpful but the data
> does need to show its disputed and the data item on Israel
> <https://www.wikidata.org/wiki/Q801> should at least have Tel Aviv listed
> as its mentonym
>
>
> within the database because the data base
> > is applying one truth where there is no one truth for everyone. This will
> > always be the single biggest flaw of Wikidata no matter how data is
> > presented it can never be the absolute truth
>
> The Jerusalem/Israel example where the data doesnt indicate its disputed
> means that it will propagated as an absolute truth...
>
>
> Then again this is shifting away from the original concern over quality
> that the ability to verify the information isnt clear combined with the
> CC0 license the already established practice on other sources. Wikidata for
> falsehoods being easily manipulated its going to have a impact.
>
> On 10 December 2015 at 16:44, Jane Darnell <jane023@gmail.com> wrote:
>
> > Amen to that! This discussion about Jerusalem reminds me of the
> discussion
> > we had about the nationality of Anne Frank. For those interested, there
> > have been some heated debates about whether Mobile should use the text in
> > Wikidata "label descriptions" or rather some basic presentation of the
> P31
> > property. Most descriptions are still blank anyway. Personally I think
> > texts such as "capital of Israel" or "holocaust victim" are both better
> > than blank, but many disagree with me.
> >
> > Both of these represent associated items that have a lot of eyes on them,
> > but what about our more obscure items? Lots of these may be improved by
> the
> > people who originally created a Wikipedia page for them. As a Wikipedia
> > editor who has created over 2000 Wikipedia pages, I feel somewhat
> dismayed
> > at the idea that I need to walk through this long list and add statements
> > to their Wikidata items as the responsible party who introduced them to
> the
> > Wikiverse in the first place. But if I had a gadget that would tell me
> > which of my created Wikipedia articles had 0-3 statements, I would
> probably
> > update those.
> >
> > On Thu, Dec 10, 2015 at 9:27 AM, Lydia Pintscher <
> > lydia.pintscher@wikimedia.de> wrote:
> >
> > > On Tue, Dec 8, 2015 at 1:15 AM, Gnangarra <gnangarra@gmail.com> wrote:
> > > > Criag is right this cant be fixed within the database because the
> data
> > > base
> > > > is applying one truth where there is no one truth for everyone. This
> > will
> > > > always be the single biggest flaw of Wikidata no matter how data is
> > > > presented it can never be the absolute truth unless its measurable
> > > through
> > > > some mathematical scientific process that can replicated by everyone,
> > > > translated into any language.
> > > >
> > > > Wikipedia's answer is to present all considerations in an equal manor
> > and
> > > > not interpret the facts....
> > > >
> > > > Wikidata defines what is fact, what is truth, what is right thats a
> big
> > > > task and is something the community has never tackled before...
> should
> > we
> > > > even try, has the damage already been done or should we narrow the
> > range
> > > of
> > > > recorded data, could we flag alternatives, could we give a measure of
> > > > acceptance for each fact. are there alternative means....
> > >
> > > That is actually not correct. We have built Wikidata from the very
> > > beginning with some core believes. One of them is that Wikidata isn't
> > > supposed to have the one truth but instead is able to represent
> > > various different points of view and link to sources claiming these.
> > > Look for example at the country statements for Jerusalem:
> > > https://www.wikidata.org/wiki/Q1218
> > > Now I am the first to say that this will not be able to capture the
> > > full complexity of the world around us. But that's not what it is
> > > meant to do. However please be aware that we have built more than just
> > > a dumb database with Wikidata and have gone to great length to make it
> > > possible to capture knowledge diversity.
> > >
> > >
> > > Cheers
> > > Lydia
> > >
> > > --
> > > Lydia Pintscher - http://about.me/lydia.pintscher
> > > Product Manager for Wikidata
> > >
> > > Wikimedia Deutschland e.V.
> > > Tempelhofer Ufer 23-24
> > > 10963 Berlin
> > > www.wikimedia.de
> > >
> > > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
> > >
> > > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> > > unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> > > Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
> > >
> > > _______________________________________________
> > > Wikimedia-l mailing list, guidelines at:
> > > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > > Wikimedia-l@lists.wikimedia.org
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> >
>
>
>
> --
> GN.
> President Wikimedia Australia
> WMAU: http://www.wikimedia.org.au/wiki/User:Gnangarra
> Photo Gallery: http://gnangarra.redbubble.com
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
On Thu, Dec 10, 2015 at 10:27 AM, Gerard Meijssen <gerard.meijssen@gmail.com
> wrote:

> The case for the CC-0 license is so in line with what the WMF stands for.
> Our aim is to share in the sum of all knowledge and it is the most obvious
> way to do it. When Wikidata is found to document falsehoods or established
> truths that are problematic, we gain a quality where people come to
> Wikidata to learn what they need to learn.
>


According to Denny, Wikidata, under its CC0 licence, must not import data
from Share-Alike sources. He reconfirmed this yesterday when I asked him
whether he still stood by that.

In practice though we have Wikidata importing massive amounts of data from
Wikipedia, which was a Share-Alike source last time I looked. Isn't
Wikidata then infringing Wikipedia contributors' rights?

Why is it okay to import data from the CC BY-SA Wikipedia, but not from
European CC BY-SA population statistics?

There are inchoate and uncomfortable parallels to licence laundering here,
which I would hope is not something the WMF stands for. Could someone
please explain?
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Hoi,
What other people say is there choice. The law is simple. Facts cannot be
copyrighted and consequently the preference / the opinion of Denny is
simply that.

Typically statistics organisations are more than happy to share their data.
They do so in the Netherlands and it is only for a lack of organisation on
our end that it has not happened yet.

When I copy data from Wikipedia, it is unstructured in every sense. As a
follow up I often spend time to improve upon it further.

I do not care for your opinion. So far I only have seen your FUD, you
present preferences of people like Denny as a ground for compliance, it is
not and there is not much positive in what I have seen from you so far.

What is your contribution, what is it that you hope to achieve?

You point to organisations like statistics organisation like they are the
ones not interested in collaboration. They are ever so happy to collaborate
and we are happy to acknowledge them for the source of information when
they do. By seeking collaboration, by seeking to bring data together and
achieve more, we are able to make a difference. This is not done by
publicly claiming like you do that you are not involved and do not want to
know. It is done by being involved, knowing what quality means and how we
can achieve it and walking the walk and talk the talk.
Thanks,
GerardM

On 10 December 2015 at 13:17, Andreas Kolbe <jayen466@gmail.com> wrote:

> On Thu, Dec 10, 2015 at 10:27 AM, Gerard Meijssen <
> gerard.meijssen@gmail.com
> > wrote:
>
> > The case for the CC-0 license is so in line with what the WMF stands for.
> > Our aim is to share in the sum of all knowledge and it is the most
> obvious
> > way to do it. When Wikidata is found to document falsehoods or
> established
> > truths that are problematic, we gain a quality where people come to
> > Wikidata to learn what they need to learn.
> >
>
>
> According to Denny, Wikidata, under its CC0 licence, must not import data
> from Share-Alike sources. He reconfirmed this yesterday when I asked him
> whether he still stood by that.
>
> In practice though we have Wikidata importing massive amounts of data from
> Wikipedia, which was a Share-Alike source last time I looked. Isn't
> Wikidata then infringing Wikipedia contributors' rights?
>
> Why is it okay to import data from the CC BY-SA Wikipedia, but not from
> European CC BY-SA population statistics?
>
> There are inchoate and uncomfortable parallels to licence laundering here,
> which I would hope is not something the WMF stands for. Could someone
> please explain?
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
On Thu, Dec 10, 2015 at 4:18 AM Andreas Kolbe <jayen466@gmail.com> wrote:

> According to Denny, Wikidata, under its CC0 licence, must not import data
> from Share-Alike sources. He reconfirmed this yesterday when I asked him
> whether he still stood by that.
>
> In practice though we have Wikidata importing massive amounts of data from
> Wikipedia, which was a Share-Alike source last time I looked. Isn't
> Wikidata then infringing Wikipedia contributors' rights?
>
> Why is it okay to import data from the CC BY-SA Wikipedia, but not from
> European CC BY-SA population statistics?
>
>
Andreas, what I said was that Wikidata must not import data from a data
source licensed under Share-Alike date source.

The important thing that differentiates what I said from what you think I
said is "import data from a data source". Wikipedia is not a data source,
but text. Extracting facts or data from a text is a very different thing
than taking data from one place and put it in another place. There was no
database that contains the content of Wikipedia and that can be queried.
Indeed, that is the whole reason why Wikidata has been started in the first
place.

In fact, extracting facts or data from one text and then writing a
Wikipedia article is what Wikipedians do all the time, and the license of
the original text we read has no effect on the license of the output text.

So, there is no such thing as an import of data from Wikipedia, because
Wikipedia is not a database.

I have repeatedly pointed you to
http://simia.net/wiki/Free_data
and you yourself have repeatedly pointed to
https://meta.wikimedia.org/wiki/Wikilegal/Database_Rights
so I would assume that you would have by now read these and developed an
understanding of these issues. I am not a lawyer, and my understanding of
these issues is also lacking, but I wanted at least to point out that you
are misquoting me.

Please, would you mind to correct your misquoting of me in the places where
you did so, or at least point to this email for further context?
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Denny,


I quoted your statement verbatim and in full in the op-ed. Moreover, your
statement had a context. Alexrk2 had said,[1]



---o0o---

Read the above.. at least under European Union law databases are protected
by copyright. CC0 won't be compatible with other projects like
OpenStreetMap *or Wikipedia*. This means a CC0-WikiData won't be
allowed to *import
content from Wikipedia*, OpenStreetMap or any other share-alike data
source. The worst case IMO would be if WikiData *extracts content out of
Wikipedia and release it as CC0*. Under EU law this would be illegal. As a
contributor in DE Wikipedia I would feel like being expropriated somehow.
This is not acceptable! --Alexrk2 (talk) 15:32, 16 June 2012 (UTC)

---o0o---



Note Alexrk2's three (3) specific references to Wikipedia.

Alexrk2 referred to imports of content from Wikipedia, and how it would
make her or him feel expropriated if WikiData extracted content out of
Wikipedia and released it under CC0.

You replied,



---o0o---

Alexrk2, it is true that Wikidata under CC0 would not be allowed to import
content from a Share-Alike data source. *Wikidata does not plan to extract
content out of Wikipedia at all*. Wikidata will *provide *data that can be
reused in the Wikipedias. And a CC0 source can be used by a Share-Alike
project, be it either Wikipedia or OSM. But not the other way around. Do we
agree on this understanding? --Denny Vrandečić (WMDE) (talk) 12:39, 4 July
2012 (UTC)

---o0o---



Alexrk2 specifically mentioned Wikipedia. So did you in your reply,
assuring Alexrk2 that Wikidata did not in fact plan to extract content out
of Wikipedia at all. Does this lend itself to the interpretation that you
were talking only about databases, and not about Wikipedia?

Alexrk2 then replied to you,



---o0o---

@Denny Vrandečić: I agree. But I thought, the aim (or *one* aim) of
WikiData would be to *draw all the data out of Wikipedia (infoboxes and
such things)*.

---o0o---



You did not respond to that post, or participate further in that section.
And these bot imports of Wikipedia infobox contents etc. have happened and
are ongoing. They have been mentioned in many discussions. There are
millions of statements in Wikidata that are cited to Wikipedia.

Just a few days ago, Jheald said on Project Chat,[2]



---o0o---

But my own view is that we should very definitely be trying, as urgently as
possible, to *capture as much as possible of the huge amount of data in
infoboxes, templates, categorisations, etc on Wikipedia that is not yet in
Wikidata* -- and that (at least in most subject areas) calls to restrict to
only data from independent external sources are utterly utterly misguided,
and typically bear no relation to either what is desirable, what is
available, or what is still needed in order to utilise such sources
effectively. Jheald (talk) 23:49, 8 December 2015 (UTC)

---o0o---



It's not plausible to my understanding to argue that Wikipedia's templates,
infoboxes etc. are not "data sources" when contributors speak of capturing
"the huge amount of data" contained in them. Much of the existing content
of Wikidata consists of data extracted from Wikipedias.

If you feel I have misquoted you anywhere on-wiki, please point me to the
corresponding place (here or via my talk page in that project), and I will
do whatever is necessary.



[1]
https://meta.wikimedia.org/wiki/Talk:Wikidata#Is_CC_the_right_license_for_data.3F
[2]
https://www.wikidata.org/w/index.php?title=Wikidata:Project_chat&diff=281930638&oldid=281906226



On Sat, Dec 12, 2015 at 12:05 AM, Denny Vrandečić <vrandecic@gmail.com>
wrote:

> On Thu, Dec 10, 2015 at 4:18 AM Andreas Kolbe <jayen466@gmail.com> wrote:
>
> > According to Denny, Wikidata, under its CC0 licence, must not import data
> > from Share-Alike sources. He reconfirmed this yesterday when I asked him
> > whether he still stood by that.
> >
> > In practice though we have Wikidata importing massive amounts of data
> from
> > Wikipedia, which was a Share-Alike source last time I looked. Isn't
> > Wikidata then infringing Wikipedia contributors' rights?
> >
> > Why is it okay to import data from the CC BY-SA Wikipedia, but not from
> > European CC BY-SA population statistics?
> >
> >
> Andreas, what I said was that Wikidata must not import data from a data
> source licensed under Share-Alike date source.
>
> The important thing that differentiates what I said from what you think I
> said is "import data from a data source". Wikipedia is not a data source,
> but text. Extracting facts or data from a text is a very different thing
> than taking data from one place and put it in another place. There was no
> database that contains the content of Wikipedia and that can be queried.
> Indeed, that is the whole reason why Wikidata has been started in the first
> place.
>
> In fact, extracting facts or data from one text and then writing a
> Wikipedia article is what Wikipedians do all the time, and the license of
> the original text we read has no effect on the license of the output text.
>
> So, there is no such thing as an import of data from Wikipedia, because
> Wikipedia is not a database.
>
> I have repeatedly pointed you to
> http://simia.net/wiki/Free_data
> and you yourself have repeatedly pointed to
> https://meta.wikimedia.org/wiki/Wikilegal/Database_Rights
> so I would assume that you would have by now read these and developed an
> understanding of these issues. I am not a lawyer, and my understanding of
> these issues is also lacking, but I wanted at least to point out that you
> are misquoting me.
>
> Please, would you mind to correct your misquoting of me in the places where
> you did so, or at least point to this email for further context?
>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Andreas,

Why is it that Denny is to answer on your terms and why is it that you have
not addressed any of the points I made on quality, Moreover you deny his
argument because YOU are not willing to acknowledge his point and thereby
making him out for a liar.

You have not acknowledged that Wikidata is a wiki and you do not appreciate
its implications. You are told that your notion of quality has the least
operational value in Wikidata. You have been told repeatedly why and how
considering these other definitions of quality contribute to improved
quality and participation and it is as if this is of total irrelevance.
This all means nothing to you because you do not care, you are
intentionally not involved. You are like a pharisee in the temple.

I have heard it said several times now that your attitude is the same as
the ones mocking Wikipedia when it was young. Given that you stand for
Wikipedia Signpost, you degrade the appreciation of the English Wikipedia
considerably because you seem to be arguing the anti thesis of the wiki
concept,

Get a live.
Thanks,
GerardM

On 12 December 2015 at 07:01, Andreas Kolbe <jayen466@gmail.com> wrote:

> Denny,
>
>
> I quoted your statement verbatim and in full in the op-ed. Moreover, your
> statement had a context. Alexrk2 had said,[1]
>
>
>
> ---o0o---
>
> Read the above.. at least under European Union law databases are protected
> by copyright. CC0 won't be compatible with other projects like
> OpenStreetMap *or Wikipedia*. This means a CC0-WikiData won't be
> allowed to *import
> content from Wikipedia*, OpenStreetMap or any other share-alike data
> source. The worst case IMO would be if WikiData *extracts content out of
> Wikipedia and release it as CC0*. Under EU law this would be illegal. As a
> contributor in DE Wikipedia I would feel like being expropriated somehow.
> This is not acceptable! --Alexrk2 (talk) 15:32, 16 June 2012 (UTC)
>
> ---o0o---
>
>
>
> Note Alexrk2's three (3) specific references to Wikipedia.
>
> Alexrk2 referred to imports of content from Wikipedia, and how it would
> make her or him feel expropriated if WikiData extracted content out of
> Wikipedia and released it under CC0.
>
> You replied,
>
>
>
> ---o0o---
>
> Alexrk2, it is true that Wikidata under CC0 would not be allowed to import
> content from a Share-Alike data source. *Wikidata does not plan to extract
> content out of Wikipedia at all*. Wikidata will *provide *data that can be
> reused in the Wikipedias. And a CC0 source can be used by a Share-Alike
> project, be it either Wikipedia or OSM. But not the other way around. Do we
> agree on this understanding? --Denny Vrandečić (WMDE) (talk) 12:39, 4 July
> 2012 (UTC)
>
> ---o0o---
>
>
>
> Alexrk2 specifically mentioned Wikipedia. So did you in your reply,
> assuring Alexrk2 that Wikidata did not in fact plan to extract content out
> of Wikipedia at all. Does this lend itself to the interpretation that you
> were talking only about databases, and not about Wikipedia?
>
> Alexrk2 then replied to you,
>
>
>
> ---o0o---
>
> @Denny Vrandečić: I agree. But I thought, the aim (or *one* aim) of
> WikiData would be to *draw all the data out of Wikipedia (infoboxes and
> such things)*.
>
> ---o0o---
>
>
>
> You did not respond to that post, or participate further in that section.
> And these bot imports of Wikipedia infobox contents etc. have happened and
> are ongoing. They have been mentioned in many discussions. There are
> millions of statements in Wikidata that are cited to Wikipedia.
>
> Just a few days ago, Jheald said on Project Chat,[2]
>
>
>
> ---o0o---
>
> But my own view is that we should very definitely be trying, as urgently as
> possible, to *capture as much as possible of the huge amount of data in
> infoboxes, templates, categorisations, etc on Wikipedia that is not yet in
> Wikidata* -- and that (at least in most subject areas) calls to restrict to
> only data from independent external sources are utterly utterly misguided,
> and typically bear no relation to either what is desirable, what is
> available, or what is still needed in order to utilise such sources
> effectively. Jheald (talk) 23:49, 8 December 2015 (UTC)
>
> ---o0o---
>
>
>
> It's not plausible to my understanding to argue that Wikipedia's templates,
> infoboxes etc. are not "data sources" when contributors speak of capturing
> "the huge amount of data" contained in them. Much of the existing content
> of Wikidata consists of data extracted from Wikipedias.
>
> If you feel I have misquoted you anywhere on-wiki, please point me to the
> corresponding place (here or via my talk page in that project), and I will
> do whatever is necessary.
>
>
>
> [1]
>
> https://meta.wikimedia.org/wiki/Talk:Wikidata#Is_CC_the_right_license_for_data.3F
> [2]
>
> https://www.wikidata.org/w/index.php?title=Wikidata:Project_chat&diff=281930638&oldid=281906226
>
>
>
> On Sat, Dec 12, 2015 at 12:05 AM, Denny Vrandečić <vrandecic@gmail.com>
> wrote:
>
> > On Thu, Dec 10, 2015 at 4:18 AM Andreas Kolbe <jayen466@gmail.com>
> wrote:
> >
> > > According to Denny, Wikidata, under its CC0 licence, must not import
> data
> > > from Share-Alike sources. He reconfirmed this yesterday when I asked
> him
> > > whether he still stood by that.
> > >
> > > In practice though we have Wikidata importing massive amounts of data
> > from
> > > Wikipedia, which was a Share-Alike source last time I looked. Isn't
> > > Wikidata then infringing Wikipedia contributors' rights?
> > >
> > > Why is it okay to import data from the CC BY-SA Wikipedia, but not from
> > > European CC BY-SA population statistics?
> > >
> > >
> > Andreas, what I said was that Wikidata must not import data from a data
> > source licensed under Share-Alike date source.
> >
> > The important thing that differentiates what I said from what you think I
> > said is "import data from a data source". Wikipedia is not a data source,
> > but text. Extracting facts or data from a text is a very different thing
> > than taking data from one place and put it in another place. There was no
> > database that contains the content of Wikipedia and that can be queried.
> > Indeed, that is the whole reason why Wikidata has been started in the
> first
> > place.
> >
> > In fact, extracting facts or data from one text and then writing a
> > Wikipedia article is what Wikipedians do all the time, and the license of
> > the original text we read has no effect on the license of the output
> text.
> >
> > So, there is no such thing as an import of data from Wikipedia, because
> > Wikipedia is not a database.
> >
> > I have repeatedly pointed you to
> > http://simia.net/wiki/Free_data
> > and you yourself have repeatedly pointed to
> > https://meta.wikimedia.org/wiki/Wikilegal/Database_Rights
> > so I would assume that you would have by now read these and developed an
> > understanding of these issues. I am not a lawyer, and my understanding of
> > these issues is also lacking, but I wanted at least to point out that you
> > are misquoting me.
> >
> > Please, would you mind to correct your misquoting of me in the places
> where
> > you did so, or at least point to this email for further context?
> >
> >
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
On Thu, Dec 10, 2015 at 9:27 AM, Lydia Pintscher
<lydia.pintscher@wikimedia.de> wrote:
> That is actually not correct. We have built Wikidata from the very
> beginning with some core believes. One of them is that Wikidata isn't
> supposed to have the one truth but instead is able to represent
> various different points of view and link to sources claiming these.
> Look for example at the country statements for Jerusalem:
> https://www.wikidata.org/wiki/Q1218
> Now I am the first to say that this will not be able to capture the
> full complexity of the world around us. But that's not what it is
> meant to do. However please be aware that we have built more than just
> a dumb database with Wikidata and have gone to great length to make it
> possible to capture knowledge diversity.

I've taken the time and written a longer piece about data quality and
knowledge diversity on Wikidata for the current edition of the
Signpost: https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-12-09/Op-ed


Cheers
Lydia

--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Thanks for that essay, Lydia! You said it well, and I especially agree with
what you wrote about trust and believing in ourselves. I had to laugh at
some of the comments, because if you substitute "Wikipedia" for "Wikidata"
those comments could have been written 3 years ago before Wikidata came on
the scene.

On Sat, Dec 12, 2015 at 10:18 PM, Lydia Pintscher <
lydia.pintscher@wikimedia.de> wrote:

> On Thu, Dec 10, 2015 at 9:27 AM, Lydia Pintscher
> <lydia.pintscher@wikimedia.de> wrote:
> > That is actually not correct. We have built Wikidata from the very
> > beginning with some core believes. One of them is that Wikidata isn't
> > supposed to have the one truth but instead is able to represent
> > various different points of view and link to sources claiming these.
> > Look for example at the country statements for Jerusalem:
> > https://www.wikidata.org/wiki/Q1218
> > Now I am the first to say that this will not be able to capture the
> > full complexity of the world around us. But that's not what it is
> > meant to do. However please be aware that we have built more than just
> > a dumb database with Wikidata and have gone to great length to make it
> > possible to capture knowledge diversity.
>
> I've taken the time and written a longer piece about data quality and
> knowledge diversity on Wikidata for the current edition of the
> Signpost:
> https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-12-09/Op-ed
>
>
> Cheers
> Lydia
>
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Jane,

The issue is that you can't cite one Wikipedia article as a source in
another. If, as some envisage, you were to fill Wikipedia's infoboxes with
Wikidata content that's unsourced, or sourced only to a Wikipedia, you'd be
doing exactly that, and violating WP:V in the process:

"Do not use articles from Wikipedia as sources. Also, do not use *websites
that mirror Wikipedia content or publications that rely on material from
Wikipedia as sources*." (WP:CIRCULAR)

That includes Wikidata. As long as Wikidata doesn't provide external
sourcing, it's unusable in Wikipedia.

Andreas

On Sun, Dec 13, 2015 at 9:15 AM, Jane Darnell <jane023@gmail.com> wrote:

> Thanks for that essay, Lydia! You said it well, and I especially agree with
> what you wrote about trust and believing in ourselves. I had to laugh at
> some of the comments, because if you substitute "Wikipedia" for "Wikidata"
> those comments could have been written 3 years ago before Wikidata came on
> the scene.
>
> On Sat, Dec 12, 2015 at 10:18 PM, Lydia Pintscher <
> lydia.pintscher@wikimedia.de> wrote:
>
> > On Thu, Dec 10, 2015 at 9:27 AM, Lydia Pintscher
> > <lydia.pintscher@wikimedia.de> wrote:
> > > That is actually not correct. We have built Wikidata from the very
> > > beginning with some core believes. One of them is that Wikidata isn't
> > > supposed to have the one truth but instead is able to represent
> > > various different points of view and link to sources claiming these.
> > > Look for example at the country statements for Jerusalem:
> > > https://www.wikidata.org/wiki/Q1218
> > > Now I am the first to say that this will not be able to capture the
> > > full complexity of the world around us. But that's not what it is
> > > meant to do. However please be aware that we have built more than just
> > > a dumb database with Wikidata and have gone to great length to make it
> > > possible to capture knowledge diversity.
> >
> > I've taken the time and written a longer piece about data quality and
> > knowledge diversity on Wikidata for the current edition of the
> > Signpost:
> >
> https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-12-09/Op-ed
> >
> >
> > Cheers
> > Lydia
> >
> > --
> > Lydia Pintscher - http://about.me/lydia.pintscher
> > Product Manager for Wikidata
> >
> > Wikimedia Deutschland e.V.
> > Tempelhofer Ufer 23-24
> > 10963 Berlin
> > www.wikimedia.de
> >
> > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
> >
> > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> > unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> > Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
> >
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
On 13 December 2015 at 15:57, Andreas Kolbe <jayen466@gmail.com> wrote:

> Jane,
>
> The issue is that you can't cite one Wikipedia article as a source in
> another.
>


However you can within the same article per [[WP:LEAD]].

--
geni
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Hoi,
Wikidata is not Wikipedia. When it is imported from Wikipedia it often says
so. It does not mean that all the related data is from one Wikipedia and
consequently the composite data is information that may be relevantly
different.

Again you insist on your point of view. If you think that Wikidata is
inferior for the reasons that you give; fine. Never mind, move on.

In the mean time we will continually improve the quality of Wikidata and
when Wikipedians fail to take notice they will find slowly but surely that
the information in Wikidata is increasingly superior in the one area where
it is most obvious: the silly mistakes that come to light when it is not
only one Wikipedia that is the source of data.
Thanks,
GerardM

On 13 December 2015 at 16:57, Andreas Kolbe <jayen466@gmail.com> wrote:

> Jane,
>
> The issue is that you can't cite one Wikipedia article as a source in
> another. If, as some envisage, you were to fill Wikipedia's infoboxes with
> Wikidata content that's unsourced, or sourced only to a Wikipedia, you'd be
> doing exactly that, and violating WP:V in the process:
>
> "Do not use articles from Wikipedia as sources. Also, do not use *websites
> that mirror Wikipedia content or publications that rely on material from
> Wikipedia as sources*." (WP:CIRCULAR)
>
> That includes Wikidata. As long as Wikidata doesn't provide external
> sourcing, it's unusable in Wikipedia.
>
> Andreas
>
> On Sun, Dec 13, 2015 at 9:15 AM, Jane Darnell <jane023@gmail.com> wrote:
>
> > Thanks for that essay, Lydia! You said it well, and I especially agree
> with
> > what you wrote about trust and believing in ourselves. I had to laugh at
> > some of the comments, because if you substitute "Wikipedia" for
> "Wikidata"
> > those comments could have been written 3 years ago before Wikidata came
> on
> > the scene.
> >
> > On Sat, Dec 12, 2015 at 10:18 PM, Lydia Pintscher <
> > lydia.pintscher@wikimedia.de> wrote:
> >
> > > On Thu, Dec 10, 2015 at 9:27 AM, Lydia Pintscher
> > > <lydia.pintscher@wikimedia.de> wrote:
> > > > That is actually not correct. We have built Wikidata from the very
> > > > beginning with some core believes. One of them is that Wikidata isn't
> > > > supposed to have the one truth but instead is able to represent
> > > > various different points of view and link to sources claiming these.
> > > > Look for example at the country statements for Jerusalem:
> > > > https://www.wikidata.org/wiki/Q1218
> > > > Now I am the first to say that this will not be able to capture the
> > > > full complexity of the world around us. But that's not what it is
> > > > meant to do. However please be aware that we have built more than
> just
> > > > a dumb database with Wikidata and have gone to great length to make
> it
> > > > possible to capture knowledge diversity.
> > >
> > > I've taken the time and written a longer piece about data quality and
> > > knowledge diversity on Wikidata for the current edition of the
> > > Signpost:
> > >
> >
> https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-12-09/Op-ed
> > >
> > >
> > > Cheers
> > > Lydia
> > >
> > > --
> > > Lydia Pintscher - http://about.me/lydia.pintscher
> > > Product Manager for Wikidata
> > >
> > > Wikimedia Deutschland e.V.
> > > Tempelhofer Ufer 23-24
> > > 10963 Berlin
> > > www.wikimedia.de
> > >
> > > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
> > >
> > > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> > > unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> > > Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
> > >
> > > _______________________________________________
> > > Wikimedia-l mailing list, guidelines at:
> > > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > > Wikimedia-l@lists.wikimedia.org
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> >
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
On Sun, Dec 13, 2015 at 5:32 PM, geni <geniice@gmail.com> wrote:

> On 13 December 2015 at 15:57, Andreas Kolbe <jayen466@gmail.com> wrote:
>
> > Jane,
> >
> > The issue is that you can't cite one Wikipedia article as a source in
> > another.
> >
>
>
> However you can within the same article per [[WP:LEAD]].
>


Well, of course, if there are reliable sources cited in the body of the
article that back up the statements made in the lead. You still need to
cite a reliable source though; that's Wikipedia 101.
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
I really feel we are drowning in a glass of water.
The issue of "data quality" or "reliability" that Andreas raises is well
known:
what I don't understand if the "scale" of it is much bigger on Wikidata
than Wikipedia,
and if this different scale makes it much more important. The scale of the
issue is maybe something worth discussing, and not the issue itself? Is the
fact that Wikidata is centralised different from statements on Wikipedia? I
don't know, but to me this is a more neutral and interesting question.

I often say that the Wikimedia world made quality an "heisemberghian"
feature: you always have to check if it's there.
The point is: it's been always like this.
We always had to check for quality, even when we used Britannica or
authority controls or whatever "reliable" sources we wanted. Wikipedia, and
now Wikidata, is made for everyone to contribute, it's open and honest in
being open, vulnerable, prone to errors. But we are transparent, we say
that in advance, we can claim any statement to the smallest detail. Of
course it's difficult, but we can do it. Wikidata, as Lydia said, can
actually have conflicting statements in every item: we "just" have to put
them there, as we did to Wikipedia.

If Google uses our data and they are wrong, that's bad for them. If they
correct the errors and do not give us the corrections, that's bad for us
and not ethical from them. The point is: there is no license (for what I
know) that can force them to contribute to Wikidata. That is, IMHO, the
problem with "over-the-top" actors: they can harness collective intelligent
and "not give back." Even with CC-BY-SA, they could store (as they are
probably already doing) all the data in their knowledge vault, which is
secret as it is an incredible asset for them.

I'd be happy to insert a new clause of "forced transparency" in CC-BY-SA or
CC0, but it's not there.

So, as we are working via GLAMs with Wikipedia for getting reliable
sources and content, we are working with them also for good statements and
data. Putting good data in Wikidata makes it better, and I don't understand
what is the problem here (I understand, again, the issue of putting too
much data and still having a small community).
For example: if we are importing different reliable databases, andthe
institutions behind them find it useful and helpful to have an aggregator
of identifiers and authority controls, what is the issue? There is value in
aggregating data, because you can spot errors and inconsistencies. It's not
easy, of course, to find a good workflow, but, again, that is *another*
problem.

So, in conclusion: I find many issues in Wikidata, but not on the
mission/vision, just in the complexity of the project, the size of the
dataset, the size of the community.

Can we talk about those?

Aubrey



On Sun, Dec 13, 2015 at 6:40 PM, Andreas Kolbe <jayen466@gmail.com> wrote:

> On Sun, Dec 13, 2015 at 5:32 PM, geni <geniice@gmail.com> wrote:
>
> > On 13 December 2015 at 15:57, Andreas Kolbe <jayen466@gmail.com> wrote:
> >
> > > Jane,
> > >
> > > The issue is that you can't cite one Wikipedia article as a source in
> > > another.
> > >
> >
> >
> > However you can within the same article per [[WP:LEAD]].
> >
>
>
> Well, of course, if there are reliable sources cited in the body of the
> article that back up the statements made in the lead. You still need to
> cite a reliable source though; that's Wikipedia 101.
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Hoi,
Thank you for another approach. When Wikidata imports data from Wikipedia,
it essentially stands on the shoulders of giants. Yes, there are sources in
Wikipedia and it does not prevent occasional issues. Yes, we import a lot
of data from Wikipedia and this makes life at Wikidata easy and what we do
obvious. It all started with improving quality at Wikidata by making
interwiki links manageable and we are still often involved in fixing
wikilinks in Wikidata because the assumptions to link some articles are
"funny".

When you look at Wikipedia, a lot of the fixtures are essentially about
data. A category or a list can be replicated in many ways by querying
Wikidata.The inverse is that Wikidata can be populated from Wikipedia.
Consequently when we say that we know about men and women in so many
Wikipedias it is because of this that we can and do. When Wikipedia is
correct, Wikidata is. When Wikipedias do not agree, you will find this
expressed in Wikidata.

When people build tools, bots and they have done so for a long time it is
EXACTLY based on the assumption that Wikipedia is essentially correct and,
it is why the quality and quantity of Wikidata is already this good. When
you want to consider Wikidata and its complexity, it is important to look
at the statistics. The statistics by Magnus are the most relevant because
they help explain many of the issues of Wikidata.

One important point. No Wikipedia can claim Wikidata as it is a composite.
Wikipedia policies do not apply. When people insist that all the data in
Wikidata has to be 100% correct, forget it. Wikipedia is not 00% correct
either and that is what we build upon. It has never been this way and it is
impossible to do this any time soon.

What we can do is build upon existing qualities, compare and curate. It is
for instance fairly easy to improve on Wikipedia based upon the information
that is already there but shown to be problematic. It is easy when we
collaborate as it will improve the quality of what we offer. One problem is
that we are SO bad at collaboration. Wikipedians work on one article at a
time and when I work on awards there are easily 60 persons involved and I
trust Wikipedia to be right. The kind of issues I encounter I blog about
regularly. I am not involved in single items or they have to be of
relevance to me like Bassel, the only Wikipedian sentenced to death. So I
did add new items that exist as red links in the award he received and I
did ask Magnus to help me with a list for the award he received. I added
the website I used on the award and that is as far as I go.

When you want to talk about the issues, what is it that you want to
achieve. So far there has been little interest in Wikidata. When you want
to learn about issues, research the issues. Find methods to calculate the
error rate, find methods to compare Wikidata with the Wikipedias and with
other sources in a meaningful way. But do approach it like Magnus does. His
contributions help us make a positive difference. When you find numbers for
now that you cannot replicate with the next dump and the next, they are
essentially without much value because they do not enable us to improve on
what we have. They do not help us engage our minds to make a difference. I
ask Amir regularly to run a bot based on the statistics produced by Magnus,
we are not at the stage where we have such tasks automated...

Andrea, Wikidata is a wiki. It is young and it has already proven itself
for several applications. What can be done improves as our data improves.
We have a lack of data on many subjects because it is where Wikipedia is
lacking. How will we approach for instance the fact that we have fewer than
1000 Syrians and one of them is an emperor of the Roman empire and another
is Bassel?

Let us be bold and allow us to be a wiki. Let us work towards the quality
that is possible to achieve and do not burden us with the assumptions of
some Wikipedias. When you are serious, get involved.
Thanks,
GerardM

On 13 December 2015 at 19:10, Andrea Zanni <zanni.andrea84@gmail.com> wrote:

> I really feel we are drowning in a glass of water.
> The issue of "data quality" or "reliability" that Andreas raises is well
> known:
> what I don't understand if the "scale" of it is much bigger on Wikidata
> than Wikipedia,
> and if this different scale makes it much more important. The scale of the
> issue is maybe something worth discussing, and not the issue itself? Is the
> fact that Wikidata is centralised different from statements on Wikipedia? I
> don't know, but to me this is a more neutral and interesting question.
>
> I often say that the Wikimedia world made quality an "heisemberghian"
> feature: you always have to check if it's there.
> The point is: it's been always like this.
> We always had to check for quality, even when we used Britannica or
> authority controls or whatever "reliable" sources we wanted. Wikipedia, and
> now Wikidata, is made for everyone to contribute, it's open and honest in
> being open, vulnerable, prone to errors. But we are transparent, we say
> that in advance, we can claim any statement to the smallest detail. Of
> course it's difficult, but we can do it. Wikidata, as Lydia said, can
> actually have conflicting statements in every item: we "just" have to put
> them there, as we did to Wikipedia.
>
> If Google uses our data and they are wrong, that's bad for them. If they
> correct the errors and do not give us the corrections, that's bad for us
> and not ethical from them. The point is: there is no license (for what I
> know) that can force them to contribute to Wikidata. That is, IMHO, the
> problem with "over-the-top" actors: they can harness collective intelligent
> and "not give back." Even with CC-BY-SA, they could store (as they are
> probably already doing) all the data in their knowledge vault, which is
> secret as it is an incredible asset for them.
>
> I'd be happy to insert a new clause of "forced transparency" in CC-BY-SA or
> CC0, but it's not there.
>
> So, as we are working via GLAMs with Wikipedia for getting reliable
> sources and content, we are working with them also for good statements and
> data. Putting good data in Wikidata makes it better, and I don't understand
> what is the problem here (I understand, again, the issue of putting too
> much data and still having a small community).
> For example: if we are importing different reliable databases, andthe
> institutions behind them find it useful and helpful to have an aggregator
> of identifiers and authority controls, what is the issue? There is value in
> aggregating data, because you can spot errors and inconsistencies. It's not
> easy, of course, to find a good workflow, but, again, that is *another*
> problem.
>
> So, in conclusion: I find many issues in Wikidata, but not on the
> mission/vision, just in the complexity of the project, the size of the
> dataset, the size of the community.
>
> Can we talk about those?
>
> Aubrey
>
>
>
> On Sun, Dec 13, 2015 at 6:40 PM, Andreas Kolbe <jayen466@gmail.com> wrote:
>
> > On Sun, Dec 13, 2015 at 5:32 PM, geni <geniice@gmail.com> wrote:
> >
> > > On 13 December 2015 at 15:57, Andreas Kolbe <jayen466@gmail.com>
> wrote:
> > >
> > > > Jane,
> > > >
> > > > The issue is that you can't cite one Wikipedia article as a source in
> > > > another.
> > > >
> > >
> > >
> > > However you can within the same article per [[WP:LEAD]].
> > >
> >
> >
> > Well, of course, if there are reliable sources cited in the body of the
> > article that back up the statements made in the lead. You still need to
> > cite a reliable source though; that's Wikipedia 101.
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> >
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Andreas,
That's just not true. You can re-use and remix Wikimedia content as much as
you like. When you say you "can't cite one Wikipedia article as a source in
another", this is also not true, as we see this done in translated articles
in the edit summary. Fortunately Wikipedia articles need sources, so those
are translated along with the rest f the content and are perfectly valid to
take from one project to another. In art history, when we are talking about
paintings, we are all mostly talking about the same sources anyway,
worldwide. This is probably true for most other disciplines as well.

As far as citing goes, the ratio of cited vs. uncited statements in
Wikipedia is probably much greater than in Wikidata, except we can't
measure that. All we measure is the "reference" statement, but there are
lots of sources in various properties and my guess is that most items with
zero statements are early imports that have just not had anyone click on
them yet. When we use images in Wikipedia articles, we do not "cite"
Wikimedia Commons. Indeed, this is exactly the problem we have when we talk
to GLAMs about image donations. The link itself is enough to allow the user
with a few clicks to get at the image information on Commons, where there
is more information, including sources. When I as a Wikipedian use images
of paintings from Commons in a Wikipedia article, I am using multiple
sources for that article, but some of those sources may be from the Commons
image itself, as some of these are particularly well-sourced. When I am
updating the associated Wikidata item, I add all of the sources that I have
found, and for the more famous paintings, others add links from their own
sources, making Wikidata much richer as a source of references than any
single project. As Lydia explained however, not every individual statement
in Wikidata is sourced, though each item may be sourced to multiple
references. This is partially because we lack the tools to easily source
each statement when we update multiple statements at a time, but it is also
because we don't *need* to source obvious statements.

The point is, that publishing on any Wikmedia project, whether it's
Wikipedia, Wikimedia Commons, or Wikidata, is a manually-driven complex
process done by volunteers. It is not and never will be automatic.

Jane

On Sun, Dec 13, 2015 at 4:57 PM, Andreas Kolbe <jayen466@gmail.com> wrote:

> Jane,
>
> The issue is that you can't cite one Wikipedia article as a source in
> another. If, as some envisage, you were to fill Wikipedia's infoboxes with
> Wikidata content that's unsourced, or sourced only to a Wikipedia, you'd be
> doing exactly that, and violating WP:V in the process:
>
> "Do not use articles from Wikipedia as sources. Also, do not use *websites
> that mirror Wikipedia content or publications that rely on material from
> Wikipedia as sources*." (WP:CIRCULAR)
>
> That includes Wikidata. As long as Wikidata doesn't provide external
> sourcing, it's unusable in Wikipedia.
>
> Andreas
>
> On Sun, Dec 13, 2015 at 9:15 AM, Jane Darnell <jane023@gmail.com> wrote:
>
> > Thanks for that essay, Lydia! You said it well, and I especially agree
> with
> > what you wrote about trust and believing in ourselves. I had to laugh at
> > some of the comments, because if you substitute "Wikipedia" for
> "Wikidata"
> > those comments could have been written 3 years ago before Wikidata came
> on
> > the scene.
> >
> > On Sat, Dec 12, 2015 at 10:18 PM, Lydia Pintscher <
> > lydia.pintscher@wikimedia.de> wrote:
> >
> > > On Thu, Dec 10, 2015 at 9:27 AM, Lydia Pintscher
> > > <lydia.pintscher@wikimedia.de> wrote:
> > > > That is actually not correct. We have built Wikidata from the very
> > > > beginning with some core believes. One of them is that Wikidata isn't
> > > > supposed to have the one truth but instead is able to represent
> > > > various different points of view and link to sources claiming these.
> > > > Look for example at the country statements for Jerusalem:
> > > > https://www.wikidata.org/wiki/Q1218
> > > > Now I am the first to say that this will not be able to capture the
> > > > full complexity of the world around us. But that's not what it is
> > > > meant to do. However please be aware that we have built more than
> just
> > > > a dumb database with Wikidata and have gone to great length to make
> it
> > > > possible to capture knowledge diversity.
> > >
> > > I've taken the time and written a longer piece about data quality and
> > > knowledge diversity on Wikidata for the current edition of the
> > > Signpost:
> > >
> >
> https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-12-09/Op-ed
> > >
> > >
> > > Cheers
> > > Lydia
> > >
> > > --
> > > Lydia Pintscher - http://about.me/lydia.pintscher
> > > Product Manager for Wikidata
> > >
> > > Wikimedia Deutschland e.V.
> > > Tempelhofer Ufer 23-24
> > > 10963 Berlin
> > > www.wikimedia.de
> > >
> > > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
> > >
> > > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> > > unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> > > Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
> > >
> > > _______________________________________________
> > > Wikimedia-l mailing list, guidelines at:
> > > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > > Wikimedia-l@lists.wikimedia.org
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> >
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
Andrea,
I totally agree on the mission/vision thing, but am not sure what you mean
exactly by scale - do you mean that Wikidata shouldn't try to be so
granular that it has a statement to cover each factoid in any Wikipedia
article, or do you mean we need to talk about what constitutes notability
in order not to grow Wikidata exponentially to the point the servers crash?
Jane

On Sun, Dec 13, 2015 at 7:10 PM, Andrea Zanni <zanni.andrea84@gmail.com>
wrote:

> I really feel we are drowning in a glass of water.
> The issue of "data quality" or "reliability" that Andreas raises is well
> known:
> what I don't understand if the "scale" of it is much bigger on Wikidata
> than Wikipedia,
> and if this different scale makes it much more important. The scale of the
> issue is maybe something worth discussing, and not the issue itself? Is the
> fact that Wikidata is centralised different from statements on Wikipedia? I
> don't know, but to me this is a more neutral and interesting question.
>
> I often say that the Wikimedia world made quality an "heisemberghian"
> feature: you always have to check if it's there.
> The point is: it's been always like this.
> We always had to check for quality, even when we used Britannica or
> authority controls or whatever "reliable" sources we wanted. Wikipedia, and
> now Wikidata, is made for everyone to contribute, it's open and honest in
> being open, vulnerable, prone to errors. But we are transparent, we say
> that in advance, we can claim any statement to the smallest detail. Of
> course it's difficult, but we can do it. Wikidata, as Lydia said, can
> actually have conflicting statements in every item: we "just" have to put
> them there, as we did to Wikipedia.
>
> If Google uses our data and they are wrong, that's bad for them. If they
> correct the errors and do not give us the corrections, that's bad for us
> and not ethical from them. The point is: there is no license (for what I
> know) that can force them to contribute to Wikidata. That is, IMHO, the
> problem with "over-the-top" actors: they can harness collective intelligent
> and "not give back." Even with CC-BY-SA, they could store (as they are
> probably already doing) all the data in their knowledge vault, which is
> secret as it is an incredible asset for them.
>
> I'd be happy to insert a new clause of "forced transparency" in CC-BY-SA or
> CC0, but it's not there.
>
> So, as we are working via GLAMs with Wikipedia for getting reliable
> sources and content, we are working with them also for good statements and
> data. Putting good data in Wikidata makes it better, and I don't understand
> what is the problem here (I understand, again, the issue of putting too
> much data and still having a small community).
> For example: if we are importing different reliable databases, andthe
> institutions behind them find it useful and helpful to have an aggregator
> of identifiers and authority controls, what is the issue? There is value in
> aggregating data, because you can spot errors and inconsistencies. It's not
> easy, of course, to find a good workflow, but, again, that is *another*
> problem.
>
> So, in conclusion: I find many issues in Wikidata, but not on the
> mission/vision, just in the complexity of the project, the size of the
> dataset, the size of the community.
>
> Can we talk about those?
>
> Aubrey
>
>
>
> On Sun, Dec 13, 2015 at 6:40 PM, Andreas Kolbe <jayen466@gmail.com> wrote:
>
> > On Sun, Dec 13, 2015 at 5:32 PM, geni <geniice@gmail.com> wrote:
> >
> > > On 13 December 2015 at 15:57, Andreas Kolbe <jayen466@gmail.com>
> wrote:
> > >
> > > > Jane,
> > > >
> > > > The issue is that you can't cite one Wikipedia article as a source in
> > > > another.
> > > >
> > >
> > >
> > > However you can within the same article per [[WP:LEAD]].
> > >
> >
> >
> > Well, of course, if there are reliable sources cited in the body of the
> > article that back up the statements made in the lead. You still need to
> > cite a reliable source though; that's Wikipedia 101.
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> >
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Re: [Wikimedia-l] Quality issues [ In reply to ]
On Sun, Dec 13, 2015 at 6:10 PM, Andrea Zanni <zanni.andrea84@gmail.com>
wrote:

> I really feel we are drowning in a glass of water.
> The issue of "data quality" or "reliability" that Andreas raises is well
> known:
> what I don't understand if the "scale" of it is much bigger on Wikidata
> than Wikipedia,
> and if this different scale makes it much more important. The scale of the
> issue is maybe something worth discussing, and not the issue itself? Is the
> fact that Wikidata is centralised different from statements on Wikipedia? I
> don't know, but to me this is a more neutral and interesting question.
>


Wikidata's (envisaged) centralised nature certainly makes a difference,
because the promise was that it would inform the Wikipedias.

Wikipedia started out with people just writing from their personal
knowledge. The early articles had no footnotes. Then after a while people
noticed problems like cranks filling pages with their abstruse theories
(hence the ban on original research), people adding material from their
blogs, etc. Over the course of a decade, Wikipedia developed the idea and
the culture that you have to cite a professionally published source for
everything you add to Wikipedia.

Wikidata is in its early stages. In a way it really is like Wikipedia in
2003. New content welcome! No references required!

But at the same time, Wikidata is supposed to inform the Wikipedias, as a
central data repository. This creates a mismatch between Wikidata's "early
days -- anything goes, let's just get content in, we'll sort it out later"
attitude and the relatively mature Wikipedias where editors insist on
sources for any new content added.

This out-of-synch-ness is a real problem if you want Wikipedias to actually
use Wikidata content. Wikipedians will not accept content generation models
that take Wikipedia back to its bad old days where you could write anything
you liked without a source to back it up.

Wikipedia is of course still a long way away from citing such sources for
all its content. There are vast amounts of legacy material left over from
the early days. But in the pages that are being created now (like
developing news stories, an area where the quality of Wikipedia's coverage
is often praised), pages that see a lot of traffic, pages that are
controversial, etc., it is well established that you have to cite sources
for any new assertions.

Unsourced content is unceremoniously deleted.

If Wikipedia's reputation for reliability has improved since 2003, that
change in culture from the early days is the reason.

The Age for example published an article the other day that is probably one
of the most celebratory articles ever written about Wikipedia.[1] If you're
a Wikipedian, you'll probably enjoy reading it.

Among the aspects that the author, Elizabeth Farrelly, said she liked most
about Wikipedia was "its ruthless commitment to the printed, demonstrable
source." She ended the article as follows:

---o0o---

But most interesting to me is the ban on primary research. The demand that
every input be traced to a published and authoritative source doesn't make
it true, necessarily, but does enable genuine crowd-sourcing of
scholarship. This is a revelation, and a revolution.

So yes, Wikipedia is flawed. Above all, it needs more female input. But the
obvious response, for you-and-me users who encounter something stupid or
biased or just plain wrong, is to hop in there and fix it. I'll see you
there, yes? Oh, and honey? Cite away!

---o0o---

Abandoning the principles that have elicited such praise -- traceability to
published sources, verifiable citations -- is not something Wikipedians
will entertain. To them, it would be a step back. If Wikidata wants to be
an input to Wikipedia, it will have to bear that in mind.


[1]
http://www.theage.com.au/comment/why-wikipedia-at-15-is-a-beautiful-exercise-in-scholarly-excellence-20151209-glj79f.html


> I often say that the Wikimedia world made quality an "heisemberghian"
> feature: you always have to check if it's there.
> The point is: it's been always like this.
> We always had to check for quality, even when we used Britannica or
> authority controls or whatever "reliable" sources we wanted. Wikipedia, and
> now Wikidata, is made for everyone to contribute, it's open and honest in
> being open, vulnerable, prone to errors. But we are transparent, we say
> that in advance, we can claim any statement to the smallest detail. Of
> course it's difficult, but we can do it. Wikidata, as Lydia said, can
> actually have conflicting statements in every item: we "just" have to put
> them there, as we did to Wikipedia.
>
> If Google uses our data and they are wrong, that's bad for them. If they
> correct the errors and do not give us the corrections, that's bad for us
> and not ethical from them. The point is: there is no license (for what I
> know) that can force them to contribute to Wikidata. That is, IMHO, the
> problem with "over-the-top" actors: they can harness collective intelligent
> and "not give back." Even with CC-BY-SA, they could store (as they are
> probably already doing) all the data in their knowledge vault, which is
> secret as it is an incredible asset for them.
>
> I'd be happy to insert a new clause of "forced transparency" in CC-BY-SA or
> CC0, but it's not there.
>
> So, as we are working via GLAMs with Wikipedia for getting reliable
> sources and content, we are working with them also for good statements and
> data. Putting good data in Wikidata makes it better, and I don't understand
> what is the problem here (I understand, again, the issue of putting too
> much data and still having a small community).
> For example: if we are importing different reliable databases, andthe
> institutions behind them find it useful and helpful to have an aggregator
> of identifiers and authority controls, what is the issue? There is value in
> aggregating data, because you can spot errors and inconsistencies. It's not
> easy, of course, to find a good workflow, but, again, that is *another*
> problem.
>
> So, in conclusion: I find many issues in Wikidata, but not on the
> mission/vision, just in the complexity of the project, the size of the
> dataset, the size of the community.
>
> Can we talk about those?
>
> Aubrey
>
>
>
> On Sun, Dec 13, 2015 at 6:40 PM, Andreas Kolbe <jayen466@gmail.com> wrote:
>
> > On Sun, Dec 13, 2015 at 5:32 PM, geni <geniice@gmail.com> wrote:
> >
> > > On 13 December 2015 at 15:57, Andreas Kolbe <jayen466@gmail.com>
> wrote:
> > >
> > > > Jane,
> > > >
> > > > The issue is that you can't cite one Wikipedia article as a source in
> > > > another.
> > > >
> > >
> > >
> > > However you can within the same article per [[WP:LEAD]].
> > >
> >
> >
> > Well, of course, if there are reliable sources cited in the body of the
> > article that back up the statements made in the lead. You still need to
> > cite a reliable source though; that's Wikipedia 101.
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > Wikimedia-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
> >
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>

1 2 3 4 5 6 7  View All