Mailing List Archive

Is Google translation is good for Wikipedias?
Hello All,

Recently there are lot of discussions (in this list also) regarding the
translation project by Google for some of the big language wikipedias. The
foundation also seems like approved the efforts of Google. But I am not sure
whether any one is interested to consult the respective language community
to know their views.

As far as I know only Tamil, Bengali, and Swahili Wikipedians have raised
their concerns about Google's project. But, does this means that other
communities are happy about Google efforts? If there is no active community
in a wikipedia how can we expect response from communities? If there is no
response from a community, does that mean that Google can hire some native
speakers and use machine translation to create articles for that wikipedia?

Now let us go back to a basic question. Does WMF require a wiki community to
create wikipedia in any language? Or can they utilize the services of
companies like Google to create wikipedias in N number of languages?

One of the main point raised by the supporters of Google translation is
that, Google's project is good *for the online version of the language*.That
might be true. But no body is cared to verify whether it is good for
Wikipedia.

As pointed out by Ravi in his presentation in Wikimania, (
http://docs.google.com/present/view?id=ddpg3qwc_279ghm7kbhs), the Google
translation of wikipedia articles:

- will affect the biological growth of a Wikipedia article
- will create copy of English wikipedia article in local wikis
- it is against some of the basic philosophies of wikipedia

The people outside wiki will definitely benefit from this tool, if Google
translation tool is developed for each language. I saw the working example
of this in Poland during Wikimania, when some people who are not good in
English used google translator to communicate with us. :)

Apart from the points raised by Ravi in his presentation, this will affect
the community growth.If there is no active wiki community, how can we expect
them to look after all these junk articles uploaded to wiki every day. When
all the important article links are already turned blue, how we can expect
any future potential editors. So according to me, Google's project is
killing the growth of an active wiki community.

Of course, Tamil Wikipedia is trying to use Google project effectively. But
only Tamil is doing that since they have an active wiki community*. Many
Wiki communities are not even aware that such a project is happening in
their wiki*.

I do not want to point out specific language wikipedas to prove my point.
But visit the wikipedias (especially wikipedias* that use non-latin scripts*)
to view the status of google translation project. Loads of junk articles
are uploaded to wiki every day. Most of the time the only edit in these
articles is the edit by its creator and the inter language wiki bots.

This effort will definitely affect community growth. Kindly see the points
raised by a Swahali
Wikipedian<http://muddybtz.blog.com/2010/07/16/what-happened-on-the-google-challenge-the-swahili-wikipedia/>.
Many Swahali users (and other language users) now expect a laptop or some
other monitory benefits to write in their wikipedia. That affects the
community growth.

So what is the solution for this? Can we take lessons from
Tamil/Bengali/Swahili wikipedias and find methods to use this service
effectively or continue with the current article creation process.

One last question. Is this tool that is developing by Google is an open
source tool? If not, we need to answer so many questions that may follow.

Regards

Shiju Alex
http://en.wikipedia.org/wiki/User:Shijualex
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
I think the answer is "Yes and No". As with any new
project/concept/idea/trial there are pro's and there are con's. The real
question is: Do the pro's outweigh the con's?

From just reading what you linked (And not in any way being involved with
these language projects) and my own personal experiences of how I work on
Wikipedia. Yes, I think it is a good thing overall.

From what I've seen, it is much easier to convince someone who has never
edited, to fix grammatical, spelling or other "simple" mistakes. Generally
people don't dive in and write/translate entire articles - it is simply too
high of a barrier to entry. These pre-translated articles give people an
"in", they are already there, and have obvious errors that are easy to fix.


More "ok" content is better than no content, at least if I have my druthers.

-Jon

On Sat, Jul 24, 2010 at 23:12, Shiju Alex <shijualexonline@gmail.com> wrote:

> Hello All,
>
> Recently there are lot of discussions (in this list also) regarding the
> translation project by Google for some of the big language wikipedias. The
> foundation also seems like approved the efforts of Google. But I am not
> sure
> whether any one is interested to consult the respective language community
> to know their views.
>
> As far as I know only Tamil, Bengali, and Swahili Wikipedians have raised
> their concerns about Google's project. But, does this means that other
> communities are happy about Google efforts? If there is no active community
> in a wikipedia how can we expect response from communities? If there is no
> response from a community, does that mean that Google can hire some native
> speakers and use machine translation to create articles for that wikipedia?
>
> Now let us go back to a basic question. Does WMF require a wiki community
> to
> create wikipedia in any language? Or can they utilize the services of
> companies like Google to create wikipedias in N number of languages?
>
> One of the main point raised by the supporters of Google translation is
> that, Google's project is good *for the online version of the
> language*.That
> might be true. But no body is cared to verify whether it is good for
> Wikipedia.
>
> As pointed out by Ravi in his presentation in Wikimania, (
> http://docs.google.com/present/view?id=ddpg3qwc_279ghm7kbhs), the Google
> translation of wikipedia articles:
>
> - will affect the biological growth of a Wikipedia article
> - will create copy of English wikipedia article in local wikis
> - it is against some of the basic philosophies of wikipedia
>
> The people outside wiki will definitely benefit from this tool, if Google
> translation tool is developed for each language. I saw the working example
> of this in Poland during Wikimania, when some people who are not good in
> English used google translator to communicate with us. :)
>
> Apart from the points raised by Ravi in his presentation, this will affect
> the community growth.If there is no active wiki community, how can we
> expect
> them to look after all these junk articles uploaded to wiki every day. When
> all the important article links are already turned blue, how we can expect
> any future potential editors. So according to me, Google's project is
> killing the growth of an active wiki community.
>
> Of course, Tamil Wikipedia is trying to use Google project effectively. But
> only Tamil is doing that since they have an active wiki community*. Many
> Wiki communities are not even aware that such a project is happening in
> their wiki*.
>
> I do not want to point out specific language wikipedas to prove my point.
> But visit the wikipedias (especially wikipedias* that use non-latin
> scripts*)
> to view the status of google translation project. Loads of junk articles
> are uploaded to wiki every day. Most of the time the only edit in these
> articles is the edit by its creator and the inter language wiki bots.
>
> This effort will definitely affect community growth. Kindly see the points
> raised by a Swahali
> Wikipedian<
> http://muddybtz.blog.com/2010/07/16/what-happened-on-the-google-challenge-the-swahili-wikipedia/
> >.
> Many Swahali users (and other language users) now expect a laptop or some
> other monitory benefits to write in their wikipedia. That affects the
> community growth.
>
> So what is the solution for this? Can we take lessons from
> Tamil/Bengali/Swahili wikipedias and find methods to use this service
> effectively or continue with the current article creation process.
>
> One last question. Is this tool that is developing by Google is an open
> source tool? If not, we need to answer so many questions that may follow.
>
> Regards
>
> Shiju Alex
> http://en.wikipedia.org/wiki/User:Shijualex
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>



--
Jon
[[User:ShakataGaNai]] / KJ6FNQ
http://snowulf.com/
http://ipv6wiki.net/
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
Hi,

On Sun, Jul 25, 2010 at 3:52 PM, Jon Davis <wiki@konsoletek.com> wrote:
> I think the answer is "Yes and No".  As with any new
> project/concept/idea/trial there are pro's and there are con's.  The real
> question is: Do the pro's outweigh the con's?
>
> From just reading what you linked (And not in any way being involved with
> these language projects) and my own personal experiences of how I work on
> Wikipedia.  Yes, I think it is a good thing overall.
>
> From what I've seen, it is much easier to convince someone who has never
> edited, to fix grammatical, spelling or other "simple" mistakes.  Generally
> people don't dive in and write/translate entire articles - it is simply too
> high of a barrier to entry.  These pre-translated articles give people an
> "in", they are already there, and have obvious errors that are easy to fix.

In my experience at Transcom and my own as translator, people
appreciate pre-translated articles only in a good quality, there are
pre-translations in too bad quality which contains too many obvious
errors not easy to fix in time frame.

I've seen several requests, both on meta and on language projects, to
delete this kind of bad quality "translation" which people think
better to scratch a new version.

And in my observation Google translation is still in this level in
many languages. And even if you handle Western languages, unless one
of them in English, results may be in poor quality (e.g. they cannot
keep the distinction between tu/vous, du/Sie etc.)

Cheers,

>
>
> More "ok" content is better than no content, at least if I have my druthers.
>
> -Jon
>
> On Sat, Jul 24, 2010 at 23:12, Shiju Alex <shijualexonline@gmail.com> wrote:
>
>> Hello All,
>>
>> Recently there are lot of discussions (in this list also) regarding the
>> translation project by Google for some of the big language wikipedias. The
>> foundation also seems like approved the efforts of Google. But I am not
>> sure
>> whether any one is interested to consult the respective language community
>> to know their views.
>>
>> As far as I know only Tamil, Bengali, and Swahili Wikipedians have raised
>> their concerns about Google's project. But, does this means that other
>> communities are happy about Google efforts? If there is no active community
>> in a wikipedia how can we expect response from communities? If there is no
>> response from a community, does that mean that Google can hire some native
>> speakers and use machine translation to create articles for that wikipedia?
>>
>> Now let us go back to a basic question. Does WMF require a wiki community
>> to
>> create wikipedia in any language? Or can they utilize the services of
>> companies like Google to create wikipedias in N number of languages?
>>
>> One of the main point raised by the supporters of Google translation is
>> that, Google's project is good *for the online version of the
>> language*.That
>> might be true. But no body is cared to verify whether it is good for
>> Wikipedia.
>>
>> As pointed out by Ravi in his presentation in Wikimania, (
>> http://docs.google.com/present/view?id=ddpg3qwc_279ghm7kbhs), the Google
>> translation of wikipedia articles:
>>
>>   - will affect the biological growth of a Wikipedia article
>>   - will create copy of English wikipedia article in local wikis
>>   - it is against some of the basic philosophies of wikipedia
>>
>> The people outside wiki will definitely benefit from this tool, if Google
>> translation tool is developed for each language. I saw the working example
>> of this in Poland during Wikimania, when some people who are not good in
>> English used google translator to communicate with us. :)
>>
>> Apart from the points raised by Ravi in his presentation, this will affect
>> the community growth.If there is no active wiki community, how can we
>> expect
>> them to look after all these junk articles uploaded to wiki every day. When
>> all the important article links are already turned blue, how we can expect
>> any future potential editors. So according to me, Google's project is
>> killing the growth of an active wiki community.
>>
>> Of course, Tamil Wikipedia is trying to use Google project effectively. But
>> only Tamil is doing that since they have an active wiki community*. Many
>> Wiki communities are not even aware that such a project is happening in
>> their wiki*.
>>
>> I do not want to point out specific language wikipedas to prove my point.
>> But visit the wikipedias (especially wikipedias* that use non-latin
>> scripts*)
>> to view the status of google translation project.  Loads of junk articles
>> are uploaded to wiki every day. Most of the time the only edit in these
>> articles is the edit by its creator and the  inter language wiki bots.
>>
>> This effort will definitely affect community growth. Kindly see the points
>> raised by a Swahali
>> Wikipedian<
>> http://muddybtz.blog.com/2010/07/16/what-happened-on-the-google-challenge-the-swahili-wikipedia/
>> >.
>> Many Swahali users (and other language users) now expect a laptop or some
>> other monitory benefits to write in their wikipedia. That affects the
>> community growth.
>>
>> So what is the solution for this? Can we take lessons from
>> Tamil/Bengali/Swahili wikipedias and find methods to use this service
>> effectively or continue with the current article creation process.
>>
>> One last question. Is this tool that is developing by Google is an open
>> source tool? If not, we need to answer so many questions that may follow.
>>
>> Regards
>>
>> Shiju Alex
>> http://en.wikipedia.org/wiki/User:Shijualex
>> _______________________________________________
>> foundation-l mailing list
>> foundation-l@lists.wikimedia.org
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>>
>
>
>
> --
> Jon
> [[User:ShakataGaNai]] / KJ6FNQ
> http://snowulf.com/
> http://ipv6wiki.net/
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>



--
KIZU Naoko
http://d.hatena.ne.jp/Britty (in Japanese)
Quote of the Day (English): http://en.wikiquote.org/wiki/WQ:QOTD

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
Two things:

1) Please define "junk articles". Do you mean articles that you think
nobody in your community wants to read (like, say, an article about an
American singer or actor, for example [[Lady Gaga]]), or do you mean
articles that are written in such a way as to be incomprehensible, or
are filled with linkspam, etc? Or do you mean something else entirely?
Please explain.
2) Community is certainly important, but aren't we here to write an
encyclopedia? I don't think having all links turned blue is a bad
thing at all. In fact, it seems to me that over time, a larger article
base will result in more users joining. Note that I said over time; in
the short term, it may not have much effect.

-m.


On Sat, Jul 24, 2010 at 11:12 PM, Shiju Alex <shijualexonline@gmail.com> wrote:
> Hello All,
>
> Recently there are lot of discussions (in this list also) regarding the
> translation project by Google for some of the big language wikipedias. The
> foundation also seems like approved the efforts of Google. But I am not sure
> whether any one is interested to consult the respective language community
> to know their views.
>
> As far as I know only Tamil, Bengali, and Swahili Wikipedians have raised
> their concerns about Google's project. But, does this means that other
> communities are happy about Google efforts? If there is no active community
> in a wikipedia how can we expect response from communities? If there is no
> response from a community, does that mean that Google can hire some native
> speakers and use machine translation to create articles for that wikipedia?
>
> Now let us go back to a basic question. Does WMF require a wiki community to
> create wikipedia in any language? Or can they utilize the services of
> companies like Google to create wikipedias in N number of languages?
>
> One of the main point raised by the supporters of Google translation is
> that, Google's project is good *for the online version of the language*.That
> might be true. But no body is cared to verify whether it is good for
> Wikipedia.
>
> As pointed out by Ravi in his presentation in Wikimania, (
> http://docs.google.com/present/view?id=ddpg3qwc_279ghm7kbhs), the Google
> translation of wikipedia articles:
>
>   - will affect the biological growth of a Wikipedia article
>   - will create copy of English wikipedia article in local wikis
>   - it is against some of the basic philosophies of wikipedia
>
> The people outside wiki will definitely benefit from this tool, if Google
> translation tool is developed for each language. I saw the working example
> of this in Poland during Wikimania, when some people who are not good in
> English used google translator to communicate with us. :)
>
> Apart from the points raised by Ravi in his presentation, this will affect
> the community growth.If there is no active wiki community, how can we expect
> them to look after all these junk articles uploaded to wiki every day. When
> all the important article links are already turned blue, how we can expect
> any future potential editors. So according to me, Google's project is
> killing the growth of an active wiki community.
>
> Of course, Tamil Wikipedia is trying to use Google project effectively. But
> only Tamil is doing that since they have an active wiki community*. Many
> Wiki communities are not even aware that such a project is happening in
> their wiki*.
>
> I do not want to point out specific language wikipedas to prove my point.
> But visit the wikipedias (especially wikipedias* that use non-latin scripts*)
> to view the status of google translation project.  Loads of junk articles
> are uploaded to wiki every day. Most of the time the only edit in these
> articles is the edit by its creator and the  inter language wiki bots.
>
> This effort will definitely affect community growth. Kindly see the points
> raised by a Swahali
> Wikipedian<http://muddybtz.blog.com/2010/07/16/what-happened-on-the-google-challenge-the-swahili-wikipedia/>.
> Many Swahali users (and other language users) now expect a laptop or some
> other monitory benefits to write in their wikipedia. That affects the
> community growth.
>
> So what is the solution for this? Can we take lessons from
> Tamil/Bengali/Swahili wikipedias and find methods to use this service
> effectively or continue with the current article creation process.
>
> One last question. Is this tool that is developing by Google is an open
> source tool? If not, we need to answer so many questions that may follow.
>
> Regards
>
> Shiju Alex
> http://en.wikipedia.org/wiki/User:Shijualex
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
Aphaia, a great deal of confusion has been created with regards to
this project. I hope you'll allow me to attempt to clear it up.

These are NOT articles that were translated directly by Google
Translate. Rather, they were created using Google Translator Toolkit,
which requires human intervention by a speaker of the language -
someone to check and correct every single sentence translated, in the
case of languages where Google already has machine translation, or to
write entirely new _human_ translations, in the cases where no Google
Translate module exists (for example, Tamil), with the aid of
Translation Memory software.

I currently work as a translator and have found that Google Translator
Toolkit is great for speeding up and improving the consistency of
translations, and at least the results of my work are usually better
with it than they would be without (I'm glad for the consistency - if
I'm translating a large document, I'd like to make sure to translate
the same phrases the same way every time they occur rather than using
slightly different wording the second time around). Since they're
revised and corrected by a human, they _should_ have the same level of
grammatical correctness, comprehensibility and translation quality as
a pure human translation. If they don't, this is the fault of the
person using the toolkit, not the software itself.

-m.

On Sun, Jul 25, 2010 at 1:53 AM, Aphaia <aphaia@gmail.com> wrote:
> Hi,
>
> On Sun, Jul 25, 2010 at 3:52 PM, Jon Davis <wiki@konsoletek.com> wrote:
>> I think the answer is "Yes and No".  As with any new
>> project/concept/idea/trial there are pro's and there are con's.  The real
>> question is: Do the pro's outweigh the con's?
>>
>> From just reading what you linked (And not in any way being involved with
>> these language projects) and my own personal experiences of how I work on
>> Wikipedia.  Yes, I think it is a good thing overall.
>>
>> From what I've seen, it is much easier to convince someone who has never
>> edited, to fix grammatical, spelling or other "simple" mistakes.  Generally
>> people don't dive in and write/translate entire articles - it is simply too
>> high of a barrier to entry.  These pre-translated articles give people an
>> "in", they are already there, and have obvious errors that are easy to fix.
>
> In my experience at Transcom and my own as translator, people
> appreciate pre-translated articles only in a good quality, there are
> pre-translations in too bad quality which contains too many obvious
> errors not easy to fix in time frame.
>
> I've seen several requests, both on meta and on language projects,  to
> delete this kind of bad quality "translation" which people think
> better to scratch a new version.
>
> And in my observation Google translation is still in this level in
> many languages. And even if you handle Western languages, unless one
> of them in English, results may be in poor quality (e.g. they cannot
> keep the distinction between tu/vous, du/Sie etc.)
>
> Cheers,
>
>>
>>
>> More "ok" content is better than no content, at least if I have my druthers.
>>
>> -Jon
>>
>> On Sat, Jul 24, 2010 at 23:12, Shiju Alex <shijualexonline@gmail.com> wrote:
>>
>>> Hello All,
>>>
>>> Recently there are lot of discussions (in this list also) regarding the
>>> translation project by Google for some of the big language wikipedias. The
>>> foundation also seems like approved the efforts of Google. But I am not
>>> sure
>>> whether any one is interested to consult the respective language community
>>> to know their views.
>>>
>>> As far as I know only Tamil, Bengali, and Swahili Wikipedians have raised
>>> their concerns about Google's project. But, does this means that other
>>> communities are happy about Google efforts? If there is no active community
>>> in a wikipedia how can we expect response from communities? If there is no
>>> response from a community, does that mean that Google can hire some native
>>> speakers and use machine translation to create articles for that wikipedia?
>>>
>>> Now let us go back to a basic question. Does WMF require a wiki community
>>> to
>>> create wikipedia in any language? Or can they utilize the services of
>>> companies like Google to create wikipedias in N number of languages?
>>>
>>> One of the main point raised by the supporters of Google translation is
>>> that, Google's project is good *for the online version of the
>>> language*.That
>>> might be true. But no body is cared to verify whether it is good for
>>> Wikipedia.
>>>
>>> As pointed out by Ravi in his presentation in Wikimania, (
>>> http://docs.google.com/present/view?id=ddpg3qwc_279ghm7kbhs), the Google
>>> translation of wikipedia articles:
>>>
>>>   - will affect the biological growth of a Wikipedia article
>>>   - will create copy of English wikipedia article in local wikis
>>>   - it is against some of the basic philosophies of wikipedia
>>>
>>> The people outside wiki will definitely benefit from this tool, if Google
>>> translation tool is developed for each language. I saw the working example
>>> of this in Poland during Wikimania, when some people who are not good in
>>> English used google translator to communicate with us. :)
>>>
>>> Apart from the points raised by Ravi in his presentation, this will affect
>>> the community growth.If there is no active wiki community, how can we
>>> expect
>>> them to look after all these junk articles uploaded to wiki every day. When
>>> all the important article links are already turned blue, how we can expect
>>> any future potential editors. So according to me, Google's project is
>>> killing the growth of an active wiki community.
>>>
>>> Of course, Tamil Wikipedia is trying to use Google project effectively. But
>>> only Tamil is doing that since they have an active wiki community*. Many
>>> Wiki communities are not even aware that such a project is happening in
>>> their wiki*.
>>>
>>> I do not want to point out specific language wikipedas to prove my point.
>>> But visit the wikipedias (especially wikipedias* that use non-latin
>>> scripts*)
>>> to view the status of google translation project.  Loads of junk articles
>>> are uploaded to wiki every day. Most of the time the only edit in these
>>> articles is the edit by its creator and the  inter language wiki bots.
>>>
>>> This effort will definitely affect community growth. Kindly see the points
>>> raised by a Swahali
>>> Wikipedian<
>>> http://muddybtz.blog.com/2010/07/16/what-happened-on-the-google-challenge-the-swahili-wikipedia/
>>> >.
>>> Many Swahali users (and other language users) now expect a laptop or some
>>> other monitory benefits to write in their wikipedia. That affects the
>>> community growth.
>>>
>>> So what is the solution for this? Can we take lessons from
>>> Tamil/Bengali/Swahili wikipedias and find methods to use this service
>>> effectively or continue with the current article creation process.
>>>
>>> One last question. Is this tool that is developing by Google is an open
>>> source tool? If not, we need to answer so many questions that may follow.
>>>
>>> Regards
>>>
>>> Shiju Alex
>>> http://en.wikipedia.org/wiki/User:Shijualex
>>> _______________________________________________
>>> foundation-l mailing list
>>> foundation-l@lists.wikimedia.org
>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>>>
>>
>>
>>
>> --
>> Jon
>> [[User:ShakataGaNai]] / KJ6FNQ
>> http://snowulf.com/
>> http://ipv6wiki.net/
>> _______________________________________________
>> foundation-l mailing list
>> foundation-l@lists.wikimedia.org
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>>
>
>
>
> --
> KIZU Naoko
> http://d.hatena.ne.jp/Britty (in Japanese)
> Quote of the Day (English): http://en.wikiquote.org/wiki/WQ:QOTD
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
Thanks for your clarification, Node.ue, I know it because I attended
their presentation on Wikimania. It is an ambitious project I'd like
to see it growing, but at this moment they seem to have a serious
problem in its system. They seem to use English as a stem language,
and assumes all translations are first done into English and then to
another language. On the other hand, at least on major non-English
Western language Wikipedia some amount of translations (1/3 IIRC) are
not related to English.

If you think it works for you, it's fine, but please be aware it might
not work for non-English speakers as well as for you.

Cheers,

On Sun, Jul 25, 2010 at 7:18 PM, Mark Williamson <node.ue@gmail.com> wrote:
> Aphaia, a great deal of confusion has been created with regards to
> this project. I hope you'll allow me to attempt to clear it up.
>
> These are NOT articles that were translated directly by Google
> Translate. Rather, they were created using Google Translator Toolkit,
> which requires human intervention by a speaker of the language -
> someone to check and correct every single sentence translated, in the
> case of languages where Google already has machine translation, or to
> write entirely new _human_ translations, in the cases where no Google
> Translate module exists (for example, Tamil), with the aid of
> Translation Memory software.
>
> I currently work as a translator and have found that Google Translator
> Toolkit is great for speeding up and improving the consistency of
> translations, and at least the results of my work are usually better
> with it than they would be without (I'm glad for the consistency - if
> I'm translating a large document, I'd like to make sure to translate
> the same phrases the same way every time they occur rather than using
> slightly different wording the second time around). Since they're
> revised and corrected by a human, they _should_ have the same level of
> grammatical correctness, comprehensibility and translation quality as
> a pure human translation. If they don't, this is the fault of the
> person using the toolkit, not the software itself.
>
> -m.
>
> On Sun, Jul 25, 2010 at 1:53 AM, Aphaia <aphaia@gmail.com> wrote:
>> Hi,
>>
>> On Sun, Jul 25, 2010 at 3:52 PM, Jon Davis <wiki@konsoletek.com> wrote:
>>> I think the answer is "Yes and No".  As with any new
>>> project/concept/idea/trial there are pro's and there are con's.  The real
>>> question is: Do the pro's outweigh the con's?
>>>
>>> From just reading what you linked (And not in any way being involved with
>>> these language projects) and my own personal experiences of how I work on
>>> Wikipedia.  Yes, I think it is a good thing overall.
>>>
>>> From what I've seen, it is much easier to convince someone who has never
>>> edited, to fix grammatical, spelling or other "simple" mistakes.  Generally
>>> people don't dive in and write/translate entire articles - it is simply too
>>> high of a barrier to entry.  These pre-translated articles give people an
>>> "in", they are already there, and have obvious errors that are easy to fix.
>>
>> In my experience at Transcom and my own as translator, people
>> appreciate pre-translated articles only in a good quality, there are
>> pre-translations in too bad quality which contains too many obvious
>> errors not easy to fix in time frame.
>>
>> I've seen several requests, both on meta and on language projects,  to
>> delete this kind of bad quality "translation" which people think
>> better to scratch a new version.
>>
>> And in my observation Google translation is still in this level in
>> many languages. And even if you handle Western languages, unless one
>> of them in English, results may be in poor quality (e.g. they cannot
>> keep the distinction between tu/vous, du/Sie etc.)
>>
>> Cheers,
>>
>>>
>>>
>>> More "ok" content is better than no content, at least if I have my druthers.
>>>
>>> -Jon
>>>
>>> On Sat, Jul 24, 2010 at 23:12, Shiju Alex <shijualexonline@gmail.com> wrote:
>>>
>>>> Hello All,
>>>>
>>>> Recently there are lot of discussions (in this list also) regarding the
>>>> translation project by Google for some of the big language wikipedias. The
>>>> foundation also seems like approved the efforts of Google. But I am not
>>>> sure
>>>> whether any one is interested to consult the respective language community
>>>> to know their views.
>>>>
>>>> As far as I know only Tamil, Bengali, and Swahili Wikipedians have raised
>>>> their concerns about Google's project. But, does this means that other
>>>> communities are happy about Google efforts? If there is no active community
>>>> in a wikipedia how can we expect response from communities? If there is no
>>>> response from a community, does that mean that Google can hire some native
>>>> speakers and use machine translation to create articles for that wikipedia?
>>>>
>>>> Now let us go back to a basic question. Does WMF require a wiki community
>>>> to
>>>> create wikipedia in any language? Or can they utilize the services of
>>>> companies like Google to create wikipedias in N number of languages?
>>>>
>>>> One of the main point raised by the supporters of Google translation is
>>>> that, Google's project is good *for the online version of the
>>>> language*.That
>>>> might be true. But no body is cared to verify whether it is good for
>>>> Wikipedia.
>>>>
>>>> As pointed out by Ravi in his presentation in Wikimania, (
>>>> http://docs.google.com/present/view?id=ddpg3qwc_279ghm7kbhs), the Google
>>>> translation of wikipedia articles:
>>>>
>>>>   - will affect the biological growth of a Wikipedia article
>>>>   - will create copy of English wikipedia article in local wikis
>>>>   - it is against some of the basic philosophies of wikipedia
>>>>
>>>> The people outside wiki will definitely benefit from this tool, if Google
>>>> translation tool is developed for each language. I saw the working example
>>>> of this in Poland during Wikimania, when some people who are not good in
>>>> English used google translator to communicate with us. :)
>>>>
>>>> Apart from the points raised by Ravi in his presentation, this will affect
>>>> the community growth.If there is no active wiki community, how can we
>>>> expect
>>>> them to look after all these junk articles uploaded to wiki every day. When
>>>> all the important article links are already turned blue, how we can expect
>>>> any future potential editors. So according to me, Google's project is
>>>> killing the growth of an active wiki community.
>>>>
>>>> Of course, Tamil Wikipedia is trying to use Google project effectively. But
>>>> only Tamil is doing that since they have an active wiki community*. Many
>>>> Wiki communities are not even aware that such a project is happening in
>>>> their wiki*.
>>>>
>>>> I do not want to point out specific language wikipedas to prove my point.
>>>> But visit the wikipedias (especially wikipedias* that use non-latin
>>>> scripts*)
>>>> to view the status of google translation project.  Loads of junk articles
>>>> are uploaded to wiki every day. Most of the time the only edit in these
>>>> articles is the edit by its creator and the  inter language wiki bots.
>>>>
>>>> This effort will definitely affect community growth. Kindly see the points
>>>> raised by a Swahali
>>>> Wikipedian<
>>>> http://muddybtz.blog.com/2010/07/16/what-happened-on-the-google-challenge-the-swahili-wikipedia/
>>>> >.
>>>> Many Swahali users (and other language users) now expect a laptop or some
>>>> other monitory benefits to write in their wikipedia. That affects the
>>>> community growth.
>>>>
>>>> So what is the solution for this? Can we take lessons from
>>>> Tamil/Bengali/Swahili wikipedias and find methods to use this service
>>>> effectively or continue with the current article creation process.
>>>>
>>>> One last question. Is this tool that is developing by Google is an open
>>>> source tool? If not, we need to answer so many questions that may follow.
>>>>
>>>> Regards
>>>>
>>>> Shiju Alex
>>>> http://en.wikipedia.org/wiki/User:Shijualex
>>>> _______________________________________________
>>>> foundation-l mailing list
>>>> foundation-l@lists.wikimedia.org
>>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>>>>
>>>
>>>
>>>
>>> --
>>> Jon
>>> [[User:ShakataGaNai]] / KJ6FNQ
>>> http://snowulf.com/
>>> http://ipv6wiki.net/
>>> _______________________________________________
>>> foundation-l mailing list
>>> foundation-l@lists.wikimedia.org
>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>>>
>>
>>
>>
>> --
>> KIZU Naoko
>> http://d.hatena.ne.jp/Britty (in Japanese)
>> Quote of the Day (English): http://en.wikiquote.org/wiki/WQ:QOTD
>>
>> _______________________________________________
>> foundation-l mailing list
>> foundation-l@lists.wikimedia.org
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>>
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>



--
KIZU Naoko
http://d.hatena.ne.jp/Britty (in Japanese)
Quote of the Day (English): http://en.wikiquote.org/wiki/WQ:QOTD

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
> I've seen several requests, both on meta and on language projects, to
> delete this kind of bad quality "translation" which people think
> better to scratch a new version.

Uhm. In pl wiki google translate is evil. Translations by google translate are deleted (not speedy). Users who use google translate for mass production of articles are blocked.

So, it's generaly problem with copy (articles, ideas etc.) from en wiki (most popular):

http://pl.wikipedia.org/wiki/Wikipedia:Enwikizm

"Not all things in en wiki are good. Just don't copy thoughtlessly."

przykuta

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
Aphaia, any machine translation system that produces even remotely
comprehensible results should be able to be used in machine-aided
translation. It is reduced to low utility if the output is complete
gibberish, however this doesn't seem to be the case; regardless, it's
possible to turn off automatic translation and the system can be used
merely as a translation memory system, which would be useful in case
the automatic translation actually did produce gibberish. Still
useful, I think, because it automatically breaks text into segments
and is at least *intended* to preserve formatting (this seems to be an
issue for WP articles) without requiring users to re-type every single
wikilink.

-m.

On Sun, Jul 25, 2010 at 3:47 AM, Aphaia <aphaia@gmail.com> wrote:
> Thanks for your clarification, Node.ue, I know it because I attended
> their presentation on Wikimania. It is an ambitious project I'd like
> to see it growing, but at this moment they seem to have a serious
> problem in its system. They seem to use English as a stem language,
> and assumes all translations are first done into English and then to
> another language. On the other hand, at least on major non-English
> Western language Wikipedia some amount of translations (1/3 IIRC) are
> not related to English.
>
> If you think it works for you, it's fine, but please be aware it might
> not work for non-English speakers as well as for you.
>
> Cheers,
>
> On Sun, Jul 25, 2010 at 7:18 PM, Mark Williamson <node.ue@gmail.com> wrote:
>> Aphaia, a great deal of confusion has been created with regards to
>> this project. I hope you'll allow me to attempt to clear it up.
>>
>> These are NOT articles that were translated directly by Google
>> Translate. Rather, they were created using Google Translator Toolkit,
>> which requires human intervention by a speaker of the language -
>> someone to check and correct every single sentence translated, in the
>> case of languages where Google already has machine translation, or to
>> write entirely new _human_ translations, in the cases where no Google
>> Translate module exists (for example, Tamil), with the aid of
>> Translation Memory software.
>>
>> I currently work as a translator and have found that Google Translator
>> Toolkit is great for speeding up and improving the consistency of
>> translations, and at least the results of my work are usually better
>> with it than they would be without (I'm glad for the consistency - if
>> I'm translating a large document, I'd like to make sure to translate
>> the same phrases the same way every time they occur rather than using
>> slightly different wording the second time around). Since they're
>> revised and corrected by a human, they _should_ have the same level of
>> grammatical correctness, comprehensibility and translation quality as
>> a pure human translation. If they don't, this is the fault of the
>> person using the toolkit, not the software itself.
>>
>> -m.
>>
>> On Sun, Jul 25, 2010 at 1:53 AM, Aphaia <aphaia@gmail.com> wrote:
>>> Hi,
>>>
>>> On Sun, Jul 25, 2010 at 3:52 PM, Jon Davis <wiki@konsoletek.com> wrote:
>>>> I think the answer is "Yes and No".  As with any new
>>>> project/concept/idea/trial there are pro's and there are con's.  The real
>>>> question is: Do the pro's outweigh the con's?
>>>>
>>>> From just reading what you linked (And not in any way being involved with
>>>> these language projects) and my own personal experiences of how I work on
>>>> Wikipedia.  Yes, I think it is a good thing overall.
>>>>
>>>> From what I've seen, it is much easier to convince someone who has never
>>>> edited, to fix grammatical, spelling or other "simple" mistakes.  Generally
>>>> people don't dive in and write/translate entire articles - it is simply too
>>>> high of a barrier to entry.  These pre-translated articles give people an
>>>> "in", they are already there, and have obvious errors that are easy to fix.
>>>
>>> In my experience at Transcom and my own as translator, people
>>> appreciate pre-translated articles only in a good quality, there are
>>> pre-translations in too bad quality which contains too many obvious
>>> errors not easy to fix in time frame.
>>>
>>> I've seen several requests, both on meta and on language projects,  to
>>> delete this kind of bad quality "translation" which people think
>>> better to scratch a new version.
>>>
>>> And in my observation Google translation is still in this level in
>>> many languages. And even if you handle Western languages, unless one
>>> of them in English, results may be in poor quality (e.g. they cannot
>>> keep the distinction between tu/vous, du/Sie etc.)
>>>
>>> Cheers,
>>>
>>>>
>>>>
>>>> More "ok" content is better than no content, at least if I have my druthers.
>>>>
>>>> -Jon
>>>>
>>>> On Sat, Jul 24, 2010 at 23:12, Shiju Alex <shijualexonline@gmail.com> wrote:
>>>>
>>>>> Hello All,
>>>>>
>>>>> Recently there are lot of discussions (in this list also) regarding the
>>>>> translation project by Google for some of the big language wikipedias. The
>>>>> foundation also seems like approved the efforts of Google. But I am not
>>>>> sure
>>>>> whether any one is interested to consult the respective language community
>>>>> to know their views.
>>>>>
>>>>> As far as I know only Tamil, Bengali, and Swahili Wikipedians have raised
>>>>> their concerns about Google's project. But, does this means that other
>>>>> communities are happy about Google efforts? If there is no active community
>>>>> in a wikipedia how can we expect response from communities? If there is no
>>>>> response from a community, does that mean that Google can hire some native
>>>>> speakers and use machine translation to create articles for that wikipedia?
>>>>>
>>>>> Now let us go back to a basic question. Does WMF require a wiki community
>>>>> to
>>>>> create wikipedia in any language? Or can they utilize the services of
>>>>> companies like Google to create wikipedias in N number of languages?
>>>>>
>>>>> One of the main point raised by the supporters of Google translation is
>>>>> that, Google's project is good *for the online version of the
>>>>> language*.That
>>>>> might be true. But no body is cared to verify whether it is good for
>>>>> Wikipedia.
>>>>>
>>>>> As pointed out by Ravi in his presentation in Wikimania, (
>>>>> http://docs.google.com/present/view?id=ddpg3qwc_279ghm7kbhs), the Google
>>>>> translation of wikipedia articles:
>>>>>
>>>>>   - will affect the biological growth of a Wikipedia article
>>>>>   - will create copy of English wikipedia article in local wikis
>>>>>   - it is against some of the basic philosophies of wikipedia
>>>>>
>>>>> The people outside wiki will definitely benefit from this tool, if Google
>>>>> translation tool is developed for each language. I saw the working example
>>>>> of this in Poland during Wikimania, when some people who are not good in
>>>>> English used google translator to communicate with us. :)
>>>>>
>>>>> Apart from the points raised by Ravi in his presentation, this will affect
>>>>> the community growth.If there is no active wiki community, how can we
>>>>> expect
>>>>> them to look after all these junk articles uploaded to wiki every day. When
>>>>> all the important article links are already turned blue, how we can expect
>>>>> any future potential editors. So according to me, Google's project is
>>>>> killing the growth of an active wiki community.
>>>>>
>>>>> Of course, Tamil Wikipedia is trying to use Google project effectively. But
>>>>> only Tamil is doing that since they have an active wiki community*. Many
>>>>> Wiki communities are not even aware that such a project is happening in
>>>>> their wiki*.
>>>>>
>>>>> I do not want to point out specific language wikipedas to prove my point.
>>>>> But visit the wikipedias (especially wikipedias* that use non-latin
>>>>> scripts*)
>>>>> to view the status of google translation project.  Loads of junk articles
>>>>> are uploaded to wiki every day. Most of the time the only edit in these
>>>>> articles is the edit by its creator and the  inter language wiki bots.
>>>>>
>>>>> This effort will definitely affect community growth. Kindly see the points
>>>>> raised by a Swahali
>>>>> Wikipedian<
>>>>> http://muddybtz.blog.com/2010/07/16/what-happened-on-the-google-challenge-the-swahili-wikipedia/
>>>>> >.
>>>>> Many Swahali users (and other language users) now expect a laptop or some
>>>>> other monitory benefits to write in their wikipedia. That affects the
>>>>> community growth.
>>>>>
>>>>> So what is the solution for this? Can we take lessons from
>>>>> Tamil/Bengali/Swahili wikipedias and find methods to use this service
>>>>> effectively or continue with the current article creation process.
>>>>>
>>>>> One last question. Is this tool that is developing by Google is an open
>>>>> source tool? If not, we need to answer so many questions that may follow.
>>>>>
>>>>> Regards
>>>>>
>>>>> Shiju Alex
>>>>> http://en.wikipedia.org/wiki/User:Shijualex
>>>>> _______________________________________________
>>>>> foundation-l mailing list
>>>>> foundation-l@lists.wikimedia.org
>>>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jon
>>>> [[User:ShakataGaNai]] / KJ6FNQ
>>>> http://snowulf.com/
>>>> http://ipv6wiki.net/
>>>> _______________________________________________
>>>> foundation-l mailing list
>>>> foundation-l@lists.wikimedia.org
>>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>>>>
>>>
>>>
>>>
>>> --
>>> KIZU Naoko
>>> http://d.hatena.ne.jp/Britty (in Japanese)
>>> Quote of the Day (English): http://en.wikiquote.org/wiki/WQ:QOTD
>>>
>>> _______________________________________________
>>> foundation-l mailing list
>>> foundation-l@lists.wikimedia.org
>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>>>
>>
>> _______________________________________________
>> foundation-l mailing list
>> foundation-l@lists.wikimedia.org
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>>
>
>
>
> --
> KIZU Naoko
> http://d.hatena.ne.jp/Britty (in Japanese)
> Quote of the Day (English): http://en.wikiquote.org/wiki/WQ:QOTD
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
Can we clarify here, are we talking about Google Translate or Google
Translator Toolkit?

-m.

On Sun, Jul 25, 2010 at 3:49 AM, Przykuta <przykuta@o2.pl> wrote:
>> I've seen several requests, both on meta and on language projects,  to
>> delete this kind of bad quality "translation" which people think
>> better to scratch a new version.
>
> Uhm. In pl wiki google translate is evil. Translations by google translate are deleted (not speedy). Users who use google translate for mass production of articles are blocked.
>
> So, it's generaly problem with copy (articles, ideas etc.) from en wiki (most popular):
>
> http://pl.wikipedia.org/wiki/Wikipedia:Enwikizm
>
> "Not all things in en wiki are good. Just don't copy thoughtlessly."
>
> przykuta
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
about google translation, I think.

przykuta


> Can we clarify here, are we talking about Google Translate or Google
> Translator Toolkit?
>
> -m.
>
> On Sun, Jul 25, 2010 at 3:49 AM, Przykuta <przykuta@o2.pl> wrote:
> >> I've seen several requests, both on meta and on language projects,  to
> >> delete this kind of bad quality "translation" which people think
> >> better to scratch a new version.
> >
> > Uhm. In pl wiki google translate is evil. Translations by google translate are deleted (not speedy). Users who use google translate for mass production of articles are blocked.
> >
> > So, it's generaly problem with copy (articles, ideas etc.) from en wiki (most popular):
> >
> > http://pl.wikipedia.org/wiki/Wikipedia:Enwikizm
> >
> > "Not all things in en wiki are good. Just don't copy thoughtlessly."
> >
> > przykuta
> >
> > _______________________________________________
> > foundation-l mailing list
> > foundation-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> >
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
> about google translation, I think.
>
> przykuta
>

oops, sorry i found an e-mail from Shiju Alex in spambox.


_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
Well - this seems a bit confusing. I think Shiju Alex was talking
about the toolkit, but I got the impression you're referring to Google
Translate, which I agree is always unsuitable to produce usable
articles.

-m.

On Sun, Jul 25, 2010 at 4:26 AM, Przykuta <przykuta@o2.pl> wrote:
> about google translation, I think.
>
> przykuta
>
>
>> Can we clarify here, are we talking about Google Translate or Google
>> Translator Toolkit?
>>
>> -m.
>>
>> On Sun, Jul 25, 2010 at 3:49 AM, Przykuta <przykuta@o2.pl> wrote:
>> >> I've seen several requests, both on meta and on language projects,  to
>> >> delete this kind of bad quality "translation" which people think
>> >> better to scratch a new version.
>> >
>> > Uhm. In pl wiki google translate is evil. Translations by google translate are deleted (not speedy). Users who use google translate for mass production of articles are blocked.
>> >
>> > So, it's generaly problem with copy (articles, ideas etc.) from en wiki (most popular):
>> >
>> > http://pl.wikipedia.org/wiki/Wikipedia:Enwikizm
>> >
>> > "Not all things in en wiki are good. Just don't copy thoughtlessly."
>> >
>> > przykuta
>> >
>> > _______________________________________________
>> > foundation-l mailing list
>> > foundation-l@lists.wikimedia.org
>> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>> >
>>
>> _______________________________________________
>> foundation-l mailing list
>> foundation-l@lists.wikimedia.org
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>>
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
On Sun, Jul 25, 2010 at 8:33 AM, Mark Williamson <node.ue@gmail.com> wrote:

> about the toolkit, but I got the impression you're referring to Google
> Translate, which I agree is always unsuitable to produce usable
> articles.
>

Machine translation is always unsuitable to produce usable articles, but can
help to start new ones in smaller wikipedias.

If we want to use machine translation we should try with a free project like
Apertium:

http://www.apertium.org/
http://wiki.apertium.org/wiki/Main_Page
irc://irc.freenode.net/apertium

--
Fajro
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
2010/7/25 Shiju Alex <shijualexonline@gmail.com>:
> Hello All,
>
> Recently there are lot of discussions (in this list also) regarding the
> translation project by Google for some of the big language wikipedias. The
> foundation also seems like approved the efforts of Google. But I am not sure
> whether any one is interested to consult the respective language community
> to know their views.

At the same session at Wikimania a very sensible approach was
presented by Mikel Iturbe from the Basque Wikipedia:

* They didn't use Google Translate, but an academically-developed
tool, which also happened to be Free Software - which diminished the
arguments about commercialization.

* The editors community was involved throughout the whole process.

* Articles were not uploaded without correcting mistakes that the
translation software made.

* What's also important, the corrections were reported to the
translation software developers, so they would try to improve it.

Of course, not every language community can afford developing
Free-as-in-speech academic translation software, but the other points
are useful to everybody.

Mikel Iturbe's presentation:
* http://www.slideshare.net/janfri/wikimania2010

The academic papers related to that project:
* http://ixa.si.ehu.es/openmt2/argitalpenak_html
* http://ixa.si.ehu.es/Ixa/Argitalpenak/Artikuluak/index_html?Atala=Artikulua_Itzulpen_automatikoa

--
אָמִיר אֱלִישָׁע אַהֲרוֹנִי
Amir Elisha Aharoni

http://aharoni.wordpress.com

"We're living in pieces,
 I want to live in peace." - T. Moore

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
--- On Sun, 25/7/10, Fajro <faigos@gmail.com> wrote:
> Machine translation is always unsuitable to produce usable
> articles, but can
> help to start new ones in smaller wikipedias.


I second that. About 50% of machine translation output is gibberish, or worse, plausible-sounding text that actually says the opposite of what the original said. To get it into readable form takes about as long as starting from scratch.

Translation memory software only helps where content is repetitive.

A.




_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
Дана Sunday 25 July 2010 08:12:43 Shiju Alex написа:
> So what is the solution for this? Can we take lessons from
> Tamil/Bengali/Swahili wikipedias and find methods to use this service
> effectively or continue with the current article creation process.

I was thinking about a website that would have static copies of all Wikipedia
articles translated to all languages. That should dissuade people from using
Google Translate to make Wikipedia articles, since the articles would already
be online; and even if someone would do that, admins would have community
support for deletion of such articles because they already exist online. And
if someone would want to fix Google Translate translation and make a real
article, they could do that too...

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
As an admin in Bengali wikipedia, I had to deal with this issue a lot
(some of which were discussed with the Telegraph (India) newspaper
article). But I'd like to elaborate our stance here:

(The tool used was Google Translation Toolkit. (not Google Translate).
There is a distinction between these two tools. Google Translation
Toolkit (GTT) is a translation-memory based semi-manual translation
tool. That is, it learns translation skills as you gradually translate
articles by hand. Later, this can be used to automate translation.)

Issues:
1. Community involvement: First of all, the local community was not at
all involved or informed about this project. All on a sudden, we found
new users signing up, dropping a large article on a random topic, and
move away. These users never responded to any talk page messages, so
we first assumed these were just random users experimenting with
wikipedia.

Even now, no one from Google has contacted us in Bengali wikipedia and
inform us about Google's intentions. This is not a problem by itself,
but see the following points.

2. Translation quality: The quality of the translations was awful. The
translations added to Bengali wikipedia were artificial, dry, and used
obscure words and phrases. It looked as if a non-native speaker sat
down with a dictionary in hand, and mechanically translated each
sentence word by word. That led to sentences which are hard to
understand, or downright nonsensical.

The articles were half-done. Numerals were not translated at all. The
punctuation symbol for Bengali language (the "danda" symbol: । ) was
not used. (apparently, GTT and/or the google transliteration tool does
not support that).

The articles were also full of spelling mistakes. The paid translator
misspelled many simple words, or even used different spellings for the
same word in different parts of the article.

Finally, different languages have different sentence structures.
Sometimes, a complex sentence is better expressed if broken up in two
sentences in another language. We found that the translators simply
translated sentences preserving their English language structure. This
caused the resulting Bengali sentences awkward and artificial to read.
For example, we do not write "If x then y" in Bengali just by
replacing if and then with the corresponding Bengali words. But the
translators did that, apparently this is an artifact of using GTT.


3. Lack of follow up: When we found the above problems, naturally, we
asked the contributor to fix them. Got no reply. It is NOT the task of
volunteers to clean up the mess after the one-night-standish paid
translators. Given the small number of volunteers active at any given
moment, it will take enormous efforts in our part to go through these
articles and fix the punctuation, spelling, and grammar issues. Not to
mention the awkward language style used by the translators.

So, after getting a cold shoulder from the paid translators about
fixing their mess, we had to ban such edits outright. We didn't know
who was behind this, until the Wikimania talk from Google. Not that it
matters ... even now, we won't allow these half done and badly
translated articles on bengali wikipedia.

Bengali wikipedia is small (21k articles), but we do not want to
populate it overnight with badly translated content, some of which
won't even qualify as grammatically correct Bengali. While wikipedia
may be a perpetual work in progress, that does not mean we need to be
guinea-pigs of some careless experiments. So, our stance is, "Thanks,
but NO Thanks!". Unless, of course, they can put enough commitment
into the translations and fix mistakes.

We welcome automation in translation, but not at the expense of
introducing incorrect and messy content on wikipedia. We'd rather stay
small and hand-craft than allow an experimental tool and unskilled
paid translators creating a big mess.


Thanks

Ragib (User:Ragib on en and bn)

--
Ragib Hasan, Ph.D
NSF Computing Innovation Fellow and
Assistant Research Scientist

Dept of Computer Science
Johns Hopkins University
3400 N Charles Street
Baltimore, MD 21218

Website:
http://www.ragibhasan.com




On Sun, Jul 25, 2010 at 2:12 AM, Shiju Alex <shijualexonline@gmail.com> wrote:
> Hello All,
>
> Recently there are lot of discussions (in this list also) regarding the
> translation project by Google for some of the big language wikipedias. The
> foundation also seems like approved the efforts of Google. But I am not sure
> whether any one is interested to consult the respective language community
> to know their views.
>
> As far as I know only Tamil, Bengali, and Swahili Wikipedians have raised
> their concerns about Google's project. But, does this means that other
> communities are happy about Google efforts? If there is no active community
> in a wikipedia how can we expect response from communities? If there is no
> response from a community, does that mean that Google can hire some native
> speakers and use machine translation to create articles for that wikipedia?
>
> Now let us go back to a basic question. Does WMF require a wiki community to
> create wikipedia in any language? Or can they utilize the services of
> companies like Google to create wikipedias in N number of languages?
>
> One of the main point raised by the supporters of Google translation is
> that, Google's project is good *for the online version of the language*.That
> might be true. But no body is cared to verify whether it is good for
> Wikipedia.
>
> As pointed out by Ravi in his presentation in Wikimania, (
> http://docs.google.com/present/view?id=ddpg3qwc_279ghm7kbhs), the Google
> translation of wikipedia articles:
>
>   - will affect the biological growth of a Wikipedia article
>   - will create copy of English wikipedia article in local wikis
>   - it is against some of the basic philosophies of wikipedia
>
> The people outside wiki will definitely benefit from this tool, if Google
> translation tool is developed for each language. I saw the working example
> of this in Poland during Wikimania, when some people who are not good in
> English used google translator to communicate with us. :)
>
> Apart from the points raised by Ravi in his presentation, this will affect
> the community growth.If there is no active wiki community, how can we expect
> them to look after all these junk articles uploaded to wiki every day. When
> all the important article links are already turned blue, how we can expect
> any future potential editors. So according to me, Google's project is
> killing the growth of an active wiki community.
>
> Of course, Tamil Wikipedia is trying to use Google project effectively. But
> only Tamil is doing that since they have an active wiki community*. Many
> Wiki communities are not even aware that such a project is happening in
> their wiki*.
>
> I do not want to point out specific language wikipedas to prove my point.
> But visit the wikipedias (especially wikipedias* that use non-latin scripts*)
> to view the status of google translation project.  Loads of junk articles
> are uploaded to wiki every day. Most of the time the only edit in these
> articles is the edit by its creator and the  inter language wiki bots.
>
> This effort will definitely affect community growth. Kindly see the points
> raised by a Swahali
> Wikipedian<http://muddybtz.blog.com/2010/07/16/what-happened-on-the-google-challenge-the-swahili-wikipedia/>.
> Many Swahali users (and other language users) now expect a laptop or some
> other monitory benefits to write in their wikipedia. That affects the
> community growth.
>
> So what is the solution for this? Can we take lessons from
> Tamil/Bengali/Swahili wikipedias and find methods to use this service
> effectively or continue with the current article creation process.
>
> One last question. Is this tool that is developing by Google is an open
> source tool? If not, we need to answer so many questions that may follow.
>
> Regards
>
> Shiju Alex
> http://en.wikipedia.org/wiki/User:Shijualex
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
On Tue, Jul 27, 2010 at 8:38 PM, Ragib Hasan <ragibhasan@gmail.com> wrote:

> (The tool used was Google Translation Toolkit. (not Google Translate).
> There is a distinction between these two tools. Google Translation
> Toolkit (GTT) is a translation-memory based semi-manual translation
> tool. That is, it learns translation skills as you gradually translate
> articles by hand. Later, this can be used to automate translation.)

Another issue: The resulting translation memory is not free.

--
Fajro

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
On Tue, Jul 27, 2010 at 8:43 PM, Fajro <faigos@gmail.com> wrote:
> On Tue, Jul 27, 2010 at 8:38 PM, Ragib Hasan <ragibhasan@gmail.com> wrote:
>
>> (The tool used was Google Translation Toolkit. (not Google Translate).
>> There is a distinction between these two tools. Google Translation
>> Toolkit (GTT) is a translation-memory based semi-manual translation
>> tool. That is, it learns translation skills as you gradually translate
>> articles by hand. Later, this can be used to automate translation.)
>
> Another issue: The resulting translation memory is not free.
>
> --
> Fajro
>


My guess is that, the translation memory will be used in enhancing
Google Translate (the automated translator). That is probably a reason
behind creating these translations in the first place.

(See http://en.wikipedia.org/wiki/Google_Translate : "According to
Och, a solid base for developing a usable statistical machine
translation system for a new pair of languages from scratch, would
consist in having a bilingual text corpus (or parallel collection) of
more than a million words and two monolingual corpora of each more
than a billion words")

--

Ragib

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
>
> We welcome automation in translation, but not at the expense of
> introducing incorrect and messy content on wikipedia. We'd rather stay
> small and hand-craft than allow an experimental tool and unskilled
> paid translators creating a big mess.
>


Yes. This is the answer that you will get from most of the active wiki
((small wikis) communities where this project is going on. Many of the small
wiki communities are not worried about the numbers as some big wikipedias
do. Quality is more important for small wikis when number of contributors
are less. *Many of us will use this quality matrix* itself to bring in more
people.

My real concern is about the rift that is happening in a language community
due to this project. Issues of a language wiki is taken outside wiki to
prove some points against its contributors. Two types are communities are
evolving out of this project. *Google's Wiki community* and *Wiki's wiki
community*. :) This is really annoying as far as small wikis are concerned.

So, some sort of intervention is required to make sure this project run
smootly on different wiikipedias.


~Shiju


On Wed, Jul 28, 2010 at 1:38 AM, Ragib Hasan <ragibhasan@gmail.com> wrote:

> As an admin in Bengali wikipedia, I had to deal with this issue a lot
> (some of which were discussed with the Telegraph (India) newspaper
> article). But I'd like to elaborate our stance here:
>
> (The tool used was Google Translation Toolkit. (not Google Translate).
> There is a distinction between these two tools. Google Translation
> Toolkit (GTT) is a translation-memory based semi-manual translation
> tool. That is, it learns translation skills as you gradually translate
> articles by hand. Later, this can be used to automate translation.)
>
> Issues:
> 1. Community involvement: First of all, the local community was not at
> all involved or informed about this project. All on a sudden, we found
> new users signing up, dropping a large article on a random topic, and
> move away. These users never responded to any talk page messages, so
> we first assumed these were just random users experimenting with
> wikipedia.
>
> Even now, no one from Google has contacted us in Bengali wikipedia and
> inform us about Google's intentions. This is not a problem by itself,
> but see the following points.
>
> 2. Translation quality: The quality of the translations was awful. The
> translations added to Bengali wikipedia were artificial, dry, and used
> obscure words and phrases. It looked as if a non-native speaker sat
> down with a dictionary in hand, and mechanically translated each
> sentence word by word. That led to sentences which are hard to
> understand, or downright nonsensical.
>
> The articles were half-done. Numerals were not translated at all. The
> punctuation symbol for Bengali language (the "danda" symbol: । ) was
> not used. (apparently, GTT and/or the google transliteration tool does
> not support that).
>
> The articles were also full of spelling mistakes. The paid translator
> misspelled many simple words, or even used different spellings for the
> same word in different parts of the article.
>
> Finally, different languages have different sentence structures.
> Sometimes, a complex sentence is better expressed if broken up in two
> sentences in another language. We found that the translators simply
> translated sentences preserving their English language structure. This
> caused the resulting Bengali sentences awkward and artificial to read.
> For example, we do not write "If x then y" in Bengali just by
> replacing if and then with the corresponding Bengali words. But the
> translators did that, apparently this is an artifact of using GTT.
>
>
> 3. Lack of follow up: When we found the above problems, naturally, we
> asked the contributor to fix them. Got no reply. It is NOT the task of
> volunteers to clean up the mess after the one-night-standish paid
> translators. Given the small number of volunteers active at any given
> moment, it will take enormous efforts in our part to go through these
> articles and fix the punctuation, spelling, and grammar issues. Not to
> mention the awkward language style used by the translators.
>
> So, after getting a cold shoulder from the paid translators about
> fixing their mess, we had to ban such edits outright. We didn't know
> who was behind this, until the Wikimania talk from Google. Not that it
> matters ... even now, we won't allow these half done and badly
> translated articles on bengali wikipedia.
>
> Bengali wikipedia is small (21k articles), but we do not want to
> populate it overnight with badly translated content, some of which
> won't even qualify as grammatically correct Bengali. While wikipedia
> may be a perpetual work in progress, that does not mean we need to be
> guinea-pigs of some careless experiments. So, our stance is, "Thanks,
> but NO Thanks!". Unless, of course, they can put enough commitment
> into the translations and fix mistakes.
>
> We welcome automation in translation, but not at the expense of
> introducing incorrect and messy content on wikipedia. We'd rather stay
> small and hand-craft than allow an experimental tool and unskilled
> paid translators creating a big mess.
>
>
> Thanks
>
> Ragib (User:Ragib on en and bn)
>
> --
> Ragib Hasan, Ph.D
> NSF Computing Innovation Fellow and
> Assistant Research Scientist
>
> Dept of Computer Science
> Johns Hopkins University
> 3400 N Charles Street
> Baltimore, MD 21218
>
> Website:
> http://www.ragibhasan.com
>
>
>
>
> On Sun, Jul 25, 2010 at 2:12 AM, Shiju Alex <shijualexonline@gmail.com>
> wrote:
> > Hello All,
> >
> > Recently there are lot of discussions (in this list also) regarding the
> > translation project by Google for some of the big language wikipedias.
> The
> > foundation also seems like approved the efforts of Google. But I am not
> sure
> > whether any one is interested to consult the respective language
> community
> > to know their views.
> >
> > As far as I know only Tamil, Bengali, and Swahili Wikipedians have raised
> > their concerns about Google's project. But, does this means that other
> > communities are happy about Google efforts? If there is no active
> community
> > in a wikipedia how can we expect response from communities? If there is
> no
> > response from a community, does that mean that Google can hire some
> native
> > speakers and use machine translation to create articles for that
> wikipedia?
> >
> > Now let us go back to a basic question. Does WMF require a wiki community
> to
> > create wikipedia in any language? Or can they utilize the services of
> > companies like Google to create wikipedias in N number of languages?
> >
> > One of the main point raised by the supporters of Google translation is
> > that, Google's project is good *for the online version of the
> language*.That
> > might be true. But no body is cared to verify whether it is good for
> > Wikipedia.
> >
> > As pointed out by Ravi in his presentation in Wikimania, (
> > http://docs.google.com/present/view?id=ddpg3qwc_279ghm7kbhs), the Google
> > translation of wikipedia articles:
> >
> > - will affect the biological growth of a Wikipedia article
> > - will create copy of English wikipedia article in local wikis
> > - it is against some of the basic philosophies of wikipedia
> >
> > The people outside wiki will definitely benefit from this tool, if Google
> > translation tool is developed for each language. I saw the working
> example
> > of this in Poland during Wikimania, when some people who are not good in
> > English used google translator to communicate with us. :)
> >
> > Apart from the points raised by Ravi in his presentation, this will
> affect
> > the community growth.If there is no active wiki community, how can we
> expect
> > them to look after all these junk articles uploaded to wiki every day.
> When
> > all the important article links are already turned blue, how we can
> expect
> > any future potential editors. So according to me, Google's project is
> > killing the growth of an active wiki community.
> >
> > Of course, Tamil Wikipedia is trying to use Google project effectively.
> But
> > only Tamil is doing that since they have an active wiki community*. Many
> > Wiki communities are not even aware that such a project is happening in
> > their wiki*.
> >
> > I do not want to point out specific language wikipedas to prove my point.
> > But visit the wikipedias (especially wikipedias* that use non-latin
> scripts*)
> > to view the status of google translation project. Loads of junk articles
> > are uploaded to wiki every day. Most of the time the only edit in these
> > articles is the edit by its creator and the inter language wiki bots.
> >
> > This effort will definitely affect community growth. Kindly see the
> points
> > raised by a Swahali
> > Wikipedian<
> http://muddybtz.blog.com/2010/07/16/what-happened-on-the-google-challenge-the-swahili-wikipedia/
> >.
> > Many Swahali users (and other language users) now expect a laptop or some
> > other monitory benefits to write in their wikipedia. That affects the
> > community growth.
> >
> > So what is the solution for this? Can we take lessons from
> > Tamil/Bengali/Swahili wikipedias and find methods to use this service
> > effectively or continue with the current article creation process.
> >
> > One last question. Is this tool that is developing by Google is an open
> > source tool? If not, we need to answer so many questions that may follow.
> >
> > Regards
> >
> > Shiju Alex
> > http://en.wikipedia.org/wiki/User:Shijualex
> > _______________________________________________
> > foundation-l mailing list
> > foundation-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> >
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
Dear colleagues,

My experiences with the Translate Kit are negative, too. It happened
just too often that a sentence was so twisted that I did not
understand it. Checking it with the original took me a lot of time, so
I decided that doing the translation by myself is much quicker and
reliable. It is good for nobody to read Wikipedia articles in
gibberish.
The idea that the translation tool is doing the work and that a human
being has to make just some little corrections, has simply failed.
Especially negative was, to me, that the Translator kit encourages you
to translate sentence by sentence.
I don't want to do injustice to anyone, but in my view there are two
groups of Wikipedians:
- those who want to see huge article numbers and believe that any
article with any content is good, in any quality, and that the
Wikipedians are sufficient to do the rest.
- those who believe that (at least a minimum) quality is important and
that articles below a certain niveau do damage to a Wikipedia. The
small numbers of Wikipedians cannot cope with the work. They welcome
not any content, but content that meets the possible interests of
their readers.
It seems to me that the first group is mainly populated by computer
specialists and natives of English. The second group consists of
language specialists and non natives of English. But of course there
are many exceptions.

Kind regards
Ziko van Dijk


2010/7/28 Shiju Alex <shijualexonline@gmail.com>:
>>
>> We welcome automation in translation, but not at the expense of
>> introducing incorrect and messy content on wikipedia. We'd rather stay
>> small and hand-craft than allow an experimental tool and unskilled
>> paid translators creating a big mess.
>>
>
>
> Yes. This is the answer that you will get from most of the active  wiki
> ((small wikis) communities where this project is going on. Many of the small
> wiki communities are not worried about the numbers as some big wikipedias
> do. Quality is more important for small wikis when number of contributors
> are less. *Many of us will use this quality matrix* itself to bring in more
> people.
>
> My real concern is about the rift that is happening in a language community
> due to this project. Issues of a language wiki is taken outside wiki to
> prove some points against its contributors.  Two types are communities are
> evolving out of this project. *Google's Wiki community* and *Wiki's wiki
> community*. :) This is really annoying as far as small wikis are concerned.
>
> So, some sort of intervention is required to make sure this project run
> smootly on different wiikipedias.
>
>
> ~Shiju
>
>
> On Wed, Jul 28, 2010 at 1:38 AM, Ragib Hasan <ragibhasan@gmail.com> wrote:
>
>> As an admin in Bengali wikipedia, I had to deal with this issue a lot
>> (some of which were discussed with the Telegraph (India) newspaper
>> article). But I'd like to elaborate our stance here:
>>
>> (The tool used was Google Translation Toolkit. (not Google Translate).
>> There is a distinction between these two tools. Google Translation
>> Toolkit (GTT) is a translation-memory based semi-manual translation
>> tool. That is, it learns translation skills as you gradually translate
>> articles by hand. Later, this can be used to automate translation.)
>>
>> Issues:
>> 1. Community involvement: First of all, the local community was not at
>> all involved or informed about this project. All on a sudden, we found
>> new users signing up, dropping a large article on a random topic, and
>> move away. These users never responded to any talk page messages, so
>> we first assumed these were just random users experimenting with
>> wikipedia.
>>
>> Even now, no one from Google has contacted us in Bengali wikipedia and
>> inform us about Google's intentions. This is not a problem by itself,
>> but see the following points.
>>
>> 2. Translation quality: The quality of the translations was awful. The
>> translations added to Bengali wikipedia were artificial, dry, and used
>> obscure words and phrases. It looked as if a non-native speaker sat
>> down with a dictionary in hand, and mechanically translated each
>> sentence word by word. That led to sentences which are hard to
>> understand, or downright nonsensical.
>>
>> The articles were half-done. Numerals were not translated at all. The
>> punctuation symbol for Bengali language (the "danda" symbol: । ) was
>> not used. (apparently, GTT and/or the google transliteration tool does
>> not support that).
>>
>> The articles were also full of spelling mistakes. The paid translator
>> misspelled many simple words, or even used different spellings for the
>> same word in different parts of the article.
>>
>> Finally, different languages have different sentence structures.
>> Sometimes, a complex sentence is better expressed if broken up in two
>> sentences in another language. We found that the translators simply
>> translated sentences preserving their English language structure. This
>> caused the resulting Bengali sentences awkward and artificial to read.
>> For example, we do not write "If x then y" in Bengali just by
>> replacing if and then with the corresponding Bengali words. But the
>> translators did that, apparently this is an artifact of using GTT.
>>
>>
>> 3. Lack of follow up: When we found the above problems, naturally, we
>> asked the contributor to fix them. Got no reply. It is NOT the task of
>> volunteers to clean up the mess after the one-night-standish paid
>> translators. Given the small number of volunteers active at any given
>> moment, it will take enormous efforts in our part to go through these
>> articles and fix the punctuation, spelling, and grammar issues. Not to
>> mention the awkward language style used by the translators.
>>
>> So, after getting a cold shoulder from the paid translators about
>> fixing their mess, we had to ban such edits outright. We didn't know
>> who was behind this, until the Wikimania talk from Google. Not that it
>> matters ... even now, we won't allow these half done and badly
>> translated articles on bengali wikipedia.
>>
>> Bengali wikipedia is small (21k articles), but we do not want to
>> populate it overnight with badly translated content, some of which
>> won't even qualify as grammatically correct Bengali. While wikipedia
>> may be a perpetual work in progress, that does not mean we need to be
>> guinea-pigs of some careless experiments. So, our stance is, "Thanks,
>> but NO Thanks!". Unless, of course, they can put enough commitment
>> into the translations and fix mistakes.
>>
>> We welcome automation in translation, but not at the expense of
>> introducing incorrect and messy content on wikipedia. We'd rather stay
>> small and hand-craft than allow an experimental tool and unskilled
>> paid translators creating a big mess.
>>
>>
>> Thanks
>>
>> Ragib (User:Ragib on en and bn)
>>
>> --
>> Ragib Hasan, Ph.D
>> NSF Computing Innovation Fellow and
>> Assistant Research Scientist
>>
>> Dept of Computer Science
>> Johns Hopkins University
>> 3400 N Charles Street
>> Baltimore, MD 21218
>>
>> Website:
>> http://www.ragibhasan.com
>>
>>
>>
>>
>> On Sun, Jul 25, 2010 at 2:12 AM, Shiju Alex <shijualexonline@gmail.com>
>> wrote:
>> > Hello All,
>> >
>> > Recently there are lot of discussions (in this list also) regarding the
>> > translation project by Google for some of the big language wikipedias.
>> The
>> > foundation also seems like approved the efforts of Google. But I am not
>> sure
>> > whether any one is interested to consult the respective language
>> community
>> > to know their views.
>> >
>> > As far as I know only Tamil, Bengali, and Swahili Wikipedians have raised
>> > their concerns about Google's project. But, does this means that other
>> > communities are happy about Google efforts? If there is no active
>> community
>> > in a wikipedia how can we expect response from communities? If there is
>> no
>> > response from a community, does that mean that Google can hire some
>> native
>> > speakers and use machine translation to create articles for that
>> wikipedia?
>> >
>> > Now let us go back to a basic question. Does WMF require a wiki community
>> to
>> > create wikipedia in any language? Or can they utilize the services of
>> > companies like Google to create wikipedias in N number of languages?
>> >
>> > One of the main point raised by the supporters of Google translation is
>> > that, Google's project is good *for the online version of the
>> language*.That
>> > might be true. But no body is cared to verify whether it is good for
>> > Wikipedia.
>> >
>> > As pointed out by Ravi in his presentation in Wikimania, (
>> > http://docs.google.com/present/view?id=ddpg3qwc_279ghm7kbhs), the Google
>> > translation of wikipedia articles:
>> >
>> >   - will affect the biological growth of a Wikipedia article
>> >   - will create copy of English wikipedia article in local wikis
>> >   - it is against some of the basic philosophies of wikipedia
>> >
>> > The people outside wiki will definitely benefit from this tool, if Google
>> > translation tool is developed for each language. I saw the working
>> example
>> > of this in Poland during Wikimania, when some people who are not good in
>> > English used google translator to communicate with us. :)
>> >
>> > Apart from the points raised by Ravi in his presentation, this will
>> affect
>> > the community growth.If there is no active wiki community, how can we
>> expect
>> > them to look after all these junk articles uploaded to wiki every day.
>> When
>> > all the important article links are already turned blue, how we can
>> expect
>> > any future potential editors. So according to me, Google's project is
>> > killing the growth of an active wiki community.
>> >
>> > Of course, Tamil Wikipedia is trying to use Google project effectively.
>> But
>> > only Tamil is doing that since they have an active wiki community*. Many
>> > Wiki communities are not even aware that such a project is happening in
>> > their wiki*.
>> >
>> > I do not want to point out specific language wikipedas to prove my point.
>> > But visit the wikipedias (especially wikipedias* that use non-latin
>> scripts*)
>> > to view the status of google translation project.  Loads of junk articles
>> > are uploaded to wiki every day. Most of the time the only edit in these
>> > articles is the edit by its creator and the  inter language wiki bots.
>> >
>> > This effort will definitely affect community growth. Kindly see the
>> points
>> > raised by a Swahali
>> > Wikipedian<
>> http://muddybtz.blog.com/2010/07/16/what-happened-on-the-google-challenge-the-swahili-wikipedia/
>> >.
>> > Many Swahali users (and other language users) now expect a laptop or some
>> > other monitory benefits to write in their wikipedia. That affects the
>> > community growth.
>> >
>> > So what is the solution for this? Can we take lessons from
>> > Tamil/Bengali/Swahili wikipedias and find methods to use this service
>> > effectively or continue with the current article creation process.
>> >
>> > One last question. Is this tool that is developing by Google is an open
>> > source tool? If not, we need to answer so many questions that may follow.
>> >
>> > Regards
>> >
>> > Shiju Alex
>> > http://en.wikipedia.org/wiki/User:Shijualex
>> > _______________________________________________
>> > foundation-l mailing list
>> > foundation-l@lists.wikimedia.org
>> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>> >
>>
>> _______________________________________________
>> foundation-l mailing list
>> foundation-l@lists.wikimedia.org
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>



--
Ziko van Dijk
Niederlande

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
Just to be sure I understand... What's happening here is that human
beings, using a software tool, are translating articles from the
English Wikipedia into a variety of other languages and posting them
on the comparatively small Wikipedia projects in these languages. The
articles, of unknown intrinsic quality, are usually mid to low quality
translations.

In the projects with an active community, some have rejected these
articles because they are not high quality and because the community
refuses to be responsible for fixing punctuation and other errors made
by editors who are not members of the community. In the projects
without an active community, Wikimedians (who may not speak any of the
languages affected by the Google initiative) are objecting for a
variety of other reasons - because the software used to assist
translation isn't free, because the effort is managed by a commercial
organization or because the endeavor wasn't cleared with the Wikimedia
community first. Some are also concerned that these new articles will
somehow deter new editors from becoming involved, despite clear
evidence that a larger base of content attracts more readers, and more
readers plus imperfect content leads to more editors.

What I find interesting is that few seem to be interested in keeping
or improving the translated articles; Google's attempt to provide
content in under-served languages is actually offending Wikimedians,
despite our ostensible commitment to the same goal. Concerns like
bureaucratic pre-approval, using free software, etc. are somehow more
important than reaching more people with more content. It all seems
strange and un-Wikimedian like to me. Obviously there are things
Google should have done differently. Maybe working with them to
improve their process should be the focus here?

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
2010/7/28 Nathan <nawrich@gmail.com>:
> Just to be sure I understand...

It's good that you ask, indeed. :-)

No, it's not about free software, and the Wikimedians are not too
snobby or lazy to correct poor language. That is what I frequently do
in de.WP and eo.WP, and I suppose Ragib and many others as well. The
point is: The machine translated articles are often so bad that I
simply don't understand them. I *cannot* correct them, because I don't
know what they are saying.

Kind regards
Ziko



What's happening here is that human
> beings, using a software tool, are translating articles from the
> English Wikipedia into a variety of other languages and posting them
> on the comparatively small Wikipedia projects in these languages. The
> articles, of unknown intrinsic quality, are usually mid to low quality
> translations.
>
> In the projects with an active community, some have rejected these
> articles because they are not high quality and because the community
> refuses to be responsible for fixing punctuation and other errors made
> by editors who are not members of the community. In the projects
> without an active community, Wikimedians (who may not speak any of the
> languages affected by the Google initiative) are objecting for a
> variety of other reasons - because the software used to assist
> translation isn't free, because the effort is managed by a commercial
> organization or because the endeavor wasn't cleared with the Wikimedia
> community first. Some are also concerned that these new articles will
> somehow deter new editors from becoming involved, despite clear
> evidence that a larger base of content attracts more readers, and more
> readers plus imperfect content leads to more editors.
>
> What I find interesting is that few seem to be interested in keeping
> or improving the translated articles; Google's attempt to provide
> content in under-served languages is actually offending Wikimedians,
> despite our ostensible commitment to the same goal. Concerns like
> bureaucratic pre-approval, using free software, etc. are somehow more
> important than reaching more people with more content. It all seems
> strange and un-Wikimedian like to me. Obviously there are things
> Google should have done differently. Maybe working with them to
> improve their process should be the focus here?
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>



--
Ziko van Dijk
Niederlande

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
Consider Malayalam Language sentense "വിക്കിപീഡിയ ഒരു നല്ല വിജ്ഞാനകോശം ആണ്"
means "Wikipedia is a good encyclopedia". How one can understand if a
translator picks meaning of Malayalam words and create an English
sentence like "wikipedia one good encyclopedia is". Please think about
more complex sentences. Sentence structure of Indian languages are
completely different from English or European languages. Google's
current attempt putting extra weight over tiny communities by pushing
them complete rewriting (Easiest way is deletion because some sentence
does not make any sense at all). I am not against machine translations
but Google must improve their tool or toolkit before trying it over
small wikipedias.




On Sunday 25 July 2010 09:01 PM, Andreas Kolbe wrote:
> --- On Sun, 25/7/10, Fajro<faigos@gmail.com> wrote:
>
>> Machine translation is always unsuitable to produce usable
>> articles, but can
>> help to start new ones in smaller wikipedias.
>>
>
> I second that. About 50% of machine translation output is gibberish, or worse, plausible-sounding text that actually says the opposite of what the original said. To get it into readable form takes about as long as starting from scratch.
>
> Translation memory software only helps where content is repetitive.
>
> A.
>
>
>
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
>


_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Is Google translation is good for Wikipedias? [ In reply to ]
On Wed, Jul 28, 2010 at 10:20 AM, praveenp <me.praveen@gmail.com> wrote:
> Consider Malayalam Language sentense "വിക്കിപീഡിയ ഒരു നല്ല വിജ്ഞാനകോശം ആണ്"
> means "Wikipedia is a good encyclopedia". How one can understand if a
> translator picks meaning of Malayalam words and create an English
> sentence like "wikipedia one good encyclopedia is". Please think about
> more complex sentences. Sentence structure of Indian languages are
> completely different from English or European languages. Google's
> current attempt putting extra weight over tiny communities by pushing
> them complete rewriting (Easiest way is deletion because some sentence
> does not make any sense at all). I am not against machine translations
> but Google must improve their tool or toolkit before trying it over
> small wikipedias.
>

Nor google nor the wmf is creating articles automatically via machine
translations.
Google is not pushing translated articles.

Toolkit is a page where you can see a (sometimes not good)
translation, and you (if you want to) are able to complete or fix it.

When you believe it is complete, you upload it to wikipedia, just like
you waoult upload a fully manual translation when you consider it's
complete.

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

1 2  View All