Mailing List Archive

Commons ZIP file upload for admins
Hello all,

for some types of resources, it's desirable to upload source files
(whether it's Blender, COLLADA, Scribus, EDL, or some other format),
so that others can more easily remix and process them. Currently, as
far as I know, there's no way to upload these resources to Commons.

What would be the arguments against allowing administrators to upload
arbitrary ZIP files on Wikimedia Commons, allowing the Commons
community to develop policy and process around when such archived
resources are appropriate? An alternative, of course, would be to
whitelist every possible source format for admins, but it seems to me
that it would be a good general policy to not enable additional
support for formats that aren't officially supported (reduces
confusion among users about what's permitted -- there's only one file
format they can't use).

Thoughts?

Thanks,
Erik

--
Erik Möller
Deputy Director, Wikimedia Foundation

Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
On 25.10.2010, 23:02 Erik wrote:

> Hello all,

> for some types of resources, it's desirable to upload source files
> (whether it's Blender, COLLADA, Scribus, EDL, or some other format),
> so that others can more easily remix and process them. Currently, as
> far as I know, there's no way to upload these resources to Commons.

> What would be the arguments against allowing administrators to upload
> arbitrary ZIP files on Wikimedia Commons, allowing the Commons
> community to develop policy and process around when such archived
> resources are appropriate? An alternative, of course, would be to
> whitelist every possible source format for admins, but it seems to me
> that it would be a good general policy to not enable additional
> support for formats that aren't officially supported (reduces
> confusion among users about what's permitted -- there's only one file
> format they can't use).

> Thoughts?

Instead of amassing social constructs around technical deficiency, I
propose to fix bug 24230 [1] by implementing proper checking for JAR
format. Also, we need to check all contents with antivirus and
disallow certain types of files inside archives (such as .exe). Once
we took all these precautions, I see no need to restrict ZIPs to any
special group. Of course, this doesn't mean that we soul allow all the
safe ZIPs, just several open ZIP-based file formats.

-------------
[1] https://bugzilla.wikimedia.org/show_bug.cgi?id=24230

--
Best regards,
Max Semenik ([[User:MaxSem]])


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
On 10/25/2010 12:02 PM, Erik Moeller wrote:
> Hello all,
>
> for some types of resources, it's desirable to upload source files
> (whether it's Blender, COLLADA, Scribus, EDL, or some other format),
> so that others can more easily remix and process them. Currently, as
> far as I know, there's no way to upload these resources to Commons.
>
> What would be the arguments against allowing administrators to upload
> arbitrary ZIP files on Wikimedia Commons, allowing the Commons
> community to develop policy and process around when such archived
> resources are appropriate? An alternative, of course, would be to
> whitelist every possible source format for admins, but it seems to me
> that it would be a good general policy to not enable additional
> support for formats that aren't officially supported (reduces
> confusion among users about what's permitted -- there's only one file
> format they can't use).
>
> Thoughts?
>
> Thanks,
> Erik
>
>

Its most ideal if we actually support these formats, so we can do thing
like thumbnails, basic meta data etc. Failing that its better to support
a given file extension, then it is to support zip files. This way if in
'the future' we add support for X file format, then we have X format
files stored consistently so we can support representation of that file
format.

If we add blanket support for 'throw whatever you want' into a zip file,
it will be difficult to give a quality representation of that asset in
the future. ( other than as a zip file with multiple sub assets ).

If for example someone writes a diff engine for representing 3d model
transformations, we won't as easily be able to plug-in that tool, if we
don't have a consistent storage model for that file format.

That being said their may be some composite asset sets that lack
container systems, in which case it would not be bad support some open
container format.

The number of formats or multimedia asset compositing systems that are
not web representable with JavaScript engines or natively supported in
the browser should be on a dramatic decline in the next decade, so best
to just focus on support for such formats.

For example we prefer svg uploads to a zip file with an illustrator
assets, because svg is representable in the browser, there are
javascript based engines for editing svg
[http://svg-edit.googlecode.com/svn/branches/2.4/editor/svg-editor.html]
etc. Likewise for 3d model representation with the COLLADA format,
(although much more in its infancy at this point in time. )

--michael


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
On Mon, Oct 25, 2010 at 3:50 PM, Max Semenik <maxsem.wiki@gmail.com> wrote:
> Instead of amassing social constructs around technical deficiency, I
> propose to fix bug 24230 [1] by implementing proper checking for JAR
> format.

Does that bug even affect Wikimedia? We have uploads segregated on
their own domain, where we don't set cookies or do anything else
interesting, so what would an uploaded JAR file even do? If that kind
of attack is still a problem even with separate domains, we can do
like Mozilla's Bugzilla and serve each uploaded file from its own
unique domain (that would have ramifications for how browsers fetch
the images, but they might be positive anyway).

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
Aryeh Gregor wrote:
> On Mon, Oct 25, 2010 at 3:50 PM, Max Semenik <maxsem.wiki@gmail.com> wrote:
>> Instead of amassing social constructs around technical deficiency, I
>> propose to fix bug 24230 [1] by implementing proper checking for JAR
>> format.
>
> Does that bug even affect Wikimedia? We have uploads segregated on
> their own domain, where we don't set cookies or do anything else
> interesting, so what would an uploaded JAR file even do? If that kind
> of attack is still a problem even with separate domains, we can do
> like Mozilla's Bugzilla and serve each uploaded file from its own
> unique domain (that would have ramifications for how browsers fetch
> the images, but they might be positive anyway).

Well, the fact that a would not be able to steal the cookies if they
could place a jar file there* doesn't mean a malicious applet there
isn't bad.

*Not sure if we can really assert that. Most likely it varies depending
on browser, JVM and version.

Doing a full ZIP exploration against java classes is simple. However, we
should check that everything there is clean, not that nothing there is
blacklisted.

Archive formats have its own can of of issues. We don't want people to
upload a "OASIS file" that contains a videogame, even if it's not a jar
or a virus. How to determine if a file should be in the archive or not?
What to do with archived archives?


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
On Mon, Oct 25, 2010 at 10:09 PM, Aryeh Gregor
<Simetrical+wikilist@gmail.com> wrote:
> On Mon, Oct 25, 2010 at 3:50 PM, Max Semenik <maxsem.wiki@gmail.com> wrote:
>> Instead of amassing social constructs around technical deficiency, I
>> propose to fix bug 24230 [1] by implementing proper checking for JAR
>> format.
>
> Does that bug even affect Wikimedia?  We have uploads segregated on
> their own domain, where we don't set cookies or do anything else
> interesting, so what would an uploaded JAR file even do?
upload.wikimedia.org could end up on Google's Safe Surfing (or however
it's called) blacklist for hosting malicious .jar's which are injected
on another pwned web site or loaded through pwned advertising brokers.
Given the fact that Java is the 2nd biggest exploit vector in terms of
exploits (but 1st in terms of impact - users don't update Java as
often as the Adobe Reader), it should not be allowed to upload JARs
(or things that look like something else, but infact can be loaded and
executed by the JRT) to Wikipedia.

Marco
--
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
On Mon, Oct 25, 2010 at 10:51 PM, Marco Schuster
<marco@harddisk.is-a-geek.org> wrote:
> On Mon, Oct 25, 2010 at 10:09 PM, Aryeh Gregor
> <Simetrical+wikilist@gmail.com> wrote:
>> On Mon, Oct 25, 2010 at 3:50 PM, Max Semenik <maxsem.wiki@gmail.com> wrote:
>>> Instead of amassing social constructs around technical deficiency, I
>>> propose to fix bug 24230 [1] by implementing proper checking for JAR
>>> format.
>>
>> Does that bug even affect Wikimedia?  We have uploads segregated on
>> their own domain, where we don't set cookies or do anything else
>> interesting, so what would an uploaded JAR file even do?
> upload.wikimedia.org could end up on Google's Safe Surfing (or however
> it's called) blacklist for hosting malicious .jar's which are injected
> on another pwned web site or loaded through pwned advertising brokers.
> Given the fact that Java is the 2nd biggest exploit vector in terms of
> exploits (but 1st in terms of impact - users don't update Java as
> often as the Adobe Reader), it should not be allowed to upload JARs
> (or things that look like something else, but infact can be loaded and
> executed by the JRT) to Wikipedia.
>
> Marco
> --
> VMSoft GbR
> Nabburger Str. 15
> 81737 München
> Geschäftsführer: Marco Schuster, Volker Hemmert
> http://vmsoft-gbr.de
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Should we also be exploring any possibly malicious archives inside
archives recursively, or is just making sure the archive itself is
good is good enough?

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
Martijn Hoekstra wrote:
> Should we also be exploring any possibly malicious archives inside
> archives recursively, or is just making sure the archive itself is
> good is good enough?

I think that we should block such files.
Also note that we can't recursively analyse everything since that would
allow to DoS us.


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
On Mon, Oct 25, 2010 at 1:05 PM, Michael Dale <mdale@wikimedia.org> wrote:

> Its most ideal if we actually support these formats, so we can do thing
> like thumbnails, basic meta data etc. Failing that its better to support
> a given file extension, then it is to support zip files. This way if in
> 'the future' we add support for X file format, then we have X format
> files stored consistently so we can support representation of that file
> format.
>
> If we add blanket support for 'throw whatever you want' into a zip file,
> it will be difficult to give a quality representation of that asset in
> the future. ( other than as a zip file with multiple sub assets ).
>

I tend to agree that it's preferable to be able to recognize and validate
formats; though as noted sometimes you're going to have stuff that doesn't
really fit well in an individual file.

Certainly for Wikibooks I could envision *all sorts* of totally legitimate
use for being able to upload/download various files, including archives. The
Blender handbook could use example files and projects to download, which
might include dozens of support files. A programming module might need to
provide source code and sample input files.

Then we have the 'media source file' case: an animation should be able to
include the Blender or POV-Ray or whatever sources that were used to create
it. A pretty picture built in a layered raster system like Gimp or Photoshop
would do better to include the source .xcf or .psd than not too, even if the
source file is in a format that's harder to work with.

I believe we've got an old bug on the idea of being to explicitly attach a
source file:
https://bugzilla.wikimedia.org/show_bug.cgi?id=17012


In all cases we have the worry that if we allow uploading those funky
formats, we'll either a) end up with malicious files or b) end up with lazy
people using and uploading non-free editing formats when we'd prefer them to
use freely editable formats. I'm not sure I like the idea of using admin
powers to control being able to upload those, though; bottlenecking content
reviews as a strict requirement can be problematic on its own.

What I'd probably like to see is a more wide-open allowal of arbitrary
'source files' which can be uploaded as attachments to standalone files. We
could give them more limited access: download only, no inline viewing, only
allowed if DLs are on separate safe domain, etc.

I don't really relish the thought of checking image source data for warez
archives, though. :) Can't guarantee a magic solution there.

-- brion
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
2010/10/25 Brion Vibber <brion@pobox.com>:
> In all cases we have the worry that if we allow uploading those funky
> formats, we'll either a) end up with malicious files or b) end up with lazy
> people using and uploading non-free editing formats when we'd prefer them to
> use freely editable formats. I'm not sure I like the idea of using admin
> powers to control being able to upload those, though; bottlenecking content
> reviews as a strict requirement can be problematic on its own.

Yeah, I don't like the bottleneck approach either, but in the absence
of better systems, it may be the best way to go as an immediate
solution. We could do it for a list of whitelisted open formats that
are requested by the community. And we'd see from usage which file
types we need to prioritize proper support/security checks for.

> What I'd probably like to see is a more wide-open allowal of arbitrary
> 'source files' which can be uploaded as attachments to standalone files. We
> could give them more limited access: download only, no inline viewing, only
> allowed if DLs are on separate safe domain, etc.

It seems fairly straightforward to me to say: "These free file formats
are permitted to be uploaded. We haven't developed fully sophisticated
security checks for them yet, so we're asking trusted users to do
basic sanity checks until we've developed automatic checks." We can
then prod people to convert any proprietary formats into free ones
that are on that whitelist. And if they're free formats, I'm not sure
why they shouldn't be first-class citizens -- as Michael mentioned,
that makes it possible to plop in custom handlers at a later time. A
COLLADA handler for 3D files may seem like a remote possibility, but
it's certainly within the realm of sanity. ZIP files would have to be
specially treated so they're only allowed if they contain only files
in permitted formats.

So, consistent with Michael's suggestion, we could define a
'restricted-upload' right, initially given to admins only but possibly
expanded to other users, which would allow files from the "potentially
insecure" list of extensions to be uploaded, and for ZIP files, would
ensure that only accepted file types are contained within the archive.
The resultant review bottleneck would simply be a reflection that we
haven't gotten around to adding proper support for these file types
yet. On the plus side, we could add restricted upload support for new
open formats as soon as there's consensus to do so.

The main downside I would see is that users might end up being
confused why these files get uploaded. To mitigate this, we could add
a "This file has a restricted filetype. Files of this type can
currently only be uploaded by administrators for security reasons"
note on file description pages.
--
Erik Möller
Deputy Director, Wikimedia Foundation

Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
@2010-10-26 03:45, Erik Moeller:
> 2010/10/25 Brion Vibber<brion@pobox.com>:
>> In all cases we have the worry that if we allow uploading those funky
>> formats, we'll either a) end up with malicious files or b) end up with lazy
>> people using and uploading non-free editing formats when we'd prefer them to
>> use freely editable formats. I'm not sure I like the idea of using admin
>> powers to control being able to upload those, though; bottlenecking content
>> reviews as a strict requirement can be problematic on its own.
> Yeah, I don't like the bottleneck approach either, but in the absence
> of better systems, it may be the best way to go as an immediate
> solution. We could do it for a list of whitelisted open formats that
> are requested by the community. And we'd see from usage which file
> types we need to prioritize proper support/security checks for.
>
>> What I'd probably like to see is a more wide-open allowal of arbitrary
>> 'source files' which can be uploaded as attachments to standalone files. We
>> could give them more limited access: download only, no inline viewing, only
>> allowed if DLs are on separate safe domain, etc.
> It seems fairly straightforward to me to say: "These free file formats
> are permitted to be uploaded. We haven't developed fully sophisticated
> security checks for them yet, so we're asking trusted users to do
> basic sanity checks until we've developed automatic checks." We can
> then prod people to convert any proprietary formats into free ones
> that are on that whitelist. And if they're free formats, I'm not sure
> why they shouldn't be first-class citizens -- as Michael mentioned,
> that makes it possible to plop in custom handlers at a later time. A
> COLLADA handler for 3D files may seem like a remote possibility, but
> it's certainly within the realm of sanity. ZIP files would have to be
> specially treated so they're only allowed if they contain only files
> in permitted formats.
>
> So, consistent with Michael's suggestion, we could define a
> 'restricted-upload' right, initially given to admins only but possibly
> expanded to other users, which would allow files from the "potentially
> insecure" list of extensions to be uploaded, and for ZIP files, would
> ensure that only accepted file types are contained within the archive.
> The resultant review bottleneck would simply be a reflection that we
> haven't gotten around to adding proper support for these file types
> yet. On the plus side, we could add restricted upload support for new
> open formats as soon as there's consensus to do so.
>
> The main downside I would see is that users might end up being
> confused why these files get uploaded. To mitigate this, we could add
> a "This file has a restricted filetype. Files of this type can
> currently only be uploaded by administrators for security reasons"
> note on file description pages.

ODS, ODT and such should be fairly easy to check at least on a basic
level. A very basic check would be to check if it contains "Basic" or
"Scripts" folder. Bit more advanced would be to check if manifest.xml
contains "application/binary" (to check if anyone tried to change
default naming) and check if any file contains "<script:module" (for the
same reason).
If any of this would be true than there should be a warning.

I think we should also support Dia for diagrams and XCF for layered
bitmaps. Don't know much about XCF, but Dia is a simple XML file (which
might be zipped) and so shouldn't be dangerous at all. I guess it could
even be unzipped upon loading because Dia supports both zipped and
unzipped versions alike. There is/was also Extension:Dia which generates
thumbnails... It seems to work fine even with 1.16 from the trunk and
the latest Dia version. It doesn't work with zipped Dia files but this
would be manageable.

Regards,
Nux.
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
On Tue, Oct 26, 2010 at 6:50 AM, Max Semenik <maxsem.wiki@gmail.com> wrote:
> >>
>
> Instead of amassing social constructs around technical deficiency, I
> propose to fix bug 24230 [1] by implementing proper checking for JAR
> format. Also, we need to check all contents with antivirus and
> disallow certain types of files inside archives (such as .exe). Once
> we took all these precautions, I see no need to restrict ZIPs to any
> special group. Of course, this doesn't mean that we soul allow all the
> safe ZIPs, just several open ZIP-based file formats.

If we only want zip's for several formats, we should check that they
are of the expected type, _and_ that they consist of open file formats
within the zip.

e.g. Open Office XML (the MS format) can include binary files for OLE
objects and fonts (I think)

see "Table 2. Content types in a ZIP container"

http://msdn.microsoft.com/en-us/library/aa338205(office.12).aspx

OOXML can also include any other mimetype, which are registered
_within_ the zip, and linked into the main content file.

afaics, allowing only safe zip to be upload isn't difficult.

Expand the zip, and reject any zip which contains files on
$wgFileBlacklist, and not on $wgFileExtensions + $wgZipFileExtensions.

$wgZipFileExtensions would consist of array('xml')

Then check the mimetypes of the files in the zip, against
$wgMimeTypeBlacklist (with 'application/zip' removed), again allowing
desired XML mimetypes through.

--
John Vandenberg

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
[.Kicking this thread back to life, full-quoting below only for quick reference.]

I've collected some additional notes on this here:
http://commons.wikimedia.org/wiki/Commons:Restricted_uploads

Would appreciate feedback & will circulate further in the Commons community.

Thanks,
Erik


2010/10/25 Erik Moeller <erik@wikimedia.org>:
> 2010/10/25 Brion Vibber <brion@pobox.com>:
>> In all cases we have the worry that if we allow uploading those funky
>> formats, we'll either a) end up with malicious files or b) end up with lazy
>> people using and uploading non-free editing formats when we'd prefer them to
>> use freely editable formats. I'm not sure I like the idea of using admin
>> powers to control being able to upload those, though; bottlenecking content
>> reviews as a strict requirement can be problematic on its own.
>
> Yeah, I don't like the bottleneck approach either, but in the absence
> of better systems, it may be the best way to go as an immediate
> solution. We could do it for a list of whitelisted open formats that
> are requested by the community. And we'd see from usage which file
> types we need to prioritize proper support/security checks for.
>
>> What I'd probably like to see is a more wide-open allowal of arbitrary
>> 'source files' which can be uploaded as attachments to standalone files. We
>> could give them more limited access: download only, no inline viewing, only
>> allowed if DLs are on separate safe domain, etc.
>
> It seems fairly straightforward to me to say: "These free file formats
> are permitted to be uploaded. We haven't developed fully sophisticated
> security checks for them yet, so we're asking trusted users to do
> basic sanity checks until we've developed automatic checks." We can
> then prod people to convert any proprietary formats into free ones
> that are on that whitelist. And if they're free formats, I'm not sure
> why they shouldn't be first-class citizens -- as Michael mentioned,
> that makes it possible to plop in custom handlers at a later time. A
> COLLADA handler for 3D files may seem like a remote possibility, but
> it's certainly within the realm of sanity. ZIP files would have to be
> specially treated so they're only allowed if they contain only files
> in permitted formats.
>
> So, consistent with Michael's suggestion, we could define a
> 'restricted-upload' right, initially given to admins only but possibly
> expanded to other users, which would allow files from the "potentially
> insecure" list of extensions to be uploaded, and for ZIP files, would
> ensure that only accepted file types are contained within the archive.
> The resultant review bottleneck would simply be a reflection that we
> haven't gotten around to adding proper support for these file types
> yet. On the plus side, we could add restricted upload support for new
> open formats as soon as there's consensus to do so.
>
> The main downside I would see is that users might end up being
> confused why these files get uploaded. To mitigate this, we could add
> a "This file has a restricted filetype. Files of this type can
> currently only be uploaded by administrators for security reasons"
> note on file description pages.
> --
> Erik Möller
> Deputy Director, Wikimedia Foundation
>
> Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
>



--
Erik Möller
Deputy Director, Wikimedia Foundation

Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
Erik Moeller wrote:
> I've collected some additional notes on this here:
> http://commons.wikimedia.org/wiki/Commons:Restricted_uploads
>
> Would appreciate feedback & will circulate further in the Commons community.

From a social and technical perspective, this proposal is horribly hackish.
The over-arching goal should be to implement fewer hacks, though we
obviously don't live in an ideal world.

Given the current parameters, this is probably the best solution. However,
there needs to be a more in-depth analysis of the potential security
implications of some of these file types. Even trusted users shouldn't be
able to upload files that allow for the arbitrary injection of PHP, for
example. I suppose that's why you're asking for more feedback from
wikitech-l.

The current proposal is vague about which specific file types are desired. A
concrete list ought to be generated so that people can research the known
security implications of allowing those file types to uploaded.

I don't think there is ever going to be (or ever should be) a generic
whitelist to allow any and all free/open file types. What are the specific
file types that are currently banned that you're seeking to have partially
unbanned?

MZMcBride



_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
On Thu, Nov 25, 2010 at 12:46 AM, Erik Moeller <erik@wikimedia.org> wrote:
> [.Kicking this thread back to life, full-quoting below only for quick reference.]
>
> I've collected some additional notes on this here:
> http://commons.wikimedia.org/wiki/Commons:Restricted_uploads
>
> Would appreciate feedback & will circulate further in the Commons community.
>

I think you are taking the wrong approach here, altough I agree with
MZMcBride's reply to your mail "From a social and technical
perspective, this proposal is horribly hackish. [...] Given the
current parameters, this is probably the best solution. [...]"

I believe that we should really be aiming for scanning for security
vulnerabilities and reject only those files that pose a vulnerability.
For example, we do now outright reject open office files, as they may
encapsulate files that will be executed by the JVM. We should be able
to determine the exact circumstances that pose a vulnerability and
only reject those files, similar to what we have done for the embedded
HTML in files that affects IE.


Bryan

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
On 25 November 2010 07:58, Bryan Tong Minh <bryan.tongminh@gmail.com> wrote:

> I think you are taking the wrong approach here, altough I agree with
> MZMcBride's reply to your mail "From a social and technical
> perspective, this proposal is horribly hackish. [...] Given the
> current parameters, this is probably the best solution. [...]"


The rock and hard place here are:

1. This solution is horribly hacky and bletcherous.
2. The ideal is the enemy of the actually adequate; at present things
are not adequate.

Do we have a clear picture of what the ideal looks like? Are the hacks
clearly on the path to that and not to obstruct it in any way?


- d.

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
Erik Moeller wrote:
> [.Kicking this thread back to life, full-quoting below only for quick reference.]
>
> I've collected some additional notes on this here:
> http://commons.wikimedia.org/wiki/Commons:Restricted_uploads
>
> Would appreciate feedback & will circulate further in the Commons community.
>
> Thanks,
> Erik

How do you expect the end users to send ? Uploading to a service like
megaupload? As email attachments? Via OTRS? Using a toolserver app?

Seems a use case for the upload stash. Allow the users to upload the
file, but require approval until it is finally publicly shown.
We could even show the files publically, as far as there's no direct
download, requiring downloaders to provide a session token in the process.

In any case, files treated as html by IE would still need to be disallowed.


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
> Message: 5
> Date: Wed, 24 Nov 2010 15:46:24 -0800
> From: Erik Moeller <erik@wikimedia.org>
> Subject: Re: [Wikitech-l] Commons ZIP file upload for admins
> To: Wikimedia developers <wikitech-l@lists.wikimedia.org>
> Message-ID:
>       <AANLkTimD7kXngs4azgPanR_84Ok_th9T1DsANc7stkSh@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> [.Kicking this thread back to life, full-quoting below only for quick reference.]
>
> I've collected some additional notes on this here:
> http://commons.wikimedia.org/wiki/Commons:Restricted_uploads
>
> Would appreciate feedback & will circulate further in the Commons community.
>
> Thanks,
> Erik

Personally I think it would be nicer if you could associate source
files with the final files.
Something like:
*User uploads jpeg of 3D image (or whatever)
*on the image description page for the jpg, there is an upload
"source" file link
*Users (who have appropriate permissions) can upload the associated
source files with this link.
*These source files might appear as a subpage of the primary
image/document/media, or they might just appear in list form at the
bottom of the image description page of the main image/media. Either
way, the source files would be associated with a single "main" file.

Doing it this way would limit the feature to source files of actually
uploaded files (so less random cruft lying around, no orphaned source
files, less chance of people abusing the feature to get around file
type restrictions). I also personally don't like the idea of uploading
archives. Instead I think it would be better just to upload all the
source files needed. (although that might fall apart if you're
uploading source files for something very complex which has many
source files in a specific directory structure). There could also be a
download all option where all the source files get tar'ed together on
the server side for an easy download.

-bawolff

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
2010/11/25 bawolff <bawolff+wn@gmail.com>:
> Personally I think it would be nicer if you could associate source
> files with the final files.

Yeah, this was discussed a bit earlier in this thread. As far as I can
tell, that approach adds a fair degree of complexity (requirement of
tracking a whole new class of files in association with other files,
including versioning, deletion, etc.). It also seems to presume that
you'd never want to reference those same files using standard
MediaWiki links. It's not clear to me that such a system has clear
advantages over using normal wiki-links to source files from
appropriate places.

Stepping back a bit, I did a bit more research over the weekend as to
the current state of sourcing in Wikimedia Commons, and which file
types would be the most important to support.

Generally speaking, there's an existing (albeit limited) practice of
adding sources that can be represented as simple plain-text files,
such as POV-Ray, Gnuplot, etc. Sometimes these are formatted using the
syntax-highlighting extension, sometimes not. This practice could be
made more formal by directly requesting that users add source data
when they specify that a file has been created using one of these
applications (which is often identified using "Created with"
templates). But I don't necessarily see that any additional software
support is needed for these formats, save perhaps easier
downloadability, which could be added to the syntax-highlighting
extension.

For binary formats (and perhaps complex XML-based formats), the
following stand out as being of high significance:

* .blend as Blender's native export format and COLLADA as an open
interchange format
* .xcf as Gimp's native format (preserving layers and other
meta-information for bitmap images)
* .scribus as Scribus' native format (XML, but files can get very
large + have dependencies)
* .odt, .odp, .od as OpenDocument formats
* potentially OpenEXR and some other open interchange formats.

As far as I understand the pure security (as opposed to content)
concerns, these fall primarily into these categories:

* client-side execution of unsafe formats using designated
applications (embedded macros, references to other malicious content
etc.)
* exploitation of browser in-line display for purposes of XSS attacks or similar

Let me know if I'm missing a large category. I'm assuming server-side
execution is not an issue for Wikimedia given correct server
configuration.

Full security for these and other conceivably useful binary formats
seems difficult to obtain to me (that is, making sure that nothing bad
ever runs on a user's computer if they open a file). The restricted
upload (or restricted attachment) approach builds on social trust to
complement technical verification methods. We'd still have to invent
some additional machinery to implement security warnings before ever
exposing such files directly to the user.

Sacrificing easy individual file manageability, I wonder if it
wouldn't be most straightforward to write a decent ZIP handler (with
directory display, and thumbnailing of included images, for purposes
of patrolling), to disallow ZIP files that contain non-whitelisted
filetypes, and to use ZIPs as the container for all complex,
free-format source uploads. [[File:Bla source.zip]] could then just be
referenced as part of the file description pages where relevant.
Because some of the aforementioned binary formats are effectively
archives, some of this work would likely be necessary anyway.

That said, I'm not wedded to any particular approach. I hope we can
identify reasonably simple steps that we can take to significantly
expand our support for source files in the near term, because such
files are essential for re-use.
--
Erik Möller
Deputy Director, Wikimedia Foundation

Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
2010/11/29 Erik Moeller <erik@wikimedia.org>:
> As far as I understand the pure security (as opposed to content)
> concerns, these fall primarily into these categories:
>
> * client-side execution of unsafe formats using designated
> applications (embedded macros, references to other malicious content
> etc.)
> * exploitation of browser in-line display for purposes of XSS attacks or similar
>
> Let me know if I'm missing a large category. I'm assuming server-side
> execution is not an issue for Wikimedia given correct server
> configuration.
>
Server side execution is not an issue, no.

The client-side issues can all be reduced to a file acting as type A
to MediaWiki and as type B to the victim, where A is some harmless
file type we'd like to allow users to upload and B is some potentially
dangerous file type. This is usually enabled by one or more of the
following factors:
* IE second-guesses the server-provided MIME type in favor of its own
brain-dead MIME type detection algorithm, which in particular is
extremely eager to treat things as HTML (causing any embedded JS to be
executed): the presence of certain HTML tags or tag-like strings in
the first 255 bytes is sufficient reason for IE to call something HTML
* File formats are often interpreted flexibly, so a file that doesn't
conform to the standard completely may be read just fine by most
applications. These flexibilities allow for creating a file that looks
like an A but also comes close enough to being a B. For example,
running an HTML page containing unified diff text in the middle
through patch(1) will usually work, because patch(1) discards
"garbage" before and after the diff. These flexibilities are usually
undocumented and vary between applications, so it can be difficult to
predict whether a file qualifies as "almost a B"
* Some file formats are designed in such a way that a file can
actually be a completely valid A *and* a completely valid B all at the
same time. This is the case for most ZIP and ZIP-like formats

To illustrate the last sentence of the second bullet point, I'll quote
Tim's blog post on upload security [1] (which is a fun read for anyone
even mildly interested in the topic). It's part of the section on the
GIFAR vulnerability, which involves a file that's a valid GIF or ZIP
file, but which Java happily executes as a JAR (a ZIP-like format for
executable Java bytecode) file because Java's JAR format validation is
extremely lax, almost nonexistent. The only validation is does do is
check for a certain magic number at the end of the file, so rejecting

"An alternative [to rejecting all ZIP files] would be to parse the
entire zip directory and to reject any archives that contain a file
with a .class extension. I can’t vouch for this method. **If you did
this, the zip library you used would have to be exactly as tolerant of
zip format errors as the one used by Java.** It would probably be best
to actually shell out to Java to do the test."

(emphasis mine)

Roan Kattouw (Catrope)

[1] http://tstarling.com/blog/2008/12/secure-web-uploads/

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
Roan Kattouw wrote:
> "An alternative [to rejecting all ZIP files] would be to parse the
> entire zip directory and to reject any archives that contain a file
> with a .class extension. I can’t vouch for this method. **If you did
> this, the zip library you used would have to be exactly as tolerant of
> zip format errors as the one used by Java.** It would probably be best
> to actually shell out to Java to do the test."
>
> (emphasis mine)

If we consider acceptable the perfomance of parsing full zip files (as
opposed to just 512 bytes or the central directory), we can quite easily
accept many zip files.

There's also the issue of jar protocol, but that seems fixed from
Firefox 2.0.0.10 so probably not worth taking into account.
http://kb.mozillazine.org/Network.jar.open-unsafe-types


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
On Mon, Nov 29, 2010 at 9:29 PM, Roan Kattouw <roan.kattouw@gmail.com> wrote:
> "An alternative [to rejecting all ZIP files] would be to parse the
> entire zip directory and to reject any archives that contain a file
> with a .class extension. I can’t vouch for this method. **If you did
> this, the zip library you used would have to be exactly as tolerant of
> zip format errors as the one used by Java.** It would probably be best
> to actually shell out to Java to do the test."
>

I was thinking about this. There appears to be no option to the java
command line client to only check a file without executing. An option
would be to invoke the java debugger (jdb), which initially breaks at
the first instruction and presumably fails if the file is not a valid
jar. Still sounds nasty though, plus the fact that jdb is not a
generally installed program.


Bryan

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
Bryan Tong Minh wrote:
> On Mon, Nov 29, 2010 at 9:29 PM, Roan Kattouw <roan.kattouw@gmail.com> wrote:
>> "An alternative [to rejecting all ZIP files] would be to parse the
>> entire zip directory and to reject any archives that contain a file
>> with a .class extension. I can’t vouch for this method. **If you did
>> this, the zip library you used would have to be exactly as tolerant of
>> zip format errors as the one used by Java.** It would probably be best
>> to actually shell out to Java to do the test."
>>
>
> I was thinking about this. There appears to be no option to the java
> command line client to only check a file without executing. An option
> would be to invoke the java debugger (jdb), which initially breaks at
> the first instruction and presumably fails if the file is not a valid
> jar. Still sounds nasty though, plus the fact that jdb is not a
> generally installed program.
>
>
> Bryan

Note that you can't simply check (or reverse-engineer) that JVM X
doesn't treat it as a jar, since it could be detected in X-1 or X+1.
So there should be a range of still in use JVMs to assert.


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
On Mon, Nov 29, 2010 at 11:10 PM, Platonides <Platonides@gmail.com> wrote:
> Note that you can't simply check (or reverse-engineer) that JVM X
> doesn't treat it as a jar, since it could be detected in X-1 or X+1.
> So there should be a range of still in use JVMs to assert.
I run my own IT support company, and I've seen both private and
company clients running three-year-old Java and Flash versions, of
course the machines had a load of malware on them (which was the
reason I got called). The problem is, you've got a lot of users out
there who are confused by the update messages or by the Windows UAC
launching with every update as they get a LOT of lookalike messages
from sites like kino.to and now are confused what is real and what
not.
Securing against the "most in use JVM/PDF/Flash/whatever" version is
pointless, as you have to cover around three years of version
histories, if not more. For OpenOffice clients, it's even worse, as
some companies introduce their own private patch sets. Haven't seen
this until now, but I've never been at really big companies where this
actually is likely to happen.

Marco


--
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Commons ZIP file upload for admins [ In reply to ]
On Mon, Nov 29, 2010 at 11:10 PM, Platonides <Platonides@gmail.com> wrote:
> Bryan Tong Minh wrote:
>> On Mon, Nov 29, 2010 at 9:29 PM, Roan Kattouw <roan.kattouw@gmail.com> wrote:
>>> "An alternative [to rejecting all ZIP files] would be to parse the
>>> entire zip directory and to reject any archives that contain a file
>>> with a .class extension. I can’t vouch for this method. **If you did
>>> this, the zip library you used would have to be exactly as tolerant of
>>> zip format errors as the one used by Java.** It would probably be best
>>> to actually shell out to Java to do the test."
>>>
>>
>> I was thinking about this. There appears to be no option to the java
>> command line client to only check a file without executing. An option
>> would be to invoke the java debugger (jdb), which initially breaks at
>> the first instruction and presumably fails if the file is not a valid
>> jar. Still sounds nasty though, plus the fact that jdb is not a
>> generally installed program.
>>
>>
>> Bryan
>
> Note that you can't simply check (or reverse-engineer) that JVM X
> doesn't treat it as a jar, since it could be detected in X-1 or X+1.
> So there should be a range of still in use JVMs to assert.
>
I think that the most recent version should be sufficient. I don't
think Java would break backwards compatibility: users wouldn't be
happy if their old jar suddenly stops working on a new JVM.


Bryan

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

1 2  View All