Mailing List Archive

Dealing with a really large number of files in a single directory
I've got a directory on an NFS server. I think it's been filling up with
log files, and may be in excess of a million.
(ls -d shows me a 350MB rather than the usual 4k).

strace -fTt ls -1 -f is stalling on 'getdents' system call. I assume it'll
complete eventually, but ... well, the 'leave it and wait' hasn't yet.

Does anyone have any good tricks on a filer for helping me tell that a)
this directory is in fact, absurdly large and b) starting to retrieve
filenames so I can delete/move them in sensibly sized baches?

(I'm wondering if a filer-side ls is going to help or hinder, for example).
AW: Dealing with a really large number of files in a single directory [ In reply to ]
Hi Edward,

BTDT many times unfortunately.
The volume option „maxdirsize“ is what’s usually limiting you in creating directories of that size, so what’s maxdirsize set on your volume? You can check it with `vol options` if that’s a 7-mode filer.

In order to delete this directory, you will only have a few sane choices.
One that usually works good is to rsync an empty directory over it, so f.ex.:

/nfs/reallybigdir <- this is the victim

mkdir /nfs/emptydir
rsync –a –delete /nfs/emptydir/ /nfs/reallybigdir/

Let this run for a while and make sure that you don’t omit the trailing slashes of the directories.
That of course is only valid if you want to get rid of all the files in this directory, if you want to keep some of them and need to find out the names of the files, you could try to use `find /nfs/reallybigdir –type f –print` and redirect that output to a file, but it will also take very long I suppose.
You can check if this command would actually work by limiting it’s output to let’s say 10 files, by running:

find /nfs/reallybigdir –type f | head

If that completes in time, you can increase the pagination by passing arguments to head, f.ex.

find /nfs/reallybigdir –type f | head -2000

`rm –r /nfs/reallybigdir` usually also works pretty good, but takes notibly more resources on the host OS than the rsync variant.

If all of that doesn’t help, I’ve a C program code which I used to delete 1023MB big directories in the past which I could share then.

Best,

Alexander Griesser
Head of Systems Operations

ANEXIA Internetdienstleistungs GmbH

E-Mail: AGriesser@anexia-it.com<mailto:AGriesser@anexia-it.com>
Web: http://www.anexia-it.com<http://www.anexia-it.com/>

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstraße 140, 9020 Klagenfurt
Geschäftsführer: Alexander Windbichler
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

Von: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] Im Auftrag von Edward Rolison
Gesendet: Donnerstag, 17. März 2016 11:27
An: toasters@teaparty.net
Betreff: Dealing with a really large number of files in a single directory

I've got a directory on an NFS server. I think it's been filling up with log files, and may be in excess of a million.
(ls -d shows me a 350MB rather than the usual 4k).

strace -fTt ls -1 -f is stalling on 'getdents' system call. I assume it'll complete eventually, but ... well, the 'leave it and wait' hasn't yet.

Does anyone have any good tricks on a filer for helping me tell that a) this directory is in fact, absurdly large and b) starting to retrieve filenames so I can delete/move them in sensibly sized baches?

(I'm wondering if a filer-side ls is going to help or hinder, for example).