Mailing List Archive

Downloading BLOBs
(This is related to my discussion of BLOBs via DBIx::Class on their list)

Given a $c->stash->{content_oid} that contains a PostgreSQL OID for a
large object, this should cause Catalyst::Engine to send it to the
client without slurping the entire contents into ram:

sub finalize_body {
my $c = shift;

if (defined $c->stash->{'content_oid'}) {
$c->log->debug("In custom finalize_body() routine..");
# This handle should be a Postgres large object ID
my $oid = $c->stash->{'content_oid'};
my $dbh = $c->model('DB')->schema->storage->dbh;

$c->model('DB')->schema->txn_begin;
my $io = IO::BLOB::Pg->new($dbh, $oid);
die("Failed to create IO::BLOB::Pg") unless defined $io;
$c->response->body($io);
$c->NEXT::finalize_body;
$c->model('DB')->schema->txn_commit;
}
else {
$c->NEXT::finalize_body;
}
return $c;
}


It's rather hacky, but I thought I'd submit it for your comments.

I'll try and create a DBIx::Class::..::BLOB interface at some point, and
if I do, I'll definately arrange it so it hopefully doesn't require the
hacky transaction-surrounding bits.

Also, I have a bad feeling about doing the transaction with
begin/commit, rather than the txn_do( sub {} ) method.. However, that's
not available, due to NEXT not functioning within an anonymous sub (of
course).

An alternative is to just call $c->engine->finalize_body( $c, @_ );
inside the txn_do() sub, instead of NEXTing onto it. I think I prefer
that method, since it allows us to end the transaction in case of
death.. but it breaks any other plugins.

$c->model('DB')->schema->txn_do( sub {
my $io = IO::BLOB::Pg->new($dbh, $oid);
die("Failed to create IO::BLOB::Pg") unless defined $io;
$c->response->body($io);
$c->engine->finalize_body( $c, @_ );
});


Any suggestions?
(It seems like "write the D::C plugin already" is the neat solution,
although I won't ahve time to do that immediately)

Cheers,
Toby

_______________________________________________
Catalyst mailing list
Catalyst@lists.rawmode.org
http://lists.rawmode.org/mailman/listinfo/catalyst
Re: Downloading BLOBs [ In reply to ]
Toby Corkindale <toby@ymogen.net> writes:
> It's rather hacky, but I thought I'd submit it for your comments.

Unless you've got something historical to support, don't use BLOBs in
Pg. Seriously, with real prepared statements, supported by DBD::Pg,
the one remaining argument for using the LO interface (that all that
data didn't have to be parsed, making things much faster) has gone the
way of the dodo.

Mike
--
You better not mess with Major Tom

_______________________________________________
Catalyst mailing list
Catalyst@lists.rawmode.org
http://lists.rawmode.org/mailman/listinfo/catalyst
Re: Downloading BLOBs [ In reply to ]
On May 9, 2006, at 7:32 AM, Michael Alan Dorman wrote:

> Toby Corkindale <toby@ymogen.net> writes:
>> It's rather hacky, but I thought I'd submit it for your comments.
>
> Unless you've got something historical to support, don't use BLOBs in
> Pg. Seriously, with real prepared statements, supported by DBD::Pg,
> the one remaining argument for using the LO interface (that all that
> data didn't have to be parsed, making things much faster) has gone the
> way of the dodo.

They're still useful for a few things - the ability to read
just part of the field is something that bytea doesn't have, for
instance. And being able to send the whole content out without
smashing the RAM postgresql is using is extremely useful.

That they're horrible to use doesn't mean there aren't places
they're still useful (though the places they're useful tend to be
exactly the same ones where you'd likely be just as happy
with filenames and metadata in the DB and files in a directory
tree outside it).

Cheers,
Steve


_______________________________________________
Catalyst mailing list
Catalyst@lists.rawmode.org
http://lists.rawmode.org/mailman/listinfo/catalyst
Re: Downloading BLOBs [ In reply to ]
Michael Alan Dorman wrote:
> Toby Corkindale <toby@ymogen.net> writes:
>> It's rather hacky, but I thought I'd submit it for your comments.
>
> Unless you've got something historical to support, don't use BLOBs in
> Pg. Seriously, with real prepared statements, supported by DBD::Pg,
> the one remaining argument for using the LO interface (that all that
> data didn't have to be parsed, making things much faster) has gone the
> way of the dodo.

So, are you suggesting that one just makes extensive use of the
substring() Pg function to retrieve chunks-at-a-time from the field?
And include some hacks to prevent Class::DBI or DBIx::Class from
automatically attempting to retrieve the entire BYTEA column
(potentially a gig or two) from the DB every time you hit that record?

What is the performance of substring() like over very large BYTEA
columns? I'm concerned that it wouldn't match the lo_read interface.
Does it support in-place writing, a-la perl's substr()? Or would one
need to read out the entire BYTEA column in order to be able to make a
change and write it back?


I like the lo_ interface, as it means that I can send/receive/access
large files, yet keep a very low memory footprint. When you're talking
about having many files being sent over the wire simultaneously, and
slowly, it's advantageous to keep memory use low and thus let the
process-count go high.

Cheers,
Toby

_______________________________________________
Catalyst mailing list
Catalyst@lists.rawmode.org
http://lists.rawmode.org/mailman/listinfo/catalyst