Discussion:
[Refdb-devel] DB_VERSION redux
David Nebauer
2006-09-02 15:14:40 UTC
Permalink
Hi Markus,

The change in database version in svn has exposed some problems. First,
use of refdba and refdbc has not caused /var/lib/refdb/db/DB_VERSION to
increment from 1 to 2. I've restarted the refdbd daemon and run clients
as root but it stays at '1'. I haven't yet investigated log messages.

The second problem is more complex. It has to do with how the packaging
system handles an upgrade in database version. You may recall the
unitary refdb package handled this by having the pre-install script
compare database versions of new package and existing package. If the
new database version was higher, then the user was urged to abort the
upgrade and backup all data. The backup and restore scripts could
assist. The upgrade abort was achieved by having the pre-install script
exit with a non-zero status. This forced the package manager to abort
the upgrade.

Now refdb is split into clients and server packages. Only the server
package has the version-checking machinery. If both clients and server
packages are running on the same machine, the clients package updates
but the server package update may be aborted. This may result in the
application being unable to access *any* reference data.

The most obvious solution is to use the pre-install script to force a
data backup (by, say, refdb-backup) and use the post-install script to
restore the data (by, say, refdb-restore). The problem is the server
package may be running on a computer without the clients package. The
backup and restore scripts require the refdba and refdbc clients
supplied by the clients package. In fact, since both clients are used
in backing up and restoring reference data, neither operation can be
done on server-managed data in the absence of the clients package. The
relevant client executables cannot be packaged into both packages under
the same names as this is forbidden by all package managers. Forcing
the server package to be dependent on the clients package would,
however, defeat the purpose of splitting refdb into multiple packages in
the first place.

What this problem cries out for is a data conversion utility that ships
with the server package and can be called by pre- and/or post-install
scripts to upgrade all databases in-place and without user involvement.
A "poor man's" approach would be to move the backup and restore scripts
from client package to server package. The server package would also
need renamed copies of the refdba and refdbc clients (as they are called
by the backup and restore scripts).

When DB_VERSION was first added to refdb you declared it to be a "slimy
hack". Perhaps now is a good time to implement a more robust solution.

Having fixed the documentation problems, I am now reluctant to post a
new svn debian package until this "migration" issue is resolved.

Regards,
David.
Markus Hoenicka
2006-09-02 23:11:35 UTC
Permalink
Hi David,
Post by David Nebauer
Hi Markus,
The change in database version in svn has exposed some problems. First,
use of refdba and refdbc has not caused /var/lib/refdb/db/DB_VERSION to
increment from 1 to 2. I've restarted the refdbd daemon and run clients
as root but it stays at '1'. I haven't yet investigated log messages.
They should be informative. Just grep for "version file". Let me know
what you find. The upgrade did work on my box. DB_VERSION contains a
'2' here without any manual intervention.

Anyway, we had to bump into this problem eventually. I prefer this to
happen now (between prereleases) than during a release.
Post by David Nebauer
What this problem cries out for is a data conversion utility that ships
with the server package and can be called by pre- and/or post-install
scripts to upgrade all databases in-place and without user involvement.
A "poor man's" approach would be to move the backup and restore scripts
from client package to server package. The server package would also
need renamed copies of the refdba and refdbc clients (as they are called
by the backup and restore scripts).
This is not much different from saying "the server package depends on
the client package". Maybe we have to face it for the moment.

An entirely different solution is to let refdbd handle the upgrades
internally. refdbd would have to retain the ability to read old-style
databases, save the data to temporary tables, alter the database
schema, and write the data back into place. This is doable, although
it requires quite a bit of coding. It may turn out to be a nightmare
to maintain in the long run, as I'd have to test an increasing number
of combinations (1->2, then 1->3 and 2->3, then 1->4, 2->4, 3->4 and
so on) before releasing a new backwards-incompatible version.
Post by David Nebauer
When DB_VERSION was first added to refdb you declared it to be a "slimy
hack". Perhaps now is a good time to implement a more robust solution.
Yup. Lesson learned. Sooner or later, all hacks will start to haunt
you.
Post by David Nebauer
Having fixed the documentation problems, I am now reluctant to post a
new svn debian package until this "migration" issue is resolved.
I guess it is not Debian-like to abort the installation if the
database needs to be updated and ask the user to backup his data?
Instead of upgrading you'd have to deinstall the old package and
reinstall the new one, followed by restoring the data. The problem is
that I badly need some feedback about the new citekey citation style
which only works in the SVN version...

regards,
Markus
--
Markus Hoenicka
***@cats.de
(Spam-protected email: replace the quadrupeds with "mhoenicka")
http://www.mhoenicka.de
David Nebauer
2006-09-03 15:29:51 UTC
Permalink
Hi Markus,
Post by Markus Hoenicka
Post by David Nebauer
What this problem cries out for is a data conversion utility that ships
with the server package and can be called by pre- and/or post-install
scripts to upgrade all databases in-place and without user involvement.
An entirely different solution is to let refdbd handle the upgrades
internally. refdbd would have to retain the ability to read old-style
databases, save the data to temporary tables, alter the database
schema, and write the data back into place.
Sounds like a brilliant idea and it gets my vote.
Post by Markus Hoenicka
This is doable, although
it requires quite a bit of coding. It may turn out to be a nightmare
to maintain in the long run, as I'd have to test an increasing number
of combinations (1->2, then 1->3 and 2->3, then 1->4, 2->4, 3->4 and
so on) before releasing a new backwards-incompatible version.
I may be a naive, but why not have a translation routine for each
increment in database version, i.e., from 1 to 2, 2 to 3, and so on. If
refdbd found it necessary to translate from version 1 to 4, it would
sequentially call the translation routines 1 to 2, 2 to 3 and 3 to 4.
It may not be the most elegant solution, but it has the virtues that:
- number of translation routines increases geometrically with version
numbers rather than exponentially,
- it is easier to code and maintain, and
- it's hidden from the user anyway so who cares?

The longer translation time for a large version number jump is not
significant. It is a one-off cost and most people regularly update
their systems so they will only ever jump a single version number at a time.
Post by Markus Hoenicka
I guess it is not Debian-like to abort the installation if the
database needs to be updated and ask the user to backup his data?
That is precisely the solution I used in the previous unitary refdb
package. The problem with using it now is it only stops the server
package upgrading. If the clients package is installed on the same
machine it *will* be upgraded. The danger then is the clients and
server will be at different versions. I assumed this would be a *BAD
THING*. I wasn't sure the backup script, which uses refdba and refdbc,
would work if the clients were from the new package (and therefore new
database version) while the server was from the old package (and
therefore old database version). Was I right to be worried? If you can
guarantee the clients will always work in this situation then I can use
the previous solution of aborting the server upgrade so the user can
backup the databases before upgrading again.

I'm not sure whether such a solution is "Debian-like". I do know it's
goddamn ugly. Rather than gracefully upgrading the databases in situ,
the entire update (which likely involved many packages other than refdb)
fails with an error message. This forces the user to trace back to find
the package that failed to upgrade and try to figure out the reason why.

Even if this solution is usable, I would urge you to consider it an
interim one and to implement the translation schema you postulated earlier.

Regards,
David.
Markus Hoenicka
2006-09-03 16:09:13 UTC
Permalink
Hi David,
Post by David Nebauer
Post by Markus Hoenicka
An entirely different solution is to let refdbd handle the upgrades
internally. refdbd would have to retain the ability to read old-style
databases, save the data to temporary tables, alter the database
schema, and write the data back into place.
Sounds like a brilliant idea and it gets my vote.
Before I go into the details, I've got one question. Does the package
postinstall script have access to the database admin username and
password?

I don't see a chance to let refdbd handle upgrades silently. To this
end you'd have to make sure the first connection after upgrading is
from the database admin account. This would be easy if the clients
were around - the postinstall script could send a ping to trigger the
upgrade. But as we've seen before it is not an option to rely on the
clients (at least not a *desirable* option).

What I could do is add a command-line switch to refdbd that lets it
run as sort of a one-time version checker. But I'd have to provide the
database superuser name and password to be able to upgrade the
database.

regards,
Markus
--
Markus Hoenicka
***@cats.de
(Spam-protected email: replace the quadrupeds with "mhoenicka")
http://www.mhoenicka.de
David Nebauer
2006-09-04 09:32:13 UTC
Permalink
Hi Markus,
Post by Markus Hoenicka
Before I go into the details, I've got one question. Does the package
postinstall script have access to the database admin username and
password?
The Debian package uses sqlite as the database backend and it does not,
as I understand it, accepts any username and password. All that is
essential is the calling program have root access, which the package
manager certainly does.

I hope I've interpreted your question correctly.

Regards,
David.
Markus Hoenicka
2006-09-04 10:20:49 UTC
Permalink
Post by David Nebauer
The Debian package uses sqlite as the database backend and it does not,
as I understand it, accepts any username and password. All that is
essential is the calling program have root access, which the package
manager certainly does.
Oh, I'm sorry. I forgot about this. In this case it is indeed sufficient to run
refdbd with root access. I'll try to whip up a solution for 0.9.8.

regards,
Markus
--
Markus Hoenicka
***@cats.de
(Spam-protected email: replace the quadrupeds with "mhoenicka")
http://www.mhoenicka.de
David Nebauer
2006-09-27 09:33:27 UTC
Permalink
Hi Markus,
Post by Markus Hoenicka
Post by David Nebauer
The change in database version in svn has exposed some problems. First,
use of refdba and refdbc has not caused /var/lib/refdb/db/DB_VERSION to
increment from 1 to 2. I've restarted the refdbd daemon and run clients
as root but it stays at '1'. I haven't yet investigated log messages.
They should be informative. Just grep for "version file". Let me know
what you find. The upgrade did work on my box. DB_VERSION contains a
'2' here without any manual intervention.
I ran the server in standalone mode to capture log output and, lo and
behold, the version file updated on the first database read. No idea
why earlier attempts failed.

Regards,
David.
Markus Hoenicka
2006-09-27 11:26:10 UTC
Permalink
Post by David Nebauer
I ran the server in standalone mode to capture log output and, lo and
behold, the version file updated on the first database read. No idea
why earlier attempts failed.
Come to think of it, the version file will not be updated if you run refdbd -a.
This will update the main database if necessary, but the version file gets
updated only as soon as refdbd forks to handle a client request. I probably ran
refdbd -s after each refdbd -a run to see whether refdbd starts up correctly.
Therefore I always saw the version file being updated.

Would it be hard to change your install scripts and get rid of DB_VERSION in
favour of a refdbd -c call? I'd suggest to drop DB_VERSION then. I could also
make the refdbd -c output easier to parse.

regards,
Markus
--
Markus Hoenicka
***@cats.de
(Spam-protected email: replace the quadrupeds with "mhoenicka")
http://www.mhoenicka.de
Loading...