Better Reading

Pages

Monday, March 23, 2015

A functional database

I mentioned in an earlier post that I was again working on a functional programming language. Probably the reason it is going so well is that the language itself is not my main goal. What I really want to do is build a functional database.

On the face of it, this seems like a ridiculous idea, because in order to be purely functional you would need to pass the whole database in as a parameter to any query, and return the updated database whenever you make a change to it.

But put aside the ridiculousness of it for a moment and imagine the following:

1. Your database is represented as a linked list of data changing transactions, so changing the database is just a matter of consing the new transaction onto the transaction chain.

2. Every transaction in the database carries a timestamp and a checksum, where the checksum is built from the data in the transaction and the checksum of the last transaction.

3. The data storage is trusted and the checksum has a vanishingly low rate of collision, so the checksum at time T can act as a reliable proxy for the whole database at time T.

What good would this thing be? I think it could be very powerful for certain applications. Consider the following:

  • All changes to the database would be audited. In an important sense, the database itself is the audit trail.
  • It would be trivial to add a date/time parameter to any query and get back the result of the query as it would have been at any time in the past.
  • The checksum could be used to verify whether the results of a query did indeed come from a given database at a given time.
  • Brute-force backdoor changes to the transaction chain would show up in the checksums.
  • You could allow "alternate universes" to be forked off of the main database at any time, to experiment with different scenarios. These would exist in parallel time with the main database, but would be distinguished by their own checksums.
This database would be useful in environments where data needs to be reliable and auditable.

No comments:

Post a Comment