More Advanced Database Features and Rails (Page 1 of 4 )
Triggers, Rules, and Stored Procedures
Now we're in dangerous territory. Let it be known that you should probably have a good reason to use triggers, rules, or stored procedures for anything terribly complicated. That is not to say that they have no purpose; they can be lifesavers. But they should be used to address a specific problem or concern, such as the following:
A complicated process that involves searching through lots of data (such as OLAP or log analysis) can be much faster if offloaded to the database server. As always, profiling is key; premature optimization can cost you execution speed, not just developer time.
Concerns that have little to do with the application logic, such as audit logs, can usually be safely moved to the database as triggers.
PostgreSQL can use rules to create updateable views. Unfortunately, this is currently the only way to get updateable views.
When using Postgres large objects, you should use a trigger to delete the large object when the corresponding record (containing the LOB's OID) is deleted. Consider this a form of referential integrity.
Extended or non-native types will use stored procedures for access. PostGIS, a geospatial database for Postgres, uses functions to manage spatial data and indexes.
The TSearch2 library, integrated into PostgreSQL 8.3 and later, uses functions to access full-text indexing functions.
Some applications use stored procedures for all data access, in order to enforce access control. This is definitely not the Rails way. Although it can be made to work, it will be more difficult than directly accessing tables and views. Views provide sufficient access control for most enterprise applications; only use stored procedures if you have to. ActiveRecord can transparently use updateable views as if they were concrete tables.
Large object deletion
Since PostgreSQL's large objects are decoupled from their associated record, it is useful to set up a simple rule to delete them when the corresponding record is deleted. The rule can be implemented as follows:
-- (table name is 'attachments'; LOB OID is 'file_oid')
CREATE RULE propagate_deletes_to_lob AS ON DELETE TO attachments DO ALSO SELECT lo_unlink(OLD.file_oid) AS lo_unlink
PostgreSQL has a very powerful rule system that can rewrite incoming queries in many ways. One use for this rule system is to implement partitioning, where data from one table is federated into one of several tables depending on some condition. Consider a database of real estate listings. For historical purposes, we may want to keep listings that have expired, been sold, or been removed from the system. However, most of the data being used on a day-to-day basis is derived from listings that are current and for sale.
In addition, the datasets of "current listings" and "all listings" will have differing data needs; the former is likely to be used transactionally while the latter is probably used analytically. It makes sense to store these separately, as they may have different characteristics.
First, we assume that we already have listing data in a table called listings, and it has a status column representing the status of the listing. We create the two tables, current_listings and non_current_listings, which inherit from the main table. This way, we can say SELECT * FROM listings and Postgres will include the data from the two inherited tables automatically.
Next, we create rules that rewrite inserts on the parent table to inserts on the proper child:
CREATE RULE listings_insert_current AS ON INSERT TO listings WHERE (status = 'C') DO INSTEAD INSERT INTO current_listings VALUES(NEW.*); CREATE RULE listings_insert_non_current AS ON INSERT TO listings WHERE (status != 'C') DO INSTEAD INSERT INTO non_current_listings VALUES(NEW.*);
Now that the rules are set up, we move the existing data in listings to the proper subtable:
INSERT INTO current_listings SELECT * FROM listings WHERE STATUS = 'C'; INSERT INTO non_current_listings SELECT * FROM listings WHERE STATUS != 'C'; DELETE FROM listings;
We know that the DELETE statement is safe because no new data has been inserted into the listings table, thanks to the rewrite rules. This is why it is important that the partition conditions are a proper partitioning such as status = 'C' and status != 'C' (non-overlapping and completely covering all possibilities). This ensures that every row is inserted into one of the child tables, not the parent. Note that this would not be a proper partitioning if the status column allowed NULL values, as both conditions would be false.
Now we can insert and select data against listings as if it were one table, while PostgreSQL transparently handles the partitioning and works with the proper partition. This is a very simple example. In particular, we need to implement rules for UPDATE and DELETE queries before using this scheme. This method can easily be extended to many partitions, even on complicated conditions.