Beginners guide

= SUMMARY = In summary, Class::DBI maps an instance of an object to a row in a relational database table. When you operate upon the object, you affect the underlying database. Class::DBI takes care of generating all the select, inserts, updates, deletes for you. It makes working with a relational database almost trivial...

= INTRODUCTION = If you're anything like me, you started interacting with your Relational Database Management System (RDBMS), such as: Oracle, Postgres, MySQL, Sybase, SQL Server... (in my order of preference :) using Perl via the vendor supplied command line tool. Shelling out in your script, launching the command line tool to execute some SQL, catching the output, parsing it, and using it in your Perl in some meaningful way.

Then I found out about DBI, and the world was beautiful. I could now connect and execute SQL within perl in an concise and intuitive way. But then I got lazy (a good thing) and started writing all kinds of helper modules to do all the things I was doing over and over.

Then I thought, "dummy, I bet someone has already done this", so I searched around the web and CPAN, and found Class::DBI. Since then I've been doing my database interaction exclusively via Class::DBI.

While this beginner's guide overlaps much of the Class::DBI documentation (probably entirely), it is not, I repeat, it is not a substitute for the Class::DBI documentation.

This article attempts to be many things:


 * Pedantic, obviously the seasoned Perl coder can do what I do in 5 lines, in a one-liner, but I felt it was easier to show how CDBI worked by having verbose examples than slick Perl code.


 * Beginners guide showing constistent examples of all Class::DBI methods


 * Show some common patterns in Class::DBI


 * FAQ to things I see on the mailing list

Class::DBI is commonly called CDBI by the lazy. I will be calling it CDBI from now on.

= Entity Relationship Diagram (ERD) =

This beginner's guide uses the following ERD: +--+ | artistgroup     | |--| | bandartistid    | | personartistid  | +--+       +--+     \|/      \|/            | band         | |       |             |--|      |        |             | artistid     | +--+       | creationdate | | artist          || enddate      | |--|       +--+ |  artistid        | | name            |        +---+ | popularity      || person    | +--+       |---|          |                  | artistid  | |                 | gender    | /|\                | haircolor | +-+        | birthdate | | cd             |         +---+ |-| | cdid           | | artistid       | | title          | | publishdate    | +-+         |          |         /|\ +-+ |  track          | |-| | cdid           | | songid         | | sequence       | +-+        \|/          |          |    +--+    |  song    | |--|   |  songid  | | name    | +--+

For those unfamiliar with ERDs, ERDs are a graphical representation showing tables, columns within those tables, and the relationships between tables. See entity relationship diagram for more information on ERDs.

The Data Definition Language (DDL) for creating the above tables and relationships is included in the appendix of this document.

= SETUP =

Base Class
It's usually wise to set up a "top level" class for your entire application to inherit from, rather than have each class inherit directly from Class::DBI. This gives you a convenient point to place system-wide overrides and enhancements to Class::DBI's behavior. For example, it's common to put your database connection into your base class.

package Music::MusicDB; use strict; use warnings; use base 'Class::DBI::Oracle'; my ($dsn, $username, $password) = getConfig; Music::MusicDB->set_db('Main',                             $dsn,                              $username,                              $password,                              {AutoCommit=>1},                             ); sub getConfig { return ('dbi:Oracle:o901.mydomain.com','me', 'me'); } 1;

Regular Class
To start off slow let's implement the Artist class. CDBI maps, one to one, a class to a table. The artist class looks like:

package Music::Artist; use strict; use base 'Music::MusicDB'; Music::Artist->table('artist'); Music::Artist->columns(All => qw/artistid name popularity/); 1;

You indicate the table via the ->table method. And, you indicate the columns in the table via the ->columns method. When not otherwise specified, CDBI uses the first column "artistid" as the primary key of the table.

CDBI automatically creates methods with the same names as your columns. All the methods are in lower case (by default), regardless of the character case in your database.

So, if I have an Artist object "$artist" and I want to print out the "name" field, I do:

print $artist->name,"\n";

If I want to print a bunch of column values:

@column = qw/name artistid/; @value = $artist->get(@column); foreach (@value) { print $_,"\n"; }

If I want to change the value of the "name" field:

$artist->name("New Name");

If I want to change a bunch of attributes:

%hash = (name=>"New Name", artistid=>17); $artist->set(%hash);

CDBI, uses Class::Accessor internally to create the methods. So if you are interested, check out the Class::Accessor docs for how to use it (also a very helpful module).

= CRUD (Create, Read, Update, Delete) =

Insert (formerly create)
Note that create is deprecated.

Inserting (creating) data in CDBI, like everything else, is extremely easy. I want to create some artists. I create a script like:

use strict; use Music::Artist; my %hash = (artistid=>1, name=>'sublime'); my $obj1 = Music::Artist->insert(\%hash); %hash = (artistid=>2, name=>'beatles'); my $obj2 = Music::Artist->insert(\%hash);
 * 1) !/usr/bin/perl

Simple, I've now created 2 rows in my database. I also have two (2) objects available in my code, which are the instantiation of those two rows, upon which I can perform useful functionality.

Read (aka select)
There are a number of methods with which to instantiate objects in your perl code from the database.

retrieve_all
Like the name indicates this method returns all of the rows in a table.

@obj = Class->retrieve_all; use strict; use Music::Artist; my @obj = Music::Artist->retrieve_all;
 * 1) Generic form
 * 1) In your code
 * 2) !/usr/bin/perl

Which obviously returns an array of objects, one for each row in the database.

retrieve
You use this method to "retrieve" a row by primary key. It returns a single row.

$obj = Class->retrieve( $id ); $obj = Class->retrieve( %hash ); my $artist = Music::Artist->retrieve(2); print $artist->name,"\n";
 * 1) Generic forms
 * 1) In your code

If the table uses multiple columns to define the primary key, you use a hash to define the key values:

my %key = (CD=>100,songid=>1000); my $track = Music::Track->retrieve(%key); print $track->sequence,"\n";

search
Returns an array of objects matching the criteria you supply.

my @objs = Class->search(column1 => $value, column2 => $value ...);
 * 1) Generic form

The criteria are exact and additive, meaning that when you say:

my @cds = Music::CD->search(title => "Greatest Hits", year => 1990);

Means "retrieve all the rows where title equals 'Greatest Hits' AND year equals 1990".

Conveniently, if you want to search for any NULL fields:

my @cds = Music::CD->search(title => "Greatest Hits", year => undef);

Returns any albums that haven't yet been released to the public.

search_like
This is a simple search for all objects where the columns specified are "like" the values specified.

@obj = Class->search_like(column1 => $like_pattern, ....); my @cd = Music::CD->search_like(title => 'October%'); my @cd = Music::CD->search_like(title => 'Hits%', artist => 'Various%');
 * 1) Generic form
 * 1) In your code

$like_pattern is a pattern given in SQL LIKE predicate syntax. '%' means "any one or more characters", '_' means "any single character". The wildcards can be used in the front, middle, or end of the string.

my @obj = Music::Artist->search_like(name => '_eatles'); my @obj = Music::Artist->search_like(name => '%les'); my @obj = Music::Artist->search_like(name => 'Be%les');

search_where
While not actually a part of the CDBI distribution, it is a nearly indispensable plugin component.

Essentially it lets you create queries that are much more complex than the standard "search" method. Using AbstractSearch you can use "AND"s, "OR"s, <, >, !=, <=, >= and probably more in your search criteria.

To use "search_where" you must "use" Class::DBI::AbstractSearch within your class definition, for example:

package Music::CD use Class::DBI::AbstractSearch; use strict my @obj = Music::Artist->search_where(           name => [ 'Ozzy', 'Kelly' ],            status => { '!=', 'outdated' },        );
 * 1) Then in a nearby piece of code
 * 1) !/usr/bin/perl
 * 1) select the rows where the (name equals 'Ozzy' or 'Kelly') AND
 * 2) (status is not 'outdated')

Class::DBI::AbstractSearch is built upon SQL::Abstract. Look at SQL::Abstract to learn the exactly how to form complicated where clauses.

Iterator
With the inclusion of "iterator" CDBI is CRUDI (hardy har har)

All of the above retrieval methods can also return an "iterator".

use strict; use Music::Artist; my $iterator = Music::Artist->retrieve_all;
 * 1) !/usr/bin/perl

Notice the above returns a scalar as opposed to an array. This form of return is called an "iterator". CDBI determines which type of return you want based on the target's datatype (i.e. the datatype scalar or array of the field on the left of the equals "=" sign).

You use an iterator like:

my $iterator = Music::Artist->retrieve_all; while (my $artist = $iterator->next) { print $artist->name,"\n"; }

You might ask, "what's the big deal". The big deal occurs during execution. The array method returns all the rows and all the columns contained in the "essential" column group (discussed later) from the database, and creates the associated CDBI objects at execution. Whereas, the iterator fetches only the primary keys of all the rows and holds off creating the object until it is asked for, via ->next.

find_or_create
There is another form, that is a combination of search and create. It first looks for the row with the values you specify, if it can't be found, CDBI creates it for you.

my $cd = Music::Artist->find_or_create({name=>'Elvis Presley'});

If CDBI can't find "Elvis" then it will create the King.

Update
The update takes any changes you have made to the data in the object and applies them to the database.

my $obj = Music::Artist->retrieve(1); $obj->name("New Name"); $obj->status("discontinued");
 * 1) First get the object
 * 1) Change the data one column at a time

my $obj2 = Music::Artist->retrieve(2); $obj2->set(name=>"something", status=>"top 40"); $obj2->update;
 * 1) Or change multiple columns en masse
 * 1) Store changes in the database

Delete
You can probably guess what the delete method does. use strict; use Music::Artist; my @obj = Music::Artist->retrieve_all; foreach (@obj) { $_->delete; }
 * 1) !/usr/bin/perl

= RELATIONSHIPS = One of the fundamental points is the "R" of RDBMS, that is "relational". CDBI makes it really easy to navigate from one object to another related object.

has_many
Simply, you define which tables (classes) are children of the table you are working upon. That is, which tables store the primary key of your table, in a column of their table. This means that for every one of a specific object, there are many directly related objects. When you invoke the "has_many" method, it creates a method in your classes definition which you can then call from your code.

Class->has_many(method_to_create => "Child::Class");

In our model, each artist puts out one or more CDs (what's the point of being an artist if you aren't going to have a CD).

So let's add a bit to the Artist class

Music::Artist->has_many(cds => 'Music::CD');

that lets it relate to the CD class. More specifically, it says an Artist has_many CDs. The entire Artist class looks like:

package Music::Artist; use strict; use base 'Music::MusicDB'; Music::Artist->table('artist'); Music::Artist->columns(All => qw/artistid name popularity/); Music::Artist->has_many(cds => 'Music::CD'); 1;

Behind the scenes, it makes a method available called "cds", which you can call on an instance of an Artist.

has_a
Class->has_a(foreign_key_column => 'foreign_class');

Similarly, a CD has to relate to the Artist class. We haven't seen the CD class yet so let's build it now and put in the "has_a" code that relates it to the Artist class.

package Music::CD; use strict; use base 'Music::MusicDB'; Music::CD->set_up_table('cd'); Music::CD->has_a(artistid => 'Music::Artist'); Music::CD->has_many(tracks=>Music::Track=>'cdid'); 1;

In database speak, you use has_a to obtain data via a foreign key. In other words, your object contains the value of another object's primary key.

The has_a code says "CD" has_a "Artist". It also says, use the artistid in the CD to look up the corresponding value in the primary key of the Artist.

Something you might not have noticed is that the Artist class didn't need to specify how to find the relating CDs. That is because CDBI is smart enough to examine the CD class and learn how it relates to the Artist class.

Now let's load up some sample data (see Appendix 2), so we can do some examples. Whew, that was tough.

Let's read down the hierarchy, via the has_many, from Artist to CD

use strict; use Music::Artist; my ($artist) = Music::Artist->search_like(name=>'%Beatles%'); print $artist->name,"\n"; foreach ($artist->cds) { print "\t",$_->title,"\n"; }
 * 1) !/usr/bin/perl

Here's the output:

The Beatles Abbey Road Rubber Soul Sgt. Peppers Lonely Hearts Club Band The Beatles (aka The White Album) Revolver

Or if you want to read up the hierarchy, via the has_a, from CD to Artist:

use strict; use Music::CD; my ($cd) = Music::CD->search(title=>'Abbey Road'); print "The CD: ",$cd->title," written by the: ",$cd->artistid->name,"\n";
 * 1) !/usr/bin/perl

Here's the output:

The CD: Abbey Road written by the: The Beatles

You can chain these together, you can read down one relationship, up another and so on. Here we'll print out all the songs on all the CDs written by the Beatles (at least for the data we have):

use strict; use Music::Artist; my ($artist) = Music::Artist->search_like(name => '%Beatles%'); print $artist->name,"\n"; foreach ($artist->cds) { print "\t", $_->title, "\n"; foreach my $track ($_->tracks) { print "\t\t", $track->songid->name,"\n"; } }
 * 1) !/usr/bin/perl

Here's the output:

The Beatles Abbey Road Rubber Soul Sgt. Peppers Lonely Hearts Club Band The Beatles (aka The White Album) Back in the U.S.S.R.                Dear Prudence Glass Onion Ob-La-Di, Ob-La-Da Wild Honey Pie Continuing Story of Bungalow Bill While My Guitar Gently Weeps Happiness Is a Warm Gun Martha My Dear I'm So Tired Blackbird Piggies Rocky Raccoon Don't Pass Me By                Why Don't We Do It in the Road? I Will Julia Revolver

Many to Many (aka lookups)
Many to many really just means traversing an intersection table (i.e. the Track table). An intersection table is just a table that contains 2 or more has_a definitions, linking to different tables.

In the above example where we went, Artist => CD => Track => Song, the long winded way. CDBI provides a facility for taking some of the work out of it. Really, we're making use of the "mapping" functionality of CDBI, discussed later, but in a very specific context.

Suppose we're really not interested in the sequence of a Track on a CD, we're only concerned (at this time) with the Song.

I'd add to the CD class:

Music::CD->has_many(songs => [ Track => 'songid' ]);

making it look like:

package Music::CD; use strict; use base 'Music::MusicDB'; Music::CD->set_up_table('cd'); Music::CD->has_a(artistid => 'Music::Artist'); Music::CD->has_many(tracks=>'Music::Track'); Music::CD->has_many(songs => ['Music::Track' => 'songid']); 1;

then I could use it like:

use strict; use Music::CD; my ($cd) = Music::CD->search_like(title => 'The Bea%'); foreach ($cd->songs) { # Note at this point $_ is an instance of a Song print $_->name,"\n"; }
 * 1) !/usr/bin/perl

might_have
Class->might_have(method_name => Class => (@fields_to_import));

might_have makes using (one-to-one) relationships transparent. A one-to-one relationship is typically used when subtyping a more general class. In our example, there are really 2 types of Artists: Bands, People. Bands and People share some of the same attributes (specifically, name and popularity), but they differ in that a person has: gender, birthdate, haircolor, whereas a band has: creationdate, breakupdate.

Let's change the Artist class (once again) to accomondate the subtypes of Person and Band:

package Music::Artist;

use strict; use base 'Music::MusicDB'; Music::Artist->table('artist'); Music::Artist->columns(All => qw/artistid name popularity/); Music::Artist->has_many(cds => 'Music::CD');

# Music::Artist->might_have(        person => 'Music::Person' => qw/gender haircolor birthdate/); Music::Artist->might_have(        band=> 'Music::Band' => qw/creationdate breakupdate/); 1;
 * 1) Adding the might_have definitions below

The cool thing is how CDBI handles the might_have relationship. CDBI lets you access the subtype's attribute "as if" they were in your super-type. CDBI will automagically create a coresponding row in the subtype table, if one doesn't already exist. For example:

use strict; use Music::Artist; use Music::Person; my $artist = Music::Artist->retrieve(name=>'Elvis Presley'); $artist->gender('M'); $artist->haircolor('Brown'); $artist->update; my $a = Music::Artist->retrieve(name=>'Elvis Presley'); print $a->haircolor,"\n";
 * 1) !/usr/bin/perl


 * NOTE**: You cannot retrieve/search via the attributes in the subtype. That is, you cannot do:

my $b = Music::Artist->retrieve(gender=>'M');

You'll get an error like:

gender is not a column of Music::Artist at ...

= Mapping = = TRIGGERS = __PACKAGE__->add_trigger(trigger_point_name => \&code_to_execute);

Triggers are bits of code that run at predefined stages during an object or column's retrieval, creation, and deletion. You can create any number of triggers for each point, but you cannot specify the order in which they will be run.

Below are the list of triggers:

before_create after_create before_set_$column after_set_$column before_update after_update before_delete after_delete select

For example, let's set the PUBLISHDATE of a CD if it is not defined.

package Music::CD; use strict; use base 'Music::MusicDB'; Music::CD->set_up_table('cd'); Music::CD->has_a(artistid => 'Music::Artist'); Music::CD->has_many(tracks=>'Music::Track'); Music::CD->has_many(songs => ['Music::Track' => 'songid']); __PACKAGE__->add_trigger(before_create => \&make_publish_date);

sub make_publish_date { } 1;

= HELPER FUNCTIONS (FOR FREE) = Please contribute text.

= ORDERING = Please contribute text.

= LOWER LEVEL CODING WITH DBI = Please contribute text.

= TRAPPING ERRORS = Please contribute text.

= CUSTOM/ADVANCED SQL = You can create your own SQL in CDBI too. This invaluable when you want to use database SQL functions or you need to do complex queries like joins and subqueries.

CDBI gives you a couple of ways to make custom sql. If all you need to do is change the "where" clause of your SQL (regardless of how complex) you can use either the "add_constructor" or "retrieve_from_sql" constructs. If your database supports them you can even do subqueries within these.

But, if you need to return columns that are calculated (like: sum, count, min, max...) or you want to perform a join then you leverage the underlying technology of Ima::DBI that is inherited by CDBI. You access this functionality via the "set_sql" construct.

add_constructor
__PACKAGE__->add_constructor(method_name => 'SQL_where_clause');

When used in your class definition, this creates a method, "method_name", which you can call in your executable, which returns a list of objects that match your "SQL_where_clause".

For example, for some reason I need to select the ARTISTs whose name length is greater than 15 characters. So, I could do it by changing the ARTIST class like:

package Music::Artist; use strict; use base 'Music::MusicDB'; Music::Artist->table('artist'); Music::Artist->columns(All => qw/artistid name popularity/); Music::Artist->has_many(cds => 'Music::CD'); Music::Artist->might_have(        person => 'Music::Person' => qw/gender haircolor birthdate/); Music::Artist->might_have(band=> 'Music::Band' =>         qw/creationdate breakupdate/); # # __PACKAGE__->add_constructor(long_names => qq{ length(name) > 15 }); 1;
 * 1) Add the add_constructor for looking for name lengths

Then in my executable I'd do:

use strict; use Music::Artist; foreach (Music::Artist->long_names) { print $_->name,"\n"; }
 * 1) !/usr/bin/perl

If I want to be tricky, and not so hard coded I can pass in the length of the names I'm looking for. First I need to change the add_constructor to accept a parameter, or placeholder, indicated by the "?". In my case I'm just going to add a new add_constructor:

package Music::Artist; use strict; use base 'Music::MusicDB'; Music::Artist->table('artist'); Music::Artist->columns(All => qw/artistid name popularity/); Music::Artist->has_many(cds => 'Music::CD'); Music::Artist->might_have(        person => 'Music::Person' => qw/gender haircolor birthdate/); Music::Artist->might_have(band=> 'Music::Band' =>         qw/creationdate breakupdate/); __PACKAGE__->add_constructor(long_names => qq{ length(name) > 15 }); __PACKAGE__->add_constructor(dynamic_long_names => qq{length(name)> ?}); 1;

And execute it like:

use strict; use Music::Artist; foreach (Music::Artist->dynamic_long_names(20)) { print $_->name,"\n"; } Lastly, if I wanted to do a subquery within my where clause, I can do that too. Lets return only the artist who have more than one CD.
 * 1) !/usr/bin/perl

Again, I'll add use "add_constructor" in my Music::Artist class.

package Music::Artist; use strict; use base 'Music::MusicDB'; Music::Artist->table('artist'); Music::Artist->columns(All => qw/artistid name popularity/); Music::Artist->has_many(cds => 'Music::CD'); Music::Artist->might_have(        person => 'Music::Person' => qw/gender haircolor birthdate/); Music::Artist->might_have(band=> 'Music::Band' =>         qw/creationdate breakupdate/); __PACKAGE__->add_constructor(long_names => qq{ length(name) > 15 }); __PACKAGE__->add_constructor(dynamic_long_names => qq{length(name)> ?}); # #	 __PACKAGE__->add_constructor(prolific => qq{        artistid in (select artistid from cd                      having count(*) > 1 group by artistid)}); 1;
 * 1) Here is the new code

Then I use it in my new code (just like you'd expect):

use strict; use Music::Artist; my @artist = Music::Artist->prolific; foreach (@artist) { print $_->name,"\n"; }
 * 1) !/usr/bin/perl

You can even do correlated subqueries, I just can't think of an example :)

retrieve_from_sql
This method provides the same functionality as the add_constructor method, except it is setup and performed in your executable as opposed to your class. This can be particularly handy when you don't have permission to alter the classes, or you don't want to build an add_constructor method for ever wacky SQL you might need during your execution.

So, lets do the same example above (in the add_constructor) here with retrieve_from_sql:

use strict; use Music::Artist; my @artist = Music::Artist->retrieve_from_sql(qq{length(name) > 15}); foreach (@artist) { print $_->name,"\n"; }
 * 1) !/usr/bin/perl
 * 1) retrieve_from_sql

set_sql
__PACKAGE__->set_sql("method_name","sql");

Using the inherited functionality of Ima::DBI you can create CDBI objects out of complicated SQL.

For example, let's say you want a method that returns only the Artists that are also Persons. One way to do this is by joining the ARTIST table with the PERSON table. So lets do it.

First you'd add the new functionality to your Artist class:

package Music::Artist; use strict; use base 'Music::MusicDB'; Music::Artist->table('artist'); Music::Artist->columns(All => qw/artistid name popularity/); Music::Artist->has_many(cds => 'Music::CD'); Music::Artist->might_have(        person => 'Music::Person' => qw/gender haircolor birthdate/); Music::Artist->might_have(band=> 'Music::Band' => 	qw/creationdate breakupdate/); __PACKAGE__->add_constructor(long_names => qq{ length(name) > 15 }); __PACKAGE__->add_constructor(dynamic_long_names => qq{length(name)> ?}); __PACKAGE__->add_constructor(prolific => qq{        artistid in (select artistid from cd                      having count(*) > 1 group by artistid)}); # # __PACKAGE__->set_sql(persons => qq {        	select artist.id 	from artist, person 	where artist.artistid = person.artistid}); 1;
 * 1) Heres the new code

When you do this CDBI will setup a method by the name of "search_persons". CDBI prepends the "search_" to the method name you supply.

Then you'd call it from your code like:

use strict; use Music::Artist; my @persons = Music::Artist->search_persons; foreach (@persons) { print $_->name,"\n"; }
 * 1) !/usr/bin/perl

Like in most of CDBI you can also use placeholders or parameters in your "set_sql" commands to accect runtime arguments. Like:

__PACKAGE__->set_sql(persons => qq { 	select artist.artistid 	from artist, person 	where artist.artistid = person.artistid 	and person.gender = ?});

Then in your code you'd call it like:

my @persons = Music::Artist->search_persons("F");

NOTE: When using iterators with set_sql command, you must include the tables primary key in the result set. If the primary key is missing the iterators next method will always return false.

= Common Errors = Can't locate object method "xxxxx" via package "Class::DBI::Iterator" (perhaps you forgot to load "Class::DBI::Iterator"?)

This means you retrieved an array of object, but you assigned it to a scalar and then tried to call methods on the scalar. For example:

my $cd = Music::CD->search_like(title => 'The Bea%');

instead of:

my ($cd) = Music::CD->search_like(title => 'The Bea%');

Turning on tracing to see your SQL:

my $dbh = Music::Artist->db_Main; $dbh->trace(2);

= APPENDIX = The following DDL is for Oracle, I'm sure you can convert it to your RDBMS of choice fairly simply.


 * Oracle DDL for Example ERD