only for RuBoard - do not distribute or recompile |
What was the point in creating the logical data model? You want to create a database to store data about CDs. The data model is only an intermediate step along the way. Ultimately, you would like to end up with a MySQL database in which you can store data. How do you get there? Physical database design translates your logical data model into a set of SQL statements that define your MySQL database.
Since MySQL is a relational database system, it is relatively easy to translate from a logical data model, such as the one we described earlier, into a physical MySQL database. Here are the rules for translation:
Attributes become columns in the physical database. Choose an appropriate data type for each column.
Unique identifiers become columns that are not allowed to have NULL values. These are called primary keys in the physical database. You may also choose to create a unique index on the identifiers to enforce uniqueness.
Relationships are modeled as foreign keys.
If we apply the first three rules to our data model—minus the Record Label address information—we will end up with the physical database described in Table 7-2.
Table |
Column |
Data type |
Notes |
---|---|---|---|
CD |
CD_ID |
INT |
Primary key |
CD_TITLE |
VARCHAR(50) |
||
ARTIST |
ARTIST_ID |
INT |
Primary key |
ARTIST_NAME |
VARCHAR(50) |
||
SONG |
SONG_ID |
INT |
Primary key |
SONG_NAME |
VARCHAR(50) |
||
SONG_LENGTH |
TIME |
||
RECORD_LABEL |
RECORD_LABEL_ID |
INT |
Primary key |
RECORD_LABEL_NAME |
VARCHAR(50) |
Note that all of the spaces are gone from the entity names in our physical schema. This is because these names need to translate into SQL calls to create these tables. Table names should thus conform to SQL naming rules. Another thing to notice is we made all primary keys type INT. Because these attributes are complete inventions on our part, they can be of any indexible data type.[1] The fact that they are of type INT here is almost purely arbitrary—or rather, almost arbitrary, because it is actually faster to search on numeric fields in many database engines; hence, numeric fields make good primary keys. However, we could have chosen CHAR as the type for the primary key fields, and everything would work just fine. The bottom line is that this choice should be driven by your criteria for choosing identifiers.
[1] We covered the MySQL data types in Chapter 3, and a full reference is in Chapter 16.
CD_TITLE, ARTIST_NAME, SONG_NAME, and RECORD_LABEL_NAME are VARCHAR with a length of 50. The length has been chosen arbitrarily for the sake of this example. In reality, you should do some analysis of sample data to determine the length of text fields. If you set them too short, you may end up with a database that cannot capture all the data you need to store.
SONG_LENGTH is set to type TIME, which can store elapsed time.
We now have a starting point for a physical schema. We have not yet translated the relationships into the physical data model. As we discussed earlier, once you have refined your data model, you should have all 1-to-1 and 1-to-M relationships—the M-to-M relationships were resolved via junction tables. We model relationships by adding a foreign key to one of the tables involved in the relationship. A foreign key is the unique identifier, or primary key, of the table on the other side of the relationship.
The most common relationship is the 1-to-M relationship. This relationship is mapped by placing the primary key from the "one" side of the relationship into the table on the "many" side. In our example, this rule means we need to do the following:
Place a RECORD_LABEL_ID column in the CD table.
Place a CD_ID column in the SONG table.
Place an ARTIST_ID column in the SONG table.
Table 7-3 shows the new schema.
Table |
Column |
Data type |
Notes |
---|---|---|---|
CD |
CD_ID |
INT |
Primary key |
CD_TITLE |
VARCHAR(50) |
||
RECORD_LABEL_ID |
INT |
Foreign key |
|
ARTIST |
ARTIST_ID |
INT |
Primary key |
ARTIST_NAME |
VARCHAR(50) |
||
SONG |
SONG_ID |
INT |
Primary key |
SONG_NAME |
VARCHAR(50) |
||
SONG_LENGTH |
TIME |
||
CD_ID |
INT |
Foreign key |
|
ARTIST_ID |
INT |
Foreign key |
|
RECORD_LABEL |
RECORD_LABEL_ID |
INT |
Primary key |
RECORD_LABEL_NAME |
VARCHAR(50) |
We do not have any 1-to-1 relationships in this data model. If we did have such a relationship, we would map it by picking one of the tables and giving it a foreign key column that matches the primary key from the other table. In theory, it does not matter which table you choose, but practical considerations may dictate which column makes the most sense as a foreign key. Another way to handle a 1-to-1 relationship is to simply combine both entities into a single table. In that case, you have to pick a primary key from one of the tables to be the primary key of the combined table.
We now have a complete physical database schema. The last remaining task is to translate that schema into SQL. For each table in the schema, you write one CREATE TABLE statement. Typically, you should create unique indexes on the primary keys to enforce uniqueness.
Example 7-1 is an example SQL script for creating the example database in MySQL.
CREATE TABLE cd (cd_id INT NOT NULL PRIMARY KEY, record_label INT, cd_title VARCHAR(50)); CREATE TABLE artist (artist_id INT NOT NULL PRIMARY KEY, artist_name VARCHAR(50)); CREATE TABLE song (song_id INT NOT NULL PRIMARY KEY, song_name VARCHAR(50), song_length TIME, cd_id INT, artist_id INT); CREATE TABLE record_label (record_label_id INT NOT NULL PRIMARY KEY, record_label_name VARCHAR(50));
Note that no FOREIGN KEY reference is used in the script. This is because MySQL does not support FOREIGN KEY constraints in its default data type. MySQL will allow you to embed them in your CREATE TABLE statements, but they will not be enforced. The InnoDB table type, which was recently stabilized and is documented on the MySQL web site, supports foreign keys.
Data models are meant to be database independent. You can therefore take the techniques and the data model we have generated in this chapter and apply them not only to MySQL, but to Oracle, Sybase, or any other relational database engine. In the following chapters, we will discuss the details of how you can use your new database design knowledge to build applications.
only for RuBoard - do not distribute or recompile |