Here's a scenario where I'm wondering whether to denormalize a database.
I have users (stored in a users table).
Users belong to Accounts:
- An account is created before its users are created
- Every user must belong to one account
- Any account may contain many users
The Users table is predefined so I can't add an "accountId" column to the Users table; so to implement accounts I have:
- A Users table with a userId key
- An Accounts table with an accountId key
- A UserAccounts table with userId and accountId columns (to identify the account associated with each user)
The UserAccounts table is constrained:
- Foreign key relations to Users and Accounts table
- Uniqueness constraint to ensure that a userId appears no more than once
Because I can't modify the Users table to insert and accountId column in it, I can't ensure that every user has an accountId. And in fact I will insert into the UserAccounts table after inserting into the Users table, though probably in the same transaction.
Now for the question:
- I have a dozen other tables (e.g. Sales) which define various types of thing which belong to users
- These tables have a userId column (to identify which user each row belongs to)
- The problem is that I also need to know what account each thing belongs to
There are two ways to implement knowing which account each thing belongs to:
- Join the table to the UserAccounts table (to select the accountId for each userId)
- Denormalize by storing the accountId as well as the userId in each table
If I did the latter I could ensure integrity by defining the userId plus accountId pair/combination as a foreign key constraint into the UserAccounts table (to ensure that the pairing of accountId with userId matches the pairing defined in the UserAccounts table).
I don't have vast experience with database design: I learned it from books (by Joe Celko).
I'm inhibited against denormalizing in general, but denormalizing seems appropriate here: why an extra JOIN in every SELECT statement (to get the accountId), when the same could be obtained by storing the accountId in the table with a (compound or composite) foreign key.
Is there a compelling reason to use (or to not use) either of these two possibilities?
If not then I'm tempted to use the 2nd method:
- because it's simpler to define the select statement (without a join)
- because it's (presumably) slightly better performance (without a join)
- because it ensures that a corresponding row (with an accountId) exists
Aucun commentaire:
Enregistrer un commentaire