优雅的数据库设计帮助......（MySQL / PHP）

I'm building a movies website... I need to display info about each movie, including genres, actors, and a lot of info (IMDB.com like)...

我正在建立一个电影网站...我需要显示每部电影的信息,包括流派,演员和很多信息(像IMDB.com一样)......

I created a 'movies' table including an ID and some basic information. For the genres I created a 'genres' table including 2 columns: ID and genre. Then I use a 'genres2movies' table with two columns:movieID and the genreID, to connect between the genres and the movies tables...

我创建了一个包含ID和一些基本信息的“电影”表。对于类型我创建了一个'genres'表,包括2列:ID和流派。然后我使用一个'genres2movies'表,其中包含两列:movieID和genreID,用于连接流派和电影表...

This way, for example, if a movie have 5 different genres I get the movieID in 5 different rows of the'genres2movies' table. Its better than including the genre each time for each movie but...

这样,例如,如果一部电影有5种不同的类型,我会在'genres2movies'表的5个不同行中获得movieID。它比每次为每部电影都包括类型更好但是......

There is a better way for doing this???

有一个更好的方法来做到这一点???

I need to do this also for actors, languages and countries so performance and database size is really important.

我还需要为演员,语言和国家这样做,因此性能和数据库大小非常重要。

Thanks!!!

3 个解决方案

#1

You are in the right track. That's the way to do many-to-many relationships. Database size won't grow much because you use integers and for speed you must set up correct indexes for those IDs. When making SELECt queries check out the EXPLAIN - it helps to find the bottlenecks of speed.

你走在正确的轨道上。这是做多对多关系的方式。数据库大小不会增长太多,因为您使用整数,为了提高速度,您必须为这些ID设置正确的索引。在进行SELECt查询时,请查看EXPLAIN - 它有助于找到速度的瓶颈。

#2

It sounds like you are following proper normalisation rules at the moment, which is exactly what you want.

听起来你现在正在遵循正确的规范化规则,这正是你想要的。

However, you may find that if performance is a key factor you may want to de-normalise some parts of your data, since JOINs between tables are relatively expensive operations.

但是,您可能会发现,如果性能是一个关键因素,您可能希望对数据的某些部分进行反规范化,因为表之间的JOIN是相对昂贵的操作。

It's usually a trade-off between proper/full normalisation and performance

这通常是正确/完全标准化和性能之间的权衡

#3

You're on exactly the right track - this is the correct, normalized, approach.

你正处于正确的轨道上 - 这是正确的,标准化的方法。

The only thing I would add is to ensure that your index on the join table (genres2movies) includes both genre and movie id and it is generally worthwhile (depending upon the selects used) to define indexes in both directions - ie. two indexes, ordered genre-id,movie-id and movie-id,genre-id. This ensures that any range select on either genre or movie will be able to use an index to retrieve all the data it needs and not have to resort to a full table scan, or even have to access the table rows themselves.

我要添加的唯一内容是确保连接表上的索引(genres2movies)包括流派和电影ID,并且通常值得(取决于所使用的选择)在两个方向上定义索引 - 即。两个索引,有序的genre-id,movie-id和movie-id,genre-id。这可以确保任何类型或电影上的任何范围选择都能够使用索引来检索所需的所有数据,而不必使用全表扫描,甚至必须自己访问表行。

#1

#2