Postgresql:使用具有一个模式的多个数据库,还是使用具有多个模式的一个数据库更好?

时间:2022-03-24 04:18:07

After this comment to one of my question, I'm thinking if it is better using 1 database with X schemas or vice versa.

在我对其中一个问题的评论之后,我在想,使用一个带有X模式的数据库是否更好,反之亦然。

My situation: I'm developing a web-app where, when people register, I create (actually) a database (no, its not a social network: everyone must have access to his own data and never see the data of the other user).

我的情况是:我正在开发一个网络应用程序,当人们注册时,我创建了一个数据库(不,它不是一个社交网络:每个人都必须能够访问自己的数据,而不能看到其他用户的数据)。

That's the way I used for the previous version of my application (that is still running on mysql): through the plesk api, for every registration, I do:

这就是我在之前版本的应用程序中使用的方式(它仍然在mysql上运行):通过plesk api,每次注册,我都会这样做:

  1. Create a database user with limited privileges;
  2. 创建权限有限的数据库用户;
  3. Create a database that can be accessed just by the previous created user and the superuser (for maintenance)
  4. 创建可以由先前创建的用户和超级用户访问的数据库(用于维护)
  5. Populate the db
  6. 填充数据库

Now, I'll need to do the same with postgresql (the project is getting mature and mysql.. don't fulfill all the needs)

现在,我需要对postgresql做同样的事情(项目变得成熟,mysql…不要满足所有的需求)

I need to have all the databases/schemas backups independent: pg_dump works perfectly in both ways, the same for the users that can be configured to access just 1 schema or 1 database.

我需要使所有的数据库/模式备份独立:pg_dump在这两种情况下都能很好地工作,对于只能访问一个模式或一个数据库的用户来说也是如此。

So, assuming you are more experienced potsgres users than me, what do you think is the best solution for my situation, and why?

那么,假设您比我更有经验的potsgres用户,您认为对我的情况最好的解决方案是什么?为什么?

Will there be performance differences using $x db instead of $x schemas? And what solution will be better to maintain in future (reliability)?

使用$x db而不是$x模式会有性能差异吗?什么样的解决方案在未来更好地维护(可靠性)?

Edit: I almost forgot: all of my databases/schemas will always have the same structure!

编辑:我几乎忘了:我的所有数据库/模式都将始终具有相同的结构!

Edit2: For the backups issue (using pg_dump), is maybe better using 1 db and many schemas, dumping all the schemas at once: recovering will be quite simple loading the main dump in a dev machine and then dump and restore just the schema needed: there is 1 additional step, but dumping all the schema seem faster then dumpin them one by one.

Edit2:备份问题(使用pg_dump),可能是更好的使用1 db和许多模式,将所有的模式:复苏将非常简单加载主转储在dev机然后转储和恢复所需的模式:1额外的步骤,但倾倒所有的模式似乎更快然后dumpin他们一个接一个。

p.s: sorry if i forgot some 'W' char in the text, my keyboard suffer that button ;)

p。s:对不起,如果我在文字中忘记了W字字符,我的键盘就会受到这个按钮的影响。

UPDATE 2012

Well, the application structure and design are changed so much dirung those last two years. Im still using the 1 db with many schemas approach, but still, I have 1 database for each version of my application:

在过去的两年里,应用结构和设计发生了很大的变化。我仍然使用带有许多模式的1db,但是我的应用程序的每个版本都有一个数据库:

Db myapp_01
    \_ my_customer_foo_schema
    \_ my_customer_bar_schema
Db myapp_02
    \_ my_customer_foo_schema
    \_ my_customer_bar_schema

For backups, im dumping each database regularly, then moving the backups on the dev server.

对于备份,im定期转储每个数据库,然后在dev服务器上移动备份。

Im also using the PITR/WAL backup but, as I said before, its not likely i'll have to restore all database at once.. so it will probably be dismissed this year (in my situation is not the best approach).

我也在使用PITR/WAL备份,但是,正如我之前所说的,我不太可能一次恢复所有的数据库。因此,今年它可能会被驳回(就我而言,这不是最好的方法)。

The 1-db-many-schema approach worked very well for me since now, even if the app structure is totally changed:

1-db-many-schema方法从现在起对我来说非常有效,即使应用程序结构完全改变:

i almost forgot: all of my databases/schemas will always have the same structure!

我几乎忘记了:我的所有数据库/模式都具有相同的结构!

...now, every schema has its own structure that change dinamycally reacting to users data flow.

…现在,每个模式都有它自己的结构,它会改变对用户数据流的反应。

6 个解决方案

#1


82  

A PostgreSQL "schema" is roughly the same as a MySQL "database". Having many databases on a PostgreSQL installation can get problematic; having many schemas will work with no trouble. So you definitely want to go with one database and multiple schemas within that database.

PostgreSQL“模式”与MySQL“数据库”大致相同。在PostgreSQL安装上拥有许多数据库可能会带来问题;拥有许多模式将不会有任何问题。所以你肯定想在这个数据库中使用一个数据库和多个模式。

#2


20  

Definitely, I'll go for the 1-db-many-schemas approach. This allows me to dump all the database but restore just 1 very easily, in many ways:

当然,我将采用1-db-many-schemas方法。这使我可以转储所有的数据库,但仅恢复1非常容易,在许多方面:

  1. Dump the db (all the schema), load the dump in a new db, dump just the schema i need, and restore back in main db
  2. 转储db(所有模式),在新的db中加载转储,仅转储我需要的模式,并在主db中恢复
  3. Dump the schema separately, one by one (but I think the machine will suffer more this way - and I'm expecting like 500 schemas!)
  4. 分别转储模式(逐个转储模式)(但我认为机器将遭受更大的损失——我预计会有500个模式!)

Otherwise, googling around I've seen that there is no auto-procedure to duplicate a schema (using one as a template), but many suggest this way:

否则,在谷歌上搜索,我发现没有自动过程来复制一个模式(使用一个模式作为模板),但是很多人建议这样做:

  1. Create a template-schema
  2. 创建一个template-schema
  3. When need to duplicate, rename it with new name
  4. 当需要复制时,用新名字重命名
  5. Dump it
  6. 把它倾倒
  7. Rename it back
  8. 重命名它
  9. Restore the dump
  10. 恢复转储
  11. The magic is done.
  12. 魔术就完成了。

I've written 2 rows in python to do that; I hope they can help someone (in-2-seconds-written-code, don’t use it in production):

我用python写了两行;我希望他们能帮助某人(2秒内写代码,不要在生产中使用):

import os
import sys
import pg

#Take the new schema name from the second cmd arguments (the first is the filename)
newSchema = sys.argv[1]
#Temp folder for the dumps
dumpFile = '/test/dumps/' + str(newSchema) + '.sql'
#Settings
db_name = 'db_name'
db_user = 'db_user'
db_pass = 'db_pass'
schema_as_template = 'schema_name'

#Connection
pgConnect = pg.connect(dbname= db_name, host='localhost', user= db_user, passwd= db_pass)
#Rename schema with the new name
pgConnect.query("ALTER SCHEMA " + schema_as_template + " RENAME TO " + str(newSchema))
#Dump it
command = 'export PGPASSWORD="' + db_pass + '" && pg_dump -U ' + db_user + ' -n ' + str(newSchema) + ' ' + db_name + ' > ' + dumpFile
os.system(command)
#Rename back with its default name
pgConnect.query("ALTER SCHEMA " + str(newSchema) + " RENAME TO " + schema_as_template)
#Restore the previus dump to create the new schema
restore = 'export PGPASSWORD="' + db_pass + '" && psql -U ' + db_user + ' -d ' + db_name + ' < ' + dumpFile
os.system(restore)
#Want to delete the dump file?
os.remove(dumpFile)
#Close connection
pgConnect.close()

#3


7  

I would say, go with multiple databases AND multiple schemas :)

我会说,使用多个数据库和多个模式:)

Schemas in postgres are a lot like packages in Oracle, in case you are familiar with those. Databases are meant to differentiate between entire sets of data, while schemas are more like data entities.

postgres中的模式非常类似于Oracle中的包,如果您熟悉它们的话。数据库旨在区分整个数据集,而模式更像是数据实体。

For instance, you could have one database for an entire application with the schemas "UserManagement", "LongTermStorage" and so on. "UserManagement" would then contain the "User" table, as well as all stored procedures, triggers, sequences etc. that are needed for the user management.

例如,您可以为整个应用程序使用模式“UserManagement”、“LongTermStorage”等数据库。然后,“UserManagement”将包含“User”表,以及用户管理所需的所有存储过程、触发器、序列等。

Databases are entire programs, schemas are components.

数据库是整个程序,模式是组件。

#4


3  

A number of schemas should be more lightweight than a number of databases, although I cannot find a reference which confirms this.

许多模式应该比许多数据库更轻量级,尽管我找不到一个证实这一点的引用。

But if you really want to keep things very separate (instead of refactoring the web application so that a "costomer" column is added to your tables), you may still want to use separate databases: I assert that you can more easily make restores of a particular customer's database this way -- without disturbing the other customers.

但如果你真的想让事情很独立(而不是重构的web应用程序,这样一个“客户”列添加到你的表),您可能还想使用不同的数据库:我说,你可以更容易地使恢复的特定客户的数据库——而不打扰其他客户。

#5


2  

In a postgres-context I recommend to use one db with multiple schemas, as you can (e.g.) UNION ALL across schemas but not across databases. For that reason, a database is really completely insulated from another database whilst schemas are not insulated from other schemas within the same database. If you -for some reason- have to consolidate data across schemas in the future, it will be easy to do this over multiple schemas. With multiple databases you would need multiple db-connections and collect and merge the data from each database "manually" by application logic.

在postgres-context中,我建议使用一个带有多个模式的db,因为您可以(例如)在所有模式之间(而不是在数据库之间)联合。由于这个原因,一个数据库实际上与另一个数据库完全隔离,而模式与同一数据库中的其他模式并不隔离。如果您(出于某种原因)将来必须跨模式合并数据,那么在多个模式上进行合并将很容易。对于多个数据库,您将需要多个db-connections并根据应用程序逻辑“手动”收集和合并每个数据库中的数据。

The latter have advantages in some cases, but for the major part I think the one-database-multiple-schemas approach is more useful.

后者在某些情况下具有优势,但对于主要部分,我认为单数据库多模式方法更有用。

#6


-1  

Get the things clear First most the time you would like to make Some Db read Only and Some read/write So keep schema used as Read only can be kept on diff Db And read/write Schema in Diff database although i would suggest you to keep MAX 25-30 schema in one DB as you don't want to create a load on the database for logs for all schema

先清楚的事情大部分时间你想让一些Db只读和读/写所以可以继续保持模式作为只读diff Db和读/写模式diff数据库中虽然我建议你保持最大25 - 30在一个Db模式如您不想创建一个数据库上的负载日志模式

Here is one article if u want to read more

如果你想读更多,这里有一篇文章

#1


82  

A PostgreSQL "schema" is roughly the same as a MySQL "database". Having many databases on a PostgreSQL installation can get problematic; having many schemas will work with no trouble. So you definitely want to go with one database and multiple schemas within that database.

PostgreSQL“模式”与MySQL“数据库”大致相同。在PostgreSQL安装上拥有许多数据库可能会带来问题;拥有许多模式将不会有任何问题。所以你肯定想在这个数据库中使用一个数据库和多个模式。

#2


20  

Definitely, I'll go for the 1-db-many-schemas approach. This allows me to dump all the database but restore just 1 very easily, in many ways:

当然,我将采用1-db-many-schemas方法。这使我可以转储所有的数据库,但仅恢复1非常容易,在许多方面:

  1. Dump the db (all the schema), load the dump in a new db, dump just the schema i need, and restore back in main db
  2. 转储db(所有模式),在新的db中加载转储,仅转储我需要的模式,并在主db中恢复
  3. Dump the schema separately, one by one (but I think the machine will suffer more this way - and I'm expecting like 500 schemas!)
  4. 分别转储模式(逐个转储模式)(但我认为机器将遭受更大的损失——我预计会有500个模式!)

Otherwise, googling around I've seen that there is no auto-procedure to duplicate a schema (using one as a template), but many suggest this way:

否则,在谷歌上搜索,我发现没有自动过程来复制一个模式(使用一个模式作为模板),但是很多人建议这样做:

  1. Create a template-schema
  2. 创建一个template-schema
  3. When need to duplicate, rename it with new name
  4. 当需要复制时,用新名字重命名
  5. Dump it
  6. 把它倾倒
  7. Rename it back
  8. 重命名它
  9. Restore the dump
  10. 恢复转储
  11. The magic is done.
  12. 魔术就完成了。

I've written 2 rows in python to do that; I hope they can help someone (in-2-seconds-written-code, don’t use it in production):

我用python写了两行;我希望他们能帮助某人(2秒内写代码,不要在生产中使用):

import os
import sys
import pg

#Take the new schema name from the second cmd arguments (the first is the filename)
newSchema = sys.argv[1]
#Temp folder for the dumps
dumpFile = '/test/dumps/' + str(newSchema) + '.sql'
#Settings
db_name = 'db_name'
db_user = 'db_user'
db_pass = 'db_pass'
schema_as_template = 'schema_name'

#Connection
pgConnect = pg.connect(dbname= db_name, host='localhost', user= db_user, passwd= db_pass)
#Rename schema with the new name
pgConnect.query("ALTER SCHEMA " + schema_as_template + " RENAME TO " + str(newSchema))
#Dump it
command = 'export PGPASSWORD="' + db_pass + '" && pg_dump -U ' + db_user + ' -n ' + str(newSchema) + ' ' + db_name + ' > ' + dumpFile
os.system(command)
#Rename back with its default name
pgConnect.query("ALTER SCHEMA " + str(newSchema) + " RENAME TO " + schema_as_template)
#Restore the previus dump to create the new schema
restore = 'export PGPASSWORD="' + db_pass + '" && psql -U ' + db_user + ' -d ' + db_name + ' < ' + dumpFile
os.system(restore)
#Want to delete the dump file?
os.remove(dumpFile)
#Close connection
pgConnect.close()

#3


7  

I would say, go with multiple databases AND multiple schemas :)

我会说,使用多个数据库和多个模式:)

Schemas in postgres are a lot like packages in Oracle, in case you are familiar with those. Databases are meant to differentiate between entire sets of data, while schemas are more like data entities.

postgres中的模式非常类似于Oracle中的包,如果您熟悉它们的话。数据库旨在区分整个数据集,而模式更像是数据实体。

For instance, you could have one database for an entire application with the schemas "UserManagement", "LongTermStorage" and so on. "UserManagement" would then contain the "User" table, as well as all stored procedures, triggers, sequences etc. that are needed for the user management.

例如,您可以为整个应用程序使用模式“UserManagement”、“LongTermStorage”等数据库。然后,“UserManagement”将包含“User”表,以及用户管理所需的所有存储过程、触发器、序列等。

Databases are entire programs, schemas are components.

数据库是整个程序,模式是组件。

#4


3  

A number of schemas should be more lightweight than a number of databases, although I cannot find a reference which confirms this.

许多模式应该比许多数据库更轻量级,尽管我找不到一个证实这一点的引用。

But if you really want to keep things very separate (instead of refactoring the web application so that a "costomer" column is added to your tables), you may still want to use separate databases: I assert that you can more easily make restores of a particular customer's database this way -- without disturbing the other customers.

但如果你真的想让事情很独立(而不是重构的web应用程序,这样一个“客户”列添加到你的表),您可能还想使用不同的数据库:我说,你可以更容易地使恢复的特定客户的数据库——而不打扰其他客户。

#5


2  

In a postgres-context I recommend to use one db with multiple schemas, as you can (e.g.) UNION ALL across schemas but not across databases. For that reason, a database is really completely insulated from another database whilst schemas are not insulated from other schemas within the same database. If you -for some reason- have to consolidate data across schemas in the future, it will be easy to do this over multiple schemas. With multiple databases you would need multiple db-connections and collect and merge the data from each database "manually" by application logic.

在postgres-context中,我建议使用一个带有多个模式的db,因为您可以(例如)在所有模式之间(而不是在数据库之间)联合。由于这个原因,一个数据库实际上与另一个数据库完全隔离,而模式与同一数据库中的其他模式并不隔离。如果您(出于某种原因)将来必须跨模式合并数据,那么在多个模式上进行合并将很容易。对于多个数据库,您将需要多个db-connections并根据应用程序逻辑“手动”收集和合并每个数据库中的数据。

The latter have advantages in some cases, but for the major part I think the one-database-multiple-schemas approach is more useful.

后者在某些情况下具有优势,但对于主要部分,我认为单数据库多模式方法更有用。

#6


-1  

Get the things clear First most the time you would like to make Some Db read Only and Some read/write So keep schema used as Read only can be kept on diff Db And read/write Schema in Diff database although i would suggest you to keep MAX 25-30 schema in one DB as you don't want to create a load on the database for logs for all schema

先清楚的事情大部分时间你想让一些Db只读和读/写所以可以继续保持模式作为只读diff Db和读/写模式diff数据库中虽然我建议你保持最大25 - 30在一个Db模式如您不想创建一个数据库上的负载日志模式

Here is one article if u want to read more

如果你想读更多,这里有一篇文章