如何在一列中存储数组或多个值

时间:2022-07-10 13:34:44

Running Postgres 7.4 (Yeah we are in the midst of upgrading)

运行Postgres 7.4(是的,我们正在升级)

I need to store from 1 to 100 selected items into one field in a database. 98% of the time it's just going to be 1 item entered, and 2% of the time (if that) there will be multiple items.

我需要将1到100个选定项目存储到数据库中的一个字段中。 98%的时间只是输入1个项目,2%的时间(如果有的话)会有多个项目。

The items are nothing more than a text description, (as of now) nothing more than 30 characters long. They are static values the user selects.

这些项目只不过是文本描述(截至目前),不超过30个字符。它们是用户选择的静态值。

Wanted to know the optimal column data type used to store the desired data. I was thinking BLOB but didn't know if this is a overkill. Maybe JSON?

想知道用于存储所需数据的最佳列数据类型。我在想BLOB,但不知道这是不是太过分了。也许是JSON?

Also I did think of ENUM but as of now I can't really do this since we are running Postgres 7.4

我也确实想到了ENUM但是到目前为止我还不能真正做到这一点,因为我们正在运行Postgres 7.4

I also wanted to be able to easily identify the item(s) entered so no mappings or referencing tables.

我还希望能够轻松识别输入的项目,因此没有映射或引用表格。

2 个解决方案

#1


32  

You have a couple of questions here, so I'll address them separately:

你在这里有几个问题,所以我将分别解决它们:

I need to store a number of selected items in one field in a database

My general rule is: don't. This is something which all but requires a second table (or third) with a foreign key. Sure, it may seem easier now, but what if the use case comes along where you need to actually query for those items individually? It also means that you have more options for lazy instantiation and you have a more consistent experience across multiple frameworks/languages. Further, you are less likely to have connection timeout issues (30,000 characters is a lot).

我的一般规则是:不要。这是所有但都需要第二个表(或第三个)与外键的东西。当然,现在看起来似乎更容易,但如果用例出现在您需要单独查询这些项目的情况下会怎样?这也意味着您有更多的懒惰实例化选项,并且您可以跨多个框架/语言获得更一致的体验。此外,您不太可能遇到连接超时问题(30,000个字符很多)。

You mentioned that you were thinking about using ENUM. Are these values fixed? Do you know them ahead of time? If so this would be my structure:

您提到您正在考虑使用ENUM。这些值是固定的吗?你提前知道吗?如果是这样,这将是我的结构:

Base table (what you have now):

基表(你现在拥有的):

| id primary_key sequence
| -- other columns here.

Items table:

物品表:

| id primary_key sequence
| descript VARCHAR(30) UNIQUE

Map table:

地图表:

| base_id  bigint
| items_id bigint

Map table would have foreign keys so base_id maps to Base table, and items_id would map to the items table.

映射表将具有外键,因此base_id映射到Base表,而items_id将映射到items表。

And if you'd like an easy way to retrieve this from a DB, then create a view which does the joins. You can even create insert and update rules so that you're practically only dealing with one table.

如果您想要一种简单的方法从数据库中检索它,那么创建一个进行连接的视图。您甚至可以创建插入和更新规则,以便您实际上只处理一个表。

What format should I use store the data?

If you have to do something like this, why not just use a character delineated string? It will take less processing power than a CSV, XML, or JSON, and it will be shorter.

如果你必须做这样的事情,为什么不只是使用字符描绘的字符串?它比CSV,XML或JSON需要更少的处理能力,而且会更短。

What column type should I use store the data?

Personally, I would use TEXT. It does not sound like you'd gain much by making this a BLOB, and TEXT, in my experience, is easier to read if you're using some form of IDE.

就个人而言,我会使用TEXT。这听起来并不像你通过使它成为BLOB而获得更多,而且根据我的经验,如果你使用某种形式的IDE,TEXT更容易阅读。

#2


4  

Well, there is an array type in recent Postgres versions (not 100% about PG 7.4). You can even index them, using a GIN or GIST index. The syntaxes are:

好吧,最近的Postgres版本中有一个数组类型(不是100%关于PG 7.4)。您甚至可以使用GIN或GIST索引对它们进行索引。语法是:

create table foo (
  bar  int[] default '{}'
);

select * from foo where bar && array[1] -- equivalent to bar && '{1}'::int[]

create index on foo using gin (bar); -- allows to use an index in the above query

But as the prior answer suggests, it will be better to normalize properly.

但正如先前的答案所表明的那样,正确地进行正常化会更好。

#1


32  

You have a couple of questions here, so I'll address them separately:

你在这里有几个问题,所以我将分别解决它们:

I need to store a number of selected items in one field in a database

My general rule is: don't. This is something which all but requires a second table (or third) with a foreign key. Sure, it may seem easier now, but what if the use case comes along where you need to actually query for those items individually? It also means that you have more options for lazy instantiation and you have a more consistent experience across multiple frameworks/languages. Further, you are less likely to have connection timeout issues (30,000 characters is a lot).

我的一般规则是:不要。这是所有但都需要第二个表(或第三个)与外键的东西。当然,现在看起来似乎更容易,但如果用例出现在您需要单独查询这些项目的情况下会怎样?这也意味着您有更多的懒惰实例化选项,并且您可以跨多个框架/语言获得更一致的体验。此外,您不太可能遇到连接超时问题(30,000个字符很多)。

You mentioned that you were thinking about using ENUM. Are these values fixed? Do you know them ahead of time? If so this would be my structure:

您提到您正在考虑使用ENUM。这些值是固定的吗?你提前知道吗?如果是这样,这将是我的结构:

Base table (what you have now):

基表(你现在拥有的):

| id primary_key sequence
| -- other columns here.

Items table:

物品表:

| id primary_key sequence
| descript VARCHAR(30) UNIQUE

Map table:

地图表:

| base_id  bigint
| items_id bigint

Map table would have foreign keys so base_id maps to Base table, and items_id would map to the items table.

映射表将具有外键,因此base_id映射到Base表,而items_id将映射到items表。

And if you'd like an easy way to retrieve this from a DB, then create a view which does the joins. You can even create insert and update rules so that you're practically only dealing with one table.

如果您想要一种简单的方法从数据库中检索它,那么创建一个进行连接的视图。您甚至可以创建插入和更新规则,以便您实际上只处理一个表。

What format should I use store the data?

If you have to do something like this, why not just use a character delineated string? It will take less processing power than a CSV, XML, or JSON, and it will be shorter.

如果你必须做这样的事情,为什么不只是使用字符描绘的字符串?它比CSV,XML或JSON需要更少的处理能力,而且会更短。

What column type should I use store the data?

Personally, I would use TEXT. It does not sound like you'd gain much by making this a BLOB, and TEXT, in my experience, is easier to read if you're using some form of IDE.

就个人而言,我会使用TEXT。这听起来并不像你通过使它成为BLOB而获得更多,而且根据我的经验,如果你使用某种形式的IDE,TEXT更容易阅读。

#2


4  

Well, there is an array type in recent Postgres versions (not 100% about PG 7.4). You can even index them, using a GIN or GIST index. The syntaxes are:

好吧,最近的Postgres版本中有一个数组类型(不是100%关于PG 7.4)。您甚至可以使用GIN或GIST索引对它们进行索引。语法是:

create table foo (
  bar  int[] default '{}'
);

select * from foo where bar && array[1] -- equivalent to bar && '{1}'::int[]

create index on foo using gin (bar); -- allows to use an index in the above query

But as the prior answer suggests, it will be better to normalize properly.

但正如先前的答案所表明的那样,正确地进行正常化会更好。