有没有办法在MySQL中影响用户定义的数据类型?

时间:2022-03-23 15:52:56

I have a database which stores (among other things), the following pieces of information:

我有一个数据库,存储(除其他外),以​​下信息:

  • Hardware IDs BIGINTs
  • 硬件ID BIGINT
  • Storage Capacities BIGINTs
  • 存储容量BIGINTs
  • Hardware Names VARCHARs
  • 硬件名称VARCHAR
  • World Wide Port Names VARCHARs
  • 全球端口名称VARCHAR

I'd like to be able to capture a more refined definition of these datatypes. For instance, the hardware IDs have no numerical significance, so I don't care how they are formatted when displayed. The Storage Capacities, however, are cardinal numbers and, at a user's request, I'd like to present them with thousands and decimal separators, e.g. 123,456.789. Thus, I'd like to refine BIGINT into, say ID_NUMBER and CARDINAL.

我希望能够捕获这些数据类型的更精确的定义。例如,硬件ID没有数字意义,因此我不关心它们在显示时的格式。然而,存储容量是基数,并且根据用户的要求,我想向它们提供数千和小数分隔符,例如: 123,456.789。因此,我想将BIGINT细化为ID_NUMBER和CARDINAL。

The same with Hardware Names, which are simple text and WWPNs, which are hexstrings, e.g. 24:68:AC:E0. Thus, I'd like to refine VARCHAR into ENGLISH_WORD and HEXSTRING.

与硬件名称相同,硬件名称是简单文本和WWPN,它们是十六进制字符串,例如24:68:AC:E0。因此,我想将VARCHAR改进为ENGLISH_WORD和HEXSTRING。

The specific datatypes I made up are just for illustrative purposes.

我编写的特定数据类型仅用于说明目的。

I'd like to keep all this information in one place and I'm wondering if anybody knows of a good way to hold this all in my MySQL table definitions. I could use the Comment field of the table definition, but that smells fishy to me.

我想将所有这些信息保存在一个地方,我想知道是否有人知道在MySQL表定义中保存这一切的好方法。我可以使用表定义的Comment字段,但这对我来说有点腥味。

One approach would be to define the data structure elsewhere and use that definition to generate my CREATE TABLEs, but that would be a major rework of the code that I currently have, so I'm looking for alternatives.

一种方法是在别处定义数据结构并使用该定义生成我的CREATE TABLE,但这将是我目前拥有的代码的主要返工,所以我正在寻找替代方案。

Any suggestions? The application language in use is Perl, if that helps.

有什么建议么?正在使用的应用程序语言是Perl,如果有帮助的话。

2 个解决方案

#1


6  

A good way to do this is to use views. For example, to insert commas into a cardinal number, you can use:

一个好方法是使用视图。例如,要将逗号插入基数,可以使用:

mysql> create table foo (id int);
Query OK, 0 rows affected (0.12 sec)

mysql> insert into foo (id) values ( 123456789);
Query OK, 1 row affected (0.00 sec)

mysql> create view v_foo as select format(id, 0) as id from foo;
Query OK, 0 rows affected (0.10 sec)

mysql> select * from v_foo;
+---------------+
| id            |
+---------------+
| 123,456,789   |
+---------------+
1 row in set (0.02 sec)

You can use other string functions to format your other fields, and store them in the view definition.

您可以使用其他字符串函数来格式化其他字段,并将它们存储在视图定义中。

#2


1  

I'll propose an answer that questions the question.

我会提出一个问题的答案。

One of the mantras that people who model the databases like to hum is the separation of the presentation layer (formatting) and data and I believe that relevant part from like goes something like:

模拟数据库的人喜欢哼唱的一个例子是表示层(格式化)和数据的分离,我相信相关的部分就像是:

'Thou shall not store formatted data in your databases, nor shall you discriminate against any formatting choice. Thou shall store the data in the native supported data types. Thou applications shall provide presentation layer and format your columns.'

'您不应将格式化数据存储在数据库中,也不得区分任何格式选择。您应将数据存储在本机支持的数据类型中。您的应用程序应提供表示层并格式化您的列。

Well, friedo's answer does not go directly against this - data is only presented through a view, the storage is still native.

好吧,friedo的答案不直接反对这一点 - 数据只通过视图呈现,存储仍然是原生的。

Still, it depends how do you define presentation layer there - if the view and the server settings are considered part of presentation layer then all is fine, otherwise there is potential trouble as I, possible user of your system, will not be able to specify the fact that my thousand separator is a single quote (and it is, at least at the place of my current residence).

仍然,这取决于你如何在那里定义表示层 - 如果视图和服务器设置被认为是表示层的一部分,那么一切都很好,否则可能会有麻烦,因为我,系统的可能用户,将无法指定我的千分隔符是单引号的事实(至少在我现在的住所的地方)。

Also, once you go that road, how long do you think it will pass until you will have to deal with requests to re-parse the data back from text into a number and possibly end up in situations where this might be ambiguous (such as DD/MM/YY vs MM/DD/YY)?

此外,一旦你走了这条路,你认为它将持续多久,直到你必须处理将数据从文本重新解析为数字的请求,并可能最终导致这可能是模棱两可的(例如DD / MM / YY对比MM / DD / YY)?

The above rant is only regarding formatting, determining the number of decimal digits defines the domain of your data and is a good thing as it limits down possibility of inconsistent data entering your database.

上面的咆哮只是关于格式化,确定小数位数定义数据的域,这是一件好事,因为它限制了数据进入数据库的不一致的可能性。

EDIT: (entertaining the purist point of view a bit further, regarding number bases) Saying that hexadecimal number data has no meaning in other bases is generally a false statement. Number values have no base and can be represented in any base. Their domain (the set of allowed values) is the same.

编辑:(进一步接受纯粹主义观点,关于数字基数)说十六进制数字数据在其他基础上没有意义通常是一个错误的陈述。数值没有基数,可以在任何基数中表示。他们的域(允许值集)是相同的。

The choice of hexadecimal for MAC address is a natural one due to historical reasons and the fact that it is for example easy to read the vendor part in that format. The choice of 'funny' format for IPv4 addresses is a historical with probably an anecdotal reason.

由于历史原因以及例如易于以该格式读取供应商部分的事实,MAC地址的十六进制选择是自然的。 IPv4地址的“有趣”格式的选择是一个历史,可能是一个轶事原因。

But both are only a choice and internally a good system will store them without bias (for example storing IPv4 as text is not a good thing). When RDBMS present you the results of a query (on a screen) it already takes a role of an application and format the results in some way.

但两者都只是一种选择,内部好的系统会存储它们而没有偏见(例如将IPv4作为文本存储并不是一件好事)。当RDBMS向您显示查询结果(在屏幕上)时,它已经扮演应用程序的角色并以某种方式格式化结果。

This is not significant and the format you'll use in your application should not influence how you store the Storage Capacities or other entity properties.

这并不重要,您在应用程序中使用的格式不应影响存储容量或其他实体属性的存储方式。

So I am saying that this is application configuration data (metadata to the core date) and of course it can/should be stored in the database, but with MySQL (which is not so rich in defining custom types) it can't fit in the table definition and should be simply stored in another table that application will read and apply to your columns when presenting data to the user and not in some hackish way which will not be portable.

所以我说这是应用程序配置数据(元数据到核心日期),当然它可以/应该存储在数据库中,但是使用MySQL(定义自定义类型不是很丰富)它不适合表定义应该简单地存储在另一个表中,当向用户呈现数据时应用程序将读取并应用于列,而不是以某种不可移植的hackish方式。

For example the view idea works, but can you query the view easily to get the formats that are applied to fields? Or lets say you want to change the formatting in all occurrences of field WWPN in all queries that use it (hexstring also sounds as already wrong), would that be easy? Or if there are other queries that transform the data and write it down in another table will you write it down with applied format or without it (re-parsing)? Etc...

例如,视图构思有效,但是您可以轻松查询视图以获取应用于字段的格式吗?或者假设您想要在所有使用它的查询中更改所有字段WWPN中的格式(hexstring也听起来已经错了),这会很容易吗?或者,如果有其他查询转换数据并将其写入另一个表中,您是用应用格式还是没有它(重新解析)写下来的?等等...

Now if you had a table that stores application configuration data such as FieldFormatting: Table, Field, Format, CheckRules, LongFormat (or whatever makes most sense in your situation) then the above questions become a bit easier to deal with and you get to choose extra options for your application and business logic.

现在,如果您有一个存储应用程序配置数据的表,例如FieldFormatting:Table,Field,Format,CheckRules,LongFormat(或者在您的情况下最有意义的话),那么上面的问题会变得更容易处理,您可以选择应用程序和业务逻辑的额外选项。

If you really (really, really) have to provide direct access to the database and the native types would make data unreadable for the users and you simply must preformat then you could even use the above table to generate and update the views/queries semi-automatically.

如果你真的(真的,真的)必须提供对数据库的直接访问,并且本机类型会使用户无法读取数据,你只需要预先格式化,然后你甚至可以使用上面的表来生成和更新视图/查询半自动。

NOTE: I am taking a purist point of view here since I have a feeling that you are making design decisions here and not chasing last drop of performance or convenience (for example between application data types and database data types) when practical issues can be more important than modelling guidelines and rules. But the questions from the last paragraph still stand.

注意:我在这里采取纯粹的观点,因为我觉得你在这里做出设计决策而不是追求性能或便利的最后一滴(例如在应用程序数据类型和数据库数据类型之间),当实际问题可能更多时比建模指南和规则更重要。但是最后一段的问题仍然存在。

#1


6  

A good way to do this is to use views. For example, to insert commas into a cardinal number, you can use:

一个好方法是使用视图。例如,要将逗号插入基数,可以使用:

mysql> create table foo (id int);
Query OK, 0 rows affected (0.12 sec)

mysql> insert into foo (id) values ( 123456789);
Query OK, 1 row affected (0.00 sec)

mysql> create view v_foo as select format(id, 0) as id from foo;
Query OK, 0 rows affected (0.10 sec)

mysql> select * from v_foo;
+---------------+
| id            |
+---------------+
| 123,456,789   |
+---------------+
1 row in set (0.02 sec)

You can use other string functions to format your other fields, and store them in the view definition.

您可以使用其他字符串函数来格式化其他字段,并将它们存储在视图定义中。

#2


1  

I'll propose an answer that questions the question.

我会提出一个问题的答案。

One of the mantras that people who model the databases like to hum is the separation of the presentation layer (formatting) and data and I believe that relevant part from like goes something like:

模拟数据库的人喜欢哼唱的一个例子是表示层(格式化)和数据的分离,我相信相关的部分就像是:

'Thou shall not store formatted data in your databases, nor shall you discriminate against any formatting choice. Thou shall store the data in the native supported data types. Thou applications shall provide presentation layer and format your columns.'

'您不应将格式化数据存储在数据库中,也不得区分任何格式选择。您应将数据存储在本机支持的数据类型中。您的应用程序应提供表示层并格式化您的列。

Well, friedo's answer does not go directly against this - data is only presented through a view, the storage is still native.

好吧,friedo的答案不直接反对这一点 - 数据只通过视图呈现,存储仍然是原生的。

Still, it depends how do you define presentation layer there - if the view and the server settings are considered part of presentation layer then all is fine, otherwise there is potential trouble as I, possible user of your system, will not be able to specify the fact that my thousand separator is a single quote (and it is, at least at the place of my current residence).

仍然,这取决于你如何在那里定义表示层 - 如果视图和服务器设置被认为是表示层的一部分,那么一切都很好,否则可能会有麻烦,因为我,系统的可能用户,将无法指定我的千分隔符是单引号的事实(至少在我现在的住所的地方)。

Also, once you go that road, how long do you think it will pass until you will have to deal with requests to re-parse the data back from text into a number and possibly end up in situations where this might be ambiguous (such as DD/MM/YY vs MM/DD/YY)?

此外,一旦你走了这条路,你认为它将持续多久,直到你必须处理将数据从文本重新解析为数字的请求,并可能最终导致这可能是模棱两可的(例如DD / MM / YY对比MM / DD / YY)?

The above rant is only regarding formatting, determining the number of decimal digits defines the domain of your data and is a good thing as it limits down possibility of inconsistent data entering your database.

上面的咆哮只是关于格式化,确定小数位数定义数据的域,这是一件好事,因为它限制了数据进入数据库的不一致的可能性。

EDIT: (entertaining the purist point of view a bit further, regarding number bases) Saying that hexadecimal number data has no meaning in other bases is generally a false statement. Number values have no base and can be represented in any base. Their domain (the set of allowed values) is the same.

编辑:(进一步接受纯粹主义观点,关于数字基数)说十六进制数字数据在其他基础上没有意义通常是一个错误的陈述。数值没有基数,可以在任何基数中表示。他们的域(允许值集)是相同的。

The choice of hexadecimal for MAC address is a natural one due to historical reasons and the fact that it is for example easy to read the vendor part in that format. The choice of 'funny' format for IPv4 addresses is a historical with probably an anecdotal reason.

由于历史原因以及例如易于以该格式读取供应商部分的事实,MAC地址的十六进制选择是自然的。 IPv4地址的“有趣”格式的选择是一个历史,可能是一个轶事原因。

But both are only a choice and internally a good system will store them without bias (for example storing IPv4 as text is not a good thing). When RDBMS present you the results of a query (on a screen) it already takes a role of an application and format the results in some way.

但两者都只是一种选择,内部好的系统会存储它们而没有偏见(例如将IPv4作为文本存储并不是一件好事)。当RDBMS向您显示查询结果(在屏幕上)时,它已经扮演应用程序的角色并以某种方式格式化结果。

This is not significant and the format you'll use in your application should not influence how you store the Storage Capacities or other entity properties.

这并不重要,您在应用程序中使用的格式不应影响存储容量或其他实体属性的存储方式。

So I am saying that this is application configuration data (metadata to the core date) and of course it can/should be stored in the database, but with MySQL (which is not so rich in defining custom types) it can't fit in the table definition and should be simply stored in another table that application will read and apply to your columns when presenting data to the user and not in some hackish way which will not be portable.

所以我说这是应用程序配置数据(元数据到核心日期),当然它可以/应该存储在数据库中,但是使用MySQL(定义自定义类型不是很丰富)它不适合表定义应该简单地存储在另一个表中,当向用户呈现数据时应用程序将读取并应用于列,而不是以某种不可移植的hackish方式。

For example the view idea works, but can you query the view easily to get the formats that are applied to fields? Or lets say you want to change the formatting in all occurrences of field WWPN in all queries that use it (hexstring also sounds as already wrong), would that be easy? Or if there are other queries that transform the data and write it down in another table will you write it down with applied format or without it (re-parsing)? Etc...

例如,视图构思有效,但是您可以轻松查询视图以获取应用于字段的格式吗?或者假设您想要在所有使用它的查询中更改所有字段WWPN中的格式(hexstring也听起来已经错了),这会很容易吗?或者,如果有其他查询转换数据并将其写入另一个表中,您是用应用格式还是没有它(重新解析)写下来的?等等...

Now if you had a table that stores application configuration data such as FieldFormatting: Table, Field, Format, CheckRules, LongFormat (or whatever makes most sense in your situation) then the above questions become a bit easier to deal with and you get to choose extra options for your application and business logic.

现在,如果您有一个存储应用程序配置数据的表,例如FieldFormatting:Table,Field,Format,CheckRules,LongFormat(或者在您的情况下最有意义的话),那么上面的问题会变得更容易处理,您可以选择应用程序和业务逻辑的额外选项。

If you really (really, really) have to provide direct access to the database and the native types would make data unreadable for the users and you simply must preformat then you could even use the above table to generate and update the views/queries semi-automatically.

如果你真的(真的,真的)必须提供对数据库的直接访问,并且本机类型会使用户无法读取数据,你只需要预先格式化,然后你甚至可以使用上面的表来生成和更新视图/查询半自动。

NOTE: I am taking a purist point of view here since I have a feeling that you are making design decisions here and not chasing last drop of performance or convenience (for example between application data types and database data types) when practical issues can be more important than modelling guidelines and rules. But the questions from the last paragraph still stand.

注意:我在这里采取纯粹的观点,因为我觉得你在这里做出设计决策而不是追求性能或便利的最后一滴(例如在应用程序数据类型和数据库数据类型之间),当实际问题可能更多时比建模指南和规则更重要。但是最后一段的问题仍然存在。