在MySQL数据库中存储性别的最佳技术

时间:2021-09-19 16:57:14

Which is the best method to store gender in MY SQL Database? I am bit confused about this issue because different people express in different way. Some suggests storing it in INT is better, but other suggests TINYINT and Enum , But some others suggests store it in a CHAR(1) M for Male and F For Female.

哪种方法可以在MY SQL数据库中存储性别?我对这个问题感到有点困惑,因为不同的人以不同的方式表达。有人建议将它存储在INT中更好,但是其他人建议使用TINYINT和Enum,但是其他人建议将它存储在CHAR(1)M for Male和F For Female中。

Moreover it gets more doubtful while hearing http://en.wikipedia.org/wiki/ISO_5218

此外,在听到http://en.wikipedia.org/wiki/ISO_5218时,它变得更加令人怀疑

But in my point of view storing it in CHAR is a good idea, because it provides more robustness than ENUM ?Also I am concerned about scalability, want to know a better solution for storing millions of records.

但在我看来,将它存储在CHAR中是一个好主意,因为它提供了比ENUM更强大的功能?我也关注可伸缩性,想知道存储数百万条记录的更好解决方案。

A valuable suggestion from a expert is highly appreciated.

非常感谢专家提出的宝贵建议。

2 个解决方案

#1


4  

Personally (because this is a somewhat subjective question) I'd go with ENUM. MySQL doesn't support CHECK constraints, so the ENUM is the only way to really make sure the value is M or F (or m or f). To me, that's the most important point.

就个人而言(因为这是一个有点主观的问题)我会选择ENUM。 MySQL不支持CHECK约束,因此ENUM是确保值为M或F(或m或f)的唯一方法。对我来说,这是最重要的一点。

In addition, the ENUM should only need one byte of storage space (according to the docs), so it's just as efficient storage-wise as CHAR(1) or TINYINT.

此外,ENUM应该只需要一个字节的存储空间(根据文档),因此它与CHAR(1)或TINYINT一样有效存储。

I don't understand the TINYINT approach at all because you end up with queries like this:

我根本不理解TINYINT方法,因为你最终得到这样的查询:

SELECT * FROM myTable WHERE gender = 1;

Is 1 male or female? And if it's male, is female 0? Or is it 2? Or maybe 16? You already have to remember a pile of things to write (and maintain) an application; no need to add to that pile.

是男性还是女性?如果是男性,女性是0吗?还是2?或者16岁?你必须记住要编写(和维护)应用程序的一堆东西;没有必要添加到那堆。


Addendum 2017-12-01 by Ed Gibbs: Revisiting my answer when I stumbled across it on an unrelated Google search...

Ed Gibbs的附录2017-12-01:当我偶然发现一个无关的谷歌搜索时,重温我的回答......

The ENUM approach has merit in use cases with a static, single-dimensional domain of values (e.g., Y/N, To/Cc/Bcc), but it's not valid for gender. My answer was in the nerd context of "how do you limit a column to M or F" and not in the broader context of gender definition.

ENUM方法在具有静态,一维值域(例如,Y / N,To / Cc / Bcc)的用例中具有优点,但它对于性别无效。我的回答是在“你如何将专栏限制为M或F”的书呆子语境中,而不是在更广泛的性别定义背景下。

D Mac's solution is more robust and enlightened, but it's still incomplete because it too is single-dimensional whereas gender is multi-dimensional.

D Mac的解决方案更加强大和开明,但它仍然是不完整的,因为它也是单维的,而性别是多维的。

When classifying human beings in any subjective category (gender, race, class identity, religion, political affiliation, employment status, ethnic identity, sexual preference, amouressness, etc.), consider the multiple ways in which they may identify themselves. There isn't always a "check a single box" solution.

在对任何主观类别(性别,种族,阶级认同,宗教,政治派别,就业状况,民族认同,性取向,性行为等)中对人进行分类时,请考虑他们可以通过多种方式表明自己的身份。并不总是有“检查单一盒子”的解决方案。

This goes beyond ideology. Trying to categorize a multi-dimensional entity into a single dimension is inaccurate, and inaccuracy has a cost.

这超越了意识形态。尝试将多维实体分类为单个维度是不准确的,并且不准确性具有成本。

#2


8  

If you might ever have to deal with more complex gender issues (in-process gender changes or trans-gender), the best way is to use a reference table of possible values:

如果您可能不得不处理更复杂的性别问题(进行中的性别变化或跨性别),最好的方法是使用可能值的参考表:

CREATE TABLE static_gender (
    ID INT AUTO_INCREMENT PRIMARY KEY,
    Name varchar(10),
    Description varchar(100)
) ENGINE=INNODB;

Initially, you can load it up with:

最初,您可以使用以下命令加载它:

INSERT INTO static_gender VALUES
(DEFAULT, 'F', 'female'),
(DEFAULT, 'M', 'male');

That way you can expand the table as new values for gender become necessary. In your USER (or whatever) table, you store static_gender_id and get the value for the gender with a JOIN.

这样,您可以扩展表格,因为必须使用性别的新值。在您的USER(或其他)表中,存储static_gender_id并使用JOIN获取性别的值。

#1


4  

Personally (because this is a somewhat subjective question) I'd go with ENUM. MySQL doesn't support CHECK constraints, so the ENUM is the only way to really make sure the value is M or F (or m or f). To me, that's the most important point.

就个人而言(因为这是一个有点主观的问题)我会选择ENUM。 MySQL不支持CHECK约束,因此ENUM是确保值为M或F(或m或f)的唯一方法。对我来说,这是最重要的一点。

In addition, the ENUM should only need one byte of storage space (according to the docs), so it's just as efficient storage-wise as CHAR(1) or TINYINT.

此外,ENUM应该只需要一个字节的存储空间(根据文档),因此它与CHAR(1)或TINYINT一样有效存储。

I don't understand the TINYINT approach at all because you end up with queries like this:

我根本不理解TINYINT方法,因为你最终得到这样的查询:

SELECT * FROM myTable WHERE gender = 1;

Is 1 male or female? And if it's male, is female 0? Or is it 2? Or maybe 16? You already have to remember a pile of things to write (and maintain) an application; no need to add to that pile.

是男性还是女性?如果是男性,女性是0吗?还是2?或者16岁?你必须记住要编写(和维护)应用程序的一堆东西;没有必要添加到那堆。


Addendum 2017-12-01 by Ed Gibbs: Revisiting my answer when I stumbled across it on an unrelated Google search...

Ed Gibbs的附录2017-12-01:当我偶然发现一个无关的谷歌搜索时,重温我的回答......

The ENUM approach has merit in use cases with a static, single-dimensional domain of values (e.g., Y/N, To/Cc/Bcc), but it's not valid for gender. My answer was in the nerd context of "how do you limit a column to M or F" and not in the broader context of gender definition.

ENUM方法在具有静态,一维值域(例如,Y / N,To / Cc / Bcc)的用例中具有优点,但它对于性别无效。我的回答是在“你如何将专栏限制为M或F”的书呆子语境中,而不是在更广泛的性别定义背景下。

D Mac's solution is more robust and enlightened, but it's still incomplete because it too is single-dimensional whereas gender is multi-dimensional.

D Mac的解决方案更加强大和开明,但它仍然是不完整的,因为它也是单维的,而性别是多维的。

When classifying human beings in any subjective category (gender, race, class identity, religion, political affiliation, employment status, ethnic identity, sexual preference, amouressness, etc.), consider the multiple ways in which they may identify themselves. There isn't always a "check a single box" solution.

在对任何主观类别(性别,种族,阶级认同,宗教,政治派别,就业状况,民族认同,性取向,性行为等)中对人进行分类时,请考虑他们可以通过多种方式表明自己的身份。并不总是有“检查单一盒子”的解决方案。

This goes beyond ideology. Trying to categorize a multi-dimensional entity into a single dimension is inaccurate, and inaccuracy has a cost.

这超越了意识形态。尝试将多维实体分类为单个维度是不准确的,并且不准确性具有成本。

#2


8  

If you might ever have to deal with more complex gender issues (in-process gender changes or trans-gender), the best way is to use a reference table of possible values:

如果您可能不得不处理更复杂的性别问题(进行中的性别变化或跨性别),最好的方法是使用可能值的参考表:

CREATE TABLE static_gender (
    ID INT AUTO_INCREMENT PRIMARY KEY,
    Name varchar(10),
    Description varchar(100)
) ENGINE=INNODB;

Initially, you can load it up with:

最初,您可以使用以下命令加载它:

INSERT INTO static_gender VALUES
(DEFAULT, 'F', 'female'),
(DEFAULT, 'M', 'male');

That way you can expand the table as new values for gender become necessary. In your USER (or whatever) table, you store static_gender_id and get the value for the gender with a JOIN.

这样,您可以扩展表格,因为必须使用性别的新值。在您的USER(或其他)表中,存储static_gender_id并使用JOIN获取性别的值。