I am a C++ programmer who occasionally uses MySQL to work with databases, but my SQL knowledge is rather limited. However I am surely willing to change that.
我是一个c++程序员,偶尔使用MySQL处理数据库,但是我的SQL知识非常有限。然而,我肯定愿意改变这一点。
At the moment I am trying to do analysis(!) on the data I have in my database solely with SQL queries. But I am about to give up, and instead import the data to C++ and do the analysis with C++ code.
目前,我正在尝试仅使用SQL查询对数据库中的数据进行分析(!)。但我即将放弃,而是将数据导入c++,并使用c++代码进行分析。
I have discussed this with my colleagues, and they also push me to use C++, saying that SQL is not meant for complex analysis but mainly for importing (from the existing tables) and exporting (to new tables) data, and a little bit more such as merging data to - e.g. - joined tables.
我和同事讨论了这个,他们还逼我使用c++,说SQL并不意味着复杂的分析,但主要为进口(从现有表)和出口(新表)的数据,和一点如合并数据——例如加入表。
Can somebody help me drawing a line? So I know when to switch to C++? Of course performance is also an issue.
谁能帮我画一条线吗?所以我知道什么时候转换到c++ ?当然,性能也是一个问题。
What are indications that things get to complex in SQL? Or maybe I just take the wrong approach with designing the queries. Then where can I find tutorials, books, ... to take a better approach?
SQL中什么迹象表明事情变得复杂?或者我只是在设计查询时采用了错误的方法。那么我在哪里可以找到教程、书籍……采取更好的方法?
I hope this is not too vague. I am really a bit lost.
我希望这不是太模糊。我真的有点迷路了。
8 个解决方案
#1
23
SQL excels at analyzing large sets of relational data.
SQL擅长分析大型关系数据集。
The place to draw the line is the scale of your analysis.
画线的地方是你分析的范围。
If you analyze individual records one at a time, do it in your application.
如果您一次分析一个记录,请在应用程序中进行分析。
If you analyze large sets of records as a unit, SQL is definitely the best tool for that job.
如果将大型记录集作为一个单元进行分析,那么SQL肯定是最佳的工具。
Row-by-row analysis is not something SQL is designed or optimized for very well. But, if you want to know something about a million-row group of data, do it in the database.
行逐行分析不是SQL设计或优化的东西。但是,如果你想知道关于一组百万行的数据,可以在数据库中做。
#2
7
I have discussed this with my colleagues, and they also push me to use C++, saying that SQL is not meant for complex analysis but mainly for importing (from the existent tables) and exporting (to new tables) data, and a little bit more such as merging data to - e.g. - joined tables.
我和同事讨论了这个,他们还逼我使用c++,说SQL并不意味着复杂的分析,但主要为进口(从存在的表)和出口(新表)的数据,和一点如合并数据——例如加入表。
This is completely arbitrary. Learn SQL. There are a lot of resources available on the web for free.
这是完全任意的。学习SQL。网上有很多免费的资源。
#3
5
You can do very complex analysis of data in SQL, provided you know how use the features that SQL offers.
如果您知道如何使用SQL提供的特性,那么您可以对SQL中的数据进行非常复杂的分析。
SQL has features for doing relational operations, like joins and projections. Also for doing set operations like union, intersection, and restriction (subset). Also for doing basic arithmetic on numbers, like the four arithmetic operators, and built in functions like SQRT. Also statistical functions like COUNT, SUM, and AVG that can be combined with projections in very interesting ways. A good DBMS will let you extend the built in functions with your own functions written in C, C++ or maybe PL/SQL.
SQL具有进行关系操作的特性,如连接和投影。也用于执行集合操作,如联合、交集和限制(子集)。同样也用于对数字进行基本的算术运算,如四种算术运算符,以及构建函数,如SQRT。还有统计函数,如COUNT, SUM, AVG它们可以以非常有趣的方式与投影结合在一起。一个好的DBMS可以让你用C、c++或者PL/SQL的函数来扩展内置函数。
The power you get from these features depends on how well designed the database is. A well designed database conforms to the relational model, and should be relvant to your intended use of the data.
您从这些特性中获得的能力取决于数据库设计的好坏。设计良好的数据库符合关系模型,并且应该与您预期的数据使用相关。
SQL code can be stored in the database in stored prodecures. It can be stored in SQL script files. And, as you already know, it can be embedded in application programs. In addition to SQL, you can use OLAP tools and report generators to do standard things with the data very easily.
SQL代码可以存储在存储的prodecures中的数据库中。它可以存储在SQL脚本文件中。而且,正如您已经知道的,它可以嵌入到应用程序中。除了SQL之外,您还可以使用OLAP工具和报表生成器对数据进行标准处理。
The people who advise you to keep all of your processing in C++ sound like they have learned just enough to use a database like a big and stupid file system. A good DBMS is much more than that.
那些建议您在c++中保存所有处理的人,听起来就像他们已经学会了如何像使用一个又大又笨的文件系统那样使用数据库一样。好的DBMS远远不止这些。
#4
4
SQL is usually very efficient handling its own database (depends on the server implementation).
SQL通常非常高效地处理自己的数据库(取决于服务器实现)。
You should use queries to analyze the database.
The main reason for that would be the communication overhead.
Even if the server is on the local machine (remote servers would have obvious communication overhead), you'll still have to retrieve the stored information from the SQL server to your c++ program for analysis.
您应该使用查询来分析数据库。主要原因是通信开销。即使服务器在本地机器上(远程服务器有明显的通信开销),您仍然需要从SQL服务器检索存储的信息到c++程序进行分析。
Now if you have 10000s of lines in the SQL you would have to get the SQL server to read them all and send them to your program where it would probably create a local copy of the data for you to work on.
现在,如果SQL中有1000多行,就必须让SQL服务器读取它们,并将它们发送到您的程序,在那里,它可能会为您创建数据的本地副本。
If you let the SQL server do it with queries, you'll gain the complex optimizations it does according the kind of query you're executing, and in the end you can retrieve only a limited amount of data (the one you actually need) through the communication.
如果让SQL服务器执行查询,您将获得它根据您正在执行的查询类型进行的复杂优化,最后您只能通过通信检索有限的数据(实际上需要的数据)。
#5
2
You made right decision to begin data analysis with SQL. Now, when you feel that your knowledge of SQL limits you, you have 2 choices: give up and switch back to familiar but not very efficient toolset (C++) or bring your level with SQL up.
您做出了正确的决定,开始使用SQL进行数据分析。现在,当您觉得您的SQL知识限制了您时,您有两个选择:放弃并切换回熟悉但不是非常有效的工具集(c++),或者将您的级别与SQL调高。
It's possible that at some point SQL will become too complex too, but then C++ won't be the answer either - most likely some specialized tools.
在某种程度上,SQL可能也会变得过于复杂,但是c++也不会成为答案——很可能是一些专门的工具。
#6
2
In my opinion you should only perform analysis in C++ if no equivalent for the analysis function is provided by database server, As database servers are very smart and it is hard and almost imposible to beat the algorithm efficiency of analysis function of database server. Also bringing raw data to the application for performing analysis also includes lots of overheads.
在我看来,如果数据库服务器没有提供相应的分析功能,那么只能使用c++进行分析,因为数据库服务器非常智能,很难和数据库服务器分析功能的算法效率相比。将原始数据带到应用程序来执行分析还包括大量的开销。
If at some point plain SQL becomes overly complex native PL of the sever could be a good choice
如果纯SQL在某个时候变得过于复杂,那么服务器的本机PL可能是一个不错的选择
#7
0
I agree with JNK and Jochai, but disagree with Ascanio. It's better to improve the knowledge in database systems. Sql comes with it
我同意JNK和Jochai,但不同意Ascanio。最好是在数据库系统中改进知识。Sql提供了它
#8
0
So, this is something I've been thinking about and it seems to me that SQL, as just a platform/language for storing/manipulating data, should have no inherent advantage over a C++ or C library. It seems to me that theoretically you could build a C++ library just as efficient, if not more efficient, than SQL at doing this. In doing so, you would be able to build it from the ground up, in terms of how ints, chars, strings, and other data types are stored, and make it easier to interface with you particular application (like web development). You could even make it so that the queries could be done in a language like javascript (allowing web developers to focus on just learning one language really well).
因此,这是我一直在思考的问题,在我看来,SQL作为一种存储/操作数据的平台/语言,与c++或C库相比应该没有固有的优势。在我看来,从理论上讲,您可以构建一个c++库,即使没有比SQL更有效,也同样有效。通过这样做,您将能够从头开始构建它,以了解如何存储int、chars、string和其他数据类型,并使与特定应用程序(如web开发)进行接口变得更容易。您甚至可以使用javascript这样的语言进行查询(允许web开发人员只专注于学习一种语言)。
#1
23
SQL excels at analyzing large sets of relational data.
SQL擅长分析大型关系数据集。
The place to draw the line is the scale of your analysis.
画线的地方是你分析的范围。
If you analyze individual records one at a time, do it in your application.
如果您一次分析一个记录,请在应用程序中进行分析。
If you analyze large sets of records as a unit, SQL is definitely the best tool for that job.
如果将大型记录集作为一个单元进行分析,那么SQL肯定是最佳的工具。
Row-by-row analysis is not something SQL is designed or optimized for very well. But, if you want to know something about a million-row group of data, do it in the database.
行逐行分析不是SQL设计或优化的东西。但是,如果你想知道关于一组百万行的数据,可以在数据库中做。
#2
7
I have discussed this with my colleagues, and they also push me to use C++, saying that SQL is not meant for complex analysis but mainly for importing (from the existent tables) and exporting (to new tables) data, and a little bit more such as merging data to - e.g. - joined tables.
我和同事讨论了这个,他们还逼我使用c++,说SQL并不意味着复杂的分析,但主要为进口(从存在的表)和出口(新表)的数据,和一点如合并数据——例如加入表。
This is completely arbitrary. Learn SQL. There are a lot of resources available on the web for free.
这是完全任意的。学习SQL。网上有很多免费的资源。
#3
5
You can do very complex analysis of data in SQL, provided you know how use the features that SQL offers.
如果您知道如何使用SQL提供的特性,那么您可以对SQL中的数据进行非常复杂的分析。
SQL has features for doing relational operations, like joins and projections. Also for doing set operations like union, intersection, and restriction (subset). Also for doing basic arithmetic on numbers, like the four arithmetic operators, and built in functions like SQRT. Also statistical functions like COUNT, SUM, and AVG that can be combined with projections in very interesting ways. A good DBMS will let you extend the built in functions with your own functions written in C, C++ or maybe PL/SQL.
SQL具有进行关系操作的特性,如连接和投影。也用于执行集合操作,如联合、交集和限制(子集)。同样也用于对数字进行基本的算术运算,如四种算术运算符,以及构建函数,如SQRT。还有统计函数,如COUNT, SUM, AVG它们可以以非常有趣的方式与投影结合在一起。一个好的DBMS可以让你用C、c++或者PL/SQL的函数来扩展内置函数。
The power you get from these features depends on how well designed the database is. A well designed database conforms to the relational model, and should be relvant to your intended use of the data.
您从这些特性中获得的能力取决于数据库设计的好坏。设计良好的数据库符合关系模型,并且应该与您预期的数据使用相关。
SQL code can be stored in the database in stored prodecures. It can be stored in SQL script files. And, as you already know, it can be embedded in application programs. In addition to SQL, you can use OLAP tools and report generators to do standard things with the data very easily.
SQL代码可以存储在存储的prodecures中的数据库中。它可以存储在SQL脚本文件中。而且,正如您已经知道的,它可以嵌入到应用程序中。除了SQL之外,您还可以使用OLAP工具和报表生成器对数据进行标准处理。
The people who advise you to keep all of your processing in C++ sound like they have learned just enough to use a database like a big and stupid file system. A good DBMS is much more than that.
那些建议您在c++中保存所有处理的人,听起来就像他们已经学会了如何像使用一个又大又笨的文件系统那样使用数据库一样。好的DBMS远远不止这些。
#4
4
SQL is usually very efficient handling its own database (depends on the server implementation).
SQL通常非常高效地处理自己的数据库(取决于服务器实现)。
You should use queries to analyze the database.
The main reason for that would be the communication overhead.
Even if the server is on the local machine (remote servers would have obvious communication overhead), you'll still have to retrieve the stored information from the SQL server to your c++ program for analysis.
您应该使用查询来分析数据库。主要原因是通信开销。即使服务器在本地机器上(远程服务器有明显的通信开销),您仍然需要从SQL服务器检索存储的信息到c++程序进行分析。
Now if you have 10000s of lines in the SQL you would have to get the SQL server to read them all and send them to your program where it would probably create a local copy of the data for you to work on.
现在,如果SQL中有1000多行,就必须让SQL服务器读取它们,并将它们发送到您的程序,在那里,它可能会为您创建数据的本地副本。
If you let the SQL server do it with queries, you'll gain the complex optimizations it does according the kind of query you're executing, and in the end you can retrieve only a limited amount of data (the one you actually need) through the communication.
如果让SQL服务器执行查询,您将获得它根据您正在执行的查询类型进行的复杂优化,最后您只能通过通信检索有限的数据(实际上需要的数据)。
#5
2
You made right decision to begin data analysis with SQL. Now, when you feel that your knowledge of SQL limits you, you have 2 choices: give up and switch back to familiar but not very efficient toolset (C++) or bring your level with SQL up.
您做出了正确的决定,开始使用SQL进行数据分析。现在,当您觉得您的SQL知识限制了您时,您有两个选择:放弃并切换回熟悉但不是非常有效的工具集(c++),或者将您的级别与SQL调高。
It's possible that at some point SQL will become too complex too, but then C++ won't be the answer either - most likely some specialized tools.
在某种程度上,SQL可能也会变得过于复杂,但是c++也不会成为答案——很可能是一些专门的工具。
#6
2
In my opinion you should only perform analysis in C++ if no equivalent for the analysis function is provided by database server, As database servers are very smart and it is hard and almost imposible to beat the algorithm efficiency of analysis function of database server. Also bringing raw data to the application for performing analysis also includes lots of overheads.
在我看来,如果数据库服务器没有提供相应的分析功能,那么只能使用c++进行分析,因为数据库服务器非常智能,很难和数据库服务器分析功能的算法效率相比。将原始数据带到应用程序来执行分析还包括大量的开销。
If at some point plain SQL becomes overly complex native PL of the sever could be a good choice
如果纯SQL在某个时候变得过于复杂,那么服务器的本机PL可能是一个不错的选择
#7
0
I agree with JNK and Jochai, but disagree with Ascanio. It's better to improve the knowledge in database systems. Sql comes with it
我同意JNK和Jochai,但不同意Ascanio。最好是在数据库系统中改进知识。Sql提供了它
#8
0
So, this is something I've been thinking about and it seems to me that SQL, as just a platform/language for storing/manipulating data, should have no inherent advantage over a C++ or C library. It seems to me that theoretically you could build a C++ library just as efficient, if not more efficient, than SQL at doing this. In doing so, you would be able to build it from the ground up, in terms of how ints, chars, strings, and other data types are stored, and make it easier to interface with you particular application (like web development). You could even make it so that the queries could be done in a language like javascript (allowing web developers to focus on just learning one language really well).
因此,这是我一直在思考的问题,在我看来,SQL作为一种存储/操作数据的平台/语言,与c++或C库相比应该没有固有的优势。在我看来,从理论上讲,您可以构建一个c++库,即使没有比SQL更有效,也同样有效。通过这样做,您将能够从头开始构建它,以了解如何存储int、chars、string和其他数据类型,并使与特定应用程序(如web开发)进行接口变得更容易。您甚至可以使用javascript这样的语言进行查询(允许web开发人员只专注于学习一种语言)。