My project requires a solution to store billions of rows of data with minimal relational data.
我的项目需要一个用最小关系数据存储数十亿行的解决方案。
The raw data is currently in a text file and looks something like this Id(), Type(int), Data(Binary data between 1-10MB)
原始数据当前在一个文本文件中,看起来像这样的Id()、类型(int)、数据(1-10MB之间的二进制数据)
The Id column in the raw text file can be ignored when importing, and replace with either a new int, bigint or uniqueidentifier, which ever has better performance.
在导入时,可以忽略原始文本文件中的Id列,并使用新的int、bigint或uniqueidentifier替换它们,因为它们具有更好的性能。
Any suggestions on what I should use and how I should design the database?
对我应该使用什么以及如何设计数据库有什么建议吗?
Also the front end will be written in C# with EF4 (or something else, im open to all suggestions).
而且前端也会用c#和EF4(或者别的什么,我接受所有的建议)来写。
7 个解决方案
#1
2
I think you might be interrested in a serverless database. Like SQLite or SQL Server Compact.
我想你可能在一个没有服务器的数据库中休息。比如SQLite或SQL Server Compact。
You do not have to install a server, but you can query your data using SQL, LINQ etc.
您不必安装服务器,但您可以使用SQL、LINQ等查询数据。
#2
1
Windows Azure Storage Services is the closest your gonna get if your looking for a NoSQL product by Microsoft
如果你在寻找微软的NoSQL产品,Windows Azure存储服务是最接近的
It's a cloud thing and Microsoft doesn't have a separate product that you yourself host.
这是一个云计算的东西,微软没有你自己托管的独立产品。
Windows Azure Storage Services is however, built on top of MS SQL Server, just not exposed through the normal TDS protocol. That way, they never allow access to the database without NoSQL in mind. That doesn't stop you from treating your typical SQL Server database as if it was NoSQL, and if you did, you should be able to scale really well. The idea of NoSQL is just that you don't do stuff that doesn't scale horizontally.
然而,Windows Azure存储服务构建在MS SQL Server之上,只是没有通过常规的TDS协议公开。这样,他们就不允许访问数据库而不考虑NoSQL。这并不能阻止您将典型的SQL Server数据库视为NoSQL,如果您这样做了,您应该能够很好地扩展它。NoSQL的概念是你不会做那些没有横向扩展的事情。
#3
1
http://en.wikipedia.org/wiki/NoSQL
http://en.wikipedia.org/wiki/NoSQL
NoSQL is not equviliant of any RDBMS. so "What's the NoSQL equivalent of MS SQL Server" makes no sense. it should either be NoSQL vs MS SQL or no mention of NoSQL at all.
NoSQL不是任何RDBMS的等价对象。因此,“什么是NoSQL等效的MS SQL Server”毫无意义。它要么是NoSQL vs . SQL,要么根本没有提到NoSQL。
#4
1
There's a provider giving you some feeling of having a nosql document store over sql-server. http://www.sisodb.com
有一个提供程序让您感觉在sql-server上有一个nosql文档存储。http://www.sisodb.com
#5
0
What is the problem you are attempting to solve with the system? What type of data are you analysing?
你想用这个系统解决的问题是什么?你在分析什么类型的数据?
Assuming this is an analysis system rather than a transactional processing system there are tools for analysing large data sets that might have the functionality you need without requiring you to write too much code. For example the Visualisation Toolkit from Kitware www.vtk.org or MIDAS http://www.kitware.com/products/midas.html
假设这是一个分析系统而不是事务处理系统,那么有一些工具可以分析大型数据集,这些数据集可能具有您需要的功能,而不需要编写太多代码。例如,来自Kitware www.vtk.org或MIDAS http://www.kitware.com/products/midas.html的可视化工具包
MIDAS integrates multimedia server technology with Kitware’s open-source data analysis and visualization clients. The server follows open standards for data storage, access and harvesting. MIDAS has been optimized for storing massive collections of scientific data and related metadata and reports. MIDAS is available under a non-restrictive (BSD) open-source license.
MIDAS将多媒体服务器技术与Kitware的开源数据分析和可视化客户端集成在一起。服务器遵循数据存储、访问和获取的开放标准。MIDAS已经被优化以存储大量的科学数据和相关的元数据和报告。MIDAS可通过非限制性(BSD)开源许可协议获得。
Alternatively IBM have OpenDX http://www.research.ibm.com/dx/
IBM也有OpenDX http://www.research.ibm.com/dx/
#6
0
I suggest you conider using SQL Server with the Filesteam feature for the binary data.
我建议您使用带有Filesteam特性的SQL Server来处理二进制数据。
http://technet.microsoft.com/en-us/library/bb933993.aspx
http://technet.microsoft.com/en-us/library/bb933993.aspx
Your question has nothing much to do with NoSQL. Don't go thinking that filestream is the SQL Server "equivalent" of NoSQL!
您的问题与NoSQL无关。不要认为filestream是NoSQL Server的“等效”!
#7
0
I think the closest approach you will get from MS SQL - is XML column type. Since xml is by definition semi-structured data. So you can make field of xml type and add there your document with binary data encoded in hex or base64 format (if data storage space is not an issue to you).
我认为从MS SQL得到的最接近的方法是XML列类型。因为xml是根据定义的半结构化数据。因此,您可以创建xml类型的字段,并在其中添加以十六进制或base64格式编码的二进制数据(如果数据存储空间对您来说不是问题的话)。
#1
2
I think you might be interrested in a serverless database. Like SQLite or SQL Server Compact.
我想你可能在一个没有服务器的数据库中休息。比如SQLite或SQL Server Compact。
You do not have to install a server, but you can query your data using SQL, LINQ etc.
您不必安装服务器,但您可以使用SQL、LINQ等查询数据。
#2
1
Windows Azure Storage Services is the closest your gonna get if your looking for a NoSQL product by Microsoft
如果你在寻找微软的NoSQL产品,Windows Azure存储服务是最接近的
It's a cloud thing and Microsoft doesn't have a separate product that you yourself host.
这是一个云计算的东西,微软没有你自己托管的独立产品。
Windows Azure Storage Services is however, built on top of MS SQL Server, just not exposed through the normal TDS protocol. That way, they never allow access to the database without NoSQL in mind. That doesn't stop you from treating your typical SQL Server database as if it was NoSQL, and if you did, you should be able to scale really well. The idea of NoSQL is just that you don't do stuff that doesn't scale horizontally.
然而,Windows Azure存储服务构建在MS SQL Server之上,只是没有通过常规的TDS协议公开。这样,他们就不允许访问数据库而不考虑NoSQL。这并不能阻止您将典型的SQL Server数据库视为NoSQL,如果您这样做了,您应该能够很好地扩展它。NoSQL的概念是你不会做那些没有横向扩展的事情。
#3
1
http://en.wikipedia.org/wiki/NoSQL
http://en.wikipedia.org/wiki/NoSQL
NoSQL is not equviliant of any RDBMS. so "What's the NoSQL equivalent of MS SQL Server" makes no sense. it should either be NoSQL vs MS SQL or no mention of NoSQL at all.
NoSQL不是任何RDBMS的等价对象。因此,“什么是NoSQL等效的MS SQL Server”毫无意义。它要么是NoSQL vs . SQL,要么根本没有提到NoSQL。
#4
1
There's a provider giving you some feeling of having a nosql document store over sql-server. http://www.sisodb.com
有一个提供程序让您感觉在sql-server上有一个nosql文档存储。http://www.sisodb.com
#5
0
What is the problem you are attempting to solve with the system? What type of data are you analysing?
你想用这个系统解决的问题是什么?你在分析什么类型的数据?
Assuming this is an analysis system rather than a transactional processing system there are tools for analysing large data sets that might have the functionality you need without requiring you to write too much code. For example the Visualisation Toolkit from Kitware www.vtk.org or MIDAS http://www.kitware.com/products/midas.html
假设这是一个分析系统而不是事务处理系统,那么有一些工具可以分析大型数据集,这些数据集可能具有您需要的功能,而不需要编写太多代码。例如,来自Kitware www.vtk.org或MIDAS http://www.kitware.com/products/midas.html的可视化工具包
MIDAS integrates multimedia server technology with Kitware’s open-source data analysis and visualization clients. The server follows open standards for data storage, access and harvesting. MIDAS has been optimized for storing massive collections of scientific data and related metadata and reports. MIDAS is available under a non-restrictive (BSD) open-source license.
MIDAS将多媒体服务器技术与Kitware的开源数据分析和可视化客户端集成在一起。服务器遵循数据存储、访问和获取的开放标准。MIDAS已经被优化以存储大量的科学数据和相关的元数据和报告。MIDAS可通过非限制性(BSD)开源许可协议获得。
Alternatively IBM have OpenDX http://www.research.ibm.com/dx/
IBM也有OpenDX http://www.research.ibm.com/dx/
#6
0
I suggest you conider using SQL Server with the Filesteam feature for the binary data.
我建议您使用带有Filesteam特性的SQL Server来处理二进制数据。
http://technet.microsoft.com/en-us/library/bb933993.aspx
http://technet.microsoft.com/en-us/library/bb933993.aspx
Your question has nothing much to do with NoSQL. Don't go thinking that filestream is the SQL Server "equivalent" of NoSQL!
您的问题与NoSQL无关。不要认为filestream是NoSQL Server的“等效”!
#7
0
I think the closest approach you will get from MS SQL - is XML column type. Since xml is by definition semi-structured data. So you can make field of xml type and add there your document with binary data encoded in hex or base64 format (if data storage space is not an issue to you).
我认为从MS SQL得到的最接近的方法是XML列类型。因为xml是根据定义的半结构化数据。因此,您可以创建xml类型的字段,并在其中添加以十六进制或base64格式编码的二进制数据(如果数据存储空间对您来说不是问题的话)。