原文链接:http://msdn.microsoft.com/en-us/sync/bb887608.aspx
引言
Introduction
移动和远程工作能力在团队中的重要性与日俱增。最重要的是能让当团队中的成员在办公的时候,访问到相同的信息。在多数情况下,这些成员将有某种笔记本电脑,办公桌,智能手机或者PDA。通过这些设备,用户们可以快速访问他们的数据,通过VPN链接或其它方式连接到公司网络,像下面的样子。
The ability to support mobile and remote workers is becoming more and more important for organizations every day. It is critical that organizations ensure users have access to the same information they have when they are in the office. In most cases, these workers will have some sort of laptop, office desktop, Smartphone or PDA. From these devices, users may be able to access their data directly through VPN connections, Web servers or some other connectivity method into the corporate networks as seen below.
这种解决方案很容易实现。不幸的是,无法令外部员工满意。这类解决方案的主要缺点是:
This type of solution is fairly simple to implement. Unfortunately, for most remote workers it is less than satisfactory. Some major disadvantages of this type of solution include:
-
网络需求:为了能让用户访问他们的信息,远程设备在访问他们的数据时需要不间断的网络连接到过公司网络。对于一些像是在家里工作的员工,这可能不是问题。对于另一些员工,像是经常移动的销售代表,可能很困难。例如,如果一个销售代表访问一个客户,却因为缺乏网络而不能访问清单数据,这将让他很难有效率的完成他的工作。
-
数据访问速度: 在典型的C/S企业环境中,用户有着高速网络,是他们快速的获取信息。但是,外部员工普遍使用的是低速的,不可靠的有线或无线的网络。在这个解决方案中,每次用户请求的部分数据都需要被(重新)下载下来,因为没有方法在设备上保存数据。例如,如果一个销售代表在他每次打开程序的时候都需要下载他的产品类表,他将很快因为程序的长时间的等待而变得沮丧。
-
单点故障: 在这个解决方案中,所有的用户都将依赖于同一个服务器。如果由于计划的停机维护或者启动失败,而使数据库变得不可用,这将使所有外部员工将不能连接到他们的数据。
-
服务器的可扩展性:随着远程工作的员工变多,服务器的性能也受到影响,这将会产生添加额外硬件的需求。
-
Network Requirements: In order to allow users to access their information, the remote device needs to have a constant connection to the corporate network while accessing their data. For some workers, such as those who are working from home, this may not be a problem. For others, such as sales reps who are constantly on the move, this may be more difficult. For example, if that sales rep were visiting a customer and was unable to access inventory data because of a lack of network connectivity, it would be very difficult for this user to effectively do their job.
-
Data Access Speeds: In a typical client / server corporate environment, users have high speed networks that allow them quick access to information. Remote workers, however, are typically connected over slow, unreliable wired or wireless networks. With this solution, every piece of data this user needed would need to be downloaded every time it is requested because there is no way to persist the data on the device. For example, if a sales rep is required to download his product list every time he opens his application, he will quickly become frustrated with the time lag required to populate his application with information.
-
Single Point of Failure: With this type of solution, all users are reliant on a single server. If that database becomes unavailable due to planned server downtime or from server failures, all of the remote workers will be disconnected from their data.
-
Server Scalability: As more workers work remotely, the performance of the corporate servers will be affected, leading to a need to add additional hardware.
一种可以替代的解决方案是实现偶尔链接程序(Occasionally Connected Application (OCA))。OCA允许外部员工持续的访问他们的数据,但不像先前那样,用户直接访问企业数据库(情形),而是用户请求的信息储存在本地用户设备上。为了填充这个本地数据库,OCA程序通常都会包括一些数据同步能力。数据同步是由周期提取储存在客户端数据库(像SQL Server Compact)的信息并同步更改到服务端的数据库(像是SQL Server)。基于同步的解决方案的优势是用户不在需要长时间的持续连接来访问他们的数据。由于数据是保存在本地,它能提供持续的连接来访问他们的数据,将不需要中心数据库的处理。此外,也减小了网络限制和提高访问速度到达设备的速度。
One alternative to this solution is to implement an Occasionally Connected Application (OCA). An OCA allows a remote worker to continue to access their data, but unlike the previous scenario where the user accessed the corporate database directly, the information the worker requires is stored locally on the user’s device. In order to populate this user’s local database, an OCA will typically include some data synchronization capabilities. Data synchronization consists of the ability to periodically take information that is stored in the client database (such as SQL Server Compact) and synchronize changes with a server database (such as SQL Server). The advantage of a synchronization-based solution is that users are no longer required to have a constant network connection to access their information. Since their data is stored locally they are given constant access to their data while offloading processing requirements from the central database. Furthermore, the user is no longer limited by the network speed and can now access the data at the speed of the device.
在下面2个图表中我们见到OCA 的例子,数据(用绿色数据库表示)位于外部员工的本地设备上。第一个例子是直接储存到用户设备上的独立的数据库系统中。第二个例子是外部办公室将储存信息到工作组的数据库中,外部办公室的员工可以访问这个数据。
In the following two diagrams we can see examples of OCAs where data (represented by a green database) is persisted locally on the remote worker’s device. The first example is of a standalone database system where information is stored directly on the user’s device. The second example is of a remote office where information is stored in a workgroup database within this remote office so that multiple local workers can access the data.
OCA一种常见的扩展是支持多个数据库之间的数据协作。像下面的事例,一个远程的数据库*的和其它数据库交换信息。这种解决方案当在一个小组成员在偏远位置工作并且不能访问中心数据库时很有用。这些员工经常需要向其他员工共享信息,但是他们不能连接到*中数据库,他们需要共享信息通过一些像是P2P网络传输。
A common extension to this type of OCA is the ability to support data collaboration between databases. As seen below, a remote database is free to exchange information with any other database. This type of solution is useful when a team of people are working in remote locations and do not have access to a central database. These workers often need to share information amongst each other but since they do not have connectivity to the central database they need to share information through some sort of peer-to-peer network.
在本文剩下的部分里,我们将主要讨论OCA和Sync Services for ADO.NET,这个关键的技术可以使你构建OCA程序。
Throughout the rest of this document we will discuss OCAs with a special emphasis on Sync Services for ADO.NET, a key technology that enables developers to build OCAs.
Sync Services for ADO.NET 和 the Microsoft Sync Framework
Sync Services for ADO.NET and the Microsoft Sync Framework
Sync Services for ADO.NET 是微软同步服务 (MSF)。MSF是一个综合的同步平台,它可以使开发人员添加同步能力到应用程序,服务和设备中。MSF 解决了基本的问题,如何同步储存在任何位置中的任何数据类型使用任何协议在任何拓扑。MSF 的基本的能力是支持离线和任何类型终结点之间的数据协作(如:设备到桌面,桌面到服务器,等等)。
Sync Services for ADO.NET is a part of the Microsoft Sync Framework (MSF). MSF is a comprehensive synchronization platform that enables developers to add synchronization capabilities to applications, services and devices. MSF solves the fundamental problem of how to synchronize any type of data in any store using any protocol over any topology. Fundamental to MSF is the ability to support offline and collaboration of data between any types of endpoints (e.g. device to desktop, device to server, etc.).
Sync Services for ADO.NET 能够同步使用ADO.NET访问的数据库。由于Sync Services for ADO.NET 是MSF的一部分,任何使用uses Sync Services for ADO.NET 的数据库也能交换信息同其他的数据源,这些都是被MSF所支持的,像是Web 服务,文件系统,或者自定义数据储存。
Sync Services for ADO.NET enables synchronization between ADO.NET enabled databases. Since Sync Services for ADO.NET is part of the MSF, any database that uses Sync Services for ADO.NET can then also exchange information with other data sources that are supported by MSF, such as web services, file systems or custom data stores.
本文重点是在数据库系统中同步信息和Sync Services for ADO.NET 将如何帮助开发人员避免许多和OCA相关的常见问题。
The primary focus of this document will be on synchronizing information between database systems and how Sync Services for ADO.NET helps developers avoid many of the common issues associated with OCAs.
构建偶尔链接程序的挑战Challenges of Building an OCA
Challenges of Building an OCA
对于OCA,用户可以快速的访问他们数据并且不需要网络连接到*数据中,以便获取他们的数据。OCA,先更新本地数据库,然后将数据更改同步到中心数据库上,而不是直接更新中心数据库。虽然 虽然 OCA解决直接访问中心数据库相关的关键问题,但在建立OCA时还有大量的挑战。下面几节将讨论这些挑战,并提出如何使用Sync Services for ADO.Net 来避免这些挑战的建议。
With an OCA, users have quick access to their data and do not require a network connection to the central database in order to access their information. With an OCA, updates are made locally and then synchronized into the central database, rather than being made directly at the central database,. Although OCAs solve the key problems associated with directly accesing a central server, there are a number of challenges associated with building OCAs. The following sections will discuss these challenges and propose ways that Sync Services for ADO.Net can be used to avoid these challenges.
修改跟踪
Change Tracking
为了使数据同步有效,某种修改跟踪的方法是必需的。修改跟中能提供数据库在某一时间点到另一个时间点之间的修改列表(插入、更新、删除)。想象一下,一个远程的用户链接到*数据库并希望获取他从上次在线到现在的数据。没有修改跟踪,这个用户需要从中心数据源获取所有的数据然后向用户的计算机或设备上的本地数据库中合并数据的修改。在移动环境中,这是极其低效率的,因为使用无线网络交换这种信息将花费大量的时间。如果有大量的数据交换更会特别的慢。不幸的是,如果连接丢失数据则必须重新下载,这将会更加的没有效率。
In order to make data synchronization efficient, some method of change tracking is required. Change tracking is the ability to provide a list of changes (insert, updates and deletes) that were made to the database from one point in time to another. Imagine a remote user who connects to the central database and wishes to bring their data up-to-date since the last time they were online. Without change tracking, this user would need to take all of the data from the central data source and then merge that data with the changes that the user made to the local database on the user’s computer or device. In a mobile environment, this is extremely inefficient because of the amount of time a wireless network would take to exchange this information. It would be especially slow if large datasets were exchanged. Furthermore, if the connection drops the data must be re-downloaded, making this even less efficient.
为了避免这个问题,通常使用一个修改跟踪系统。一个比较流行的修改跟中方法是通过使用rowversions 和触发器(triggers)。这个方法需将rowversion列添加到每个表中。为了删除,可能需要一个独立表或者一个删除标记列通常需要记录这些被删除的行并通过触发器来维护。
To avoid these issues, a change-tracking system is typically used. One popular method for change tracking is through the use of rowversions and triggers. This method requires a rowversion column to be added to each table. For deletions, a separate table or a “deleted” flag column is typically required to log these removed rows that are maintained through triggers.
这种解决方案的主要缺点有:
-
必须修改中心数据库的结构,添加列和表,这可能会影响到当前的程序。
-
触发器将在每行被更改后出发,这可能会对性能产生影响。
-
维护正确的rowversions和删除行的逻辑是很复杂的。
-
长时间运行事务将可能会有数据在同步期间丢失,这回照成结果不一致。
The major disadvantages to this solution are:
-
Changes are required in the central database schema to add columns and tables that may affect current applications.
-
Triggers are fired for each change made to a row, which has performance implications.
-
Logic for maintaining proper rowversions and row deletions can get extremely complicated.
-
Long running transactions can result in some data being missed during synchronization, resulting in data inconsistencies.
对于SQL Server 2008,跟踪修改方法将有一个新选择SQL Server 2008 修改跟踪。修改追踪的原理是管理标记几个监视更改的表。之后, SQL Server 2008 将跟踪任何的插入、更新或删除。当远程 “requestor” 请求更改,SQL Server 2008 将会提供全部的更改,从“Requestor”指定的最后一次成功下载开始。 Sync Services for ADO.NET 使用SQL Server 2008 修改跟踪可以给OCA环境提供以下优势:
-
不需要修改结构既可以跟踪修改
-
触发器对于修改跟中不是必需的,这意味修改跟中将减少对服务器的影响。在某些情况下,使用发器跟踪的DML开销可能比SQL Server 2008 修改跟踪高400%之多。SQL Server 2008 修改跟踪开销少的原因是维护第二个索引。
-
所有跟踪修改的逻辑都在SQL Server 引擎的内部,这样就减少了设置系统类型的复杂性。
-
对于长时间事务的数据一致性问题将不再是问题。
-
包括整合数据库管理特性,像是动态管理视图和安全。
SQL Server 2008 has introduced a new alternative method for tracking changes called SQL Server 2008 Change Tracking. The concept behind change tracking is that an administrator marks certain tables to be monitored for changes. From that point SQL Server 2008 keeps tracks of any inserts, updates or deletes that are made. When a remote “requestor” requests changes, SQL Server 2008 will provide all of the changes that have occurred since the last successful download as specified by the requestor. Sync Services for ADO.NET has been built to take advantage of SQL Server 2008 change tracking and provides the following advantages for an OCA environment:
-
No schema changes are required to be able to track changes
-
Triggers are not required for tracking changes, which means that tracking changes has far less of an impact on the server. In certain cases, the DML overhead associated with trigger based change tracking can be 400% greater than that of SQL Server 2008 change tracking. The overhead of enabling SQL Server 2008 change tracking is similar to the overhead of maintaining a second index.
-
All of the logic for tracking changes is internal to the SQL Server engine and as such reduces the complexity for setting up this type of system.
-
Data consistency issues associated with long running transactions are no longer an issue.
-
Includes integrated database administration feature such as Dynamic Management Views and Security
维护数据更改
Maintaining Change Data
修改跟踪表通常会快速增大,占用磁盘空间和影响查询效率。因此,合理的下一步是决定何时移除修改跟踪。根据用户最后一次同步来清理很困难,因为设备可能还没有同步或者用户只是简单的不使用它。这些因素使得管理员决定何时所有活动用户接受到更改并可以清理更改数据变的困难。
Change tracking tables will typically grow quite quickly, taking up storage space and affecting the performance of queries being executed against them. As such, the next logical step is to determine when tracked changes can be removed. Basing cleanup on when users have last synchronized is difficult because a device may have not been synchronized or the user may have simply stopped using it. These factors make it difficult for administrators to determine when all active users have received the changes which would allow them to clean up the change data.
在SQL Server 2008中,可以定制阈值,用来维护更改数据和自动清理旧的数据。而且,这是运行时在后台进程,不会对服务器产生性能影响。
In SQL Server 2008, customizable retention thresholds are used to maintain this change data and automatically clean up old data. Furthermore, this process runs as a background process to help offset any performance impact on the server.
冲突检测和解决
Conflict Detection and Resolution
冲突是OCA可能出现的另一个问题。当2个或更多的对同一数据的数据库修改后,同步引擎将尝试将这些修改应用到一个数据库上,这时就发生了冲突。例如,想想一下,有2个销售代表尝试更新同一个客户的2个不同的地址。第一个销售代表成功同步更新了中心数据库。当下一个销售代表同步更新到主数据库中时,就会发生异常,因为当前行的状态和同步引擎期望的不同。有多种方式解决这个问题。例如,以最后一次更新为准或者根据用户的资历进行选择。
Conflicts are another issue that can arise in an OCA. Conflicts will occur when two or more databases make a change to the same piece of data and then the synchronization engine tries to apply those into a single database. For example, imagine two sales reps who try to update a customer’s address to two different values. The first sales rep successfully synchronizes the update to the central database. When the next sales rep goes to synchronize the update to the main database, a conflict occurs because the current state of the row is different from what the synchronization engine was expecting. There are a variety of ways to resolve these conflicts. For example, the last change to come in may be the one that wins, or alternatively it may be based on which user has the most seniority.
Sync Services for ADO.NET提供使用外部冲突检测和解决的能力配合SQL Server 2008 改进低复杂度的检测数据冲突的体验。使用这个技术,在上个例子中的第一个销售代表将成功更新中心服务器。新值将会被应用,因为至今没有冲突。当第二个销售代表上传更改时,一个冲突将被检测到,因为在中心服务器的变更版本(Change Version)不包含当前销售代表的数据库。之后,远程的程序逻辑将决定如何处理这个冲突。
Sync Services for ADO.NET provides conflict detection and resolution capabailities out-of-the-box and SQL Server 2008 improves on this experience by decreasing the complexity associated with identifiying conflicts. By using this technique, the first sales rep in our example will successfully upload the change to the central server. It will be applied since there is no conflict yet. When the second sales rep goes to upload the change, a conflict will be detected because the change version in the central server does not exist in the current sales rep database. From this point, logic in the remote application determines how to handle the conflict.
优先数据交换
Prioritizing Data Exchange
当OCA成功部署后,通常需要一种方式来优化用户的数据交换一遍限制或者适应低速网络连接。一种优化数据同步的方式是定义数据的优先级,是高优先或者关键。使用优先级数据交换,关键更改将会立即被同步,次重要的数据将会稍后同步。例如,想想一下,当一个用户使用一个慢速网络访问,由于带宽有限,这个用户希望发送单个表的更改(高度重要),而其它表的更改将在之后连接高速网络后发送。对于Sync Services for ADO.NET,同步执行将从远端程序执行,基于表到表和只上传或只下载的同步。这意味着可以指定同步的表,其他的将在稍后同步,并且能够保证所有数据的交换。
When an OCA is deployed, users will often look for ways to optimize their data exchange given limited or slow network connections. One good way to optimize data synchronization is by prioritizing data to define data that is high priority or critical. Using priority data exchange, critical changes could be synchronized immediately while leaving less important data to be synchronized at a later time. For example, imagine a user currently has access only to a slow connection. Given the limited bandwidth, this user wishes to send change from a single (highly important) table and leave changes from other tables for later in the day when she can connect through a faster network connection. With Sync Services for ADO.NET, synchronization can be executed from the remote application on a table-by-table basis and on an upload-only or download-only basis. This means that specific tables can be synchronized, leaving others to be synchronized at a later time while still enabling guaranteed data exchange for all data.
后台同步
Background Synchronization
当用户的程序执行一个命令的时候(如,数据同步),也不会影响用户的控制。对于OCA,当执行诸如数据同步的时候,用户将不能访问本地数据库。
Nothing will frustrate a user more than when they are locked out of their application while another process (such as data synchronization) takes control. With most OCAs, users are not able to access their local database while processes like data synchronization are being executed.
对于Sync Services for ADO.NET,同步是在后台线程运行的。只要本地数据库支持在独立连接上同步(像 SQL Server Compact does),同步就可以在后台执行。这允许本地用户继续使用和修改数据库直到同步完成。
With Sync Services for ADO.NET, synchronization can run as a background thread. As long as the local database supports the ability to synchronize on a separate connection (as SQL Server Compact does), synchronization can be executed in the background. This allows a local user to continue to use and change their database while synchronization is being completed.
多同步拓扑
Multiple Synchronization Topologies
当构建OCA时,组织通常会选择一个网络扩谱。例如,想想一下,一组运送工,在一天开始从中心库存数据库中下载路线信息,然后在一天结束的时候,上传包囊和路线完成信息到同一库存数据库。在传统的同步环境中,这可能会被设计成放射状模型(hub-and-spoke),如下。
When first architecting an OCA, organizations will typically start with a single network topology. For example, imagine a group of delivery workers who download their route information at the start of the day from a central warehouse database and then upload their package and route completion information to the same warehouse database at the end of the day. In a traditional synchronization environment this would be considered a hub-and-spoke model as seen below.
然而,如果当一个员工完成了他的工作,却从仓库开到城市的另一端,这时司机可能希望同步数据到另一个仓库。或者进一步,使用数据协作将数据同步给另一个运送工。第二个用户回到仓库将他们数据都上传到服务器。你可以想象,这种同步逻辑将会很快变得很复杂。
However, what happens if a delivery worker completes his work on the other side of the city from the warehouse and the driver wishes to synchronize with a different warehouse? Or to take it one step further, why not let the delivery worker synchronize data with another delivery worker by using data collaboration. The second user could then go to a warehouse and upload both users’ data. As you can imagine the logic required to orchestrate this type of synchronization can quickly become very complicated.
对于Sync Services for ADO.NET,实际上支持使用任何类型同步拓扑。现在,公司将不再局限单个拓扑或者组合拓扑,这意味着可以建立包含离线和协作基础结构.
With Sync Services for ADO.NET, virtually any type of synchronization topology can be used. Now organizations are no longer limited to a single topology but can choose any one or a combination of topologies, meaning they could create combinations of offline and collaboration-based architectures.
自定义客户端和服务器数据库
Custom Client and Server Databases
毫无疑问,将来的某一时刻,你需要添加新的数据库到你的同步环境中。例如,一个企业可能拥有一个遗留系统或者大型主机需要整合到现有的同步环境中。或者,一个移动设备可能已经有了自定义数据库,需要将它整合到后端数据库。使问题更糟的是,如果数据库不允许你跟踪更改,使是你无法应用更改。Sync Services for ADO.NET,提供了通用数据库,像是SQL Server Compact和 SQL Server 2008 是外部功能。然而,使用Microsoft Sync Framework,开发人员可以闯进自定义提供器(provider),来整合同步存在任何位置的数据库,想Web service、 U 盘,有机上的SIM卡。
Without a doubt, there will come a time when you need to add a new database into your synchronization environment. For example, an enterprise may have legacy systems or mainframes that need to be integrated into the existing synchronization environment. Or alternatively, a mobile device may already have a custom database that needs to be integrated into the back-end database. To make matters worse, what if you are not able to make changes to that database to allow you to track changes? With Sync Services for ADO.NET, providers for common databases such as SQL Server Compact and SQL Server 2008 are included out of the box. However, using the Microsoft Sync Framework, a developer can create custom providers that integrate synchronization with virtually any place you store your data, whether that is a Web service, a USB key-chain drive, or a SIM card on a cell phone.
安全
Security
在一个OCA中,有多方面的安全需要实现。这些方面包括:
In an OCA, there are many aspects to security that may need to be implemented. Some of these areas include:
-
数据库加密
-
数据库认证
-
加密同步数据
-
内部认证
-
Database encryption
-
Database authentication
-
Encryption of data synchronization
-
Internal authentication
Sync Services for ADO.NET 能够帮助增加依赖同步的应用程序的安全性。在设备上, SQL Server Compact可以加密数据库,同时启动用户身份验证功能。从同步角度, Sync Services for ADO.NET 支持能使加密数据在2个数据库间传播。在企业端, SQL Server 2008,以及现有的IIS安全可以提供用户认证,认证用户交换数据。
Sync Services for ADO.NET helps to increase the security of applications that depend on synchronization. On the device side, SQL Server Compact offers the ability to both encrypt the database as well as the ability to enable user authentication. From a synchronization perspective, Sync Services for ADO.NET supports the ability to encrypt data as it travels between databases. On the corporate side, SQL Server 2008 as well as existing IIS security can be leveraged for user authentication as users exchange data.
总结
Summary
Sync Services for ADO.NET是一个中和数据同步解决方案,能使开发者构建支持任何数据库,任何数据库,在任何网络上的同步解决方案。对于Sync Services for ADO.NET 和Microsoft Sync Framework,信息同步的方向是可以进出组织的,允许开发者构建高效率和高伸缩性的偶尔连接程序。使用Microsoft Sync Framework开发者可以扩展Sync Services for ADO.NET 应用程序具备在设备间使用任何数据源协作的能力。
Sync Services for ADO.NET is a comprehensive data synchronization solution that enables developers to build solutions that support synchronization of any database, on any data protocol over any network topology. With Sync Services for ADO.NET and the Microsoft Sync Framework, the synchronization of information can flow virtually anywhere in or out of the organization, allowing developers to build efficient and highly scalable Occasionally Connected Applications. Using the Microsoft Sync Framework developers can extend Sync Services for ADO.NET applications to enable collaboration of data between devices using any data source.
关于Sync Services for ADO.NET 和Microsoft Sync Framework的更多信息,请访问: http://msdn2.microsoft.com/en-us/sync/default.aspx .
For more information about Sync Services for ADO.NET and the Microsoft Sync Framework, please visit: http://msdn2.microsoft.com/en-us/sync/default.aspx .