【文件属性】:
文件名称:DbfDotNet_version_1.0_Source
文件大小:65KB
文件格式:ZIP
更新时间:2012-06-25 11:17:29
Dbf DataSet
Introduction
This article present a tiny database engine which implements fixed width record tables and BTree indexes.
This library is a work in progress, I am looking for some feedback on this to see if there is a need for it.
This article is also a work in progress please bookmark and come back later, if you want to see the final article.
Why an embedded database
Although most of us will use a SQL Server to store and retrieve data sets.
There are several situation where an embedded database make sense.
When you don't have a SQL Server available
When you want your footprint as small as possible and can't afford SQL Express
When you want to manipulate or cache SQL data
When you need to write highly procedural data manipulation routines
When you want maximum speed
Features
Despite its small size DbfDotNet provides a number of features that you might find useful
Type safe
In DbfDotNet you manipulate classes with native field types. All data conversion plumbing is done automatically.
Very simple entity framework
Creating a record and accessing its propery is only what you need.
Very small memory footprint
Last time I checked the dbfDotNet dll was 50Kb. Other databases are 1Mb to 10Mb.
I would appreciate if someone could do some memory usage comparison (I will insert it here).
Fast
DbfDotNet was conceived for speed.
DbfDotNet do not use PInvoke, Threading locks, and do not implement any transaction system.
Those 3 technologies have a performance cost that it won't have to pay.
In contrast it is using TypeSafe records (without boxing/unboxing) and type safe emitted code. The code is emitted only once per table.
It has therefore I believe the potential to be the fastest embedded .Net database there is.
I would appreciate if someone could do some speed comparison (I will insert it here).
Very small runtime memory usage
When you use in Memory DataTable or SQL requests that return DataSets, the entire result sets is in memory.
DbfDotNet works conjointly with the garbage collector. As soon as you're finished modifying an entity the garbage collector will mark the record buffer to be saved to disk and released from memory.
Why Dbf
By default the files are compatible with dBase and can therefore be open in Excel and many other packages.
I have been asked : Why Dbf ? Dbf is an old format.
The answer is a bit long but simple.
As I said earlier DbfDotNet is designed to be as fast as possible.
In order to get the database started and get some interest I need two things:
A good product
A good user base
I know by experience that the DBF format will appeal to some of you for several reason:
You can easily backup DBF files (and leave index files)
You can check DBF content using Excel and many other tools
DBF is well known and simple to implement
It can be extended to modern types (and has been by clipper and fox pro)
Most importantly for me, implementing the .DBF rather that my own custom format has no impact on runtime speed.
How does it compare to ADO.Net, SQL, SqlLite, SharpSQL ...
I did some speed test against another database (which I won't name)
The results are quite encouraging.
Dbf.Net ADO.Net
Collapse Copy CodeOpening DbfDotNetDatabase: 185 ms
Insert 1000 individuals: 39 ms
Read individuals sequentially: 5 ms
Read individual randomly: 3 ms
Modifying individuals: 21 ms
Create DateOfBirth index: 77 ms
Michael Simmons 22/07/1909
Mark Adams 21/09/1909
Charles Edwards 28/09/1909
... total 1000 records
Enumerate Individuals by age: 36 ms
Closing DbfDotNetDatabase: 44 ms Collapse Copy CodeOpening ADO.Net Database: 459 ms
Insert 1000 individuals: 80601 ms
Read individuals sequentially: 1655 ms
Read individual randomly: 1666 ms
Modifying individuals: 75574 ms
Create DateOfBirth index: 80 ms
Michael Simmons 22/07/1909
Mark Adams 21/09/1909
Charles Edwards 28/09/1909
... total 1000 records
Enumerate Individuals by age: 29 ms
Closing ADO.Net Database: 0 ms
In this test Dbf.Net runs nearly 400 times faster. This is quite unfair however. Dbf.Net does not have transactions and is not ACID.
Lets not focus to much on speed but more on code differences:
Creating a Table
Creating the table is quite different. Dbf.Net requires a type safe record upfront to create a table. In ADO.Net you provide a string.
Dbf.Net ADO.Net
Collapse Copy CodeDbfTable mIndividuals;
void CreateIndividualTable()
{
mIndividuals =
new DbfTable(
@"individuals.dbf",
Encoding.ASCII,
DbfDotNet.DbfVersion.dBaseIV);
}
class Individual
: DbfDotNet.DbfRecord, IIndividual
{
[DbfDotNet.Column(Width = 20)]
public string FIRSTNAME;
[DbfDotNet.Column(Width = 20)]
public string MIDDLENAME;
[DbfDotNet.Column(Width = 20)]
public string LASTNAME;
public DateTime DOB;
[DbfDotNet.Column(Width = 20)]
public string STATE;
}
Collapse Copy CodeConnection _cnn = null;
void ITestDatabase.CreateIndividualTable()
{
_cnn = new System.Data.Connection(
"Data Source=adoNetTest.db");
_cnn.Open();
using (DbCommand cmd = _cnn.CreateCommand())
{
cmd.CommandText = "CREATE TABLE
INDIVIDUAL (ID int primary key,
FIRSTNAME VARCHAR(20),
MIDDLENAME VARCHAR(20),
LASTNAME VARCHAR(20),
DOB DATE,
STATE VARCHAR(20))";
cmd.ExecuteNonQuery();
}
}
Inserting new entries in a table:
Dbf.Net ADO.Net
Collapse Copy Codevoid InsertNewIndividual(
int id,
string firstname,
string middlename,
string lastname,
DateTime dob,
string state)
{
var indiv = mIndividuals.NewRecord();
indiv.FIRSTNAME = firstname;
indiv.MIDDLENAME = middlename;
indiv.LASTNAME = lastname;
indiv.DOB = dob;
indiv.STATE = state;
indiv.SaveChanges();
}
Collapse Copy Codevoid InsertNewIndividual(
int id,
string firstname,
string middlename,
string lastname,
DateTime dob,
string state)
{
using (DbCommand cmd =
_cnn.CreateCommand())
{
cmd.CommandText = string.Format(
"INSERT INTO INDIVIDUAL (ID,
FIRSTNAME, MIDDLENAME, LASTNAME,
DOB, STATE) VALUES({0},
'{1}', '{2}', '{3}',
'{4}', '{5}');",
id, firstname, middlename,
lastname,
dob.ToString("yyyy-MM-dd HH:mm:ss"),
state);
cmd.ExecuteNonQuery();
}
}
Getting an individual by record ID
Dbf.Net ADO.Net
Collapse Copy CodeIIndividual GetIndividualById(int id)
{
DbfDotNetIndividual result =
mIndividuals.GetRecord(id);
return result;
}
Collapse Copy CodeIIndividual GetIndividualById(int id)
{
using (DbCommand cmd =
_cnn.CreateCommand())
{
cmd.CommandText = string.Format(
"SELECT * FROM INDIVIDUAL
WHERE ID=" + id);
var reader = cmd.ExecuteReader();
try
{
if (reader.Read())
return GetNewIndividual(reader);
else return null;
}
finally
{
reader.Close();
}
}
}
Individual GetNewIndividual(
DbDataReader reader)
{
var res = new Individual();
res.ID = reader.GetInt32(0);
res.FirstName = reader.GetString(1);
res.MiddleName = reader.GetString(2);
res.LastName = reader.GetString(3);
res.Dob = reader.GetDateTime(4);
res.State = reader.GetString(5);
return res;
}
class Individual : IIndividual
{
public int ID { get; set; }
public string FirstName { get; set; }
public string MiddleName { get; set; }
public string LastName { get; set; }
public DateTime Dob { get; set; }
public string State { get; set; }
}
Saving a modified individual back to the database.
In Dbf.Net you don't have to write any code, if you don't want to wait for the garbage collector to collect your individual you can call SaveChanges.
Dbf.Net ADO.Net
Collapse Copy Codevoid SaveIndividual(
Individual individual)
{
individual.SaveChanges();
}
Collapse Copy Codevoid SaveIndividual(
IIndividual individual)
{
using (DbCommand cmd =
_cnn.CreateCommand())
{
cmd.CommandText = string.Format(
"UPDATE INDIVIDUAL
SET DOB='{1}' WHERE ID={0};",
individual.ID,
individual.Dob.ToString(
"yyyy-MM-dd HH:mm:ss"));
cmd.ExecuteNonQuery();
}
}
Creating an Index
Dbf.Net ADO.Net
Collapse Copy Codevoid CreateDobIndex()
{
var sortOrder =
new DbfDotNet.SortOrder(
/*unique*/false);
sortOrder.AddField("DOB");
mDobIndex = mIndividuals.GetIndex(
"DOB.NDX", sortOrder);
}
Collapse Copy Codevoid CreateDobIndex()
{
using (DbCommand cmd =
_cnn.CreateCommand())
{
cmd.CommandText =
string.Format(
"CREATE INDEX DOB_IDX ON
INDIVIDUAL (DOB)");
cmd.ExecuteNonQuery();
}
}
Getting individuals sorted by Age
Dbf.Net ADO.Net
Collapse Copy CodeIEnumerable
IndividualsByAge()
{
foreach (Individual indiv
in mDobIndex)
{
yield return indiv;
}
}
Collapse Copy CodeIEnumerable
IndividualsByAge()
{
using (DbCommand cmd =
_cnn.CreateCommand())
{
cmd.CommandText = string.Format(
"SELECT * FROM INDIVIDUAL
ORDER BY DOB");
var reader = cmd.ExecuteReader();
try
{
while (reader.Read())
{
yield return
GetNewIndividual(reader);
}
}
finally
{
reader.Close();
}
}
}
High Level Interface
I have been asked how I compare to other SQL databases.
Again DbfDotNet is not a SQL engine.
It is rather an object persistence framework, like the Microsoft Entity Framework or NHibernate.
The difference is that it doesn't translate object manipulations into SQL requests because it speaks directly to the database layer.
I would love to write a proper Dbf to Linq interface, if you want to help me on this please volunteer.
The difference
Using the code
Warning: This project is at its infancy, it has not been tested thoroughly.
You can try it but please don't use it in a live environment.
If you want speed however and are ready to either report or fix issues that might arrise:
Create a C# project
Reference DbfDotNet.dll in your project
Create a record class
Write some code manipulate the records
Point 3 and 4 are expanded below.
The DbfRecord class
The DbfRecord class represent one row in your table.
You can can the column attribute to change DBF specific parameters.
Collapse Copy Code class Individual : DbfDotNet.DbfRecord
{
[Column(Width = 20)] public string FIRSTNAME;
[Column(Width = 20)] public string MIDDLENAME;
[Column(Width = 20)] public string LASTNAME;
public DateTime DOB;
[Column(Width = 20)] public string STATE;
}The system automatically chooses the DbfField most appropriate for your datatype.
The DbfTable class
In order to store your records somewhere you need to create a Table:
Collapse Copy Code individuals = new DbfTable(
@"individuals.dbf",
Encoding.ASCII,
DbfVersion.dBaseIV);
Note that this using a type safe template. Every record in the table are individual's.
Record Manipulation
You can add new lines in the table by using the NewRecord
Collapse Copy Code var newIndiv = individuals.NewRecord();Then you simply use the fields in your record
Collapse Copy Code newIndiv.LASTNAME = "GANAYE";Optionally you can make a call to SaveChanges to immediately save your changes.
If you don't the data will be saved when your individual is garbage collected.
Collapse Copy Code newIndiv.SaveChanges();
Index support
This is still very basic. First you define your sort order:
Collapse Copy Code var sortOrder = new SortOrder(/* unique */ false);
sortOrder.AddField("LASTNAME");Then you can get your index:
Collapse Copy Code mIndex = individuals.GetIndex("lastname.ndx", sortOrder);
You can then, In a type safe way, retrieve any individual from your index.
Collapse Copy Code individual = mIndex.GetRecord(rowNo);
In order to maximize speed, the index emit its own type safe code for :
reading the index fields from the DBF record
reading and writing index entries
comparing index entries
Inner architecture
DbfDotNet main class is the ClusteredFile
The ClusteredFile is a wrapper around stream that provide paging and caching support.
The ClusteredFile is the base class for DbfFile and NdxFile. It will also be the base class for memo files when I write them.
The ClusteredFile uses a class called QuickSerializer to serialize the record content to a byte array.
QuickSerializer parse the Record fields and generate a bit of IL code for every fields to allow reading, saving and comparison.
NdxFile implements a B+Tree index
Roadmap
My plan is to keep this library extremelly small. It is not my intention to implement any transaction or multi-threading support.
I will implement :
support for every DBF fields types
memo fields (VARCHAR type)
multiple indexes files (*.mdx)
Proper documentation
LINQ (in a separate dll)
If you want to help me on this project please contact me.
Points of Interest
In order to maximize speed I forced myself to not use any thread synchronization locking.
Each set of Dbf + Indexes must be called from a given thread.
In other word each dbf file and its index can be used by only one thread.
I encountered a problem though when the Garbage Collector finalize a record, this is done in the Garbage Collector thread. I did not want to lock a resource and ended up writing this code:
Collapse Copy Codeclass Record
{
private RecordHolder mHolder;
~Record()
{
try
{
...
}
finally
{
mHolder.RecordFinalized.Set();
}
}
}
Each record has a RecordHolder that store a ReadBuffer and potentially a WriteBuffer.
When the record finalize it signal the RecordHolder that the record has been finalized. This instruction is not blocking, it raises a flag that can be used in other threads.
Collapse Copy Codeclass ClusteredFile
{
internal virtual protected Record InternalGetRecord(UInt32 recordNo)
{
RecordHolder holder = null;
if (!mRecordsByRecordNo.TryGetValue(recordNo, out holder)) {...}
record = holder.mRecordWeakRef.Target;
if (record==null)
{
// the object is not accessible it has finalized a while ago or is being finalized
if (holder.RecordFinalized.WaitOne())
{
//Now it has finalized we will create a new record
holder.RecordFinalized.Reset();
holder.Record = OnCreateNewRecord(/*isnew*/false, recordNo);
}
}
return holder.Record;
}
}
And then when the table thread try to get the record while it is disposing we use the method : holder.RecordFinalized.WaitOne() to make sure the finalization has completed first. Most of the time this method won't be blocking your DBF thread as the record has been finalized some time ago.
History
2009 June 4th : Added samples and ADO.Net comparison
2009 June 1st : First DbfDotNet (C#) release.
2000 May 21st : I wrote my first database engine, it is called tDbf and works on Delphi.
License
This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)
【文件预览】:
DbfDotNet version 1.0 Source
----DbfDotNet()
--------DbfDotNet.Linq()
--------DbfDotNet()
--------DemoApplication1()
--------DbfDotNet.sln(3KB)