ADO.NET has long provided a variety of generic data objects you can use to access data in a variety of data stores. A few of the most common objects that you’ve probably used since the initial debut of the .NET Framework in the early part of the millennium include:
- SqlConnection and OleDbConnection: Represents a connection to a SQL Server database and an OLE DB database, respectively.
- SqlDataAdapter and OleDbAdapter: Represent data commands and a database connection used to read and update a database.
- DataSet: An in-memory data cache with one or more DataTables, used to access and update data as a generic data container with various data-related behaviors.
- DataTable: An in-memory data cache with a single DataTable, kind of a lightweight DataSet.
- SqlDataReader and OleDbDataReader: Provide a forward-only stream of data from a SQL Server or OLE DB database. These objects don’t actually “contain” data, but provide a fast way to retrieve data for immediate processing or caching in other objects in memory.
Together, these objects provide a means for accessing data in a data store, manipulate it, and persist updates back to the data store. The two data containers, DataSet and DataTable, can contain any two-dimensional data (rows and columns) that you read from a data store or generate in memory as the application executes. They contain properties and behaviors that let you perform tasks such as getting a list of the fields for a DataTable (whether a standalone object or part of a DataSet), read each row of data as a whole or field-by-field, compare updated field values in a row to the original values read from the data store, track changes in values and new and deleted rows, and save changes back to the database. These objects are remarkably versatile, and developers have built untold numbers of applications using these data objects.
As useful as these objects are, there are a number of problems that developers had to overcome, making data code unnecessarily complex, including:
- Strong coupling between an application and database: The application code that reads data using a DataAdapter or DataReader needs to know the structure of the data coming from the database in response to some kind of query. This means that any change to the database schema requires a change to your application code, and the problem is magnified with the more applications that use the database. There are some ways to mitigate this issue, but that means more complex code.
- Loose typing: DataSets and DataTables primarily rely on strings to identify the table (recordset) and field you want to access or update. The objects have to be able to handle any kind of data contained within a field—strings, dates, numbers of various kinds, etc.—and therefore return generic objects of type System.Object. You often have to perform an expensive conversion to a specific, strong data type before using it, another way that the application is strongly coupled to the database. DataReaders have type-specific methods to retrieve specific data types from a field, which again causes strong coupling.
- Object interactions: A DataSet is able to contain relationships that define how the data in various DataTables are related to each other, something like what foreign keys accomplish in a relational database. But extracting related data requires some often complex, unintuitive code, and sorting and filtering can be a challenge as well.
As you can see, there are various problems with using the generic ADO.NET objects for data access, even though developers have used these objects for years. The problem is that every one of those non-trivial applications has had to deal with these objects’ limitations, over and over and over again. But for about a decade, it was the best we had, short of manually developing custom entity objects that encapsulate all the data access code but require massive amounts of custom code.
Entity objects are a big improvement over generic data objects. Instead of instantiating a DataSet object that contains DataTables for Customers, Orders, and OrderDetails, you can instantiate a Customers object with custom properties related to being a customer, order, and order details, along with navigation properties that easily access related entities and behaviors germane to the entity types. An application can then make use of these entity objects in a very object-oriented fashion, making it far easier to work with the underlying data that is all nicely encapsulated in a customer object, for example, that acts like a native customer object should.
The interface of an entity object can closely mirror the tables and fields in a relational database, such as with an entity object for each relevant table and properties for all or most fields in the underlying table. Or the object can be completely different with, perhaps, a single entity object that uses data from multiple tables in the database, with properties that bear no resemblance to the fields in the various underlying tables. Or something between these two extremes, whatever makes the entity objects most useful for applications that use them. No matter how you design the objects, somewhere deep within them the objects have to have a mechanism for mapping the database schema to the properties of the entity objects. Entity Framework has all the features it needs to manage this mapping for you.
There are many benefits to using entity objects instead of generic data objects:
- Strong typing: Each property of an entity object can be the specific type needed for the data it exposes. Strings can be strings, dates can be dates, numbers can be whichever of any of several types to most closely match the data, and so on. You no longer have to do any type conversions when using data in the application; the entity object takes care of that when reading and writing the data from and to the database.
- Related to strong typing is compile-time checking. The generic data objects mostly return data as objects, so your application code has to manage conversion and you have to make sure it’s right. With entity objects, the compiler can take care of this task for you, dramatically reducing the potential for runtime bugs.
- Persistence ignorance: You can design these entity objects to encapsulate the persistent data store so that the application doesn’t need to know anything about how and where data is stored, or even its structure. This is known as persistence ignorance, in which information about persistence is segregated from business logic. It disconnects the data storage from the application, making both resilient to changes in the other, and is a major benefit for using entity objects.
- Productivity: A benefit of both strong typing and compile-time checking, as well as the inherent design of entity objects, is that application code is far easier and faster to write. You don’t have to worry about writing and rewriting code to connect to the database or manage the flow of data between objects and the database, as you do with the generic data objects. Even better, IntelliSense in Visual Studio can assist you in writing clean, correct code, taking advantage of the implementation of business objects.
Entity data objects provide a far richer way to access data and manage its flow between the application and database than the generic ADO.NET data objects. You can do a better job modeling the real world with them, and they can have behaviors that automatically know how to perform various actions.