Posts Tagged ‘Data modeling’

In the previous post we looked at the architecture of the Entity Framework. Carrying on from there we will see how to setup a basic Model First Design Scenario.

Common Tasks

For any operation against the Entity Framework, you will need to instantiate a data context. Since this object is bound to external resources it is important to clean it up after use. Wrapping the instantiation in a “using” block should take care of this.

A Simple Designer Scenario

Is this example we will add an entity to the designer in model first mode. We can then add some properties. We will also run through some of the options available to you as we go.

It is a good idea to keep all your EF Models in a class library off your main project. This will afford you the opportunity to reuse the model between multiple projects.

To add a EF Model

  • Right click the project -> Add New Item
  • Select ADO .NET Entity Data Model
  • Give your model a name

  • Select “Empty Model” from the wizard

This will create an EDMX file and add it to your project. You will also notice some references being added to the project.

You should now be presented with the designer surface. Selecting the design surface, explore the properties of the data context but hitting F4

Lets run through some of the more interesting properties here, bearing in mind that we are going to model first then generate our database off the model. This means we need to provide Entity Framework with some working knowledge about our database.

  • Code Generation Strategy: Allows us to use our own T4 templates for code generation.
  • Connection String: Notice this is blanked out, this will be populated once we bind the model to a database.
  • Database Schema Name: Notice in this image it is bold, meaning it has been changed. The default is “dbo”, however it is convenient to partition off your system areas. This will add every generated entity in the model to the database schema specified.
  • Lazy Load Enabled: This allows us to use deferred loading of entities.
  • Metadata Artifact Processing: We will go into this in more detail, but for now we can say it relates to the special way entity framework handles its connection strings.
  • Namespace: This is the code namespace the model will be stored under
  • Pluralize New Objects: This tries to cleverly pluralize the entity collections and database object names. So Company would have a collection name of Companies.
  • Transform Related Text Template on Save: This specifies whether to invoke the T4 code generation when the EDMX is saved.
  • Validate on Build: Verifies that the model is valid and there is no unbound entities or properties.

Once we are happy with the data context we can begin creating entities. We can either drag an entity from the toolbox onto the design surface, or we can use the context menus from the design surface. Either method is fine, but the context menu gives you a net dialog to full out: right click -> Add -> Entity…

  • Properties
    • Entity Name: The name of your entity (singular)
    • Base Type: Relates to inheritance, which we will cover later on.
    • Entity Set: The collection name for your entity, this get auto populated but you can change it if you feel the need.
  • Key Property
    • Create Key Property: Tell the designer to create a property the entity can key off. Every entity MUST have a key. Composite keys can be defined later in the designer.
    • Property Name and Property Type: If generating a single primary key, you can set the type and name of it here

Once the entity has been created we can start adding properties to it. Right click on the properties section in the entity -> Add -> Scalar. Give your property a name. You can get more detail on your property by hitting F4 while it is selected:

Here I have created a Company entity and given it a few properties. Inspecting one of these we see:

  • Concurrency Mode: If set, this property will for part of the update statement generated along with the primary key. This is similar to the concurrency model in the old DataAdapters, except you have much finer control in the Entity Framework. By way of example, lets look at two SQL statements:

A) No Concurrency Control:

    UPDATE Company SET RegistrationNumber = ‘9999’ WHERE Id = 3

B) With Concurrency Control:

    UPDATE Company SET RegistrationNumber = ‘9999’ WHERE Id = 3 AND RegistrationNumber = ‘123’

In example A there is no guarantee that the record has not been modified since we read it. The only way we could detect a concurrency issue is if the row was deleted in which case the ROWS AFFECTED will return 0.

In the example B, because the original value/s of the row participates in locating the record, we can now detect if a concurrency violation has taken place since out last read, i.e.: The record has been changed, since no match will be made. If ROWS AFFECTED returns 0, a concurrency exception is thrown, and the application must offer up some form of conflict resolution.

  • Default Value: This is the default code value. Note: This does NOT set the database default value for a property. It will only initialize the objects property value.
  • Entity Key: Specifies whether this property participates in the entities key.
  • Fixed Length: This is for database generation, will make the property either varchar of char if a string.
  • Max Length: This is for database generation, will set the field length in the database.
  • Nullable: Affects both the entity and the database object.
  • StoreGeneratedPattern: Allows the database to set the value of this property, this is primarily used for auto increment values.
  • Type: the CLR type of this property
  • Unicode: if the type is a string, specifies whether the property supports an extended character set (nvarchar vs varchar)

We now have a basic entity model to query. Although at this stage it is not bound to a database


LinQ to Entities

Since the Entity Framework supports the Linq query interface we have all the Linq commands available to us. The following link provides several querying examples that should cover most development scenarios.

It is important to note however that it is never a good idea to have overly complicated database queries in a Linq statement, lest we lose SQL pre-compilation optimizations, instead for this it is probably better to wrap the query logic in a stored proc and expose it via the entity framework. Here is an example of a simple query layout that one might find in a service class that handles paging.

Some things to note here:

  • The data context instance is wrapped in a using statement so it is nicely cleaned up after use.
  • We ToList() the query, in essence what this does is execute the query and enumerate the results into a List object.
Lazy Loading

Entity framework supports deferred loading, that is, a query will not be executed unless it is used via iteration or other operation. This gives us the ability to logically supplement the query as we pass it through our code. We could also create a base query (say limit the results to the current user ID) and then build upon this. Lazy loading may be switched on an off via the DataContext.

Loading a Network Graph

It is also possible when querying an entity to return all related entities via a navigation property. In this case we instruct the data context to also load a “path”. This will load the queried entities as well as all related entities in the graph. This is a useful for caching or master / detail scenarios. We do this by querying using lambda expressions queries off the data context collection using the “Include” method:

Context.Companies.Include(“Employees”).Select(x => x.Name == “My Start-up”);

In this example we load the company “My Start-up” and all its Employees.

Loading from Inheritance

It is possible to define an inheritance relationship between entities (covered later on). However when querying, it is important to note that principles of polymorphism apply. When querying a base type, all sub types that match the base type query are returned too. Although returned as an enumeration of the base type, the underlying CLR type that they will be instantiated as is the sub type. This has some profound implications on the data loading from SQL too. Any properties that extend the base type will also be loaded from their respective mapped types. If this behaviour is undesirable, then perhaps one should consider implementing a splitting scenario over an inheritance structure.

In the next part we will look into generating the database off the model as well as Inserting, Deleting and Updating.


Recently I decided to roll up my sleeves and demystify the workings of the Entity Framework. I had taken a look at it a while back but found the tooling a little premature, so I put it back on the shelf and carried on bashing away with LinQ to SQL which to my mind “did everything that Entity Framework does”. Boy was I wrong!

This is the first of a multi part series that will take us from EF Noobs to Pirate Ninja EF Ubersoldata. But first. The basics.

What is EF?

Simply put, the Entity Framework is an object relational mapper. That is it takes database objects like tables, sprocs, views, etc and maps them into a nice easy to use object model that can be called in code. If done correctly, this should take the nightmare out of managing the translation of SQL types and structures into CLR types. Entity Framework off this and SO much more.


The basic architecture of Entity Framework is similar to most ORM mappers. The DataContext is a class that holds all the meta data about the model as well as the entity definitions. The DataContext is broken up into three physical layers that aid the transformation from the data layer into the conceptual layer. These three layers are the Conceptual Layer, Storage Layer and the Mapping Layer. Any data changes (CRUD, etc) made in the Conceptual Layer are passed through the mapping layer and mapped onto SQL commands, generated dynamically or called via procs. This gives us a powerful mechanism to perform some advanced operations with the complexity abstracted from the developer. These include:

  • Inheritance scenarios
  • Data splitting scenarios
  • Data composition scenarios
  • Parent / Child awareness in the conceptual layer
  • Renaming and remapping of database entities.
  • Allows strong typed set based operations in code

CSDL: Conceptual Layer

The conceptual layer is a layer whose functionality maps closely onto your applications functionality. Moreover, since we can have multiple Data Contexts, it gives us the ability to map our conceptual layer closely with our problem domain and remove the nuances of having to handle database specific issues in our code. We can target multiple EDMX files to different parts of the system.

SSDL: Store Layer

The Storage layer maps directly onto the database structure. Since this layer has semantic knowledge of the database, it can perform code generation for most database scenarios at runtime. You may map most database entities in the Store layer and bubble them up to the Conceptual Layer. This includes:

  • Tables
  • Views
  • Stored Procs
  • Functions

In some more advanced scenarios you are permitted to define SQL commands in your store and treat them as first class Store elements

MSDL: Mapping Layer

The Mapping Layer ties the Conceptual and Store layers together and handles translations between the two. The existence of this layer allows us some complex binding and mapping scenarios (These will be outlined later on) There are several options when mapping entities:

  • Procs to CLR methods
  • CRUD through auto generated SQL
  • CRUD though stored proc to entity mapping

The EDMX and Code Generation

When creating a new entity data model is created, in fact what happens is the designer tool generates an XML file (.EDMX). We can explore this file by opening the EDMX file in an XML editor. From there you can clearly see the three sections: CSDL, SSDL and MSDL. There is also a fourth section for the designer to track the graphical objects on the design surface.

Using T4 template code generation the XML is parsed and the code behind is generated. This code behind file includes the classes for the three layers as well as the data context. Since T4 code generation is used, we can change the T4 template to use a custom template if we want to modify or extend the generated classes. Out the box the generator creates a Table per Type inheritance strategy with table mappings and all entity classes deriving off the EntityObject class. We will see how to change these behaviors later on.

Model First vs Data First

The Visual Studio 2010 tooling provides a rich modelling environment. We are given two options to create the data model.

  • Model First
  • Generate from Existing Database

Model first allows us to play with the designer and let us generate a database off the model. It abstracts some of the noramlization concerns like associative entities. This is a very natural way for a developer to work. From this we can specify some database options like schema namespace, properties that enlist in concurrency checking, etc. At this stage all your work resides on the conceptual layer. Given this it has enough information to generate a new database script. Use this method when designing a new system from scratch as it employs some best practices that are often overlooked. Once generated, the designer then populates its store layer and mapping layer to map onto the newly created database.

Generate from existing database does exactly what it says. It will go through the database and extract database objects and build the model from the store layer upwards. Once complete, you may remove entities from the conceptual layer and be prompted if you would like to remove from the mapping and store layer accordingly. In some cases you want items to exist in the store layer but mapped/remapped to one or more entities in the conceptual layer. You may also choose to update the model from the database as it changes over time.

In part 2 we will be diving into some code and basic examples.