Real-World Validation

by msimpson 5/12/2009 8:10:00 AM

I'm in the middle of rewriting this site using ASP.NET MVC, and am wrestling with the age-old question of how best to validate my data.  Like every developer, I've addressed this in different ways in the past, and thought I would take some time time to think through the problem systematically and figure out the 'best' solution for my needs.

The first thing to do is examine the problem.  Here are the goals and objectives as I see them:

  • Ensure that all input data is validated before it is persisted;
  • Keep my domain model clean at all times, to the degree possible;
  • Provide enough flexibility to handle complex business logic;
  • Expose validation through the entity model, while avoiding or at least minimizing dependencies on it;
  • Gracefully handle multiple levels or stages of validation without unduly duplicating code;
  • Make it easy to write validation code by providing 'helper' methods for common types of validation;
  • Facilitate a good user experience by making it easy to collect all validation errors and display them.

As I see it, there are various ways in which data validation approaches differ from one another.  The first is when the validation occurs: it can either be done immediately when working with the entity model, or it can be deferred until later and done all at once.  It seems clear that deferred validation is significantly more flexible, but at the cost of removing your ability to rely on the domain model always being valid.

The second characteristic is division of responsibility, in other words who's validating whom. I differentiate between the local approach, where domain objects validate themselves, vs. the remote approach where a validator object validates another object.  The remote approach seems more flexible here, but there are a lot of options that blur the lines - validators that exist as separate objects but are incorporated into entities via composition, for example, a la the Strategy pattern.

The third aspect is support for arbitrary context.  Complex validation often depends on things besides the entity's current internal state; some validation approaches support passing context, and some don't.

As a practical example of context, my team wrote a parser for ACH data that's used both when originating files and receiving files.  The structural model for the data is always the same, but the validation rules for the two scenarios are vastly different - when originating, we are much more strict than when receiving.  Also, the current date is a factor, because rules need to take effect on certain configurable dates, and expire on other dates.  We can't simply use DateTime.Now, because we might want to validate 'as of' a certain date.  We need to pass the date, and the current originate/receive scenario, as context to inform the validation.

When reviewing the approaches above against our goals, it's clear we need to make some tradeoffs.  For example, 'immediate local' validation (via property setters for example) ensures that input is validated, and helps keep the domain model clean at all times.  DDD purists might favor this approach.

But continuing with the above example, we don't have any control over when the validation runs, and if properties are used (instead of methods), then we can't pass context unless we set some reference to it ahead of time.  Immediate validation can also make it hard to collect validation errors and display them to the user; if an exception occurs, do we stop in our tracks and display the error, or keep going?  The calling code has to create and manage the exception collection.  Any and all code that sets properties on an entity needs to be prepared to handle validation exceptions and deal with them appropriately.

Earlier I mentioned different levels or stages of validation.  Not all validation occurs within the domain entities.  For example, when one is creating a new user account, there may be an email and/or password confirmation field that is not reflected in the entity.  This type of thing needs to be validated independently of the normal User entity validation. 

And there may be larger-scale validation that occurs between or across entities, and not just within them.  In the ACH parser example above, the file has an internal hierarchy of records.  Each record in the file becomes an entity with its own internal validation, but there are control records in the file that contain counts and totals of the previous records.  In this particular case, since it's a single-parent hierarchy, it's easy enough to decide where this cross-entity validation should go; in DDD terms, the file might be designated an 'aggregate root' that is responsible for everything it contains.  But one can easily imagine validation that might need to occur across nominally unrelated entities, due to their participating in the same business process.

So what's the best general approach?  In Scott Guthrie's Nerd Dinner tutorial, he uses what I would call 'deferred local' validation.  He extends his LINQ-to-SQL entity objects via partial classes, and adds support for deferred validation with a custom API.  This approach works well enough, but I feel that partial classes are too tight a coupling to the entity objects, and don't address cross-entity and extra-entity validation.

I've chosen to go with deferred remote validation.  Remote validation keeps me from getting too intimate with my Entity Framework entities, and deferred validation is simply more flexible.  I've decided to live with my domain model being suspect until I validate it.  In some cases it's actually desirable to allow your domain model to become invalid.  Going back (again) to the ACH example, as I mentioned, when we receive files we are pretty loose about validation rules.  We want to go ahead and receive the file but note the problems we found with it.  If we allow the model to exist in an invalid state, we can use it as input to a reporting mechanism, for example.

Anyway, I've defined two interfaces, ILocalValidator and IRemoteValidator, that contain two methods each:  IsValid() and Validate().  Each of these methods takes a context object (currently declared simply as 'object' though I may define this more tightly later).  IRemoteValidator also takes a reference to the object to be validated, whereas ILocalValidator does not; this reflects the semantic difference between them.  ILocalValidator also includes a method that allows setting a reference to the object to be validated.

Implementing these is a class named AbstractValidator<T> that contains a number of helpful methods for common validation tasks.  T is constrained to be an EntityObject, in my case.  To create a validator, all one needs to do is create a class that inherits from AbstractValidator and further constrains T to be a specific type of entity object, and override one method to perform the validation.

For convenience, I have another class called EntityObjectExtensions that hooks the above into the entity objects via extension methods, allowing one to simply say, myEntity.Validate().  It uses a naming convention to look up the appropriate validator class and instantiate it.  Of course it would be easy enough to hook into the validator in other ways, for example by using an IoC container to instantiate the appropriate validator and install it in the entity during construction.

Per DDD, I use the Repository pattern for persistence (I implemented this layer a while back to hide EF).  To ensure that only valid objects are persisted, I call validation from my repository classes before I persist them.

This approach avoids touching my entity objects at all, allows me to choose when validation is performed, allows me to pass arbitrary context to my validation routines, and provides reasonable assurance that my domain entities are valid before they are saved to the database.  It also provides both local and remote validation using the same validation code.  Finally, the approach does not preclude cross-entity and extra-entity validation - in fact, in the case above where I have a 'repeat password' field, I validate that in my controller immediately before calling user.Validate().  These are encapsulated in a method in my controller that I call from the Create() and Update() action methods.

A colleague of mine has proposed implementing a 'View Model' (a la Martin Fowler) pattern to further isolate and validate model data going to and coming from the UI.  I have not implemented that (yet), but don't see that it would be difficult or cause any problems.  In fact, if carefully written, the same validator might be reused to validate both the view model and the domain model.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , , , , , , , ,

Software

Related posts

Comments

6/5/2009 2:35:02 PM

msimpson

Here's an interesting project providing validation extensions for ASP.NET MVC: http://www.codeplex.com/xval.

msimpson

Comments are closed

Powered by BlogEngine.NET 1.2.0.0
Theme by Mads Kristensen


Calendar

<<  September 2010  >>
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
27282930123
45678910

View posts in large calendar

Pages

    Recent posts

    Recent comments

    Disclaimer

    The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

    © Copyright 2010

    Sign in