Jul 1 2010

An Improved Rx Approach For Limiting Asynchronous Calls

Category: .Net Framework | Tips | ToolsAlexRobson @ 00:44

Not a day after I posted Using Reactive Extensions To Throttle Asynchronous Tasks, Josh Bush was already (kindly) saying “I think your code may have a problem”. The issue with the first example is two-fold: one it doesn’t really work as posted and two, if it did it would behave in a less than ideal way. Basically, it calls wait after each item. Not exactly what I was going for.

It Would Be Cool If…

You have some known/unknown quantity of total work (Y), and you want to limit the number of worker threads in process at any given time (X). What my first try was actually doing was making X asynchronous calls and for every call past that (X+N) it was immediately calling WaitOne on the wait handle.

Josh To The Rescue

I fought with my example for an hour or so and realized that there wasn’t just some really simple thing for me to tack on to the existing code sample. This morning, I read Josh’s post about his approach to limiting the number of asynchronous calls using IEnumerable only. The key to his approach (and little did I know, to mine as well) was the really cool Aggregate extension method.

My Improved Solution

Action<IList<XElement>> saveAction = SaveChunk;
var loader = new BulkPostLoader(@"e:\stackoverflow\062010 so\posts.xml");
var batches = loader.BufferWithCount(5000);
var results = batches.Select(x => saveAction.BeginInvoke(x, null, null));
        
results
    .Aggregate(new HashSet<IAsyncResult>(), (set, result) =>
    {
        set.RemoveWhere(x => x.IsCompleted);
        set.Add(result);
        if(set.Count > 5)
        {
            var inProcess = set.Where(x => !x.IsCompleted).FirstOrDefault();
            if(inProcess != null)
            {
                inProcess.AsyncWaitHandle.WaitOne();
            }
        }
        return set;
    }
    .Subscribe(x => {});

As you can see, the main difference is that I’m using a HashSet to aggregate calls over the stream. Every time I clear out any calls that have completed to prevent completed calls from making my code behave as if the number of calls in process have reached the limit. Every time I add the most recent async handle to the set and then, if the set is above the limit, I take an uncompleted call and wait on it.

Well That Wasn’t So Bad, Was It?

By now you’ve certainly realized I’m not an Rx expert, I’m just sharing what I’m trying to learn as I learn it. Hopefully it’s more helpful than it is distracting.

Tags: ,

Jun 15 2010

ETL To CouchDB With Symbiote, Relax and Reactive Extensions

Category: Open Source | .Net Framework | SymbioteAlexRobson @ 11:47

I’ve been working on Relax a lot lately. I’ve recently added a Lucene.Net Symbiote project which Relax then uses to provide document indexing and LINQ queries for CouchDB (more about that in another post). A very important part of getting Relax to an RC is understanding how it all behaves under high load.

But what’s high load? We generally target SMBs or internal applications which aren’t going to see social networking kinds of stress. Still, I like knowing the upper bounds of what I’m working on.

I think if you’re familiar with this type of problem and you’re familiar with Rx, just looking at the code samples is probably all you need to appreciate what this is doing. This was a learning experience for me, and I was so happy with what a drastic improvement Reactive Extensions allowed me to easily introduce, I wanted to share it.

Finding A Good Source Of Data

I chose the Stack Overflow dataset (you need to scroll down to find the link to the ClearBits link). Though I often disagree philosophically with Jeff Atwood and Joel Spolsky say, Stack Overflow is a good thing and I really admire the SO team for sharing their data.

I’m only using the posts file atm, which is > 2 million records and the file size is roughly 2.8 GB. I think that’s plenty of data for what I need : )

The Best Way To Bulk Load In Relax / CouchDB

CouchDB provides a bulk document API which allows us to store multiple documents at once in order to save on the overhead involved in the persistence call. Relax makes extensive use of this API behind the scenes. In this case, we want to be able to batch several thousand documents together to persist at once to minimize the overhead cost.

The other thing to note is that CouchDB handles concurrent load exceptionally well (at least from my experience) and so I want the save commands firing off asynchronously as soon as the batch is ready.

Let The Fun Begin

The SO data is all in XML. Yay. This would allow me to use an XML reader to stream through the file and create documents. I’m doing this through an IObservable implementation. I use a base abstract class that provides me with my standard IObservable code. It’s nothing magic, but here’s the source for the sake of clarity:

public abstract class BaseObservable<TNotification> 
    : IObservable<TNotification>, IDisposable
{
    protected ConcurrentBag<IObserver<TNotification>> observers { get; set; }

    public virtual void Notify(TNotification notification)
    {
        observers.ForEach(x => x.OnNext(notification));
    }

    public virtual void SendCompletion()
    {
        observers.ForEach(x => x.OnCompleted());
    }

    public virtual IDisposable Subscribe(IObserver<TNotification> observer)
    {
        var disposable = this as IDisposable;
        observers.Add(observer);
        return disposable;
    }

    protected BaseObservable()
    {
        this.observers = new ConcurrentBag<IObserver<TNotification>>();
    }

    public void Dispose()
    {
        while (observers.Count > 0)
        {
            IObserver<TNotification> o;
            observers.TryTake(out o);
        }
    }
}

Now for the important part: the observable XmlReader:

public class PostReader
    : BaseObservable<XElement>
{
    protected string xmlExportPath { get; set; }

    public void Start()
    {
        using(var stream = new FileStream(
                    xmlExportPath, 
                    FileMode.Open, 
                    FileAccess.Read, 
                    FileShare.None, 
                    2048, 
                    true))
        {
            using(var reader = XmlReader.Create(stream))
            {
                reader.MoveToContent();

                while(reader.Read())
                {
                    if(reader.NodeType == XmlNodeType.Element && reader.Name == "row")
                    {
                        var element = XElement.ReadFrom(reader) as XElement;
                        Notify(element);
                    }
                }
            }
        }
        SendCompletion();
    }

    public PostReader(string xmlExportPath)
    {
        this.xmlExportPath = xmlExportPath;
    }
}

Basically, all I’m doing is reading in each row element (the row element represents a Post item), creating an XElement, notifying the observer(s), and sending the complete signal after the entire file has been read.

Why Bother With The Reactive Extensions?

Starting off, I didn’t know how many records I was looking at. I did know I didn’t want to deserialize everything into memory first and then save because that’s a waste of time, waste of RAM and wouldn’t be easy to parallelize. I also know from experience that my bottleneck is IO. Spinning up async tasks faster than the tasks can complete creates memory issues and, in this case, out of memory exceptions.

I’m not suggesting you can’t handle all this without Reactive Extensions. I am suggesting you won’t be able to do it as elegantly or as simply without them.

Enter Reactive Extensions

The Reactive Extensions (or Rx) is a library from Microsoft DevLabs and created by Erik Meijer and his team of ninja assassin developers. Rx and now RxJS are two projects you really ought to be learning about. And yes, that dizzy feeling you’ll get is normal; the human brain isn’t meant to take in so much distilled awesome.

Rx makes it easy to program against asynchronous event streams. Take a moment to think about that and let it sink in…

Making Friends With IObservable

Get comfortable with IObservable because it’s the core of Rx. I like to think of IObservable as a message pump. Eric Meijer likes to compare IObservable with IEnumerable: essentially he sums it up as IEnumerable is a pull mechanism and IObservable is a push mechanism. Rx helps bridge the gap between functionality and tooling for pull mechanisms and push mechanisms and in some cases allows us to interchange the two.

Enough Talk

I’m going to show you the rest of the code and then break it down. Each section has a header so if you’re not interested in that portion of the source, just skip ahead.

class Program
{
    static void Main(string[] args)
    {
        Assimilate
            .Core()
            .Daemon(x => x
                .Arguments(args)
                .Name("SOBulkLoader")
                .DisplayName("Stack Overflow Bulk Loading Service")
                .Description("Does what it says"))
            .Relax(x => x.UseDefaults().TimeOut(1000000))
            .AddConsoleLogger<LoadingService>(x => x.Info().MessageLayout(m => m.Date().Message().Newline()))
            .RunDaemon();
    }
}

public class LoadingService
    : IDaemon
{
    protected IDocumentRepository repository { get; set; }
    protected XmlSerializer postSerializer { get; set; }

    public void Start()
    {
        "Loading service starting"
            .ToInfo<LoadingService>();

        Action<IList<XElement>> saveAction = SaveChunk;
        var loader = new BulkPostLoader(@"e:\stackoverflow\062010 so\posts.xml");
        var batches = loader.BufferWithCount(5000);
        var results = batches.Select(x => saveAction.BeginInvoke(x, null, null));
        
        results
            .BufferWithCount(5)
            .Subscribe(x => x.ForEach(y => y.AsyncWaitHandle.WaitOne()));

        loader.Start();
    }

    protected void SaveChunk(IList<XElement> x)
    {
        var list = x.Select(ProcessPost).ToList();
        repository.SaveAll(list);
        "Posts {0} to {1} chunked and saved"
            .ToInfo<LoadingService>(list.First().Id, list.Last().Id);
    }

    public Post ProcessPost(XElement element)
    {
        var content = element.ToString();
        return postSerializer.Deserialize(new StringReader(content)) as Post;
    }

    public void Stop()
    {
        "Loading service stopping"
            .ToInfo<LoadingService>();
    }

    public LoadingService(IDocumentRepository repository)
    {
        this.repository = repository;
        this.postSerializer = new XmlSerializer(typeof (Post));
    }
}

Program Main (A Shameless Symbiote Plug)

If you haven’t seen it before, this is a Symbiote Assimilation call: a centralized, fluent API for configuring multiple open source frameworks. In this code, I’m initializing Symbiote with a call to .Core() (that’s always required). Next I define the service I’m creating using Daemon. Then I’m using the default configuration for Relax and changing the timeout to 1k seconds (more that I need). I’m also adding a Log4Net console logger and telling it how I want the message layed out. Lastly, I’m starting the Daemon. The great thing is that Symbiote is registering everything (including configuration and all the different project dependencies) with StructureMap, which has a lot of good implications

What’s an IDaemon?

The Daemon project takes TopShelf and makes it super easy to create windows services. IDaemon requires Start and Stop methods and Symbiote handles the rest (like dependency injection, etc.) While TopShelf is really for Windows Services, I use it for all my console applications because it adds some really nice things and it’s simple to use.

IDocumentRepository

This is the primary interface for storage and retrieval of documents in CouchDB. I’m taking a dependency on it which is supplied by Symbiote when it instantiates and runs the service.

Putting Rx To Work

The load variable is an instance of the PostReader class and takes the path to the posts xml file. From that, we use the BufferWithCount extension method from Rx to produce a new IObservable<IList<XElement>>. At this point I hope you picked up on two things: 1) I haven’t called loader.Start() yet, so nothing is happening. 2) loader is an instance of IObservable<XElement> but calling BufferWithCount produces an IObservable<IList<XElement>> meaning that it transforms messages from the origin to a list of messages of the requested size.

It’s about to get more awesomer. Now that I have an observable that will produce messages containing a list of 5k messages, I want a way to asynchronously queue transforming and persisting these. Calling select against the batches observable lets me kick off an asynchronous call to the SaveChunk method (via the delegate defined earlier). This produces a new IObservable<IAsyncResult> so now we have an observable list of asynchronous results. The usefulness may not seem readily apparent, but remember BufferWithCount? I can use that same call to batch IAsyncResults, then subscribe to each batch of five, get the wait handles and block until all calls in the batch have completed.

Once I have created the IObservables and set everything up, I then tell my loader to start. Everything up to that call is wiring up and defining how I want to handle the XElements as they’re produced. Since I don’t want any XElements getting lost, I don’t start the  message pump until everything’s in place.

The Result

Running all this on my local development (moderate Core 2 Duo) laptop yields 100,000 inserts per minute for 15 minutes. Memory utilization is hardly noticeable, this process is actually CPU bound due to the transform from XML to class type being fairly expensive.

To put these metrics into some perspective; the highest tweet-per-minute average this month (88k) could be imported real-time into CouchDB via Relax without any special tuning or hardware.

Tags: , , ,

Jun 10 2010

A Peek Inside My Brain

While my recent work on Symbiote and Relax probably appears to be all over the place, there is a unifying, underlying purpose behind all the work I’m doing. This post is about my short and long term goals. It’s about the technologies and architectures I believe are going to become important in the not-too-distant-yet-not-immediate future.

Who I’m Building For

I’m primarily building tools for our development team at work. We have seven developers (myself included), which work on multiple projects for internal and external customers. We target SMBs, usually via SaaS solutions.

I’m hopeful that more developers on similar teams with similar needs will find that the Symbiote libraries provide a simple, easy way to adopt some of the great open source frameworks available.

Tenets

If I had a technical manifesto, it would ridiculously opinionated, long and in printed form, might be used to even out furniture with wobbly legs. It would also talk a lot about the following tenets:

1. Open standards are the way forward. Proprietary is bad.
2. Distributed architectures will be the best way to take advantage of the new advances in hardware.
3. Open source alternatives to bloated, closed technologies will become vital to small/medium development shops.
4. “Teh Cloud” will become a great place for solutions which aren’t built from proprietary pieces.
5. APIs that are “discoverable”, provide extensibility via dependency injection, and are built around conventions but provide configuration provide the most value. (shameless plug for Symbiote)

Technologies and Architectures I <3

Symbiote isn’t a complete list, but it’s a good start of the architectures and technologies I’m excited about. RabbitMQ, CouchDB and Lucene are three technologies I love. CQRS, messaging, and RESTful are just a few of the architectures that can produce powerful and agile solutions.

Technologies I Avoid

Knowing when a technology is bad for you is a skill that I’ve learned the hard way. It’s been painful. It’s been costly. My poor coworkers are probably developing mental defense mechanisms due to my tendency to wax bitter about certain technologies that infamous for burning projects and small teams to the ground.

Bloated, proprietary, and closed systems are a bad fit for most projects I’ve ever worked on. If you’re going to use technologies built for fortune 500 IT organizations, you need to BE a fortune 500 IT organization. These kinds of technologies are career paths unto their own. If you’re a smaller shop, you generally can’t spare entire people to these things alone.

What I’m Focused On Now

Our team is investing a good bit of time developing technologies around CouchDB. Why? It performs well. Schema-less storage is a huge time-savings for us because it doesn’t require a ton of up-front design and it ‘evolves’ gracefully with our domain model (at least so far). The team is going to need to address certain things that CouchDB doesn’t do out of the box. Those things are:

1. Handling relationships between document types
2. Open search capabilities. Writing views for everything a user may want to search on isn’t practical.
3. Reporting capabilities. Josh Bush and Jim Cowart are doing a brilliant job in this area.

That said, I’m trying to fast track a Relax RC that provides indexing and query services using Lucene. I have a proof of concept for those services and I’m also working on a LINQ provider.

Where To Get More Information

Following me on this blog, or on twitter are good starts. If you’re feeling adventurous and want to play with the code, check out http://github.com/arobson and see the Symbiote and Relax repositories. There is currently a wiki at http://sharplearningcurve.com/wiki and I’m also (slowly) working on a site dedicated to Relax documentation, features and updates.

Tags: , , , , , ,

Jan 22 2010

My Crash Course In High Performance NHibernate

It’s never good when your boss appears in your office unexpectedly to tell you that the deadline you thought was a few days out is actually tomorrow. It’s also not good when it happens right after your analyst informs you that the system you thought was producing valid output was actually built on an oversimplification that was only just discovered. It’s especially bad when the model you’re working against is supposed to be crawling a payroll system with insufficient metadata to support the business rules. This particular model is very complex. So complex that there are professionals who dedicate their entire career just to understanding this single facet of their industry.

Welcome to my hell, circa yesterday morning. The problem is that the process I wrote to handle all this in the first place was already written under a relatively aggressive deadline. This is my preface for telling you that I wrote a crappy console app to “get-er done!”. The issue is that the sheer volume of data, coupled with the awful schema we inherited, coupled with the complex business rules and model made for a very slow loading of the better part of the database into memory so my code would be able to handle all the calculations and recreation of new structures which would then be saved back to newer (still fairly complex) schema in the database. This wonderous and unnatural process took anywhere from 1.5 to 2 hours to complete. Still, as of last Friday, we thought we were in great shape…

The real issue with a long running process like this is that when a problem is identified, you have to identify the root cause, adapt the model/logic, test, then complete a full run. When there’s a 2 hour overhead in that process, it gets really, really painful. Now I wasn’t just on the hook for this one thing, so it’s not like I’d been able to give this my full attention. I ignorantly thought “this is good enough for now…”

I’m always saying what a good team we have here. Evan Hoff and Jim Cowart really helped me a lot. In one 18 hour day we managed to turn this slow crappy process into a fast crappy process (about 4 to 5 times faster). I also have to give credit to Oren Eini for making the wonderful NHibernate Profiler, a tool no dev wishing to remain sane should be without. Anyway, here’s what I learned:

The NHibernate.Linq Library Is Dangerous
You should only use it for fun time. The eager loading does not work correctly. In situations where you don’t care about lazy loading additional child collections, it’s worked just fine for me. I actually still use it for those cases because it’s type-safe and compile time checked for typos : )

You Can Die From Lazy Loading
Lazy loading ain’t free. It doesn’t seem like it would be a huge deal but when you have a model that’s > 2 levels deep with more than just on or two nodes off each aggregate root, lazy loading will kill you dead.

Use The Future Query API To Eager Load
This is awesome. Fortunately Evan had just read Oren’s latest blog entry on this. With some HQL experimentation we figured out how incredibly powerful this is. Sadly, HQL is just a flipping string so it’s easy to mess us. The NH error messages were good enough to point me in the right direction. Read Oren’s post
here and the HQL chapter here.

Second Level Caching Is Not Your Friend For High Volume
This wasn’t what I expected but sure enough, turning off the second level cache made the writes back to the database go much, much faster. Calling flush on the session was taking seconds just for a few persists until we eliminated the second level cache.

Use The Reflection Optimizer For High Volume
There is some up-front penalty here but it did help performance. If you’re using Fluent NH like I am, it’s a simple .UseReflectionOptimizer() call during the fluent database configuration step.

You Need One Session Per Thread And Objects Cannot Be Shared Across Sessions
To get this monstrosity running faster we needed to make all this processing happen concurrently. Unfortunately, this process was very complex in how the new object model was created. Certain objects needed to be created and shared across models on an as needed basis. Before parallelizing it, I was able to store these shared objects in a hash and wrap access to them in a nice little function call that abstracted away the fact that I was creating them if they didn’t exist and retrieving them if they did.

This does not work when you’re spinning up threads with a session per thread (this is required) because as soon as you try to associate the shared instance across more than one session, NH breaks. Here’s how we got around this limitation:

Implement a double checked lock pattern so that you have a dictionary of locks per shared object id and a lock that protects access to that dictionary. When the consumer asks for a specific shared object by id, you check the database first. If the object wasn’t there, then you lock on the outer dictionary lock and then check to see if a lock exists for that shared object id. If it doesn’t you create a lock and store it by the requested object id. After that, you lock on that newly created object for the id, check the database again and if there is still now record, you create the object, save it and exit the lock. If it was in the database, you simply return it. Here’s some demo code to reinforce that messy explanation:

private object _dictionaryLock = new object();
private Dictionary<int, object> _sharedObjectLock = new Dictionary<int, object>();

public bool GetSharedInstanceFromDB(ISession session, int id, out SharedObject instance)
{
    instance = session.Linq<SharedObject>().FirstOrDefault(x => x.Id == id);
    return instance != null;
}

public SharedObject GetSharedInstance(ISession session, int id)
{
    SharedObject instance = nulll;
    if(!GetSharedInstanceFromDB(session, id, out instance)
    {
        lock(_dictionaryLock)
        {
            if(!_sharedObjectLock.ContainsKey(id))
                _sharedObjectLock.Add(id, new object());
        }
        lock(_sharedObjectLock[id])
        {
            if(!GetSharedInstanceFromDB(session, id, out instance)
            {
                // code to create instance
                session.Save(instance);
                session.Flush();
            }
        }
    }
    return instance;
}

Far from simple, but for us, unfortunately, it was necessary. The nice thing about this is that it gives you a way to multi-thread session access and still share a common object between threads without causing session collisions.

DO NOT USE IDENTITY COLUMNS! AHHHHHHHHH
We used identity columns : \ I’ve pretty much always been against them because I don’t like the idea of my database telling me what the identifier for my records are. I like to have control (does that make me crazy?). NH pros will tell you to use Hi-Lo or something like that which allows your clients to create unique, yet arbitrary ids for your tables. Why does it matter?

Well, unlike my now dead ORM, NHibernate does not attempt to write your FK values from parent objects in one go. Instead it will do a follow-up Update to all the child rows to provide the database-specified parent Id when you’re using identity columns. This can get very expensive and chatty, very, very quickly. On the other hand, if you’re specifying the id in your client code, it’s already available to the child FK rows. IGNORE THIS ADVICE AT YOUR OWN FLIPPING PERIL. Sadly, we can’t just change all the schema and models at the last minute, but it’s definitely something I will take with me moving forward.

 

And that’s all I have to say about that. Hope it’s helpful : )

 

 

Tags:

Oct 22 2009

Simplify jqGrid JSON Generation

Category: .Net Framework | Web DevelopmentAlexRobson @ 15:09

So I’m trying to learn this jqGrid thing and so far it seems pretty cool but one thing was really bothering me: every example I saw that was using JSON was using anonymous types to create the required JSON format. Yuck city. Yes. I’m insane and whiney but there’s method to my madness.

Here’s the anonymous type approach:

var jsonData = new
{
    total = totalPages,
    page = page,
    records = totalRecords,
    rows = (
        from record in records
        select new
        {
            i = record.Id,
            cell = new string[] {
               record.Name, 
                record.Date.ToShortDateString(), 
                record.Description.ToString() 
            }
        }).ToArray()
};

Imagine you have to change any of these properties or the “schema” of your JSON output. What if you had to do that application wide? No refactoring tool can save you. Not only that, but you have to memorize/lookup the schema every time you create it. Typos will ruin you but there’s no compile time checking. These are just a few of the reasons I really dislike that approach.

What if you could do this instead?

var gridData = new jqGridData<Task>(
                tasks, 
                t => t.Id, 
                t => new object[]
                       {
                           t.Name,
                           t.Date,
                           t.Description
                       })
                       {     
							page = 1, 
							total = tasks.Count/pageSize,
							records = tasks.Count};

Now that’s more like it. Strongly typed, intellisense, compile time checked and refactor friendly. So here’s the code to make it happen (it’s very simple):

public class jqGridData<T>
	where T : class
{
	public int total { get; set; }
	public int page { get; set; }
	public int records { get; set; }
	public IList<jqGridRow<T>> rows { get; set; }

	public jqGridData(IList<T> list, Func<T, object> idMember, Func<T, object[]> columns)
	{
		rows = list.Select(i => new jqGridRow<T>(i, idMember, columns)).ToList();
	}
}
	
public class jqGridRow<T>
	where T : class
{
	public string id { get; set; }
	public string[] cell { get; set; }
	
	public jqGridRow(T rowInstance, Func<T, object> idMember, Func<T, object[]> columns)
	{
		id = idMember(rowInstance).ToString();
		cell = columns(rowInstance).Select(c => c.ToString()).ToArray();
	}
}

Right, not an earth-shattering discovery, certainly not going to change your life, but it does make using jqGrid with JSON simple.

Tags: ,

Oct 22 2009

The Bourne Framework – A High Level Introduction

Category: .Net Framework | Open SourceAlexRobson @ 03:16

For a little over a month now, I’ve been contributing to an open source project started and architected by Evan Hoff. After the week of the project, I started bugging him about when I could blog about the project. The project is called the Bourne Framework, and it’s changing the way I write code*. The one thing I should make abundantly clear is that Bourne is new and subject to change. The good news is that it’s also very usable in its current state.

Technology Stack
A lot of the framework comes from Evan’s past experience with several open source projects. Most of them are fairly widely known:

  • NHibernate
  • FluentNhibernate
  • NHibernateLinq
  • StructureMap
  • MassTransit
  • TopShelf (part of MassTransit)
  • log4Net

The framework uses these open source libraries in order to provide out-of-the-box infrastructure for the following types of .Net applications:

  • ASP.Net MVC
  • WCF
  • Windows Services

What It Does
It’s difficult to summarize without just throwing around buzz-words. I would say that it does an excellent job of tying together leading open source frameworks by providing an integrated infrastructure for configuration and application of these libraries. I feel that that’s particularly invaluable to developers who want to use the best technologies available but don’t necessarily have the time and/or resources available to do deep dives on all of them. It doesn’t and can’t completely abstract everything away. in fact, anyone who has experience in these open source frameworks knows that you’ll still need to understand what they do and how to use them in general. The difference between using them on your own and using Bourne is that Bourne gives you a really nice structure and simplifies the configuration experience while reducing LoC, something that I’m a big fan of.

How Do You Learn It?
Bourne has a fairly respectable set of unit tests as well as some demo code included in the source. That’s a good place to start. I am going to start a blog series where I go through several different (very simple) types of applications which show off some of Bourne’s features. I also plan to make the source for all of these demos available on GitHub.

Where Can You Get It?
Bourne Framework is on GitHub. Evan’s url is git://github.com/therealhoff/BourneFramework.git and mine is at git://github.com/arobson/BourneFramework. Evan hasn’t had time to review all the code I’ve added to it, so if you want the purest Bourne, check out his repository first.

 

*Which, if you’ve seen my code, is a really good thing : )

Tags: ,

Oct 16 2009

ASP.Net MVC 2 Preview 2 Installation Issue

Category: .Net Framework | Web DevelopmentAlexRobson @ 09:28

Thanks to some help from Elijah Manor, I was ready to pull down the Preview 2 installer for ASP.Net MVC 2 and play around with the new features over the week-end. Long story short; there are some people born lucky, and I am not one of them. The installer kept tanking hard telling me that I was missing important system updates and that the installer couldn’t continue without those. Say what?

After messing around for a bit, I remember having other issues due to my (foolish?) decision to install the VS 2010 beta on the same machine as VS 2008 (instead of in a VM like Evan Hoff did). So I started looking through my installed items and saw that MVC 1.1 is the VS 2010 compatible version. After uninstalling that MVC 2 Preview 2 installed without a hitch.

The happy end to this story is that you can go back and re-install MVC 1.1 for VS 2010 once the Preview 2 installation has completed without any other issues. HTH.

Tags: , ,

Sep 14 2009

Installing WCF Activation on Windows 7 with VS 2010

Category: .Net FrameworkAlexRobson @ 17:34

I’ve had a problem on my most recent Windows 7 install with getting the WCF activation to install. I kept getting errors which told me that an error occurred and some of the features were not installed and then I was prompted to restart now or later… Well, great as that was, I was sort of hoping to find a solution…

Thanks to Stack Overflow and Jörg Battermann, I just found the following post here. The short of the post is that installing VS 2010 replaces a file (SMConfigInstaller) which is required to correctly install WCF activation. Uninstalling VS 2010 will not correct the issue. Instead you need to copy the 3.0 version of the file over the 4.0 version. The only tricks are getting the access to the original version of SMConfigInstaller and having the permission to replace the file.

Regarding the first step; here’s a zip file containing the 64 bit versions of both files. (assuming I don’t get told to stop sharing the files, I figured since they’re freely available, what’s the harm?) The file you need to copy over what’s on your system is under the pre 4.0 folder and needs to be copied to c:\windows\microsoft.net\framework64\windows communication foundation\.

Before you can copy the 3.0 version of the file, you’ll have to give yourself permission to the file on your system. In my case, I had to take ownership of the file, then give myself permission (a process you will want to reverse once this whole things is over and done).

Once you’ve copied the file, open the Programs and Features, click Turn Windows Features on or off, expand the Microsoft .Net Framework 3.5.1 and select both activations (well, you can go with one, but the process is tedious, so I recommend just installing both at once). Everything should install just fine.

Once you’ve done, before removing your access to the file and restoring the previous owner, copy the 4.0 version of that file back in place. Of course, if you’re not me, you probably installed your WCF activation before you installed VS 2010 : p

Tags: , ,

Mar 7 2009

Create Dictionaries From Two IEnumerables With Zip In .Net

I've been borrowing some functionality from our Haskell brethren (thanks to Jeff Cutsinger for enlightening me). There is a function in Haskell called Zip; you can see the Haskell documentation here. It essentially creates a Tuple, which is basically a generically typed, primitive data structure used in Haskell to return multiple values from a function, which, in the case of Zip, would be composed like a key-value pair.

I keep running into instances in .Net where I need to take two lists/collections/arrays of values and turn them into a dictionary. Now; I could write tedious code each and every time to do this, but I finally decided I should see what kind of quagmire I could get myself into trying to recreate Zip functionality. May I also plug TDD here? It made this task nearly trivial. Here was my first test:

   1: [TestMethod]
   2: public void TestZip()
   3: {
   4:     var names = new[]
   5:                     {
   6:                         "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"
   7:                     };
   8:  
   9:     var numbers = new int[]
  10:                       {
  11:                           1, 2, 3, 4, 5, 6, 7, 8, 9
  12:                       };
  13:  
  14:     var lookup = Zip(name, numbers);
  15:     int testValue = 1;
  16:     Assert.AreEqual(lookup["one"], testValue++);
  17:     Assert.AreEqual(lookup["two"], testValue++);
  18:     Assert.AreEqual(lookup["three"], testValue++);
  19:     Assert.AreEqual(lookup["four"], testValue++);
  20:     Assert.AreEqual(lookup["five"], testValue++);
  21:     Assert.AreEqual(lookup["six"], testValue++);
  22:     Assert.AreEqual(lookup["seven"], testValue++);
  23:     Assert.AreEqual(lookup["eight"], testValue++);
  24:     Assert.AreEqual(lookup["nine"], testValue);
  25: }


Simple, right? Sure, could've written a much more succinct lambda to assert, but hey, I'm trying to make the concept easier to grok : ) My first attempt at implementation looked like this:

   1: public Dictionary<K, V> Zip<K, V>(IEnumerable<K> keys, IEnumerable<V> values)
   2: {
   3:     var keyList = new List<K>();
   4:     return keys.ToDictionary(
   5:         key =>
   6:         {
   7:             keyList.Add(key);
   8:             return key;
   9:         },
  10:         key => values.ElementAt(keyList.Count - 1));
  11: }


Right. Probably not obvious. If you're thinking "uhhhhhhhhh", no worries, I can make this simple (i think). Everyone who <3's functional programming and lambdas and etc. will scold you for not minding your closures. In this case, trying to introduce any kind of counter index inside one of my lambdas would be very very bad. But I needed a way to get a value from the value collection which corresponded to the value from the key collection at the same ordinal, but how? That's when I realized, if I didn't mind wasting space for a List, I could simply add the key to a new list and use the list count (minus 1) to get the ordinal of the key element at any point in time during the iteration.

It wasn't long until I realized that Haskell lets us zip un-even collections (by limiting the size of the result to the smaller of the two collections) and I thought, "Gee, that's nicer than throwing exceptions at our user's heads" and since I'm borrowing Haskell concepts, why not just stay consistent. Fortunately, someone at Microsoft likes programmers because the answer was simple. Let's look at my additional unit tests though (since I did write them first) that will help us prove the code.

   1: [TestMethod]
   2: public void TestZipTooManyKeys()
   3: {
   4:     var names = new[]
   5:                     {
   6:                         "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"
   7:                     };
   8:  
   9:     var numbers = new int[]
  10:                       {
  11:                           1, 2, 3, 4, 5, 6, 7
  12:                       };
  13:  
  14:     var lookup = Zip(names,numbers);
  15:     int testValue = 1;
  16:     Assert.AreEqual(lookup.Count, 7);
  17:     Assert.AreEqual(lookup["one"], testValue++);
  18:     Assert.AreEqual(lookup["two"], testValue++);
  19:     Assert.AreEqual(lookup["three"], testValue++);
  20:     Assert.AreEqual(lookup["four"], testValue++);
  21:     Assert.AreEqual(lookup["five"], testValue++);
  22:     Assert.AreEqual(lookup["six"], testValue++);
  23:     Assert.AreEqual(lookup["seven"], testValue);
  24: }
  25:  
  26: [TestMethod]
  27: public void TestZipTooManyValues()
  28: {
  29:     var names = new[]
  30:                     {
  31:                         "one", "two", "three", "four", "five", "six", "seven"
  32:                     };
  33:  
  34:     var numbers = new int[]
  35:                       {
  36:                           1, 2, 3, 4, 5, 6, 7, 8, 9
  37:                       };
  38:  
  39:     var lookup = Zip(names, numbers);
  40:     int testValue = 1;
  41:     Assert.AreEqual(lookup.Count, 7);
  42:     Assert.AreEqual(lookup["one"], testValue++);
  43:     Assert.AreEqual(lookup["two"], testValue++);
  44:     Assert.AreEqual(lookup["three"], testValue++);
  45:     Assert.AreEqual(lookup["four"], testValue++);
  46:     Assert.AreEqual(lookup["five"], testValue++);
  47:     Assert.AreEqual(lookup["six"], testValue++);
  48:     Assert.AreEqual(lookup["seven"], testValue);
  49: }


Makes sense right? I have one test where I declare more keys than I have values and vice-versa. I assert the count is limited to the shorter of the two (in this case 7) and proceed with the assertions. So how'd I do this? It was simpler than I thought it would be when I made the tests and I think you'll see it immediately:

   1: public static Dictionary<K, V> Zip<K, V>(this IEnumerable<K> keys, IEnumerable<V> values)
   2: {
   3:     var keyList = new List<K>();
   4:     return keys.Take(values.Count()).ToDictionary(
   5:         key =>
   6:         {
   7:             keyList.Add(key);
   8:             return key;
   9:         },
  10:         key => values.ElementAt(keyList.Count - 1));
  11: }


So for all my Microsoft whining, they obviously do lots of things right. The Take extension method will take the number of values available from the source. This means I don't have to qualify or limit Take if keys doesn't have as many elements as the values collection. Just try to tell me that's not awesome.

You may be thinking, "But, Alex, you jerkface, you told us all about Tuples for nothing!?". Sorry, thought I could pull a fast one there for a minute. While I have also included Tuple support in Nvigorate, I don't think they should be used 'willy-nilly' in .Net. In this case, a dictionary feels like the right use since we're essentially dealing with a pair and what better than being able to quickly access the pair you want by one of the keys?

Right, so turning this into an extension method is trivial. It's important too because by making it an extension method, we can now write:

   1: var names = new[]
   2:                 {
   3:                     "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"
   4:                 };
   5:  
   6: var numbers = new int[]
   7:                   {
   8:                       1, 2, 3, 4, 5, 6, 7, 8, 9
   9:                   };
  10:  
  11: var lookup = names.Zip(numbers);


Awesome? C'mon, you know it is. This extension method is now becoming a part of the Nvigorate framework.

Tags:

Aug 25 2008

Translate List&lt;A&gt; to List&lt;B&gt; With LINQ

Category: .Net Framework | TipsAlexRobson @ 17:23

Let me just skip to the fun bit that the title promised before going off on some rant-flavored-jelly-filled-technical-jargon-journey that you really didn't pay for. If you have a generic List of type A and you want a generic List of type B, AND you know A can be cast as B then you could do the following:

List<B> TranslateList<A, B>(List<A> listofATypeThingies)
{
    List<B> listofBTypeThingies;
    
    foreach(A item in listofATypeThingies)
    {
        listofBTypeThingies.Add((B) item);
    }
 
    return listofBTypeThingies;
}

And then call TranslateList all over the place until the end of time. I prefer the following LINQified snippet:

List<B> listofBTypeThingies = new List<B>(listofATypeThingies.Select(item => (B) item));

Yes. The appropriate response in this case would be "Oh snap!"

*** EDIT ***

Fortunately for all of us, someone smarter than me is reading this blog so that they can correct me when I post something ignorant (like just now). There's already an extension method that addresses this particular use case and can be used like so:

   1: List<B> listofBTypeThingies = new List<B>(listofATypeThingies.Cast<B>());

Thanks to Jeff Cutsinger for catching this one. (If you ever start posting to a blog, Jeff, I'll link it :)

******

If you want to know WHY you have to do this in the first place read on. Otherwise, ignorance really is bliss. I know, because I'm still pretty ignorant and consequently, moderately happy.

First you need to speak the language of computer science. No, I'm not going to regurgitate text book definitions at you, if you want that, read a text book. But in order to understand the why, you need to know the following definitions (someone correct me if I'm butchering this) covariance allows the implicit conversion from a base type to a derived type, contravariance allows implicit conversion from a derived type to a base type and invariance disallows type conversion.

.Net allows for covariance in return types, function arguments, delegates and arrays. Generic type parameters are always invariant. At first I was irritable because I thought it was an unreasonable limitation, but then I started to play around with all this and now I understand why there's a genuine need for some type invariance in statically typed languages that allow side effects. Behold, I give you example code:

   1: public class MotherOfAllClasses
   2: {
   3: }
   4:  
   5: public class Child1 : MotherOfAllClasses
   6: {    
   7: }
   8:  
   9: public class Child2 : MotherOfAllClasses
  10: {    
  11: }
  12:  
  13: public class TypeShinanigans
  14: {
  15:     public void ThorDestroyerOfTypeSafety()
  16:     {
  17:         Child1 child1 = new Child1();
  18:         Child2 child2 = new Child2();
  19:  
  20:         MotherOfAllClasses mom1 = new MotherOfAllClasses();
  21:         MotherOfAllClasses mom_Child1 = new Child1();
  22:         MotherOfAllClasses mom_Child2 = new Child2();
  23:  
  24:         //This line is legal because mom_Child1 actually
  25:         //holds a reference to an instance of Child1
  26:         Child1 workingCast1 = (Child1)mom_Child1;
  27:  
  28:         //Here is some wonderfully breaking code
  29:         //which upon first glance may seem legal
  30:         //and even more disturbingly will build
  31:         
  32:         //Child1 breakingCast1 = (Child1) mom1;
  33:         //Child2 breakingCast2 = (Child2) mom1;
  34:         //Child1 breakingCast3 = (Child1) mom_Child2;
  35:  
  36:         //The third line here will break because we're
  37:         //trying to sneak in an instance of the parent class
  38:         //into an array of Child1 which as we saw earlier is
  39:         //a runtime exception
  40:         Child1[] chillenz1 = new Child1[5];
  41:         MotherOfAllClasses[] mamaz = chillenz1;
  42:         mamaz[0] = new MotherOfAllClasses();
  43:     }
  44: }

Hopefully that's helpful to understand WHY variance is dangerous enough to begin with but would be especially easy to break in conjunction with generic usage. If you'd like to discuss it more, I'll tell you what I can but I'm at a slight disadvantage not having taken lambda calculus (no, I'm not making it up, it's a real class).

Tags: ,