Friday, May 27, 2016

Comparison and Review of .NET Fuzzy Matching Nuget Packages

I am simply using Jaro-Winkler to get a similarity factor of 2 strings.  I'm using this for name and address comparisons and doing my own score aggregation and weighting.

I first tried Fuzzy String.  Unfortunately, it has several issues preventing it from working properly.  Even among these issues, I found other examples that caused the Jaro-Winkler algorithm to go into an infinite loop.  It's funny that this package has a 5 star rating, because for my use case, only using Jaro-Winkler, it failed miserably.

Then I tried BlueSimilarity.  This package also had issues loading a BlueSimilarity.Interop.dll.  At this point I was tired of troubleshooting and just wanted a solution that worked.  Besides, on nuget the project site is a broken link.  Man.

Finally I tried SimMetrics-TextFunctions.  This worked really well!  I had a few small unit tests to simply verify that the bugs in FuzzyString are not in this implementation.  Awesome!
EDIT: Wow.  I found out 7 months later that this package does indeed have a bug.  It is easy to work around, but I consider it a bug non-the-less.  This code, with a space prefix on one of the strings returns with a zero similarity.

Monday, May 9, 2016

My Entity Framework Cheat Sheet

This is a list of tools I use to get generated code and get work done quickly with Entity Framework.

Generate Classes for Entity Framework Code-First from Database

  1. Prerequisite: An existing database
  2. Do one time:  Download Entity Framework 6 Tools for Visual Studio 2012 & 2013.
  3. Right click a project
  4. Add new item...
  5. Search for ADO, select ADO.NET Entity DataModel.
  6. Name it.
  7. Select Code First from database.
  8. Select / Create Database connection.
  9. Select tables you want to include.
  10. Finish
  11. Now you have models and a DbContext to work with.

Generate SQL/Database from a DbContext

  1. VIEW > Other > Package Manager Console.
  2. Make sure the console is showing the correct project.
  3. If you are trying to recreate your database, make sure to drop the DB or change the connection string.  Also remove the Migrations folder.
  4. Make sure you have a DbContext file with:
    1. A good database connection string.
    2. Sub-classing DbContext
    3. public DbSet<Entity> Entities {get;set;}
  5. Type: Enable-Migrations
  6. Add-Migration MigrationName
  7. Update-Database -Script.  Save off the SQL if you want it.
  8. Update-Database.  Now your database tables are created.