Welcome to Atlanta .NET Regular Guys Sign in | Join | Help

Ok people - tell me what I'm doing wrong

I want to be a good boy, I really do!

Consider this scenario for me:  I've got a SQL query which returns 7000 rows.  I need to put a single value of this row into a combo box.  This value is also the primary identifier for one of my custom business objects.  That leaves me with the following three options:

  1. select the results out with a dataset and bind the combobox to the dataset
  2. select the results out with a datareader and iterate through the results, adding the values one at a time to the combobox
  3. select the results out with a datareader and for each value, create my custom business object and add it to a collection.  Then bind the combobox to the collection

I actually tried each of these.  I started out with solution #2 - but that took too long.  On my 3Ghz machine with 1Gb RAM, it took over a minute to iterate through all the results and add the values to the combobox.  Way too long.  So I moved on to solution #3 - much more elegant.  I create a collection of objects and all my databinding all over the form means I have virtually no code.  Sweet.  Except that took too long too.  Remember my post on delegates?  I implemented that technique and the process of creating all my CBOs took 40 seconds on a high powered HyperThreaded/Dual Core CPU.  So that moved me to solution #1 - which takes about 2 seconds (I also used the worker thread, so that might be helping).

Let's review:

  1. Use a simple dataset (and 2 threads).  This takes 2 seconds.  Of course, I have to create my CBO from the selected value later, but that's fast when I'm only doing it once.
  2. Use a data reader and a single thread.  This takes 75 seconds.
  3. Use a custom collection of CBOs and 2 threads.  This takes 40 seconds.

Clearly solution #1 wins.  Of course, it's not as elegant as #3, but it's so much faster that anyone who talks about architectual purity needs to get slapped.  Now my question is why?  What the F$#K is taking so long?

I have a theory - I'm using the Enterprise Library to handle my calls to the database.  Hydrating my CBO makes a single call to the database and gets about 6 values out as output parameters.  My theory is that b/c the EntLib handles the opening and closing of the database inside of the ExecuteReader function that I'm actually opening and closing the connection 7000 times!  Now that totally sucks and if I was interested in writing the ADO code I'd move the open and close outside of the loop.  But I'm not really interested in writing that code b/c even if I could speed up the connection to the database, I doubt I'd shave 37 seconds off of my time.

So my question to the general public is this: when is it right to use a dataset and when should I use a CBO?  I'm not doing a lot of complex work here - just filling out some screens with some properties of some objects.  Sounds like a job for a CBO collection, but I'm typically dealing with THOUSANDS of objects.  Datasets seem to win on raw speed.

Thoughts?  Comments?

-- Matt Ranlett

Published 19-01-2006 04:23 by Matt Ranlett
Filed Under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Jim Wooley said:

I just tried pushing 30000 records into a combo box using business objects populated by a data reader. The combobox was populated within 4 seconds and available on the screen. Thus there does sound like there is some additional issues with how the CBO is being populated in your code.

If you think about it, the data set is just an array of objects populated by a datareader under the covers. In addition, the dataset has additional functionality that is additional overhead a CBO wouldn't necessarily need, thus the CBO should be more performant.

Another extention to try as you are having fun with Background Workers: Open the form with the combobox set with the AutoCompleteMode set to SuggestAppend. Set the AutoCompleteSource to the lookup object once it comes back from the worker thread. That way, it should be populated before the user can get to it, but they don't have to wait for it to be populated in order to view the screen. Could be something to try.
January 19, 2006 2:41 PM
 

Keith Rome said:

1. Run SQL Profiler while performing each test. The trace might provide some valuable insight.

2. Why force yourself into using the EntLib db block? ADO.NET is insanely simply to work with directly.

3. Actually, of the three options you present, #1 is the most elegant and prevents a few problems that #2 can cause. #3 is just simply not a good idea.

Some things that come to mind from reading your post though:

> Make sure you are only selecting out the fields that you need... no "select *" crazyness... you probably only need the key field and description field.

> 7000 records is a hell of a lot of stuff to dump into a selection box... you might want to consider another UI approach there.
January 19, 2006 4:17 PM
 

Veruca Salt said:

I wouldn't touch a dataset with a 10 foot keyboard. EVER. Keith Rome are you freakin kidding me? #3 is simply not a good idea? Do you have any clue how to write maintainable code? DO YOU????

January 19, 2006 5:37 PM
 

Matt Ranlett said:

I should have specified that I'm only selecting a single field from the database. No "select * craziness". The key field IS the description field.

I agree that 7000 items in a single UI element sucks, but I'm not sure how else to present this to them. Basically, this app is a "warehouse configurator". There are 10 or so logical warehouses and a combined total of 7000 bins inside all of these warehouses. The requirements of the form are that the user can modify the settings of any bin in any warehouse. I'm open to UI suggestions!

And Keith, why do you think #3 is not a good idea? I don't believe datasets are bad things, but if I use option #3 - a custom collection of objects, all my databinding for my other form elements is done for me (practically). Why should I avoid that?
January 20, 2006 5:00 AM
 

Atlanta .NET Regular Guys said:

Last night seven of us met at the 5 Seasons to talk about everything geeky.  We talked about some...
January 20, 2006 7:36 AM
 

Atlanta .NET Regular Guys said:

In my last post about data access woes, I talked about how using the Enterprise Library to handle my...
January 20, 2006 11:45 AM
 

Paul Wilson said:

None of the 3 options you listed should perform badly, so there is either something wrong with your code or the Ent Lib code. I can't comment on which since I haven't seen your code and since I don't use the Ent Lib due to it being overly complex and nothing special. If you want raw speed then the pure datareader approach (#2) is going to win, but I would go with #1 or #3 since there shouldn't be enough of a difference to avoid the benefits of the other approaches. If you're one that likes CBOs, as it seems you are, then #3 sounds like the better of those 2 approaches for you personally.

However, that makes me wonder why you aren't using a good O/R Mapper to take care of this for you if you like CBOs. Not only would you not have to write and maintain a lot of code, but you'll also avoid the potential pitfall of things like hanging connections that can kill an app. Sure no one believes they will ever do that, and we all certainly know how to avoid it (at least I hope), but all to often things do get careless when you have to do it all yourself, which is why a good DAL or O/R Mapper can really protect you. I've seen this all too often, and even from very senior devs due to the need to cut-n-paste a lot of code (even in some "expert" books) -- so why write the code at all if something else will do it for you in a safe way every time?

One last comment, there are performance differences between CBOs and DataSets that aren't discussed much beyond the overly simple statement that DataSets have "more overhead". The simple story is that CBOs are typically faster for small recordsets, there isn't much difference for medium sized recordsets, and DataSets are actually faster for large recordsets! My experience is that 7000 records is in the medium range, and anytime you are doing a "large" amount of records you really ought to either be paging server-side anyhow, unless you're doing something like reports in which case CBOs don't really make sense.
January 20, 2006 1:20 PM
 

Keith Rome said:

Coding a custom business object and a custom collection merely for the purpose of loading a freeking combo box in the UI is not my idea of "maintainable".

A collection of Data Transfer Objects (Entities, Data Value Objects, whatever your preferred name for them is) is a good solution. But in that sense, isnt a Dataset really just a highly generic DTO?? Keep It Simple and don't overcomplicate the solution.

It has been my experience that when people talk about "custom business objects", they are referring to more than simple DTO's. These things typically include lots of bloat, BLL methods, even self-servicing lazy-loaded properties. Nothing kills performance like iterating a custom collection of thousands of these things while inspecting a lazy loaded property!

I gotta say, Paul W is right. a reasonable O/R mapper or DAL is what is called for here. The EntLib block (which is really a dumbed-down DAL) just isn't coming through. And given the incredibly simple requirement here (select one column from a table and bind to a combo box), CBO's / CBOC's is just a misallocation of effort.
January 21, 2006 8:23 AM

What do you think?

(required) 
(optional)
(required) 

About Matt Ranlett

One of the two original Atlanta .NET Regular Guys, Matt fills his free time by helping to run several Atlanta area user groups, the Atlanta Code Camps, and works as one of the two INETA co-Vice Presidents of Technology
SkinName:iroha_Blog2
Powered by Community Server, by Telligent Systems