For unknown or inadequate reasons, I bought a copy of Office 2007 the other day and loaded it up for work. (As an aside, Amazon has Office 2007 Home & Student for $80 and Excel 2007 for $105 — and it says people are buying the individal program after comparing the two?). For the past few weeks I have been developing code that pulls financial data from Yahoo and populates a template sheet, among other things. Because of the way I add the data to the sheet (the fastest way excel supports, setting a range’s value equal to a VBA array), some summary functions at the top of the sheet have to account for the fact that the data could be of an arbitrary height.
My solution to this problem in the original version, written and debugged in Excel 2003, was to have those summary cells reference the range A23:A65536 and take a count, etc. to get the number or summary of those cells. This was fine. My code would take 30 seconds per run, most of which I thought was spent in VBA.
Enter Excel 2007. On a whim, I saved my spreadsheet in 2007 format and ran it. It was slow. I mean, mind bogglingly slow. I thought it was just the network taking a long time to download the data, however my coworker ran it a few times and it locked up his computer. Then it dawned on me: the formulas were rewritten to now reference A23:A1048576. Excel was touching 1 million rows over and over again every time the sheet calculated, which was often (he was using an RTD link, so it recalculated constantly). A simple solution: write a UDF that would count excatly the size of my data array, and use that instead.
This was the surprise: since I only had about 1,000 data points (but I didn’t want to hardcode to that number, and so I referenced all rows on the sheet), my running time, even using Excel 2003, went down from 30 seconds to about 6 seconds. The original slowness I had mostly chalked up to Excel handling data poorly; instead, it was purely a function of my forcing it to handle the data poorly, a problem only exposed by migrating to a higher-throughput device. The moral of the story is that if it doesn’t scale, it may be your fault, not theirs.

