SQL Server 2005 benchmarking

I've been curious for a while how Microsoft's latest database offering stacks up against the competition. So I did some searching today, and unfortunately couldn't find any good results. The problem is that most DB people don't understand how to do good benchmarking. Most people point to the Transaction Processing Council's database of benchmarks, but these are not valid benchmarks. Why? Because they don't standardize the hardware. Their goal is to show how fast you can run a database, without taking into account cost and personal bias. The results in their database are often from servers that cost millions of dollars. This would be fine if every test was from the same million dollar hardware, but it's not. So there are way too many competing factors.

Good benchmarks test competing systems on the same exact hardware, in the same exact lab environment, with the same exact tests for all systems. That way the only variable is the system you are testing. TCP's tests have a large array of variables, so there's no way to know why one system performs better than the other. So I'm going to have to keep my eyes peeled for a true benchmark. I did see one that compared SQL Server 2000 to Oracle, DB2, and MySQL, and it was a true benchmark (MSSQL paled in comparison). But I have yet to see one for MSSQL 2005.

Related Blog Entries

Comments
Actually, it doesn't necessarily make sense to use identical hardware to benchmark a database. For example it is quite possible that Oracle on Unix out-performs Oracle on Windows or vice-versa, so if a potential buyer is platform agnostic, they would want to evaluate performance on equivalent, but not identical hardware.

Just a thought.
# Posted By Gus | 5/5/06 1:28 PM
I disagree Gus. If you're benchmarking databases then you have to eliminate every other variable except the database. Otherwise it's not scientific, it becomes anecdotal.

If one machine has a gig of ram and the other has 128 megs. There are already too many variables. It's vital the hardware remain identical to the point of using the EXACT same machine.

I do see your point about comparing the operating systems and that is valid. But if different OSs are used, you can't come to the conclusion that one database is faster than the other. You have to come to the conclusion that one database+OS is faster than another database+OS.

What do you think?
# Posted By Steve Nelson | 5/5/06 4:51 PM
Hardware Benchmarking isn't necessarily the only factor we use at our company to determine which database platform to use.

Factors such as DBA ease of use for tasks such as Replication, backup and daily maintenance are also important. Clustering, source control, security are all factors that should weigh in also.

Hands down for the above databases you mentioned I would believe MySQL would be the most efficient for small or individual workgroups. We prefer SQL-Lite for small projects for its performance.
# Posted By David | 5/6/06 2:45 PM
I still differ with you Steve. There are already too many confounders to come up with a meaningful benchmark based on identical hardware.

Here is another simple real world example:
On a Win2003 Server with 4 megs of RAM Database A outperforms Database B.

Now bump the ram to 16 Megs and Database B outperforms Database A

Now bump the ram to 64 Megs and Database A is back on top!

Another example of confounders is what you are benchmarking. Using nothing but ANSI SQL to compare an Oracle DB to a MS SQL server is pointless. You need to run test that utilize the strengths and optimizations inherint in each DB... which by definition is going to be a confounder.

So to me, the only meaningful tests are to match strength against strength, whether hardware or DB specific features and then evaluate which is best suited for your specific requirements.

Steve.
# Posted By Gus | 5/6/06 6:22 PM
I think the problem with your analysis, Gus, is that you are talking about product evaluations, not benchmarks. A benchmark will tell you which database is the fastest, bar none. Once you've learned that you can then pick what hardware you think you'll need for your system.

Take a look at the sport of car racing. Everybody is divided up into classes, based on cost, size, engine, etc. The goal? To find out who is the best driver, not who can spend the most money. If the racers were allowed to pick their favorite track, spend as much money as they want, and potentially invent the fastest car in the world, we all know who'd win. Bill Gates. But is he really the best race car driver? No, he just has the most money.
# Posted By yacoubean | 5/8/06 7:46 PM
Yacoubean,

A benchmark will only tell you which database is the fastest on 1 particular system running 1 particular code base. Particularly in the case of databases, this is almost worthless information. As stated in my comment above, DB A may be faster than DB B on one hardware configuration, and DB B may be faster than DB A on another. So how will this tell you which DB is fastes? It won't.

As for the auto racing analogy... it is flat out wrong.

If it were true, the same driver would win all the time. In fact, some drivers perform better on certain tracks than others! And of course the cars are nowhere near identical. If they were, might as well randomly assign them to drivers before a race. Racing crews spend millions of dollars tuning the car to the specific track each time they race. They adjust suspension, balance, airflow and many other things... because the same car will perform differently on different tracks.

Just like the same database will perform differently on different hardware.

Gus
# Posted By Gus | 5/9/06 3:44 AM
Well Gus, we're just going to have to agree to disagree. Every professional benchmark I've ever seen, in both hardware and software, tries to standardize all the variables except for the thing being tested. The same thing happens in scientific testing. The only exception I've seen is the TCP lists I mentioned in my post. I think the fact that all the scientists and benchmark professionals do it the way I've stated says something, regardless of the silly analogies you and I come up with to prove our points. :)
# Posted By yacoubean | 5/9/06 7:41 AM
Just had to put my .02 in here.

Yacoubean, I agree with you that things need to be standardized for general benchmark numbers.

HOWEVER, I also agree with Gus on the point of utilizing native functions inherent to each database and coming to a more definitive answer of which database is actually faster. As long as the tweaks are documented, the benchmark figures are still accurate. Face it, benchmarking a package out of the box with the generic configurations isn't much of a benchmark.

Even with the exact same hardware, and oracle database is going to experience different IO than a SQL database and the tablespace/filegroup layouts will be different--just never going to be and apples/apples comparison.

Brett
# Posted By Brett | 6/23/06 7:58 AM
Brett,

I totally agree. I don't see anything wrong with tweaking the software to work the best it can, I'm just saying the hardware has to be static across the test. In fact, a lot of the good benchmarks I've read about bringing experts in for each system to make sure it's configured properly.
# Posted By yacoubean | 6/23/06 8:07 AM
Hi,
may you tell me the link that u mentioned in the text?
i want to know more about variouse databse benchmarking.
# Posted By mahdie | 9/1/06 10:45 PM
mahdie,

Here is a link, but again, this test is a few years old, with now outdated DB versions:
http://www.eweek.com/article2/0,3959,293,00.asp
# Posted By yacoubean | 9/2/06 7:57 AM
BlogCFC was created by Raymond Camden. This blog is running version 5.9. Contact Blog Owner