<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Hacking-Gurus &#187; Random Numbers</title>
	<atom:link href="http://www.hacking-gurus.net/tag/random-numbers/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.hacking-gurus.net</link>
	<description>Security Blog</description>
	<lastBuildDate>Thu, 19 Jan 2012 21:06:41 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
		<item>
		<title>Quick comparison of MyISAM, Infobright, and MonetDB (mysql)</title>
		<link>http://www.hacking-gurus.net/2009/09/30/quick-comparison-of-myisam-infobright-and-monetdb-mysql/</link>
		<comments>http://www.hacking-gurus.net/2009/09/30/quick-comparison-of-myisam-infobright-and-monetdb-mysql/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 04:39:44 +0000</pubDate>
		<dc:creator>r00t</dc:creator>
				<category><![CDATA[Database Security]]></category>
		<category><![CDATA[Tutorialz]]></category>
		<category><![CDATA[1 Million]]></category>
		<category><![CDATA[Amazon Server]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Benchmark]]></category>
		<category><![CDATA[Columns]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[Deb]]></category>
		<category><![CDATA[Enough Memory]]></category>
		<category><![CDATA[Graph]]></category>
		<category><![CDATA[Loaded]]></category>
		<category><![CDATA[Monet]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[Mysqldump]]></category>
		<category><![CDATA[Open Source Community]]></category>
		<category><![CDATA[Oriented Database]]></category>
		<category><![CDATA[Random Numbers]]></category>
		<category><![CDATA[Sql Statements]]></category>
		<category><![CDATA[Sums]]></category>
		<category><![CDATA[Tuning Box]]></category>
		<category><![CDATA[Types Of Queries]]></category>

		<guid isPermaLink="false">http://www.hacking-gurus.net/?p=162</guid>
		<description><![CDATA[Recently I was doing a little work for a client who has MyISAM tables with many columns (the same one Peter wrote about recently). The client&#8217;s performance is suffering in part because of the number of columns, which is over 200. The queries are generally pretty simple (sums of columns), but they&#8217;re ad-hoc (can access any [...]]]></description>
			<content:encoded><![CDATA[<div style="float: right; width: 42px; padding-right: 10px; margin: 0 0 0 10px;">
		<script type="text/javascript">
		<!--
		digg_url = "http://www.hacking-gurus.net/2009/09/30/quick-comparison-of-myisam-infobright-and-monetdb-mysql/";
		digg_bgcolor = "#FFFFFF";
		digg_skin = "";
		digg_window = "new";
		digg_title = "Quick+comparison+of+MyISAM%2C+Infobright%2C+and+MonetDB+%28mysql%29";
		digg_media = "news";
		digg_topic = "";
		digg_bodytext = "Recently I was doing a little work for a client who has MyISAM tables with many columns (the same one Peter wrote about recently). The client&#8217;s performance is suffering in part because of the number of columns, which is over 200. The queries are generally pretty simple (sums of columns), but they&#8217;re ad-hoc (can access any columns) and it...";
		//-->
		</script>
		<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script></div><p><span style="font-family: arial, sans-serif; line-height: normal; border-collapse: collapse;">Recently I was doing a little work for a client who has MyISAM tables with many columns (the same one <a style="color: #2244bb;" href="http://www.mysqlperformanceblog.com/2009/09/28/how-number-of-columns-affects-performance/" target="_blank">Peter wrote about recently</a>). The client&#8217;s performance is suffering in part because of the number of columns, which is over 200. The queries are generally pretty simple (sums of columns), but they&#8217;re ad-hoc (can access any columns) and it seems tailor-made for a column-oriented database.</p>
<p><span id="more-162"></span></p>
<p>I decided it was time to actually give <a style="color: #2244bb;" href="http://www.infobright.org/" target="_blank">Infobright</a> a try. They have an open-source community edition, which is crippled but not enough to matter for this test. The &#8220;Knowledge Grid&#8221; architecture seems ideal for the types of queries the client runs. But hey, why not also try <a style="color: #2244bb;" href="http://monetdb.cwi.nl/" target="_blank">MonetDB</a>, another open-source column-oriented database I&#8217;ve been meaning to take a look at?</p>
<p>What follows is not a realistic benchmark, it&#8217;s not scientific, it&#8217;s just some quick and dirty tinkering. I threw up an Ubuntu 9.04 small server on Amazon. (I used this version because there&#8217;s a .deb of MonetDB for it). I created a table with 200 integer columns and loaded it with random numbers between 0 and 10000. Initially I wanted to try with 4 million rows, but I had trouble with MonetDB &#8212; there was not enough memory for this. I didn&#8217;t do anything fancy with the Amazon server &#8212; I didn&#8217;t fill up the /mnt disk to claim the bits, for example. I used default tuning, out of the box, for all three databases.</p>
<p>The first thing I tried doing was loading the data with SQL statements. I wanted to see how fast MyISAM vs. MonetDB would interpret really large INSERT statements, the kind produced by mysqldump. But MonetDB choked and told me the number of columns mismatched. I found reference to this on the mailing list, and skipped that. I used LOAD DATA INFILE instead (MonetDB&#8217;s version of that is COPY INTO). This is the only way to get data into Infobright, anyway.</p>
<h3>The tests</h3>
<p>I loaded 1 million rows into the table. Here&#8217;s a graph of the times (smaller is better):</p>
<p><img title="Load Time" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/09/load_time.png" alt="Load Time" width="450" height="320" /></p>
<p>MyISAM took 88 seconds, MonetDB took 200, and Infobright took 486. Here&#8217;s the size of the resulting table on disk (smaller is better):</p>
<p><img title="Table Size in Bytes" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/09/table_size_bytes.png" alt="Table Size in Bytes" width="450" height="320" /></p>
<p>MyISAM is 787MB, MonetDB is 791MB, and Infobright is 317MB. Next I ran three queries:</p>
<div style="margin: 0px;"><span><a style="color: #2244bb;" href="http://www.mysqlperformanceblog.com/2009/09/29/quick-comparison-of-myisam-infobright-and-monetdb/" target="_blank">PLAIN TEXT</a></span></div>
<div style="margin: 0px;"><span>SQL:</span></p>
<div style="margin: 0px;">
<div style="margin: 0px;">
<ol>
<li>
<div style="margin: 0px;"><span>SELECT</span> sum<span>(</span>c19<span>)</span>, sum<span>(</span>c89<span>)</span>, sum<span>(</span>c129<span>)</span> <span>FROM</span> t;</div>
</li>
<li>
<div style="margin: 0px;"><span>SELECT</span> sum<span>(</span>c19<span>)</span>, sum<span>(</span>c89<span>)</span>, sum<span>(</span>c129<span>)</span> <span>FROM</span> t <span>WHERE</span> c11&gt; <span>5</span>;</div>
</li>
<li>
<div style="margin: 0px;"><span>SELECT</span> sum<span>(</span>c19<span>)</span>, sum<span>(</span>c89<span>)</span>, sum<span>(</span>c129<span>)</span> <span>FROM</span> t <span>WHERE</span> c11 &lt;<span>5</span>;</div>
</li>
</ol>
</div>
</div>
</div>
<p>Graphs of query performance time for all three databases are really not very helpful, because MyISAM is so much slower that you can&#8217;t see the graphs for the others. So I&#8217;ll give the numbers and then omit MyISAM from the graphs. Here are the numbers for everything I measured:</p>
<p></span></p>
<table border="0">
<thead>
<tr>
<td></td>
<th>myisam</th>
<th>monetdb</th>
<th>infobright</th>
</tr>
</thead>
<tbody>
<tr>
<th>size (bytes)</th>
<td>826000000</td>
<td>829946723</td>
<td>332497242</td>
</tr>
<tr>
<th>load time (seconds)</th>
<td>88</td>
<td>200</td>
<td>486</td>
</tr>
<tr>
<th>query1 time</th>
<td>3.4</td>
<td>0.012</td>
<td>0.0007</td>
</tr>
<tr>
<th>query2 time</th>
<td>3.4</td>
<td>0.15</td>
<td>1.2</td>
</tr>
<tr>
<th>query3 time</th>
<td>2.5</td>
<td>0.076</td>
<td>0.15</td>
</tr>
</tbody>
</table>
<p><span style="font-family: arial, sans-serif; line-height: normal; border-collapse: collapse;">And here is a graph of Infobright duking it out with MonetDB on the three queries I tested (shorter bar is better):</p>
<p><img title="MonetDB vs Infobright Query Time" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2009/09/monetdb_infobright_query_time1.png" alt="MonetDB vs Infobright Query Time" width="492" height="320" /></p>
<p>I ran each query a few times, discarded the first run, and averaged the next three together.</p>
<h3>Notes on Infobright</h3>
<p>A few miscellaneous notes: don&#8217;t forget that Infobright is <em>not</em> just a storage engine plugged into MySQL. It&#8217;s a complete server with a different optimizer, etc. This point was hammered home during the LOAD DATA INFILE, when I looked to see what was taking so long (I was tempted to use oprofile and see if there are sleep() statements). What did I see in &#8216;top&#8217; but a program called bhloader. This bhloader program was the only thing doing anything; mysqld wasn&#8217;t doing a thing. LOAD DATA INFILE in Infobright isn&#8217;t what it seems to be. Otherwise, Infobright behaved about as I expected it to; it seemed pretty normal to a MySQL guy.</p>
<h3>Notes on MonetDB</h3>
<p>MonetDB was a bit different. I had to be a bit resourceful to get everything going. The documentation was for an old version, and was pretty sparse. I had to go to the mailing lists to find the correct COPY syntax &#8212; it wasn&#8217;t that listed in the online manual. And there were funny things like a &#8220;merovingian&#8221; process (think &#8220;angel&#8221;) that had to be started before the server would start, and I had to destroy the demo database and recreate it before I could start it as shown in the tutorials.</p>
<p>MonetDB has some unexpected properties; it is not a regular RDBMS. Still, I&#8217;m quite impressed by it in some ways. For example, it seems quite nicely put together, and it&#8217;s not at all hard to learn.</p>
<p>It doesn&#8217;t really &#8220;speak SQL&#8221; &#8212; it speaks relational algebra, and the SQL is just a front-end to it. You can talk XQuery to it, too. I&#8217;m not sure if you can talk dirty to it, but you can sure talk nerdy to it: you can, should you choose to, give it instructions in MonetDB Assembly Language (MAL), the underlying language. An abstracted front-end is a great idea; MySQL abstracts the storage backend, but why not do both? Last I checked, Drizzle is going this direction, hurrah!</p>
<p>EXPLAIN is enlightening and frightening! You get to see the intermediate code from the compiler. <a style="color: #2244bb;" href="http://monetdb.cwi.nl/projects/monetdb/SQL/Documentation/EXPLAIN-Statement.html" target="_blank">The goggles, they do nothing!</a></p>
<p>From what I was able to learn about MonetDB in an hour, I believe it uses memory-mapped files to hold the data in-memory. If this is true, it explains why I couldn&#8217;t load 4 million rows into it (this was a 32-bit Amazon machine).</p>
<p>The SQL implementation is impressive. It&#8217;s a really solid subset of SQL:2003, much more than I expected. It even has CTEs, although not recursive ones. (No, there is no REPLACE, and there is no INSERT/ON DUPLICATE KEY UPDATE.) I didn&#8217;t try the XQuery interface.</p>
<p>Although I didn&#8217;t try it out, there are what looks like pretty useful instrumentation interfaces for profiling, debugging and the like. The query timer is in milliseconds (why doesn&#8217;t mysql show query times in microseconds? I had to resort to Perl + Time::HiRes for timing the Infobright queries).</p>
<p>I think it can be quite useful. However, I&#8217;m not quite sure it&#8217;s useful for &#8220;general-purpose&#8221; database use &#8212; there are a number of limitations (concurrency, for one) and it looks like it&#8217;s still fairly experimental.</p>
<hr style="height: 1px; margin: 0px;" />
</span></p>
<p>via <a href="http://www.mysqlperformanceblog.com/2009/09/29/quick-comparison-of-myisam-infobright-and-monetdb/">Quick comparison of MyISAM, Infobright, and MonetDB</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hacking-gurus.net/2009/09/30/quick-comparison-of-myisam-infobright-and-monetdb-mysql/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

