<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: More Cache Craziness</title>
	<atom:link href="http://lbrandy.com/blog/2009/03/more-cache-craziness/feed/" rel="self" type="application/rss+xml" />
	<link>http://lbrandy.com/blog/2009/03/more-cache-craziness/</link>
	<description>{ on programming and the internets, every monday }</description>
	<lastBuildDate>Tue, 09 Mar 2010 07:51:25 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Betty</title>
		<link>http://lbrandy.com/blog/2009/03/more-cache-craziness/comment-page-1/#comment-4841</link>
		<dc:creator>Betty</dc:creator>
		<pubDate>Fri, 10 Apr 2009 03:35:41 +0000</pubDate>
		<guid isPermaLink="false">http://lbrandy.com/blog/?p=603#comment-4841</guid>
		<description>I recently came across your blog and have been reading along. I thought I would leave my first comment. I don&#039;t know what to say except that I have enjoyed reading. Nice blog. I will keep visiting this blog very often.

Betty

http://laptopprocessor.info</description>
		<content:encoded><![CDATA[<p>I recently came across your blog and have been reading along. I thought I would leave my first comment. I don&#8217;t know what to say except that I have enjoyed reading. Nice blog. I will keep visiting this blog very often.</p>
<p>Betty</p>
<p><a href="http://laptopprocessor.info" rel="nofollow">http://laptopprocessor.info</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: louis</title>
		<link>http://lbrandy.com/blog/2009/03/more-cache-craziness/comment-page-1/#comment-4413</link>
		<dc:creator>louis</dc:creator>
		<pubDate>Wed, 01 Apr 2009 17:19:00 +0000</pubDate>
		<guid isPermaLink="false">http://lbrandy.com/blog/?p=603#comment-4413</guid>
		<description>justinhj,

gnuplot</description>
		<content:encoded><![CDATA[<p>justinhj,</p>
<p>gnuplot</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: justinhj</title>
		<link>http://lbrandy.com/blog/2009/03/more-cache-craziness/comment-page-1/#comment-4412</link>
		<dc:creator>justinhj</dc:creator>
		<pubDate>Wed, 01 Apr 2009 16:53:42 +0000</pubDate>
		<guid isPermaLink="false">http://lbrandy.com/blog/?p=603#comment-4412</guid>
		<description>Nice article. What did you use to do the heat plot?</description>
		<content:encoded><![CDATA[<p>Nice article. What did you use to do the heat plot?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tom d</title>
		<link>http://lbrandy.com/blog/2009/03/more-cache-craziness/comment-page-1/#comment-4405</link>
		<dc:creator>Tom d</dc:creator>
		<pubDate>Wed, 01 Apr 2009 10:52:41 +0000</pubDate>
		<guid isPermaLink="false">http://lbrandy.com/blog/?p=603#comment-4405</guid>
		<description>Very interesting analysis, thanks for the graphs!

It is probably worth investigating using optimized BLAS libraries for your matrix calculations, for example the Intel MKL will automatically multithread the matrix operation when it is optimal to do so, and will use the SSE vector processing units. Generally you can expect a significant speedup (2-10x) over handmade code. You can distribute the MKL libs along with your SDK, the license encourages that (e.g. Matlab comes with the MKL as its default BLAS now). Also, using the MKL means you&#039;ll automatically take advantage of any new vector processing hardware (e.g. AVX) when it comes available, so you don&#039;t have to keep revisiting your code.</description>
		<content:encoded><![CDATA[<p>Very interesting analysis, thanks for the graphs!</p>
<p>It is probably worth investigating using optimized BLAS libraries for your matrix calculations, for example the Intel MKL will automatically multithread the matrix operation when it is optimal to do so, and will use the SSE vector processing units. Generally you can expect a significant speedup (2-10x) over handmade code. You can distribute the MKL libs along with your SDK, the license encourages that (e.g. Matlab comes with the MKL as its default BLAS now). Also, using the MKL means you&#8217;ll automatically take advantage of any new vector processing hardware (e.g. AVX) when it comes available, so you don&#8217;t have to keep revisiting your code.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kang Su Gatlin</title>
		<link>http://lbrandy.com/blog/2009/03/more-cache-craziness/comment-page-1/#comment-4396</link>
		<dc:creator>Kang Su Gatlin</dc:creator>
		<pubDate>Tue, 31 Mar 2009 23:15:08 +0000</pubDate>
		<guid isPermaLink="false">http://lbrandy.com/blog/?p=603#comment-4396</guid>
		<description>I had done some research on these sorts of effects in the late 90s (peers with the FFTW work).  I love your ending line where you state: &quot;Its become obvious to me, however, that for any non-trivial problem, you positively need to rigoriously experiment.&quot;  That was a realization that I also learned.  And if you have low associative caches then the unpredictability becomes really pronounced.  I built a system that helped automate the process of building these sorts of algorithms, although it never had wide use.

Anyways here&#039;s a link to the work, if you&#039;re interested:
http://www.cs.ucsd.edu/~kgatlin/papers/thesis.pdf

Thanks for the interesting read.</description>
		<content:encoded><![CDATA[<p>I had done some research on these sorts of effects in the late 90s (peers with the FFTW work).  I love your ending line where you state: &#8220;Its become obvious to me, however, that for any non-trivial problem, you positively need to rigoriously experiment.&#8221;  That was a realization that I also learned.  And if you have low associative caches then the unpredictability becomes really pronounced.  I built a system that helped automate the process of building these sorts of algorithms, although it never had wide use.</p>
<p>Anyways here&#8217;s a link to the work, if you&#8217;re interested:<br />
<a href="http://www.cs.ucsd.edu/~kgatlin/papers/thesis.pdf" rel="nofollow">http://www.cs.ucsd.edu/~kgatlin/papers/thesis.pdf</a></p>
<p>Thanks for the interesting read.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt</title>
		<link>http://lbrandy.com/blog/2009/03/more-cache-craziness/comment-page-1/#comment-4393</link>
		<dc:creator>Matt</dc:creator>
		<pubDate>Tue, 31 Mar 2009 21:48:48 +0000</pubDate>
		<guid isPermaLink="false">http://lbrandy.com/blog/?p=603#comment-4393</guid>
		<description>@louis Ahh that makes sense, that wasn&#039;t clear from the article. I do a lot of server side web programming so I guess my assumption was it was an algorithm on your machines. Thanks for the answer.</description>
		<content:encoded><![CDATA[<p>@louis Ahh that makes sense, that wasn&#8217;t clear from the article. I do a lot of server side web programming so I guess my assumption was it was an algorithm on your machines. Thanks for the answer.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: louis</title>
		<link>http://lbrandy.com/blog/2009/03/more-cache-craziness/comment-page-1/#comment-4390</link>
		<dc:creator>louis</dc:creator>
		<pubDate>Tue, 31 Mar 2009 20:28:27 +0000</pubDate>
		<guid isPermaLink="false">http://lbrandy.com/blog/?p=603#comment-4390</guid>
		<description>Matt, 
We are writing an SDK we want to distribute. While we could try to make it automagic, it has a high cost and a low benefit. We don&#039;t lose much by staying far away from the maximal size, but it requires a great deal of engineering and a fair bit of assumptions (like what else might be running on a particular machine).</description>
		<content:encoded><![CDATA[<p>Matt,<br />
We are writing an SDK we want to distribute. While we could try to make it automagic, it has a high cost and a low benefit. We don&#8217;t lose much by staying far away from the maximal size, but it requires a great deal of engineering and a fair bit of assumptions (like what else might be running on a particular machine).</p>
]]></content:encoded>
	</item>
</channel>
</rss>
