<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Peter Williams &#187; Software Development</title>
	<atom:link href="http://barelyenough.org/blog/category/software-development/feed/" rel="self" type="application/rss+xml" />
	<link>http://barelyenough.org</link>
	<description>… and there is much to be learned</description>
	<lastBuildDate>Thu, 09 Feb 2012 21:58:12 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>My &#8220;cloud&#8221; tool chain</title>
		<link>http://barelyenough.org/blog/2012/01/my-cloud-tool-chain/</link>
		<comments>http://barelyenough.org/blog/2012/01/my-cloud-tool-chain/#comments</comments>
		<pubDate>Mon, 23 Jan 2012 15:45:56 +0000</pubDate>
		<dc:creator>Peter Williams</dc:creator>
				<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://barelyenough.org/?p=623</guid>
		<description><![CDATA[Recently Mike Amundsen posted a list of the tools he uses for developing cloud applications. He also asked for others to provide their lists. So here goes: Emacs The one true development environment that all others aspire to be like when they grow up. Mike has been using Cloud9 and really seems to like it. [...]]]></description>
			<content:encoded><![CDATA[<p>Recently Mike Amundsen posted a <a href="http://www.amundsen.com/blog/archives/1116">list of the tools</a> he uses for developing cloud applications. He also asked for others to provide their lists. So here goes:</p>
<section>
<h3><a href="http://www.gnu.org/software/emacs/">Emacs</a></h3>
<p>The one true development environment that all others aspire to be like when they grow up.</p>
<p>Mike has been using <a href="http://c9.io/">Cloud9</a> and really seems to like it. I have had my eye on the browser based dev environments for a while. Maybe it is time to to give them a go.</p>
</section>
<section>
<h3><a href="http://rubyonrails.org">Ruby on Rails</a></h3>
<p>RoR is still the best way to build web applications ever created.</p>
<p>(I know node.js is getting a lot of buzz these days, but it optimizes for the wrong thing. It optimizes IO performance, but what is really important is developer productivity. Except for a few very niche situations runtime performance is way less important that getting stuff done.)</p>
</section>
<section>
<h3><a href="http://couchdb.apache.org/">CouchDB</a></h3>
<p>For small to medium size data sets CouchDB is the best option for cloud based data storage. It is easy to use, there are good libraries for it, and you can setup it up to be very reliable even when running on unreliable virtual instances.</p>
<p>We have been using <a href="http://cloudant.com">Cloudant</a> for a while now. It has been reliable, but it is a bit slow. However, that problem is easily overcome with the violent application of caching. You communicate with CouchDB over HTTP so it is very easy to setup caching.</p>
</section>
<section>
<h3><a href="http://git-scm.com/">Git</a></h3>
<p>We are using Git without the <a href="http://github.com">hub</a>. It is <em>great</em> tool even without all the fancy gui collaboration support. If you are using any SCM tool that is not a DVCS you are missing out. Go switch to Git (or <a href="http://mercurial.selenic.com/">Mercurial</a>) right now.</p>
</section>
<section>
<h3><a href="http://www.opscode.com/chef/">Chef</a> (Solo)</h3>
<p>Working in the cloud really drives home the impermanence of all things. Machine disappear, or lock up, APIs become unresponsive, etc. Being able to build a replacement instances automatically, and quickly, is vital.</p>
</section>
<section>
<h3><a href="http://fog.io/1.1.2/index.html">Fog</a></h3>
<p>Fog is an adapter, in ruby, for the various cloud providers which gives them a similar interface. This is invaluable if you plan on working with more than one cloud provider.</p>
</section>
<section>
<h3><a href="http://aws.amazon.com/ec2/">Amazon EC2</a> and <a href="http://www.rackspace.com/cloud/cloud_hosting_products/servers/">Rackspace Servers</a></h3>
<p>The product we are developing is an open platform-as-a-service so we use the IaaS offerings to provide the compute power we need.</p>
</section>
<section>
<h3>Conclusion</h3>
<p>My tool chain for developing in the cloud is pretty similar to the one i used for developing apps for dedicated hardware. The biggest change is definitely the change to a document store. Traditionally relational databases just don&#8217;t work in the dodgy environment of the cloud yet. I expect that will change over time, but for now the ease of replication with document stores make them compelling for clusters of unreliable virtual machines.</p>
</section>
]]></content:encoded>
			<wfw:commentRss>http://barelyenough.org/blog/2012/01/my-cloud-tool-chain/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Developing software as if quality matters</title>
		<link>http://barelyenough.org/blog/2011/11/as-if-quality-matters/</link>
		<comments>http://barelyenough.org/blog/2011/11/as-if-quality-matters/#comments</comments>
		<pubDate>Thu, 17 Nov 2011 12:00:21 +0000</pubDate>
		<dc:creator>Peter Williams</dc:creator>
				<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://barelyenough.org/?p=609</guid>
		<description><![CDATA[When we started developing CloudSwing we decided to develop CloudSwing as if quality actually matters. By quality i don&#8217;t just mean that the code functions as designed. High quality products also meet the needs of the business and customers. In the past there has been a sense of disconnection between the business, QA and development [...]]]></description>
			<content:encoded><![CDATA[<p>When we started developing <a href='http://cloudswing.openlogic.com/'>CloudSwing</a> we decided to develop CloudSwing as if quality actually matters. By quality i don&#8217;t just mean that the code functions as designed. High quality products also meet the needs of the business and customers.</p>
<p>In the past there has been a sense of disconnection between the business, QA and development parts of our team. (An all too common situation in my experience.) If quality actually matters there must be good connections between all the stakeholders and quality must be considered at every step of the process. We don&#8217;t have a product manager, or a large QA team<sup id='as-if-quality-mattersfnref:2'><a href='#as-if-quality-mattersfn:2' rel='footnote'>2</a></sup>. That means that if quality is going to be considered at every step it is going to be because the developers are doing it, there is no one else at many of the steps.</p>
<p>Our solution is to acknowledge developer responsibility for all aspects of feature development. That unification reduces the chances of things being overlooked. It does change the workload of developers, though. Now developers must:</p>
<ul>
<li>elicit requirements from all stakeholders</li>
<li>write <em>testable</em> acceptance criteria</li>
<li>develop automated acceptance tests</li>
<li>implement the feature</li>
<li>verify all acceptance (included pre-existing ones) pass</li>
<li>verify that the feature, as implemented, matches the desire of the champion</li>
<li>deploy changes to production</li>
</ul>
<p>This has lightened the load on QA to the point that our one QA resource can keep up with four developers. QA still has a lot of responsibility:</p>
<ul>
<li>verify the acceptance criteria make sense</li>
<li>verify the acceptance criteria is testable</li>
<li>verify the acceptance criteria cover all important variations of the feature</li>
<li>verify the acceptance tests cover all important variations</li>
<li>refine acceptance tests</li>
<li>verify all tests pass in a production like environment</li>
<li>perform visual inspections in various browsers</li>
<li>accept features into the production branch</li>
</ul>
<figure>
<figcaption>
<h4>New feature development process<sup id='as-if-quality-mattersfnref:1'><a href='#as-if-quality-mattersfn:1' rel='footnote'>1</a></sup></h4>
</figcaption>
<p>  <img src='/blog/uploads/as-if-quality-matters/flow.jpg' /><br />
</figure>
<p>We have been using this version<sup id='as-if-quality-mattersfnref:3'><a href='#as-if-quality-mattersfn:3' rel='footnote'>3</a></sup> of the process for a while now. I think is has worked really well. In fact, during our last retrospective our QA guy stated that he thought our quality management processes belonged in the &#8220;Good things&#8221; category. I think that is a first i have ever heard a QA person be so positive about processes.</p>
<p>Only time will tell if this approach produces significantly better results but I am very optimistic. So far it seems to have created a world of difference.</p>
<div class='footnotes'>
<hr />
<ol>
<li id='as-if-quality-mattersfn:1'>
<p>In practice the process is more fluid that it looks in the diagram. Developers are empowered to do what it takes to get the job done. Including not following the process. However, not following the process requires an <a href='http://en.wikipedia.org/wiki/Affirmative_defense'>affirmative defense</a> when QA asks &#8220;WTF?&#8221;.</p>
<p><a href='#as-if-quality-mattersfnref:1' rev='footnote'>&#8617;</a></li>
<li id='as-if-quality-mattersfn:2'>
<p>The QA team we do have is super top notch, though.</p>
<p><a href='#as-if-quality-mattersfnref:2' rev='footnote'>&#8617;</a></li>
<li id='as-if-quality-mattersfn:3'>
<p>It has taken a lot iterative refinements to get to this version of the process.</p>
<p><a href='#as-if-quality-mattersfnref:3' rev='footnote'>&#8617;</a></li>
</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://barelyenough.org/blog/2011/11/as-if-quality-matters/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>RDF vocabulary design issues — checksums</title>
		<link>http://barelyenough.org/blog/2011/04/vocab-design-issues-checksums/</link>
		<comments>http://barelyenough.org/blog/2011/04/vocab-design-issues-checksums/#comments</comments>
		<pubDate>Wed, 06 Apr 2011 13:23:33 +0000</pubDate>
		<dc:creator>Peter Williams</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[spdx]]></category>

		<guid isPermaLink="false">http://barelyenough.org/?p=583</guid>
		<description><![CDATA[The SPDX technical team recently encountered and interesting situation while developing our RDF vocabulary. The exact scenario was as follows, the information we store about a file is potentially invalidated with every change to the contents of that file. For example, some code might be added that has different licensing requirements. Or all the code [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href='http://spdx.org/'>SPDX</a> technical team recently encountered and interesting situation while developing our <a href='http://www.w3.org/RDF/'>RDF</a> vocabulary.</p>
<p>The exact scenario was as follows, the information we store about a file is potentially invalidated with every change to the contents of that file. For example, some code might be added that has different licensing requirements. Or all the code that has a particular licensing requirement might be removed. To make sure that the information in and SPDX file is valid for a particular file it must be possible to verify that its contents are identical to the contents of the file that was analyzed.</p>
<p>In SPDX we support verification of file contents by providing message digests<sup id='vocab-design-issues-checksumsfnref:1'><a href='#vocab-design-issues-checksumsfn:1' rel='footnote'>1</a></sup> of every file. However, message digest (hash) functions come and go. MD/5 used to be the norm but has fallen out of favor. SHA1 is currently very popular but is steadily being replaced with SHA256 and SHA512. We definitely need to support more than one digest algorithm.</p>
<p>We had three basic choices:</p>
<ul>
<li>have separate properties for each digest algorithm, each of which would be a sub-property of the <code>checksum</code> property</li>
<li>define datatypes for each digest algorithm and have single <code>checksum</code> property</li>
<li>define a class for digests that encapsulates all the data and have a single <code>checksum</code> property</li>
</ul>
<p>For the first option the graph would look like</p>
<pre><code>&lt;http://zlib.net/zlib-1.2.5.tar.gz#Makefile&gt; a spdx:file;
  spdx:sha1 &quot;1fac389…&quot;^^xs:hexBinary;
  spdx:sha256 &quot;4aa8223…&quot;^^xs:hexBinary.</code></pre>
<p>The resulting graphs are simple and self explanatory. Support for new digest algorithms is achieved by the addition of one new properties for each algorithm.</p>
<p>One potential downside is that some tools might not be able to pass through digests for algorithms the do not understand. For example, SPDX provides a tool to translate SPDX RDF data to and from spreadsheets. The digests are inserted into the spreadsheet by the tool looking for the known digest properties and putting that data into the appropriate column in the spreadsheet. This means that novel digest types would be lost in the translation. This could be avoided if the tool supported <a href='http://www.w3.org/TR/owl-ref/'>OWL</a> inferencing but it is unlikely we will implement that in the near future. I think requiring OWL inferencing to work properly is a design smell.</p>
<p>A graph for the second option would look like</p>
<pre><code>&lt;http://zlib.net/zlib-1.2.5.tar.gz#Makefile&gt; a spdx:file;
  spdx:checksum &quot;1fac389…&quot;^^spdx:sha1Hex;
  spdx:checksum &quot;4aa8223…&quot;^^spdx:sha256Hex.</code></pre>
<p>The moves the digest algorithm into the datatype specification of the literals. This approach seems quite elegant. There is a single property so it is easy for tools deal with. It is extensible, anyone could define a new datatype. It is relatively compact.</p>
<p>However, there very few ontologies that use xml datatypes in this way. This could be because there are subtle problems with this approach. Or it could be that it is just uncommon. This approach would break down for algorithms that have any secondary parameters. In that case you could combine it with the third option, though.</p>
<p>A graph for the third option would look like</p>
<pre><code>&lt;http://zlib.net/zlib-1.2.5.tar.gz#Makefile&gt; a spdx:file;
  spdx:checksum [spdx:algorithm &lt;spdx:sha1&gt;;
                 spdx:checksumValue &quot;1fac389…&quot;^^xs:hexBinary];
  spdx:checksum [spdx:algorithm &lt;spdx:sha256&gt;;
                 spdx:checksumValue &quot;4aa8223…&quot;^^xs:hexBinary].</code></pre>
<p>In many ways this approach is very similar to the datatype based one. The introduction of anonymous resources does, potentially, allow the addition of additional parameters to the digest algorithm. However, tools that do not understand the algorithm would probably not pass that information though correctly. One practical upside is that the label for particular algorithms can be stored in the RDF. The <code>&lt;spdx:sha1&gt;</code> resource can have a <code>dc:title</code> property with value &#8220;SHA1&#8221;. This would mean that tools don&#8217;t necessarily have to implicitly understand a digest type in order to display it to humans.</p>
<p>In the end we decided to use the third approach on SPDX. The additional flexibility was generally found appealing. Having to define a new property for each new digest algorithm was generally viewed as a bit of a kludge. When i presented this issue to the semantic web mailing list there was only one response which preferred the first option, but found the third option acceptable. Most of the people involved in the SPDX effort are not highly experienced RDF modelers. I am not sure if the distaste for defining new properties reflects our relative lack of experience with RDF or if it is more fundamental.</p>
<p>Feedback on this choice is welcome but this post is more of an exploration of the possibilities and implications of those approaches.</p>
<div class='footnotes'>
<hr />
<ol>
<li id='vocab-design-issues-checksumsfn:1'>
<p>We decided to call this property <code>spdx:checksum</code>. While this is technically a misnomer it does effectively convey the intent of the field.</p>
<p><a href='#vocab-design-issues-checksumsfnref:1' rev='footnote'>&#8617;</a></li>
</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://barelyenough.org/blog/2011/04/vocab-design-issues-checksums/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>RDFa as interchange format</title>
		<link>http://barelyenough.org/blog/2011/04/rdfa-as-interchange-format/</link>
		<comments>http://barelyenough.org/blog/2011/04/rdfa-as-interchange-format/#comments</comments>
		<pubDate>Mon, 04 Apr 2011 11:00:33 +0000</pubDate>
		<dc:creator>Peter Williams</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[spdx]]></category>

		<guid isPermaLink="false">http://barelyenough.org/?p=566</guid>
		<description><![CDATA[The tension between human and machine readability is never greater than when developing interchange formats. Formats that are easy and efficient for computers to read tend to be rather difficult for people to understand. When developing an interchange format you know that there will be few tools supporting it when it is released tools so [...]]]></description>
			<content:encoded><![CDATA[<p>The tension between human and machine readability is never greater than when developing interchange formats. Formats that are easy and efficient for computers to read tend to be rather difficult for people to understand. When developing an interchange format you know that there will be few tools supporting it when it is released tools so it needs to be useful even with limited tooling. However, the format must support the development of sophisticated tools if it is to succeed in the long run.</p>
<p>A large part of the appeal of XML based languages is XML provides a reasonable balance between those two factors &#8211; for programmers. It can be read easily by computers and understood reasonably well by a programmer with very limited tooling.</p>
<p>For the non-programmer the story is somewhat different. Most XML based languages are complete gibberish to people without significant technical expertise. A business person will need non-trivial tools to allow them to consume the information locked up in an XML file.</p>
<p>Data interchange formats implicitly value the wider distribution of information. Why else would you be exchanging data. It is disappointing that so many of these formats are based on technology the excludes all but those with sophisticated tools or deep technical knowledge. Data interchange formats should be designed first for people, and second for computers. A properly designed data interchange format should be consumable, using commonly available tools, by any person who is familiar with the domain.</p>
<p>This means that XML is pretty much right out.</p>
<p>Fortunately there is <a href='http://www.w3.org/TR/xhtml-rdfa-primer/'>HTML+RDFa</a>. RDFa allows <a href='http://www.w3.org/TR/rdf-primer/'>RDF</a> data sets to be serialized into HTML documents. The information can then be consumed by humans using any web browser. The raw data can be readily extracted by tools.</p>
<p>Consider the following two examples. Each is part of an SPDX<sup id='rdfa-as-interchange-formatfnref:1'><a href='#rdfa-as-interchange-formatfn:1' rel='footnote'>1</a></sup> file. In first example, HTML+RDFa, both groups are easily supported. The information is displayed in a way that it can be understood by most human and the data is machine readable. In the second the information is machine readable but quite difficult for humans to interpret.</p>
<div style='border:solid 2px gray; overflow:auto;'>
<h2 style='margin-top:0.25em; margin-bottom:0.25em;'>Files in zlib 1.2.5</h2>
<table style='border-collapse:collapse;'>
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th>License</th>
<th>Checksum</th>
<th>Copyright</th>
</tr>
</thead>
<tbody>
<tr about='https://olex.openlogic.com/package_versions/download/9423?path=openlogic/zlib/1.2.5/openlogic-zlib-1.2.5-all-src-1.zip&amp;package_version_id=3690#adler32.c' typeof='spdx:File'>
<td property='spdx:Name'>adler32.c<br />
<span resource='https://olex.openlogic.com/package_versions/download/9423?path=openlogic/zlib/1.2.5/openlogic-zlib-1.2.5-all-src-1.zip&amp;package_version_id=3690' rev='spdx:MemberFile' /></td>
<td property='spdx:Type'>source</td>
<td><a href='http://spdx.com/licenses/zlib' rel='spdx:License'>Zlib</a></td>
<td datatype='spdx:sha256' property='spdx:mac'>gWAPnq8fV6sVKdiYkgJQ1nFoTaXXSqoVfJbMCr9Kzd0</td>
<td property='spdx:Copyright'>unknown</td>
</tr>
<tr about='https://olex.openlogic.com/package_versions/download/9423?path=openlogic/zlib/1.2.5/openlogic-zlib-1.2.5-all-src-1.zip&amp;package_version_id=3690#amiga/Makefile.pup' typeof='spdx:File'>
<td property='spdx:Name'>amiga/Makefile.pup<span resource='https://olex.openlogic.com/package_versions/download/9423?path=openlogic/zlib/1.2.5/openlogic-zlib-1.2.5-all-src-1.zip&amp;package_version_id=3690' rev='spdx:MemberFile' /></td>
<td property='spdx:Type'>other</td>
<td><a href='http://spdx.org/licenses/Zlib' rel='spdx:License'>Zlib</a></td>
<td datatype='spdx:sha256' property='spdx:mac'>plyzzUCxuOx34oiXTdncU9ke14u.SV6UzMhN3UI.3&#215;8</td>
<td property='spdx:Copyright'>unknown</td>
</tr>
<tr about='https://olex.openlogic.com/package_versions/download/9423?path=openlogic/zlib/1.2.5/openlogic-zlib-1.2.5-all-src-1.zip&amp;package_version_id=3690#ChangeLog' typeof='spdx:File'>
<td property='spdx:Name'>ChangeLog<span resource='https://olex.openlogic.com/package_versions/download/9423?path=openlogic/zlib/1.2.5/openlogic-zlib-1.2.5-all-src-1.zip&amp;package_version_id=3690' rev='spdx:MemberFile' /></td>
<td property='spdx:Type'>other</td>
<td><a href='http://spdx.org/licenses/Zlib' rel='spdx:License'>Zlib</a></td>
<td datatype='spdx:sha256' property='spdx:mac'>rxKcRCSHu8.4tHMdiJRINMatY4efBCMz.PmHpo.gj9s</td>
<td property='spdx:Copyright'>unknown</td>
</tr>
<tr about='https://olex.openlogic.com/package_versions/download/9423?path=openlogic/zlib/1.2.5/openlogic-zlib-1.2.5-all-src-1.zip&amp;package_version_id=3690#contrib/ada/readme.txt' typeof='spdx:File'>
<td property='spdx:Name'>contrib/ada/readme.txt<span resource='https://olex.openlogic.com/package_versions/download/9423?path=openlogic/zlib/1.2.5/openlogic-zlib-1.2.5-all-src-1.zip&amp;package_version_id=3690' rev='spdx:MemberFile' /></td>
<td property='spdx:Type'>other</td>
<td><a href='http://spdx.org/licenses/GPL-2.0' rel='spdx:License'>GPL-2.0</a></td>
<td datatype='spdx:sha256' property='spdx:mac'>j.nlMD8ujot0bHglDnS3xK63zmIS_c51H8Ogzlakf.I</td>
<td property='spdx:Copyright'>unknown</td>
</tr>
</tbody>
</table>
</div>
<p>The following is a similar amount of information expressed in RDF/XML</p>
<pre><code>&lt;spdx:File rdf:about=&quot;https://olex.openlogic.com/package_versions/download/9423?path=openlogic/zlib/1.2.5/openlogic-zlib-1.2.5-all-src-1.zip=3690#CMakeLists.txt&quot;&gt;
  &lt;spdx:Copyright&gt;unknown&lt;/spdx:Copyright&gt;
  &lt;spdx:License rdf:resource=&quot;http://spdx.org/licenses/Zlib&quot;/&gt;
  &lt;spdx:Name&gt;CMakeLists.txt&lt;/spdx:Name&gt;
  &lt;spdx:Type&gt;source&lt;/spdx:Type&gt;
  &lt;spdx:sha1&gt;L_NaoTEuHjO8uI1VLGIai2rDk7wPkoyeSe5vVD1gxOc&lt;/spdx:sha1&gt;
&lt;/spdx:File&gt;
&lt;spdx:File rdf:about=&quot;https://olex.openlogic.com/package_versions/download/9423?path=openlogic/zlib/1.2.5/openlogic-zlib-1.2.5-all-src-1.zip=3690#ChangeLog&quot;&gt;
  &lt;spdx:Copyright&gt;unknown&lt;/spdx:Copyright&gt;
  &lt;spdx:License rdf:resource=&quot;http://spdx.org/licenses/Zlib&quot;/&gt;
  &lt;spdx:Name&gt;ChangeLog&lt;/spdx:Name&gt;
  &lt;spdx:Type&gt;other&lt;/spdx:Type&gt;
  &lt;spdx:sha1&gt;rxKcRCSHu8.4tHMdiJRINMatY4efBCMz.PmHpo.gj9s&lt;/spdx:sha1&gt;
&lt;/spdx:File&gt;
&lt;spdx:File rdf:about=&quot;https://olex.openlogic.com/package_versions/download/9423?path=openlogic/zlib/1.2.5/openlogic-zlib-1.2.5-all-src-1.zip=3690#FAQ&quot;&gt;
  &lt;spdx:Copyright&gt;unknown&lt;/spdx:Copyright&gt;
  &lt;spdx:License rdf:resource=&quot;http://spdx.org/licenses/Zlib&quot;/&gt;
  &lt;spdx:Name&gt;FAQ&lt;/spdx:Name&gt;
  &lt;spdx:Type&gt;other&lt;/spdx:Type&gt;
  &lt;spdx:sha1&gt;qNaKL48VknhfAA7NgpOLjMoyHlxv45x.hMu8UfAy0Fc&lt;/spdx:sha1&gt;
&lt;/spdx:File&gt;
&lt;spdx:File rdf:about=&quot;https://olex.openlogic.com/package_versions/download/9423?path=openlogic/zlib/1.2.5/openlogic-zlib-1.2.5-all-src-1.zip=3690#INDEX&quot;&gt;
  &lt;spdx:Copyright&gt;unknown&lt;/spdx:Copyright&gt;
  &lt;spdx:License rdf:resource=&quot;http://spdx.org/licenses/Zlib&quot;/&gt;
  &lt;spdx:Name&gt;INDEX&lt;/spdx:Name&gt;
  &lt;spdx:Type&gt;other&lt;/spdx:Type&gt;
  &lt;spdx:sha1&gt;HMjPi3ZRY2Anq1vc4Lfl5x77JWj3hhaWWT.I34QPK9g&lt;/spdx:sha1&gt;
&lt;/spdx:File&gt;</code></pre>
<p>The extreme accessibility of HTML+RDFa for both humans and machines makes it an obviously superior choice for data interchange formats. HTML+RDFa is a relatively new entry into the arena. Hopefully we will see more data formats use this superb technology.</p>
<div class='footnotes'>
<hr />
<ol>
<li id='rdfa-as-interchange-formatfn:1'>
<p>The <a href='http://spdx.org/'>Software Package Data Exchange</a> project is designing a way to exchange licensing information for software packages. The current phase of development is primarily focused on simple manifest and copyright licensing related information.</p>
<p><a href='#rdfa-as-interchange-formatfnref:1' rev='footnote'>&#8617;</a></li>
</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://barelyenough.org/blog/2011/04/rdfa-as-interchange-format/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Javascript</title>
		<link>http://barelyenough.org/blog/2011/01/javascript-feelings/</link>
		<comments>http://barelyenough.org/blog/2011/01/javascript-feelings/#comments</comments>
		<pubDate>Fri, 28 Jan 2011 17:21:59 +0000</pubDate>
		<dc:creator>Peter Williams</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[JavaScript]]></category>

		<guid isPermaLink="false">http://barelyenough.org/?p=549</guid>
		<description><![CDATA[From the coffeescript homepage: Underneath all of those embarrassing braces and semicolons, JavaScript has always had a gorgeous object model at its heart. That sums up my feelings about javascript almost exactly.]]></description>
			<content:encoded><![CDATA[<p>From the <a href="http://jashkenas.github.com/coffee-script/">coffeescript</a> homepage:</p>
<blockquote><p>Underneath all of those embarrassing braces and semicolons, JavaScript has always had a gorgeous object model at its heart.</p></blockquote>
<p>That sums up my feelings about javascript almost exactly.</p>
]]></content:encoded>
			<wfw:commentRss>http://barelyenough.org/blog/2011/01/javascript-feelings/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Saml SP announce</title>
		<link>http://barelyenough.org/blog/2010/11/saml-sp-announce/</link>
		<comments>http://barelyenough.org/blog/2010/11/saml-sp-announce/#comments</comments>
		<pubDate>Mon, 08 Nov 2010 12:30:15 +0000</pubDate>
		<dc:creator>Peter Williams</dc:creator>
				<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://barelyenough.org/?p=536</guid>
		<description><![CDATA[Saml-sp provides support for being a SAML 2.0 service provider in an HTTP artifact binding SSO conversation. Synopsis This library provides parsing of SAML 2.0 artifacts. For example. artifact = Saml2::Type4Artifact.new_from_string(params[&#39;SAMLart&#39;]) # =&#62; #&#60;Saml2::Type4Artifact ...&#62; artifact.source_id # =&#62; &#39;a314Xc8KaSd4fEJAd8R&#39; artifact.type_code # =&#62; 4 Once you have an artifact you can resolve it into it&#8217;s associated [...]]]></description>
			<content:encoded><![CDATA[<p>Saml-sp provides support for being a SAML 2.0 service provider in an HTTP artifact binding SSO conversation.</p>
<h2 id='synopsis'>Synopsis</h2>
<p>This library provides parsing of SAML 2.0 artifacts. For example.</p>
<pre><code>artifact = Saml2::Type4Artifact.new_from_string(params[&#39;SAMLart&#39;])
# =&gt; #&lt;Saml2::Type4Artifact ...&gt;
artifact.source_id    # =&gt; &#39;a314Xc8KaSd4fEJAd8R&#39;
artifact.type_code    # =&gt; 4</code></pre>
<p>Once you have an artifact you can resolve it into it&#8217;s associated assertion:</p>
<pre><code>assertion = artifact.resolve     # =&gt; #&lt;Saml2::Assertion&gt;</code></pre>
<p>With the assertion you can identify the user and retrieve attributes:</p>
<pre><code>assertion.subject_name_id        # =&gt; &#39;1234&#39;
assertion[&#39;mail&#39;]                # =&gt; &#39;john.doe@idp.example&#39;</code></pre>
<h3 id='configuration'>Configuration</h3>
<p>If you are using Rails the SamlSp will automatically load configuration info from <code>config/saml_sp.conf</code>.</p>
<p>For non-Rails apps the saml-sp configuration file can be place in the application configuration directory and loaded using the following code during application startup.</p>
<pre><code>SamlSp::Config.load_file(APP_ROOT + &quot;/config/saml_sp.conf&quot;)</code></pre>
<h4 id='logging'>Logging</h4>
<p>If you are using saml-sp in a rails app it will automatically log to the Rails default logger. For non-Rails apps you can specify a Logger object to be used in the config file.</p>
<pre><code>logger MY_APP_LOGGER</code></pre>
<h4 id='artifact_resolution_service'>Artifact Resolution Service</h4>
<p>For artifact resolution to take place you need to configure an artifact resolution service for the artifacts source. This is done by adding block similar to the following to your saml-sp config file.</p>
<pre><code>artifact_resolution_service {
  source_id         &#39;opaque-id-of-the-idp&#39;
  uri               &#39;https://samlar.idp.example/resolve-artifact&#39;
  identity_provider &#39;http://idp.example/&#39;
  service_provider  &#39;http://your-domain.example/&#39;
  http_basic_auth {
    realm    &#39;the-idp-realm&#39;
    user_id  &#39;my-user-id&#39;
    password &#39;my-password&#39;
  }
}</code></pre>
<p>The configuration details are:</p>
<ul>
<li>
<p>source_id: The id of the source that this resolution service can resolve. This is a 20 octet binary string.</p>
</li>
<li>
<p>uri: The endpoint to which artifact resolve requests should be sent.</p>
</li>
<li>
<p>identity_provider: The URI identifying the identity provider that issues assertions using the source id specified.</p>
</li>
<li>
<p>service_provider: The URI identifying the your software (the service provider) to the identity provider.</p>
</li>
<li>
<p>http_basic_auth: (Optional) The credentials needed to authenticate with the IdP using HTTP basic authentication.</p>
</li>
</ul>
<h4 id='promiscuous_auth'>Promiscuous Auth</h4>
<p>If the IdP does not provide proper HTTP challenge responses you can specify the HTTP auth in promiscuous mode. For example,</p>
<pre><code>http_basic_auth {
  promiscuous
  user_id  &#39;my-user-id&#39;
  password &#39;my-password&#39;
}</code></pre>
<p>In promiscuous mode the credentials are sent with every request to this resolutions service regardless of it&#8217;s realm.</p>
<h2 id='requirements'>Requirements</h2>
<ul>
<li>Nokogiri</li>
<li>Resourcful</li>
<li>uuidtools</li>
</ul>
<h2 id='install'>Install</h2>
<ul>
<li>sudo gem install saml-sp</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://barelyenough.org/blog/2010/11/saml-sp-announce/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Resque-multi-step</title>
		<link>http://barelyenough.org/blog/2010/10/resque-multi-step-announce/</link>
		<comments>http://barelyenough.org/blog/2010/10/resque-multi-step-announce/#comments</comments>
		<pubDate>Wed, 06 Oct 2010 11:08:46 +0000</pubDate>
		<dc:creator>Peter Williams</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[resque]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://barelyenough.org/?p=520</guid>
		<description><![CDATA[I&#8217;ve been developing using asynchronous jobs quite a bit lately.1 There is only one reason to do work asynchronous. It takes too long to do it synchronously. Fortunately, it turns out that many of these very large work loads are embarrassingly parallel problems. And look, you have several (dozen) workers just waiting to do your [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been developing using asynchronous jobs quite a bit lately.<sup id='fnref:1'><a href='#fn:1' rel='footnote'>1</a></sup> There is only one reason to do work asynchronous. It takes too long to do it synchronously.</p>
<p>Fortunately, it turns out that many of these very large work loads are <a href='http://en.wikipedia.org/wiki/Embarrassingly_parallel'>embarrassingly parallel problems</a>. And look, you have several (dozen) workers just waiting to do your bidding. It just makes sense to break up these large blocks of work into many smaller chunks so that it can be processed in parallel.</p>
<p>Breaking a task up into many small parts comes with some issues. Any task that takes long enough to run asynchronously is probably going to take long enough that you need to track its progress.</p>
<p>Also the problem is probably not completely parallelizable. Most problems seem to have a large portion of easily parallelized work followed by a bit of work that can only happen after all the parallel work has been complete.</p>
<p>These patterns show up often enough that i have gotten tired of repeating myself. Hence was born <a href='http://github.com/pezra/resque-multi-step'>resque-multi-step</a>. Resque-multi-step is a <a href='http://github.com/defunkt/resque'>Resque</a> plugin that provides compound job support complete with progress tracking, error handling, and a completely serial finalization sequence.</p>
<h2 id='example'>Example</h2>
<p>Say you want to reindex all the posts in a blog. However, committing solr for each post would be excessively slow. (Trust me, it really is.)</p>
<pre><code>Resque::Plugins::MultiStepTask.create(&quot;reindex-#{blog.name}&quot;) do |task|
  blog.posts.each do |post|
    task.add_job ReindexWithoutCommit, post
  end

  task.add_finalization_job CommitSolr
end</code></pre>
<p>This reindexs all the posts in parallel. Any available workers will pick up a job to reindex a specific blog post. Once all those reindex jobs have completed, the finalization job will be executed.</p>
<p>If you have more that one finalization job, they are executed serially in the order they were added to the task.</p>
<h2 id='administrivia'>Administrivia</h2>
<p>If these issues sound familiar give resque-multi-step a try. It is available as a gem so installing is just</p>
<pre><code>gem install resque-multi-step</code></pre>
<p>If you want to contribute head on over to the <a href='http://github.com/pezra/resque-multi-step'>github project</a> and hack away. If you come up with something useful i&#8217;ll integrate it post haste.</p>
<div class='footnotes'>
<hr />
<ol>
<li id='fn:1'>
<p><a href='http://github.com/pezra/resque-fairly'>resque-fairly</a> was one of the first public out comes of such work. The fair scheduling it provides the basis for <a href='http://github.com/pezra/resque-multi-step'>this effort</a>.</p>
<p><a href='#fnref:1' rev='footnote'>&#8617;</a></li>
</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://barelyenough.org/blog/2010/10/resque-multi-step-announce/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>resque-fairly</title>
		<link>http://barelyenough.org/blog/2010/08/resque-fairly-announce/</link>
		<comments>http://barelyenough.org/blog/2010/08/resque-fairly-announce/#comments</comments>
		<pubDate>Tue, 24 Aug 2010 19:58:50 +0000</pubDate>
		<dc:creator>Peter Williams</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[projects]]></category>
		<category><![CDATA[resque]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://barelyenough.org/?p=498</guid>
		<description><![CDATA[I have been using Resque quite a bit recently. It is a really nice asynchronous job system based on Redis. Resque checks the queues for jobs to process in a fixed order. (In alphabetic order, to be precise.) This turns out to be a problem is you want predictable handling time for jobs. For example, [...]]]></description>
			<content:encoded><![CDATA[<p>I have been using <a href='http://github.com/defunkt/resque'>Resque</a> quite a bit recently. It is a really nice asynchronous job system based on <a href='http://code.google.com/p/redis/'>Redis</a>.</p>
<p>Resque checks the queues for jobs to process in a fixed order. (In alphabetic order, to be precise.) This turns out to be a problem is you want predictable handling time for jobs. For example, consider a system which has queues <code>aaa</code> and <code>zzz</code>. If you add 100 jobs to <code>aaa</code> and 1 job to <code>zzz</code>, the job on <code>zzz</code> will wait a long time before being processed.</p>
<p>This problem is easily solved by just checking the queues in random order. Over time, any particular queue will be checked early so a few deep queues will not starve the other queues in the system.</p>
<p><a href='http://github.com/pezra/resque-fairly'>resque-fairly</a> is a Resque <a href='http://wiki.github.com/defunkt/resque/plugins'>plugin</a> which provides that behavior. Just install the gem, add <code>require &#39;resque-fairly&#39;</code> and Resque will handle queues with approximate fairness.</p>
]]></content:encoded>
			<wfw:commentRss>http://barelyenough.org/blog/2010/08/resque-fairly-announce/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Vertical Slicing</title>
		<link>http://barelyenough.org/blog/2010/07/vertical-slicing/</link>
		<comments>http://barelyenough.org/blog/2010/07/vertical-slicing/#comments</comments>
		<pubDate>Mon, 12 Jul 2010 02:57:33 +0000</pubDate>
		<dc:creator>Peter Williams</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[REST]]></category>

		<guid isPermaLink="false">http://barelyenough.org/?p=466</guid>
		<description><![CDATA[I am a fan of polylithic architectures. Such architectures have many advantages related to enhancing evolvability and maintainability. When you decide to create a system composed of small pieces how do you decide what functionality goes into which component? Principles The goal is to sub-divide the application into multiple highly cohesive components which are weakly [...]]]></description>
			<content:encoded><![CDATA[<p>I am a fan of <a href='http://barelyenough.org/blog/2009/09/small-pieces/'>polylithic architectures</a>. Such architectures have many advantages related to enhancing evolvability and maintainability. When you decide to create a system composed of small pieces how do you decide what functionality goes into which component?</p>
<h2 id='principles'>Principles</h2>
<p>The goal is to sub-divide the application into multiple <a href='http://en.wikipedia.org/wiki/Cohesion_(computer_science)'>highly cohesive</a> components which are weakly <a href='http://github.com/jimweirich/presentation_connascence/raw/master/Connascence.key.pdf'>connascence</a> with each other. To achieve the desired cohesion it will be necessary to align the component boundaries with natural fissure points in the application.</p>
<p>The strategy should allow for the production of a arbitrary number of components. A component that was of a manageable size yesterday could easily become too large tomorrow. In that situation the over-sized component will need to be sub-divided. Applying the same strategy repeated will result in a system that is more easily understood.</p>
<p>We want to minimize redundancy in the components. Redundancy results in more code with must be understood and maintained. More importantly redundancy usually introduces <a href='http://onestepback.org/articles/connascence/conalgorithm.html'>connascence of algorithm</a>, making changes more error prone and expensive. In a perfect world, any particular behavior would be implemented in exactly one component.</p>
<p>We want to isolate changes to the system. When implementing a new feature it is desirable to change as few components as possible. Each additional component that must be changed raise the complexity of the change. The componentization strategy should minimize the number of components involved in the average change to the system.</p>
<p>With those metrics in mind lets explore the two most common approaches and see how they compare with each other. Those two patterns of componentization are horizontal slicing and vertical slicing.</p>
<h2 id='horizontal_slicing'>Horizontal slicing</h2>
<p>In this approach the component boundaries are derived from that implementation domain. The implementation is divided into a set of stacked layers in such a way that a layer initiates communication with the layers below it. This results in a standard layered architectures. By implementing each layer in a separate component you can achieve the horizontal slicing. This style of componentization strategy results in the very common <a href='http://en.wikipedia.org/wiki/Multitier_architecture'>n-tier architecture pattern</a>.</p>
<p>For example, an application that has a business logic and a presentation layer the application would be divided into two components. A business logic component and a presentation component.</p>
<h2 id='vertical_slicing'>Vertical slicing</h2>
<p>In this approach the component boundaries are derived from the application domain. Related domain concepts are grouped together into components. Individual components communicate with any other components as needed.</p>
<p>This approach is also quite common but is usually thought of a lot less formally. It is more common for this type of segmentation to develop incidentally. For example, because separate teams developed the parts independently, and then integrated them later. Any time you integrate separate applications you have vertical componentization.</p>
<h2 id='the_score'>The Score</h2>
<p>Against the metrics we laid out earlier, vertical slicing does much better than horizontal.</p>
<table>
<thead>
<tr>
<th />
<th>Horizontal slicing</th>
<th>Vertical slicing</th>
</tr>
</thead>
<tbody>
<tr>
<td style='text-align: left;'>Cohesion</td>
<td style='text-align: left;'>high</td>
<td style='text-align: left;'>high</td>
</tr>
<tr>
<td style='text-align: left;'>Repeatability</td>
<td style='text-align: left;'>low</td>
<td style='text-align: left;'>high</td>
</tr>
<tr>
<td style='text-align: left;'>DRYness</td>
<td style='text-align: left;'>low</td>
<td style='text-align: left;'>high</td>
</tr>
<tr>
<td style='text-align: left;'>Change isolation</td>
<td style='text-align: left;'>low</td>
<td style='text-align: left;'>high</td>
</tr>
</tbody>
</table>
<h3 id='cohesion'>Cohesion</h3>
<p>Horizontal slicing has high cohesion. Each of the components can represent the a logically cohesive part of the implementation.</p>
<p>Vertical slicing also has high cohesion. Each component represents highly cohesive part of the application domain.</p>
<h3 id='repeatability'>Repeatability</h3>
<p>Vertical slicing provides a mechanism for reapply the subdivision pattern an arbitrary number of times. If any component gets too large to manage it can be divided into multiple components based on the application domain concepts. This same process can be repeated from the initial division of a monolithic application until components of the desired size have been achieved.</p>
<p>Horizontal slicing is less repeatable. The more tiers the harder it is to maintain cohesiveness. In practice it is very rare to see an tiered architecture with more than 4 tiers, and 3 tiers is much more common.</p>
<h3 id='dryness'><a href='http://en.wikipedia.org/wiki/Don%27t_repeat_yourself'>DRYness</a></h3>
<p>Horizontal slicing tends to result in some repetition. Certain behaviors will have to be repeated a each layer. For example, data validation rules. You will need those in the presentation layer to provide good error messages and in the business logic layer to prevent bad data being persisted.</p>
<p>Vertical slicing allows you to reduce the connascence of algorithm because any single user activity is implemented in exactly one component. Components usually do end up communicating to each other, however, they do so in a way that does not require in the same algorithms be implemented in multiple components. For any one bit of data or behavior, one component will its authoritative source.</p>
<h3 id='change_isolation'>Change isolation</h3>
<p>Vertical scaling tends to allow new features to be implemented by changing only one component. The component changed is the one which already contains features cohesive with the new one.</p>
<p>Horizontal slicing, on the other hand, tends to require changes in every layer. The new feature will require additions to the presentation layer, the business logic layer and the persistence layer. Having to work in every layer increase the cognitive load required to achieve the desired result.</p>
<h2 id='conclusion'>Conclusion</h2>
<p>Vertical slicing provides significant advantages. The high cohesion, dryness, and change isolation combine to drastically reduces the risks and cost of change. That is turn allow better/faster maintenance and evolution of the system. The repeatability allows you to retain these benefits even while adding functionality over time. Each time a component gets too large you can divide it until you have reach a application size that is human scaled.</p>
<p>Having a large number of components operate as a system does result in a good deal of communication between the components. It important to pay attention to the design of the APIs. Poor API design can introduce excessive coupling which will eat up most of the advantages described above. <a href='http://barelyenough.org/blog/2007/05/hypermedia-as-the-engine-of-application-state/'>Hypermedia</a> &#8211; or more precisely, following the REST architectural style &#8211; is the best way i know to reduce coupling between the components.</p>
]]></content:encoded>
			<wfw:commentRss>http://barelyenough.org/blog/2010/07/vertical-slicing/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Is ruby immature?</title>
		<link>http://barelyenough.org/blog/2010/06/is-ruby-immature/</link>
		<comments>http://barelyenough.org/blog/2010/06/is-ruby-immature/#comments</comments>
		<pubDate>Thu, 01 Jul 2010 05:17:40 +0000</pubDate>
		<dc:creator>Peter Williams</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Rails]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://barelyenough.org/blog/2010/06/is-ruby-immature/</guid>
		<description><![CDATA[A friend of mine recently described why he feels ruby is immature. I, of course, disagree with him. There is much in ruby that could be improved, but the issues he raised are a) intentional design choices or b) weaknesses in specific applications built in ruby. Neither of those scenarios can be fairly described as [...]]]></description>
			<content:encoded><![CDATA[<p>A friend of mine recently <a href='http://www.evilsoft.org/2010/06/30/why-i-view-ruby-as-immature'>described why he feels ruby is immature</a>. I, of course, disagree with him. There is much in ruby that could be improved, but the issues he raised are a) intentional design choices or b) weaknesses in specific applications built in ruby. Neither of those scenarios can be fairly described as immaturity in the language, or the community using the language.</p>
<h2 id='id1'><code>Set</code></h2>
<p>Mr. Jones&#8217; main example is one regarding the <a href='http://ruby-doc.org/core/classes/Set.html'><code>Set</code></a> class in ruby. In practice <code>Set</code> is a rarely used class in ruby. I suspect it exists primarily for historical and completeness reasons. It is rather rare to see idiomatic ruby that utilizes <code>Set</code>.<sup id='fnref:1'><a href='#fn:1' rel='footnote'>1</a></sup></p>
<p>This is possible because <code>Array</code> provides a rather complete implementation of basic set operations. Rubyist are very accustom to using arrays. So is more common to just use the set operator on arrays rather than converting an array into a sets.</p>
<p>The set operations on <code>Array</code> do not have the same performance characteristics mr. Jones found with <code>Set</code>. For example,</p>
<pre><code>$ time ruby -rpp -e &#39;pp (1..10_000_000).to_a &amp; (1..10).to_a&#39;
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

real	0m10.152s
user	0m6.592s
sys	0m3.515s

$ time ruby -rpp -e &#39;pp (1..10).to_a &amp; (1..10_000_000).to_a&#39;
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

real	0m12.410s
user	0m8.397s
sys	0m3.860s</code></pre>
<p>Order still matters, but very much less. (That is on 1.8.6, the only version i have handy at the moment. I am sure that 1.9, or even 1.8.7, would be quite a bit faster.)</p>
<p>Libraries that are low traffic areas don&#8217;t get the effort that high use libraries do in any language. Even though <code>Set</code> is part of the standard library, it is definitely counts as a low traffic area. Hence, it has never been optimized for large numbers of items. This is appropriate because as we <a href='http://www.faqs.org/docs/artu/ch01s06.html'>learned from Ron Pike</a> &#8220;n is usually small&#8221;. The benefits of handling large sets performantly is not worth the addition complexity for a low traffic library.</p>
<h2 id='id2'><code>nil</code></h2>
<p>In his other example mr. Jones implies that the fact that <code>nil</code> is a real object is disadvantageous. On this count he is simply incorrect. Having <code>nil</code> be an object allows significant reductions in the number of special cases that must exist. This reduction in special cases often results in less code, but is always results in less cognitive load.</p>
<p>Consider the <a href='http://api.rubyonrails.org/classes/Object.html#M000027'><code>#try</code></a> in ruby. While not my favorite implementation of this concept, it is still a powerful idiom for removing clutter from the code.</p>
<p><code>#try</code> executes the specified method on the receive, unless the receiver is <code>nil</code>. When the receive is <code>nil</code> it does nothing. This allows code to use a best effort approach to performing non-critical operations. For example<sup id='fnref:2'><a href='#fn:2' rel='footnote'>2</a></sup>,</p>
<pre><code>def remove_email(email)
  emails.find_by_email(email).try(:destroy)
end  </code></pre>
<p>This is implemented as follows:</p>
<pre><code>module Kernel
  def try(method, *args, &amp;block)
    send(method, *args, &amp;block)
  end
end

class NilClass
  def try(*args)
    # do nothing
  end
end</code></pre>
<p>You could implement something like <code>#try</code> in a system that has non-object &#8220;no value&#8221; mechanism. It would be less elegant and less clear, though. (It would probably be less performant too because method calls tend to be optimized rather aggressively.) Have <code>nil</code> be an object like everything else is one less the primitive concept that the code and the programmer must keep in mind.</p>
<p>Mr. Jones does bring up the issue of <code>nil.id</code> returning 4 and that value being used as a foreign key in the database. This is not a problem i see very often, but i can happen.</p>
<p>This is definitely not a problem with ruby. Rather results from an unfortunate choice of naming convention in rails. Rails uses <code>id</code> as the name of the primary key column for database tables. This results in an <code>#id</code> method being created, which overrides the <code>#id</code> provided by ruby itself for all objects. If rails had chosen to call the primary key column something that did not conflict with an existing ruby core method &#8211; say <code>pk</code> &#8211; we would not be having this discussion.</p>
<h2 id='in_general'>In general</h2>
<p>Mr. Jones asserts that &#8220;ruby is rife with happy path coding&#8221;. I disagree with his characterization. The ruby community has a strong bias towards producing working, if incomplete code, and iterating on that code to improve it. This &#8220;simplest thing that could work&#8221; approach does result in the occasional misstep and suboptimal implementations. In return you get to use a lot of new stuff more quickly and when there are problems they are easier to fix because the code is simpler.</p>
<p>The ruby community has strongly embraced the small pieces, loosely joined approach. This is only accelerating the innovation in ruby. Gems have lowered the fiction of distributing and installing components to previously unimaginable levels. This has allowed many libraries that would have been to small to be worth releasing in the past to come into existence.</p>
<p><a href='http://rack.rubyforge.org/'>Rack</a>, with it&#8217;s middleware concept, is an example of the ruby community taking much of the Unix philosophy and turning it to 11. While rails has much historic baggage, even it is moving to a much more modular architecture with the up coming 3.0 release.</p>
<p>Following these principles does result in some rough edges occasionally, but the benefits are worth the trade. The 80% solution is how <a href='http://naggum.no/worse-is-better.html'>Unix succeed</a>. An 80% solution today is better than a 100% solution 3 months from now. (As long as you can improve it when needed.) We always have releases to get to, after all.</p>
<div class='footnotes'>
<hr />
<ol>
<li id='fn:1'>
<p>I, on the other hand, do use set rather more than the average rubyist. <code>Set</code> is a rather performant way producing collections without duplicate entries.</p>
<p><a href='#fnref:1' rev='footnote'>&#8617;</a></li>
<li id='fn:2'>
<p>Shamelessly copied from <a href='http://ozmm.org/posts/try.html'>Chris Wanstrath</a>.</p>
<p><a href='#fnref:2' rev='footnote'>&#8617;</a></li>
</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://barelyenough.org/blog/2010/06/is-ruby-immature/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

