{"id":906,"date":"2012-05-07T01:08:48","date_gmt":"2012-05-07T06:08:48","guid":{"rendered":"http:\/\/yourLinuxGuy.com\/?p=906"},"modified":"2012-05-14T01:27:33","modified_gmt":"2012-05-14T06:27:33","slug":"the-while-read-loop-controversy","status":"publish","type":"post","link":"https:\/\/yourLinuxGuy.com\/?p=906","title":{"rendered":"The while-read loop controversy&#8230;"},"content":{"rendered":"<p>For about as long as I&#8217;ve been able to spell &#8220;bash&#8221;, I&#8217;ve seen the debates on the &#8216;Net about the proper way to use the shell to loop through a text file line-by-line (rather than item-by-item).<\/p>\n<p>Of the small handful of common methods, it often comes down to this:<\/p>\n<p><strong>Using Cat<\/strong><br \/>\nWith this method, you call <code>cat<\/code> to pipe the contents of the file into the <code>while<\/code> loop.<\/p>\n<pre>cat $inputFile | while read loopLine\r\ndo\r\n  (some stuff)\r\ndone<\/pre>\n<p><strong>Using Redirection<\/strong><br \/>\nWith this method, you are redirecting the file into the loop, as indicated by the redirection arrow to the <code>done<\/code> statement in last line.<\/p>\n<pre>while read loopLine\r\ndo\r\n  (some stuff)\r\ndone &lt; $inputFile<\/pre>\n<p><strong>So Which Way?<\/strong><br \/>\nOnce you talk the purists down off the ledge about you not using the IFS variable, and they get over the fact that you aren&#8217;t <code>awk<\/code> in the first place, you can move on to the discussion regarding which <code>while read<\/code> approach you&#8217;ll use; since most people do it that way anyway, and it&#8217;s easier to understand.\u00a0 There.\u00a0 I said it.<\/p>\n<p>Between the two methods I describe above, it&#8217;s often said that the <code>cat<\/code> method is &#8220;wasteful&#8221;, but easier to understand for the person who comes after you. This is apparently because you see right away &#8212; as you read through the code in order &#8212; the thing that is getting passed into the loop, rather than having wonder or look for it.<\/p>\n<p>Conversely, the redirected method is much more efficient (since you aren&#8217;t executing <code>cat<\/code>), but someone might not easily understand how it&#8217;s happening since they have to scroll down to see the input file , or may not understand what is being looped.<\/p>\n<p>Both points are kinda&#8217; true.\u00a0 But when it comes down to it, the redirected method is just not that hard to understand, and I almost always use\u00a0 it&#8230;<\/p>\n<p><strong>Sometimes, Joel&#8230;<\/strong><br \/>\n&#8230;except in on situation, which is why I write this post; and this is almost always never mentioned in the argue-posts I read on this:\u00a0 What if you need to manipulate the content *<em>before<\/em>* is gets parsed by the while read loop?\u00a0 For instance, backslashes in the line, newlines in the wrong place, etc., read in from a group of files.<\/p>\n<p>Take this example; I have a few files with a path in one of the fields that is to be parsed, like this:<\/p>\n<pre>servername volumename folder1\\folder2<\/pre>\n<p>&#8230;and I want to read in the contents of all the few files into the one loop with an <code>ls<\/code> and a wildcard.<\/p>\n<p>Using the standard redirection method in this case, the backslash is interpreted as an &#8220;escape&#8221;, and is parsed and dropped.\u00a0 Since it&#8217;s a filesystem path, I obviously need that backslash; so the way I solved this situation was to use cat, and for each time I encountered the backslash, pass it to <code>sed<\/code> to add a second backslash as an escape before it gets parsed by the shell <code>while read<\/code> loop.<\/p>\n<p>Here&#8217;s how I did that:<\/p>\n<pre>for inputfile in `ls $fewFiles`\r\ndo\r\n  cat $inputfile | sed -e s\/'\\\\'\/'\\\\\\\\'\/g | while read loopLine\r\n  do\r\n    # Then I grab the foldername from the short parent path\r\n    item=`echo $line |awk -F '\\' '{ print $2 }'`\r\n    (some stuff)\r\n  done\r\ndone<\/pre>\n<p>Of course, you notice that the backslash even has to be escaped in the <code>sed<\/code> command, as with the double-backslash that I use to replace it&#8230;<\/p>\n<p>I know, I know what you&#8217;re thinking&#8230;\u00a0 Just use Perl&#8230;<br \/>\n\ud83d\ude09<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For about as long as I&#8217;ve been able to spell &#8220;bash&#8221;, I&#8217;ve seen the debates on the &#8216;Net about the proper way to use the shell to loop through a text file line-by-line (rather than item-by-item). Of the small handful&#8230;<br \/><a class=\"read-more-button\" href=\"https:\/\/yourLinuxGuy.com\/?p=906\">Read more<\/a><\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[67,11,47],"tags":[],"class_list":["post-906","post","type-post","status-publish","format-standard","hentry","category-bash","category-intermediate","category-linuxgeneral"],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pnjn1-eC","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/yourLinuxGuy.com\/index.php?rest_route=\/wp\/v2\/posts\/906","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/yourLinuxGuy.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/yourLinuxGuy.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/yourLinuxGuy.com\/index.php?rest_route=\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/yourLinuxGuy.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=906"}],"version-history":[{"count":12,"href":"https:\/\/yourLinuxGuy.com\/index.php?rest_route=\/wp\/v2\/posts\/906\/revisions"}],"predecessor-version":[{"id":911,"href":"https:\/\/yourLinuxGuy.com\/index.php?rest_route=\/wp\/v2\/posts\/906\/revisions\/911"}],"wp:attachment":[{"href":"https:\/\/yourLinuxGuy.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=906"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/yourLinuxGuy.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=906"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/yourLinuxGuy.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=906"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}