In defense of Unix
March 4, 2016 at 9:35 PM by Dr. Drang
This morning I came across this post by Nuno Brito on command line switches (or options, as they’re often called, even when they’re not optional). He doesn’t like the multitude and complexity of switches for common Unix utilities. Here’s his primary example:
How do you find files with a given extension inside a folder and sub-folders?
You see, in Unix (Linux, Mac, etc) this is NOT easy. Just like so many other commands, a very common task was not designed with an intuitive usage in mind. They work, but in so much as you learn an encyclopedia of switches. Sure, there exist manual pages and google/stackoverflow to help but what happened to simplicity in design?
In Windows/ReactOS one would type:
dir *.txt /s
In Unix/Linux this is a top answer:
find ./ -type f -name "*.txt"
I’ll first point out that although the -type f
part of the Unix command is, in general, a good way to restrict the list to regular files (as opposed to, say, directories or named pipes), it’s probably unnecessary in this case and means the Unix command is actually doing more than the DOS command.1 Even so,
find . -name "*.txt"
is still more complicated than the DOS equivalent. Why?
Mainly, I would argue, because Unix shells—bash, csh, zsh, whatever—are better at wildcard expansion than DOS is. In the great majority of cases, this works to the user’s advantage. Brito has found a particular instance where it doesn’t.
You might think the Unix ls
command should be able to do a recursive search, that we ought to be able to add a switch to the simple command
ls *.txt
to make it list files with a .txt
extension not just in the current directory (which is what the above command does), but also in all of its subdirectories. That, however, would be a misunderstanding of how Unix shells work.
In ls *.txt
, the ls
is doing almost nothing. Apart from how the list of files is formatted, you’d get the same result with echo *.txt
. That’s because the shell is expanding the *.txt
into a list of files before ls
or echo
themselves are invoked. The ls
command is being fed that list, so it’s just like calling
ls a.txt b.txt c.txt d.txt ...
Adding a recursive switch (and ls
does have a recursive switch) to this does no good because the arguments aren’t directories and can’t be delved into.
Although the shell’s automatic expansion makes our find
command longer than we’d like it to be for this particular case, it’s a great thing overall because we can take advantage of it when writing shell (or Perl or Python or Ruby) scripts.
Suppose you’ve written a script that does some manipulation to the files it’s given on the command line. It’s easy to write code that handles a command invocation like
manipulate a.txt b.txt c.txt
because you can just loop through the arguments. And because of the shell’s automatic expansion, that same code will work if it’s invoked as
manipulate *.txt
If the shell didn’t expand wildcards, you’d have to write code for the expansion yourself. Without bugs and consistently in every script you write. And so would everyone else. And this is not just for scripters. People who write command line utilities in C also get to take advantage of the shell’s expansion, so their work is easier, less buggy, and more consistent, too.
DOS doesn’t expand wildcards,2 which can make writing and using batch files a pain in the ass. Take a look at the workarounds people are suggesting on the linked page. If avoiding that nonsense means I have to use find
and quote my filename pattern, I’ll accept the tradeoff.
And if I really found myself doing recursive searches like this, I’d probably define an alias in my .bashrc
file:
alias lsr='find . -name'
I’d still need to quote (or escape) the wildcard pattern, but
lsr "*.md"
isn’t too bad.
Update 3/5/16 6:47 AM
Several people have tweeted me this morning that
ls **/*.txt
will also do the trick. The **
construct is a recursive globbing pattern that runs down through all the subdirectories. This is not available in the version of bash that comes with OS X but is part of more current releases. Depending on how bash comes configured on your system, you may need to set the globstar
option (shopt -s globstar
) to get this working.
You will not be surprised to hear that this tip came from zsh users. The **
globbing pattern was, I believe, introduced in zsh, and it would be unheard of for zsh users to see an article about Unix shells without commenting on the superiority of their choice. They’re like those recent converts to vegetarianism who can’t stop themselves from pointing out how disgusting your sausage pizza is. And like vegetarians, they’re most annoying because deep down you know they’re probably right.
-
OK, I guess it’s technically called the Windows Command Prompt nowadays, and it’s been improved a bit since ’80s. But it still looks like DOS to me. ↩
-
Although apparently PowerShell does. I don’t know how it directory searches. ↩