Short for concatenate, cat takes multiple files and prints them together
into one continuous stream of text. However, the first introduction to the
CLI is sure to
establish into the wasteland of the hurried neophyte's mind a simplistic mental
model boiling down to: "cat is what you use when you have a file and want its
content".
The consequence? Processes multiplying like cancerous cells:
|
|
|
|
Is there a cure? Yes, the one that shall be obvious in retrospect: tell your
tool itself what data to use.
Files all the way down
Files are a fairly common in the world of computing, and even more so across Unix-like operating systems, where (famously) everything is a file (descriptor).
When you do cat asdf, your kernel looks up asdf in the file system and
returns a file descriptor (abbreviated FD) to the process running cat,
which reads from it and writes to another, specific FD (numbered 1),
which your shell interpreter displays in your terminal.
That is it, no mystery, nothing special. We can now appreciate how silly cat file | grep is: reading from a file isn't any different from reading from the
standard input—the standard input is a file (in this context).
Your tool only has to support being specified which file to read from, and you
only need to RTFM:
|
GREP(1) User Commands GREP(1)
NAME
grep - print lines that match patterns
SYNOPSIS
grep [OPTION]... PATTERNS [FILE]...
grep [OPTION]... -e PATTERNS ... [FILE]...
grep [OPTION]... -f PATTERN_FILE ... [FILE]...
There you have it: grep and friends not only obviously accept files names as
arguments, they evidently do.
When all else fails
But what if one didn't? After rounding up the usual suspects, I can
attest to the stunning rarity of common utilities not playing by this tacit
rule. Let's conjure up gerp (grep + derp!), an abominable degradation of grep that only knows of
stdin.
You can then use input redirection (<) to funnel the data from
whichever file descriptor you get over your file, to FD 0 (the standard
input):
That's it! A patched gerp, no cat, no |, just a smidgen of understanding;
all POSIX-compliant,
too. In fact, the pipe (|) operator is essentially a way to connect the FD 1 ("stdout") of one process to the FD 0 ("stdin") of another.
Yet how seldom have you come across the redirection above? It should
unquestionably be more common than cat file | grep keyword, yet I'm fairly
confident it isn't, despite its twin counterpart (the output redirection)
being virtually omnipresent: I explain this phenomenon with the concept of
rampant shell illiteracy and attempt to address it through what I refer to
as my CLI flight manual
.
The legitimate use of cat
Despite my furry companion being of the canine variety, I have nothing against
cat in principle. Outside of scripts, I'd still argue that less is surely
often preferable, possibly with --quit-if-one-screen/-F to have it behave
like cat when you require no pagination; but here are a few instances where
cat is sensible nonetheless:
# Create file in-line
# Concatenate several files
# Group commands
{ ; ; } |
# Concatenate standard input and file
|
The definitive escalation mechanism
Somewhere along the way, we started using cat as a ceremonial prelude to
everything, as if every command needed a feline blessing. With everything being
a file (descriptor), it turns out that quite a few commands will happily accept
to read from a file!
As such, here is the definitive escalation mechanism for your commands that are to consume the contents of a file:
|
Referenced tools
| cat | |
|---|---|
| from core/coreutils | |
| manual | repository |
| grep | |
|---|---|
| from core/grep | |
| manual | repository |
| less | |
|---|---|
| from core/less | |
| manual | repository |