Puzzles with coreutils (Part 2)

On another free weekend afternoon, I decided to finish off the rest of the coreutils brainteasers I started last month. This time I learned more about Linux audio, bash substring replacement, and the assortment of flags that ls supports.

Directly route your microphone input over the network to another computer’s speaker

While I’m not sure if this is the right solution, I ended up doing this using ALSA. This, of course, required setting up two Linux VMs, but once I had proved to be fairly simple:

machine2 $ ifconfig; nc -l 4321 | aplay
machine1 $ arecord -f cd | nc $OTHER_IP 4321

In brief, arecord writes the mic input to stdout, while nc pipes the output over the network to the second machine, which then pipes that input back to aplay, which plays the input.

Messing around with this was pretty fun - the -f argument specifies the output format of the recording. Leaving it off caused the output to be poor quality, almost like a radio. More experimentation showed that this was because -f cd causes arecord to use CD quality audio - 16 bit, stereo, 44.1 kHz recording. With no format argument, arecord defaults to much lower quality audio - 8 bit, mono, 8kHz, in fact.

Replace all spaces in a filename with underscore for a given directory

This challenge turned out to be trivial on Linux, and a bit more frustrating on OS X.

Linux has the handy rename command, which does exactly what we want here - takes a Perl regex and uses it to move a file:

rename 's/ /_/g' *

That was too easy though, so I decided to figure out how to do the same thing on a BSD-based system:

ls | grep ' ' | while read file; do mv "$file" "${file// /_}"; done

Here, we filter to find files that actually contain a space (so that we don’t unnecessarily move files to the same location), then we pipe those results into a while loop that does what we want.

Cool things I learned here:

  • It’s possible to pipe things into a while loop! I finally understand how all those scripts that use read work
  • Bash has substring replacement, and it’s awful. "${file/ /_}" replaces the first space in $file with an underscore. How do you replace globally? Add an extra slash to the first bit (of course!). This is why I usually use Python for more complex tasks

Report the last ten errant accesses to the web server coming from a specific IP address

It was unclear what errant and accesses meant here, so I assumed that we were just trying to find all requests made to a web server. I also assumed that we were using nginx.

cd /var/log/nginx; zgrep $IP $(ls -vr access.*) | tail -n 10

The hardest part of this challenge wasn’t figuring out the IP accesses (which were trivially grep-able), but figuring out how to handle the compressed log format nginx uses. We needed to replace normal grep uses with zgrep, which can handle gzipped files, and then needed to do some ls magic to fix the ordering - log lines are stored from least to most recent inside the file, but access.log is more recent than access.log.1.gz, and both are more recent than access.log.2.gz.

Passing -r to ls reversed the order of its output, which meant that the log lines would now be searched from the least to most recent, across the files. There was one extra challenge though - ls sorts lexicographically, not numerically. This means that access.log.10.gz sorts after access.log.1.gz but before access.log.2.gz. That’s not what we want! Passing -v to ls tells it to sort naturally by version numbers within the text - i.e. it will sort numerically when it finds a number.

We put these together to get all accesses from a given IP, then cut down to the most recent ten.

Wrap up

Overall, these tasks were pretty interesting. Some of the solutions were immediately obvious to me, while others I would have had no idea how to solve, even if I were allowed to use Python.

While I’m not sure I’ll ever actually use these pipelines, solving the problems definitely gave me a better appreciation for Unix’s “everything is really a file philosophy.” Unfortunately, I still retain my negative feelings towards bash for anything that can’t be expressed in a single loop-free non-subshell-containing command.