Dd is not a disk writing tool (2015)

mschuster91 | karma 19531 | avg karma 1.6 · 2017-01-08 11:49:16+00:00

The unique advantage of dd over cat, cp and friends: you can specify the block size.

Just try (on OS X) dd if=disk.img of=/dev/disk1... first speedup is gained by using rdisk1, but the real improvement comes with the bs=1m. 2 vs 16 vs 300 MB/s on my machine, when cloning via USB-SATA adapter.

reply

nothrabannosir | karma 8958 | avg karma 5.56 · 2017-01-08 11:55:23+00:00

I always wondered: what determines the optimal block size, and how can I know?

mschuster91 | karma 19531 | avg karma 1.6 · 2017-01-08 11:57:33

In theory the native block size (512 bytes for most drives these days) should be the fastest, but the problem is that if you're doing such small sized I/O, you introduce a shitload of overhead for all the individual r/w calls - I guess that a huge blocksize benefits from DMA and look-ahead reads.

kosma | karma 1238 | avg karma 5.22 · 2017-01-08 12:18:29+00:00

> native block size (512 bytes for most drives these days)

Not anymore. Advanced Format (4096-byte sector) hard drives have taken the market like a storm, and SSDs benefit even more from using larger I/O sizes (because erase sectors are way larger).

reply

mschuster91 | karma 19531 | avg karma 1.6 · 2017-01-08 12:29:54

Ah, I thought it was the other way round. Thanks for the information.

Do you know a way to get an SSD's native erase sector size?

reply

zielmicha | karma 115 | avg karma 2.21 · 2017-01-08 14:08:44+00:00

I would guess that using erase sector size as a block size won't help, because SSD controller anyway rearranges blocks all the time.

kosma | karma 1238 | avg karma 5.22 · 2017-01-08 15:19:12+00:00

There is no simple answer to this. Stack Overflow has some hints but no definite answer.

vmp | karma 185 | avg karma 2.8 · 2017-01-08 12:35:13

For some reason I cannot reply to your message directly, only to a parent.

Flashbench [1] should be able to tell you what the erase-block and page size is.

[1] https://github.com/bradfa/flashbench

reply

fh973 | karma 1590 | avg karma 4.31 · 2017-01-08 12:49:33

Isn't it: the larger the better until you face diminishing returns as with small block sizes the system call overhead becomes noticeable?

morganvachon | karma 4378 | avg karma 2.65 · 2017-01-08 15:31:31

I've always used bs=4M for writing .iso or .img files to USB flash drives, as it gives me the best times. This is on Linux, OpenBSD, and macOS (using /dev/rdiskn for the latter two).

koala_man | karma 2309 | avg karma 6.19 · 2017-01-09 02:26:03+00:00

Which theory suggests that this should be the fastest?

tacostakohashi | karma 2448 | avg karma 3.15 · 2017-01-08 13:42:38+00:00

stat(2) the file, and use the value from the st_blksize member of struct stat.

dom0 | karma 4157 | avg karma 2.72 · 2017-01-08 14:22:53

Nononono, that's waaaayy to small. This will be something like 512B or 4K, meaning you'll burn the CPU on syscalls instead of doing meaningful work.

Even the 32K read/write size used by many utilities (shells, XYZsum, rsync and so on) can slow things down with modern/fast IO devices.

Today you'll want to use something like 32K to 2MB. Doesn't really matter around there.

If you're writing synchronously (which you shouldn't), then it becomes a tad more difficult to figure out for optimal performance.

reply

snuxoll | karma 5382 | avg karma 2.41 · 2017-01-08 15:43:25+00:00

Sync writes have a place, if I'm copying a disk image to a flash drive I typically want to know it's done when dd finishes so I can yank the drive without worrying if the write cache has been flushed.

mschuster91 | karma 19531 | avg karma 1.6 · 2017-01-08 15:56:14

At least a couple years ago, it was said one should not rip out USB sticks without ejecting them before, because their controller might do invisible maintenance work and that might lead to data corruption...

floatboth | karma 5759 | avg karma 2.0 · 2017-01-09 12:38:54+00:00

Hmm. Microsoft doesn't agree: https://i.imgur.com/JIZveQz.png

"Disables write caching […] you can disconnect the device safely without [unmounting it]"

reply

dom0 | karma 4157 | avg karma 2.72 · 2017-01-09 12:59:29

Broadly speaking, flash GC/management journaling is the problem of the controller and is usually not the most broken part of it.

floatboth | karma 5759 | avg karma 2.0 · 2017-01-09 12:32:30+00:00

$ sync; sync; sync

:D

reply

luchs | karma 187 | avg karma 2.37 · 2017-01-08 15:54:26+00:00

The optimal block size is probably the the amount of data which can be transferred with one DMA operation.

For NVMe disks on Linux, you can find out this size with the nvme-cli [0] tool. Use "nvme id-ctrl" to find the Maximum Data Transfer Size (MDTS) in disk (LBA) blocks and "nvme id-ns" to find the LBA Data Size (LBADS). The value is then 2^MDTS * 2^LBADS byte.

For example, the Intel SSD 450 can transfer 32 blocks of 4096 byte per NVMe command, so you'd want a block size of 128 kiB.

[0] https://github.com/linux-nvme/nvme-cli

reply

floatboth | karma 5759 | avg karma 2.0 · 2017-01-09 12:21:02+00:00

On FreeBSD, use diskinfo -v.

Also check out the BUGS section of the manpage :) https://www.freebsd.org/cgi/man.cgi?diskinfo

reply

Fr0styMatt88 | karma 1897 | avg karma 5.84 · 2017-01-08 12:32:49

Funny you should mention OS X. Something I learned only yesterday is that if you're using dd to write an .img file to an SD card (may apply to other disk types as well I imagine), using /dev/rdisk devices instead of /dev/disk devices can be much faster; in my case using the /dev/rdisk device wrote to the SD card at nearly 20 MB/sec, vs the 2 MB/sec speed I got when using /dev/disk.

See the second answer here as to why: http://superuser.com/questions/631592/why-is-dev-rdisk-about...

reply

pixelbeat | karma 1110 | avg karma 9.02 · 2017-01-08 05:58:58

Yes "D" is not for disk/drive/device/...

It comes from the DD (data definition) statement of OS/360 JCL, and hence why dd has the unusual option syntax compared to other unix utils

BTW if you are using dd to write usb drives etc. it's useful to bypass the Linux VM as much as possible to avoid systems stalls, especially with slow devices. You can do that with O_DIRECT. Also dd recently got a progress option, so...

    dd bs=2M if=disk.img of=/dev/sda... status=progress iflag=direct oflag=direct

Note dd is a lower level tool, which is why there are some gotchas when using for higher level operations. I've noted a few at:

http://www.pixelbeat.org/docs/coreutils-gotchas.html#dd

reply

the_mitsuhiko | karma 13985 | avg karma 3.97 · 2017-01-08 12:25:26

I thought it stands for copy and convert but cc was taken bu the c compiler.

gonzo | karma 2397 | avg karma 1.69 · 2017-01-08 12:32:30+00:00

Ask a graybeard.

lloeki | karma 11319 | avg karma 2.78 · 2017-01-08 12:33:45+00:00

On OS X (and BSD?) be sure to use /dev/rdisk[0-9]+ instead of /dev/disk[0-9]+.

Details as to exactly why it's faster are welcome. (I just know it bypasses stuff).

EDIT: someone mentioned this below http://superuser.com/questions/631592/why-is-dev-rdisk-about...

reply

djent | karma 767 | avg karma 4.17 · 2017-01-08 13:34:08+00:00

Ah wish I knew this last week. Writing Xubuntu to my USB took something like 2900 seconds from a Mac.

muterad_murilax | karma 1720 | avg karma 7.2 · 2017-01-08 14:02:04+00:00

In other words, about 48 minutes for a ~1.2 GB file?

isostatic | karma 5044 | avg karma 1.97 · 2017-01-08 16:46:27

About 3mbit/second, or 400kbytes a second. I'd expect something 50-100 times faster.

josteink | karma 14976 | avg karma 3.09 · 2017-01-08 16:56:07

Usually just specifying a reasonable blocksize works for me. bs=1m or so.

Without that it does literally take hours.

I suspect the default blocksize is really small (1?) and combined with uncached/unbuffered writes to slower devices, it just kills all performance outright.

Edit: answered! https://news.ycombinator.com/item?id=13350002

reply

NamTaf | karma 5843 | avg karma 4.44 · 2017-01-09 00:45:22+00:00

Per the sibling comments, you just need to specify a sane block size. dd's default is really low and if you experiment a bit with 2M or around that you'll get near-theoretical throughput.

NB: Remember the units! Without the units you specify it as bytes or something insanely small like that. I've made that mistake more than once!

reply

mkup | karma 871 | avg karma 2.62 · 2017-01-08 13:43:48+00:00

In FreeBSD, cached/block disk devices are long gone: https://www.freebsd.org/doc/en/books/arch-handbook/driverbas... so all disk devices in /dev are implicitly O_DIRECT.

Though, read cache can be enabled manually by creating separate device via gcache(8). This is usually not required, because caching is done at the filesystem layer.

It's important to specify block size for uncached devices, of course. dd(1) with bs= option will surely work, and with cp(1) your mileage may vary, depending on whether underlying disk driver supports I/O with partial sector size or not.

reply

Esau | karma 1247 | avg karma 3.26 · 2017-01-08 13:12:38

Thank you for the historical information.

SSLy | karma 2051 | avg karma 1.78 · 2017-01-08 13:30:56

>it's useful to bypass the Linux VM as much as possible to avoid systems stalls

Oh man, I didn't even know that was the cause of these problems.

reply

drvdevd | karma 1178 | avg karma 2.05 · 2017-01-08 14:25:29

Nice! Thanks. Built-in status and that Linux bypass trick are beautiful.

Vosporos | karma 403 | avg karma 1.71 · 2017-01-08 15:42:41+00:00

Oh? I had read that its name was originally "copy and convert" but `cc` was already taken by the compiler

realfinkployd | karma 18 | avg karma 6.0 · 2017-01-08 18:54:41+00:00

Yeah, this article is wrong. Have you ever noticed the syntax for DD is unusual? It is set up more like a JCL syntax.

pmorici | karma 12372 | avg karma 3.51 · 2017-01-08 16:13:09+00:00

Hasn't the ability to check progress be around forever? You could always send it SIGUSR1 and get back a progress report on stderr.

_joel | karma 2846 | avg karma 2.1 · 2017-01-08 16:20:19+00:00

Sure it's been around for a while, GNU version on linux at least. Personally found pipe viewer (pv) quite handy too https://www.ivarch.com/programs/pv.shtml - available in most distros

groovy2shoes | karma 1848 | avg karma 1.72 · 2017-01-09 09:21:56

Yup. Just don't do what I did earlier and `pkill -USR1 -f dd` if your desktop session is currently being provided to you courtesy of `sddm` …

As a programmer, it usually pays off to be on the lazy side, but every once in a while it comes back and bites me in the arse ;)

reply

carussell | karma 2674 | avg karma 2.57 · 2017-01-08 16:34:54+00:00

You don't have to wait for updates to newer versions to get dd to report its progress. The status line that dd prints when it finishes can also be forced at any point during dd's operation by sending the USR1 or INFO signals to the process. E.g.:

    ps a | grep "\<dd"
    # [...]
    kill -USR1 $YOUR_DD_PID

or

    pkill -USR1 ^dd

It also doesn't require you to get everything nailed down at the beginning. You've just spent the last 20 seconds waiting and realize you want a status update, but you didn't think to specify the option ahead of time? No problem.

I've thought that dd's behavior could serve as a model for a new standard of interaction. Persistent progress indicators are known to cause performance degradation unless implemented carefully. And reality is, you generally don't need something to constantly report its progress even while you're not looking, anyway.

To figure out the ideal interaction, try modeling it after the conversation you'd have if you were talking to a person instead of your shell:

"Hey, how much longer is it going to take to flash that image?"

The way dd works is close to this scenario.

reply

hueving | karma 14605 | avg karma 3.02 · 2017-01-08 16:39:22

>I've thought that dd's behavior could serve as a model for a new standard of interaction. Persistent progress indicators are known to cause performance degradation unless implemented carefully. And reality is, you generally don't need something to constantly report its progress even while you're not looking, anyway.

Progress bars by default are also garbage if you are scripting and want to just log results. ffmpeg is terrible for this.

reply

jff | karma 5646 | avg karma 3.02 · 2017-01-08 17:12:49+00:00

> Persistent progress indicators are known to cause performance degradation unless implemented carefully.

Are you referring to that npm progress bar thing a few months back? I'm pretty sure the reason for that can be summed up as "javascript, and web developers".

Anyway, he's not proposing progress bars by default, he's proposing a method by which you can query a process to see how far it's come. I think there's even a key combination to do this on FreeBSD.

Or, for example, you could write a small program that sends a USR1 signal every 5 seconds, splitting out the responsibility of managing a progress bar:

% progress cp bigfile /tmp/

And then the 'progress' program would draw you a text progress bar, or even pop up an X window with a progress bar.

reply

erikbye | karma 1418 | avg karma 1.57 · 2017-01-08 18:04:59+00:00

Yes, C-t for SIGINFO, works on all BSDs (including macOS).

hashhar | karma 1759 | avg karma 2.65 · 2017-01-09 07:36:46

Check this out. https://github.com/Xfennec/progress

jff | karma 5646 | avg karma 3.02 · 2017-01-09 16:15:31+00:00

That's great! I think due to the way it's implemented it wouldn't be able to do progress reporting for e.g. "dd if=/dev/zero of=bigfile bs=1M count=2048", but that's a less common case than just cp'ing a big "regular" file.

pixelbeat | karma 1110 | avg karma 9.02 · 2017-01-08 16:42:43

Yes this is true. Note BSD supports this better with Ctrl-T to generate SIGINFO, which one can send to any command even if not supported, in which case it's ignored. Using kill on linux, and having that kill processes by default is decidedly more awkward.

It's also worth noting the separate "progress" project which can be used to give the progress of running file based utilities.

We generally have pushed back on adding progress to each of the coreutils for these reasons, but the low overhead of implementation and high overlap with existing options was deemed enough to warrant adding this to dd

reply

gkya | karma 4805 | avg karma 2.02 · 2017-01-08 21:49:43+00:00

Also a gotcha on BSD (FBSD at least): SIGUSR1 kills dd.

stuaxo | karma 5251 | avg karma 1.76 · 2017-01-08 22:22:31+00:00

Ctrl-T for SIGINFO is pretty useful, it would be good if Linux could pick this up.

circuit_breaker | karma 92 | avg karma 6.57 · 2017-01-08 23:05:28+00:00

Been waiting for years, frankly. I guess if I was motivated enough and had the time, I could do the research and submit patches..

feld | karma 2363 | avg karma 1.94 · 2017-01-09 00:53:49+00:00

They don't want it. Even if you got the patches accepted to the kernel they'd never accept patches to GNU Coreutils to support it.

feld | karma 2363 | avg karma 1.94 · 2017-01-09 20:04:33

The most recent comment I've seen on this is lukewarm at best: http://lkml.iu.edu/hypermail/linux/kernel/1411.0/03374.html

floatboth | karma 5759 | avg karma 2.0 · 2017-01-09 12:28:28

> in which case it's ignored

Ignored by the application, sure. But FreeBSD always prints useful stuff like load, current command, its pid and state:

$ dd if=/dev/random of=/dev/null

load: 0.72 cmd: dd 5820 [running] 0.70r 0.02u 0.68s 6% 2008k

263276+0 records in

263276+0 records out

134797312 bytes transferred in 0.708372 secs (190291809 bytes/sec)

(here, the "load:..." line is from the system, and the other 3 lines are from dd)

reply

highd | karma 526 | avg karma 3.35 · 2017-01-08 18:43:27

Yes! I'm thinking of building something like this for my neural net training (1-2 days on AWS, 16 GPUs/processes on the job). In this case the "state" that I'd like to access is all the parameters of the model and training history, so I'm thinking I'll probably store an mmapped file so I can use other processes to poke at it while it's running. That way I can decouple the write-test-debug loops for the training code and the viz code.

dllthomas | karma 14749 | avg karma 1.38 · 2017-01-08 23:06:20+00:00

> I'm thinking I'll probably store an mmapped file so I can use other processes to poke at it while it's running.

That seems to run substantial risk of seeing it in an inconsistent state, yeah?

reply

highd | karma 526 | avg karma 3.35 · 2017-01-09 05:28:20+00:00

I generally use a semaphore when I'm reading and writing from my shm'd things. The data structure will also likely be append-only for the training process, as I want to see how things are changing over time.

Also I meant shm'd, not mmap'd.

reply

terminalcommand | karma 1596 | avg karma 2.47 · 2017-01-09 19:05:10+00:00

I am knew to the shared memory concept. I am familiar with named pipes. Could you please elaborate a bit, I'm curios.

Are you passing the reference to an mmap adress, or using the shm systemcalls? In what language are you programming in? Does race conditions endanger the shared memory? If so, how does using semaphores help?

Sorry, if I asked a lot of questions, feel free to answer any/none of them :)

reply

highd | karma 526 | avg karma 3.35 · 2017-01-10 02:38:05+00:00

Sure! SHM is really cool, I just found out about it. It's an old Posix functionality, so people should use it more!

I'm using shm system calls in Python. Basically I get a buffer of raw bytes of a fixed size that is referred to by a key. When I have multiple processes running I just have to pass that key between them and they get access to that buffer of bytes to read and write.

On each iteration first I wait until the semaphore is free and then I lock it (P). That prevents anyone else from accessing the shared memory. I have the process read from the shared memory a set of variables - I have little helper functions that serialize and deserialize numpy arrays into raw bytes using fixed shapes and dtypes. Those arrays are then updated using some function combining the output of the process and the current value of the array. Then those arrays are reserialized and written back to the shm buffer as raw bytes again. Finally, the process releases the semaphore using P() so other processes can access it. The purpose of the semaphore is to prevent reading the arrays while another process is writing them - otherwise you might get interleaved old and new data from a given update. In a process-wise sense there is a race-condition, as each process can update at different times or in a different order, but for my purposes this is acceptable since neural net training is a stochastic sort of thing and it shouldn't care too much.

[0] http://nikitathespider.com/python/shm/ - original library which works fine for me

[1] http://semanchuk.com/philip/PythonIpc/ - updated version

reply

joombaga | karma 800 | avg karma 1.59 · 2017-01-09 05:29:00+00:00

I've always used pv to get progress from dd, or other pipes:

  pv image.img | dd of=/dev/rdisk2 bs=1M

This adds another pipe though. I don't know the effect this has on performance.

rsync | karma 22894 | avg karma 4.62 · 2017-01-08 19:18:16+00:00

"Yes "D" is not for disk/drive/device/..."

But that's the very beauty of unix!

If you can find a way to use 'dd' for disk/drive/device you can use it in interesting new manners (pipelines, etc.) and have very good confidence that it won't break in weird ways. It will do the small, simple thing it is supposed to do even if you are abusing it horribly.

Like this, for instance:

  pg_dump -U postgres db | ssh user@rsync.net "dd of=db_dump"

takeda | karma 6128 | avg karma 2.11 · 2017-01-08 20:50:21+00:00

Is there a benefit to use dd over cat in this case?

circuit_breaker | karma 92 | avg karma 6.57 · 2017-01-08 23:43:28

You could use it to rate limit... or arbitrarily set block sizes per use case. I've used it for the former when doing 'over the wire' backups through ssh

abrowne | karma 2613 | avg karma 2.37 · 2017-01-08 19:44:18

Thanks for the tips.

Clueless noob here . . . most guides I've seen use bs=1M for writing e.g. a Linux installer to a USB drive. Does 1 MB vs 2 MB change anything?

reply

blr246 | karma 215 | avg karma 6.52 · 2017-01-08 21:30:08+00:00

The setting controls the block size. When writing to block devices, you can maximize throughput by tuning the block size for the filesystem, architecture, and specific disk drive in use. You can tune it by benchmarks and searching over various multiples of 512K block sizes.

For most modern systems, 1MB is a reasonable place to start. Even as high as 4MB can work well.

The block size can make a major difference in terms of sustained write speed due to reduced overhead in system calls and saturation of the disk interface.

A similar thing happens when writing to sockets where lots of small messages kill throughput, but they can decrease latency for a system that passes a high volume of small control messages.

reply

rollthehard6 | karma 449 | avg karma 3.71 · 2017-01-08 12:01:13

I came from Solaris land originally and it always surprises me how seldom people use filesystem level utilities to copy such stuff, i.e. ufsdump | ufsrestore or dump | restore

Works a treat and using the fs level tool you know everything will be properly copied, much safer.

reply

rollthehard6 | karma 449 | avg karma 3.71 · 2017-01-08 12:02:47+00:00

Obviously I'm not talking about raw devices, but people copying disks which often don't have that many filesystems on them.

mlaux | karma 8 | avg karma 0.89 · 2017-01-08 12:05:23+00:00

> It’s unique in its ability to issue seeks and reads of specific lengths, which enables a whole world of shell scripts that have no business being shell scripts. Want to simulate a lseek+execve? Use dd!

How would one simulate a call to execve with dd? Seems like a totally different problem domain.

reply

kosma | karma 1238 | avg karma 5.22 · 2017-01-08 12:11:12

There's a common idiom of skipping a file header and handing off processing to some other program, like this:

    cat foo | ( dd bs=$HEADERSIZE skip=1 of=/dev/null; process-foo-contents )

greenleafjacob | karma 678 | avg karma 2.03 · 2017-01-08 13:32:27+00:00

Isn't this the same as:

    tail -c +$HEADERSIZE <foo

tspiteri | karma 1924 | avg karma 4.16 · 2017-01-08 15:06:55

The dd idiom can be used to split a file into parts with known block size, something like

    (dd bs=$SIZE1 count=1 of=file1; dd bs=$SIZE2 ...

fiddlerwoaroof | karma 6113 | avg karma 2.37 · 2017-01-08 15:07:28+00:00

Cool, although this doesn't really help with my most common-case, skipping the first line.

    (head -1 > /dev/null; cat -) < file

kosma | karma 1238 | avg karma 5.22 · 2017-01-08 15:23:47+00:00

head is line-wise. dd is byte-wise.

fiddlerwoaroof | karma 6113 | avg karma 2.37 · 2017-01-08 16:05:54+00:00

Yeah, I was just saying that generally, when I need to strip off a header, the header is the first line of the file

RJIb8RBYxzAMX9u | karma 784 | avg karma 3.14 · 2017-01-08 15:49:09

Unintuitively, tail is the utility you want:

  $ tail -n +2 file

From tail(1):

  -n, --lines=K
    output the last K lines, instead of the last 10; or use -n +K to output lines starting with the Kth

fiddlerwoaroof | karma 6113 | avg karma 2.37 · 2017-01-08 16:03:33+00:00

Yeah, I know about that, I just prefer not to use that option because `head -1 > /dev/null` is clearer

koala_man | karma 2309 | avg karma 6.19 · 2017-01-08 20:54:45

This is undefined behavior: while `head -1` will only output a single line, it may read more.

It happens to work on GNU head when stdin is seekable file, because GNU head specifically rewinds the stream before exiting:

    $ (strace  -e read,write,lseek head -1 > /dev/null; cat -) < file
    ...
    read(0, "hello\nworld\n", 8192)         = 12
    lseek(0, -6, SEEK_CUR)                  = 6    # <-- here
    write(1, "hello\n", 6)                  = 6
    +++ exited with 0 +++

If not for that explicit `lseek`, `head -1` would have skipped the entire 8k buffer.

As far as I know, this is exclusive to GNU cat. Neither Busybox nor OSX cat will do this, and will therefore throw away an entire buffer instead of just the first line. You can try it out:

(busybox head -1 > /dev/null; cat -) < file

reply

fiddlerwoaroof | karma 6113 | avg karma 2.37 · 2017-01-08 22:25:14+00:00

Interesting. Is this true of `tail -n +2` as well? (On mobile, can't test at the moment).

koala_man | karma 2309 | avg karma 6.19 · 2017-01-09 02:22:26

That always reads to eof, so it can't be used in the same way.

ucs | karma 25 | avg karma 1.19 · 2017-01-09 02:34:00

Tail employs a large read buffer as well, but it does not matter because you wouldn't use it in the same manner.

Tail is the right tool for the job here. But if you wish to stick with your idiom, read will reliably consume a single line of input, regardless of how it is implemented:

  (read -r; cat) < file

koala_man | karma 2309 | avg karma 6.19 · 2017-01-09 03:03:31+00:00

If anyone is wondering why `read` can reliably read a single line while head can't, it's because it reads byte by byte.

This is just as inefficient as it sounds, but it doesn't matter much in practice since you rarely read a lot with it.

reply

barhun | karma 122 | avg karma 5.3 · 2017-01-08 12:24:46

+ "On UNIX, the adage goes, everything is a file."

- not all the things on unix are abstracted as files (or 'byte streams' to be more accurate). however, i/o resources and some ipc facilities are defined so. an operating system provides many other abstractions in addition to these such as processes, threads, non-stream devices, concurrency and synchronization primitives, etc.; thus it's absolutely wrong to say that everything is a file on unix.

reply

darfs | karma 83 | avg karma 1.36 · 2017-01-08 14:19:52+00:00

Could this provide an example please?

Only for those like me, who are wondering...^^

reply

RJIb8RBYxzAMX9u | karma 784 | avg karma 3.14 · 2017-01-08 14:23:37

"Everything is a file on UNIX" is as true as "Vulcan's cannot lie."

na85 | karma 6342 | avg karma 1.69 · 2017-01-08 16:33:32

I assume Vulcans can lie?

LeoPanthera | karma 26644 | avg karma 5.67 · 2017-01-09 06:35:57

Technically they cannot lie. But they can exaggerate. Or omit facts.

sh_tinh_hair | karma 37 | avg karma 0.84 · 2017-01-08 17:37:51+00:00

Great analogy.

qubex | karma 5969 | avg karma 2.4 · 2017-01-08 14:33:04+00:00

Most notably: network sockets do not have file-like semantics, mainly because they were introduced as a concept and implemented long after the system was designed. Plan9 is an effort to revise all system objects to be accessible with the same file-like fopen/fclose system calls.

sh_tinh_hair | karma 37 | avg karma 0.84 · 2017-01-08 17:28:11+00:00

Only one ipc primitive (fifo) can be used with man 2 read or write. Sysv ipc cannot, sockets (any domain) cannot (unless there is a kernel interface via /proc or another pseudo fs). What is meant by this saying (and the way it is has been used in my experience) is that everything in unix _looks_ like a file. That is usually meant in reference to the C API which has some r[ec]v|read|write|snd type namespace.

deathanatos | karma 10985 | avg karma 3.37 · 2017-01-08 20:18:05

You can read(2) and write(2) to a socket, although it is more normal to recv and send on it. On Linux, you can also (and pretty much MUST) read(2) and write(2) to an eventfd; you can read(2) from a timerfd. That said, the semantics of the latter are considerably different from that of fifos or sockets.

sh_tinh_hair | karma 37 | avg karma 0.84 · 2017-01-08 21:35:41+00:00

You are right. I've long forgotten about read() and write() compat with sockets for very good reason. I don't lump an eventfd into the same IPC category as those mentioned in parent for the semantic(s) reason you mention and also for it's intended usage.

erikb | karma 4464 | avg karma 1.37 · 2017-01-08 12:36:47+00:00

Reminds me of "find", which is another tool that is mostly just used for its most boring application.

hasenj | karma 4763 | avg karma 2.56 · 2017-01-08 13:43:00+00:00

What are some of its more interesting applications?

erikb | karma 4464 | avg karma 1.37 · 2017-01-08 14:09:38+00:00

If you need to do anything on a list of files/dirs, but they are not in the same place, or not easily filterable (e.g., all sub directories "/Fillion" but not the ones that start with "Nathan") you can use find to prefilter everything, then hand it over via xargs (if you want all to be handled by one instance of the action program) or with "-exec" (if you want each instance to be handled by a separate instance by the action program).

Also check out "find tutorial" articles like this one: http://www.grymoire.com/Unix/Find.html (e.g., did you know you can filter by size?)

reply

koala_man | karma 2309 | avg karma 6.19 · 2017-01-08 20:36:52+00:00

How about "awk"? It's an entire programming language, but it's only ever used to get single fields.

erikb | karma 4464 | avg karma 1.37 · 2017-01-09 12:48:05+00:00

AWK is also awesome. AWK is in my category of languages you should learn even if you never use it (but you will for sure), because it enables another kind of thinking.

voidz | karma 718 | avg karma 1.91 · 2017-01-08 12:40:54+00:00

Most of the time it's much better (as in, faster) to just use cat (or pv, to get a nice progress bar) for writing a file to a block device, because it streams, and lets underlying heuristics worry about block sizes and whatnot.

So:

  cat foobar.img > /dev/sdi

will stream the file rather than what dd does, i.e. read block, write block, read block, write block and so on.

Usually I also lower the vm.dirty_bytes and vm.dirty_background_bytes to 16 resp. 48 MB (in bytes) beforehand, which limits the buffer sizes to those amounts. Else it will seem that the progress bar indicates 300MB/s is written, and when it completes you still have to wait a really long time for things to have been written out. Afterwards I restore back vm.dirty_ratio and vm.dirty_background_ratio to respectively 10 and 5 - the defaults on my system.

I wish that all of those projects, tutorials etc. that explain how to write their image to a block device, like an sdcard, would start advise using cat, because there is no reason to use dd, it's just something that people stick with because others do it too.

I only use dd for specific blocks, like writing back a backup of the mbr, or as a rudimentary hex editor.

reply

RJIb8RBYxzAMX9u | karma 784 | avg karma 3.14 · 2017-01-08 14:11:30+00:00

> I wish that all of those projects, tutorials etc. that explain how to write their image to a block device, like an sdcard, would start advise using cat, because there is no reason to use dd, it's just something that people stick with because others do it too.

I'd wondered whether dd or cat were faster, and indeed cat is faster, but not by much. Also, for some embedded devices, you have to write to specific offsets, so dd is more convenient and explicit. Lastly, cat composes poorly with sudo.

  $ sudo dd if=foobar.img of=/dev/sdi # works
  $ sudo cat foobar.img > /dev/sdi    # fails unless root b/c redirection is done by shell

> I only use dd for specific blocks, like writing back a backup of the mbr, or as a rudimentary hex editor.

xxd / xxd -r is much nicer, but I suppose sometimes vim is not available...

reply

greggyb | karma 2632 | avg karma 2.06 · 2017-01-08 14:26:21+00:00

    sudo (cat foobar.img > /dev/sdi)

No?

ryao | karma 1990 | avg karma 2.08 · 2017-01-08 16:35:27+00:00

That is a syntax error in bash.

greggyb | karma 2632 | avg karma 2.06 · 2017-01-09 19:58:04

Learn something new every day. Thanks.

nerdponx | karma 22397 | avg karma 2.51 · 2017-01-08 14:27:18+00:00

Would

    sudo (cat foobar.img > /dev/sdi)

Work?

RJIb8RBYxzAMX9u | karma 784 | avg karma 3.14 · 2017-01-08 14:36:30+00:00

No, unless you have a shell where sudo is a built-in, and I don't know of any.

c0l0 | karma 2819 | avg karma 8.59 · 2017-01-08 15:03:45+00:00

This is actually impossible (at least with UNIX shells), as shell builtins cannot have the SUID bit set - and your shellitself also shouldn't ;)

JoshTriplett | karma 44606 | avg karma 4.76 · 2017-01-08 15:19:29

You could have a shell with a "sudo" builtin that knew how to invoke a separate "sudo" program with the right shell syntax and quoting, such that "sudo somecommand > /path/to/root-writable-file" did the right thing.

abofh | karma 925 | avg karma 2.66 · 2017-01-08 17:29:40+00:00

Not really - to sudo, your shell would have to be setuid - and constantly fork stuff as you to get user permissions. Alternatively your shell could maintain a separate process for privileged access, but that puts a whole lot of your security on the assumption that your shell has no bugs that might allow escalation.

In short, you could do it, but it'd be ripped out of every server that's been hardened, and for users that don't want to care - they're just running 'sudo su' anyhow.

Speaking only for myself, the thought of my shell having a magical escalation process would scare the bejeezus out of me - and I'm supposed to have root on our boxes!

reply

TheCoelacanth | karma 7988 | avg karma 1.71 · 2017-01-08 20:42:12+00:00

The shell wouldn't need to be setuid if it just performed a syntax transformation and then called the existing sudo binary.

JoshTriplett | karma 44606 | avg karma 4.76 · 2017-01-08 23:24:42+00:00

Exactly. The shell would just treat "sudo" as a builtin prefix similar to "time", but would then run the real "sudo" with an appropriate shell invocation and proper quoting.

nerdponx | karma 22397 | avg karma 2.51 · 2017-01-09 02:44:08

I finally got home to try it, and indeed it does fail (Zsh).

    zsh: parse error near `)'

Is it just ignoring the parentheses? Does the redirection "take precedence" over parentheses and brackets?

sply | karma 925 | avg karma 13.6 · 2017-01-08 14:57:15+00:00

No, you should use

  sudo sh -c "cat foobar.img > /dev/sdi"

or

  sudo -s "cat foobar.img > /dev/sdi"

RJIb8RBYxzAMX9u | karma 784 | avg karma 3.14 · 2017-01-08 15:29:51

I was originally going to put that in my examples, but opted to leave it out, because with sh -c you have to think about escaping special characters. Most of the time it doesn't matter, but when running commands as root you ought to be absolutely sure.

nerdponx | karma 22397 | avg karma 2.51 · 2017-01-09 02:41:28

What about using a here-doc (as in http://stackoverflow.com/a/16514624/2954547)?

    sudo sh <<EOF
    cat foobar.img > /dev/sdi
    EOF

mercora | karma 460 | avg karma 1.24 · 2017-01-08 15:19:46+00:00

one could use:

cat foobar.img | sudo tee /dev/sdi > /dev/null

reply

JoshTriplett | karma 44606 | avg karma 4.76 · 2017-01-08 15:23:11+00:00

I've used that to write short files (such as settings in /sys or /proc), but for large files, tee has the disadvantage of writing everything twice, and the pipe adds another write and read of every byte.

mercora | karma 460 | avg karma 1.24 · 2017-01-08 15:28:11

I just realized this too and tried to close stdout instead. tee complains about it but goes on with its business:

cat foobar.img | sudo tee /dev/sdi >&-

reply

RJIb8RBYxzAMX9u | karma 784 | avg karma 3.14 · 2017-01-08 15:40:20

Why's everyone so against dd? :-) If you're going to use tee, might as well not bother with cat at all[0]:

  $ sudo tee /dev/sdi >&- < foobar.img

Or better yet, pv[1]; you even get a progress bar that way!

[0] https://en.wikipedia.org/wiki/Cat_%28Unix%29#Useless_use_of_...

[1] http://www.ivarch.com/programs/pv.shtml

reply

mercora | karma 460 | avg karma 1.24 · 2017-01-08 15:59:14+00:00

Thanks for the reminder :) I do fall quite often for the useless use of cat. But most of the time i also do not really care about it much. I did omit pv in believe tee will always be available but it is great and absolutely preferred when available.

JoshTriplett | karma 44606 | avg karma 4.76 · 2017-01-08 15:20:58

> I'd wondered whether dd or cat were faster, and indeed cat is faster, but not by much.

That performance difference often comes from block size; "dd bs=1M" typically runs much faster than the default block size of 512 bytes.

reply

carussell | karma 2674 | avg karma 2.57 · 2017-01-08 15:57:52+00:00

> xxd / xxd -r is much nicer, but I suppose sometimes vim is not available

`od` is ubiquitous--it's POSIX and a requirement of the Single UNIX Specification

reply

na85 | karma 6342 | avg karma 1.69 · 2017-01-08 16:30:34

Sudo is anachronistic anyways.

Laptops don't need to follow multiuser best practices, and frankly a complex root password offers little.

reply

fragmede | karma 18795 | avg karma 1.82 · 2017-01-08 17:15:25+00:00

> So:
> cat foobar.img > /dev/sdi

> will stream the file rather than what dd does,

Sorry, that's not quite right. `cat` (and your shell, presumably bash) does the same fundamental thing as `dd`, ie, read block, write block. There's not really an underlying 'stream' primitive that `cat` (or `bash`, as you're using redirection to write the file) is using compared to `dd`

What `cat` does do, is it does a better job of trying to find an optimal blocksize than a naive `dd` call does. `dd` simply defaults to a 512-byte block size, which is just inherited from history when 512-bytes was the alignment for, well, everything.

There are numerous optimizations upon the fundamental read-block, write-block primitives to make it go faster (`cat` makes use of some of these). The linux kernel actually has a "stream file to socket" syscall to avoid the copy to user-land and back to "stream" a file out to the network, but that's not happening here, and there's still reading and writing of blocks in the kernel happening.

See also: http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f...

reply

LukeShu | karma 6759 | avg karma 4.09 · 2017-01-08 18:40:05

That's still not quite right.

Once cat is spawned, the shell gets out of the way, and has nothing to do with it. (Exception: zsh will "helpfully" insert itself with `tee`-like operation in certain situations.)

What `cat` doesn't do is finding an optimal block size for writes. It does a good job of finding optimal block size for reads, but not for writes. Many older block devices perform very poorly when the write block size is not optimal.

reply

omribahumi | karma 312 | avg karma 3.25 · 2017-01-08 21:09:43+00:00

The fact that zsh is a parent doesn't necessarily hurt performance. The parent and child can share the stdin fd, and the child could be the only one reading from it.

LukeShu | karma 6759 | avg karma 4.09 · 2017-01-08 23:56:19

That's not what I was referring to. Though I should have been more clear: I don't believe that zsh inserts itself in the mentioned situation; just that it does insert itself in some situations, and that makes it an exception to my general "once the process is spawned, the calling shell doesn't matter" statement.

As an example of what I was referring to, in zsh, this

    a >&2 | b

is equivalent to Bourne shell:

    a | tee /dev/stderr | b

except that `tee` is implemented as part of zsh, rather than a separate program (for no difference in performance).

That is, zsh inserted itself into the middle of the pipeline. There are two pipes instead of one; zsh/tee reads from the a pipe, and then writes that data to the b pipe (and stderr). This does hurt performance.

reply

asdfaoeu | karma 1037 | avg karma 2.58 · 2017-01-09 02:31:55

This doesn't seem to be the case for me

    -> % python test.py 2>&1 | python test.py
    COMMAND   PID    USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
    [snip]
    python  24150 someone    0u   CHR  136,0      0t0        3 /dev/pts/0
    python  24150 someone    1w  FIFO    0,8      0t0 72772851 pipe
    python  24150 someone    2w  FIFO    0,8      0t0 72772851 pipe
    python  24150 someone    3w   CHR    5,0      0t0     1049 /dev/tty
    COMMAND   PID    USER   FD   TYPE DEVICE SIZE/OFF     NODE NAME
    [snip]
    python  24151 someone    0r  FIFO    0,8      0t0 72772851 pipe
    python  24151 someone    1u   CHR  136,0      0t0        3 /dev/pts/0
    python  24151 someone    2u   CHR  136,0      0t0        3 /dev/pts/0
    python  24151 someone    3w   CHR    5,0      0t0     1049 /dev/tty
    someone@arch-two [01:29:00] [~]
    -> % echo $ZSH_VERSION
    5.2
    someone@arch-two [01:29:16] [~]
    -> % cat test.py
    import os
    import sys
    import subprocess

    with open("/dev/tty", "w") as f:
        subprocess.check_call(["lsof", "-p", str(os.getpid())], stdout=f)

Edit: Ok so I misread your comment and you were talking about >&2. Which is weird because it actually has different behaviour in bash vs zsh. Having said I can't think why you would want the bash behaviour, although the bash behaviour does seem more consistent.

LukeShu | karma 6759 | avg karma 4.09 · 2017-01-09 03:09:40

In the particular example I used, the Bourne shell behavior probably has no practical use (but I can write code of no practical use in many languages).

It's not particularly surprising that the zsh behavior is different than bash; zsh redirections are only Bourne shell-like in simple cases.

I dislike zsh's behavior because it isn't what the user typed. It should be simple to look at a command line and see how many pipes it makes; see the processes that will be involved in the pipeline. Zsh's behavior seems to me to be clever and implicit; I'd much rather have dumb and explicit.

reply

mbrumlow | karma 1896 | avg karma 2.54 · 2017-01-08 17:22:55+00:00

This is all wrong.

I am not even sure what you mean by stream or how it would be any difference than "read", "write", "read", "write". Because that is literally what cat does, and so does dd.

Many people feel cat is faster than dd because dd's default block size is 512 bytes as you can see from a simple dd command.

  strace dd if=10M of=junk 

  read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
  write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
  read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
  write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
  read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
  write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512

  strace cat 10M > junk2

  read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
  write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
  read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
  write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
  read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072
  write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072

So when benchmarking dd vs cat make sure you are comparing apples to apples.

The beauty of dd is you don't have to tinker with things like vm-.dirty_bytes and vm.dirty_background_bytes before using a command. Those should not be messed with on most systems and should definitely not be messed with for the sake of running a single command. When you use cat, it makes some decisions for you. Most of the time makes good choices but it happens to make not so optimal choices for working with large files being written to other devices or file systems.

To avoid the pitfalls of using cat to copy image files (and not have to change global system settings) you use dd. With dd you can use O_DIRECT to bypass the VFS caching layer so your vm.dirty_* settings are ignored in general. You can also specify the block size optimal for the device you are writing to and even reading from.

While dd may not stand for disk or drive or its name have anything to do with disk it is a powerful tool that is the appropriate tool to use when working with large files, precise movement or data and yes copying image files to devices.

reply

quotemstr | karma 7442 | avg karma 2.34 · 2017-01-08 21:19:20

Almost, but not quite. You really want buffer in there to smooth out IO time variances.

    <foo.img pv -cNsrc | buffer -p75 -s256k -b128 | pv -cNdst >/dev/whatever

buffer lets the rest of the pipeline proceed while pv is blocked writing to the output file. Yes, conventional filesystem readahead helps to some extent, but IME, not enough, especially when foo.img is a block device itself.

wpietri | karma 58013 | avg karma 4.11 · 2017-01-08 21:27:55+00:00

Holy moly. I have spent a lot of years typing Unix commands, but it never occurred to me to put the input redirection first. But that's much more pipeline-ish, so I like it a lot.

pmontra | karma 15472 | avg karma 2.26 · 2017-01-08 12:44:07+00:00

Indeed

    $ man dd
    NAME
           dd - convert and copy a file

    SYNOPSIS
           dd [OPERAND]...
           dd OPTION

    DESCRIPTION
           Copy a file, converting and formatting according to the operands.
    
    ...

More at http://man7.org/linux/man-pages/man1/dd.1.html

See the part after "Each CONV symbol may be:"

reply

microcolonel | karma 4436 | avg karma 0.84 · 2017-01-08 13:03:09

This is not strictly true. For example, good luck trying to scp a device to a remote host.

snuxoll | karma 5382 | avg karma 2.41 · 2017-01-08 15:45:12+00:00

"Everything is a stream" is closer to the truth until you learn the dark secret of ioctl and similar fun.

progman | karma 571 | avg karma 1.39 · 2017-01-08 18:35:45

> scp a device to a remote host

  dd if=/dev/hda bs=512 | (ssh root@remote dd of=/dev/hdX bs=512)

where bs=512 is the block size in bytes. Of course, the hdX drive must be different from the remote hosts's main drive, otherwise dd won't complete :-)

microcolonel | karma 4436 | avg karma 0.84 · 2017-01-09 08:26:11+00:00

That is not my point. I meant to say that scp itself will refuse to open the block device file for reading.

peterburkimsher | karma 4193 | avg karma 2.53 · 2017-01-08 13:17:57

Can we have a single-purpose tool for getting the text between two delimiter strings?

I know it's possible with regex, but given how frequently that parsing logic is needed, and the difficulty of getting sed right, I think a "tb" tool would be very helpful.

reply

bbrazil | karma 708 | avg karma 2.33 · 2017-01-08 13:36:28+00:00

Does cut(1) work for you?

peterburkimsher | karma 4193 | avg karma 2.53 · 2017-01-09 05:45:12+00:00

cut only supports single ASCII characters as delimiters.

Reading a value from "<td>2017-01-09 08:30</td>" is harder than it should be.

cut needs single-character delimiters, so only splitting on "<" or ">" won't work.

sed trips up on the "/", ":", and "-" without proper escaping.

This is before even mentioning Unicode. High-level scripting languages handle all of these just fine. I'd much rather have a standard tool and library for this purpose though.

reply

protomikron | karma 786 | avg karma 2.75 · 2017-01-09 06:21:05

Maybe use awk?

  $ echo '<td>2017-01-09 <b>08:30</b></td>' | awk -F'[<>]' '{print $5}'

This will give you '08:30'. Unicode seems to be also supported:

  $ echo '?f. (?x. f (x x)) (?x. f (x x))' | awk -F'[?.]' '{print $4}'

This prints 'x'. For clarity some other matches:

  $1: ''
  $2: 'f'
  $3: ' ('
  $4: 'x'
  ...

peterburkimsher | karma 4193 | avg karma 2.53 · 2017-01-09 07:21:37+00:00

Fine, until the next row doesn't have tags. So the $5 no longer matches where it should.

I'm just suggesting easier syntax, and a custom tool. Like dd is for disk copying, instead of cp/gzip/etc. A simple tool that will just work.

This is the syntax I'm going for: textBetween("<td>", "</td>") | replace ("", "") | replace ("", "")

Maybe with some magic strings,

e.g. textBetween(":", "end")

reply

protomikron | karma 786 | avg karma 2.75 · 2017-01-09 07:54:25

Yeah you are right, but if the input is more sophisticated I would probably use a scripting language and its full-fledged XML parser.

Your replace part is already covered by sed (although I agree with you that the escaping can be awkward).

reply

lucb1e | karma 17323 | avg karma 2.26 · 2017-01-09 06:36:49+00:00

There's loads of things you can easily do with tools and I'm sure someone will come up with something, but sometimes it's just easier to use Python:

    cat dates | python -c 'import sys; for line in sys.stdin: print(line.split("<td>"))'

Or PHP

    php -r 'foreach (explode("\n", file_get_contents("dates")) as $line) { print_r(explode("<td>", $line)); }'

Or with Perl or Ruby or whatever floats your boat.

While writing this I did think of something though, depending on what you're actually trying to do it might work:

    cat dates | tr "<>" "  " | awk '{print $yourColumnOfInterest}'

peterburkimsher | karma 4193 | avg karma 2.53 · 2017-01-09 07:17:28+00:00

As I say, "High-level scripting languages handle all of these just fine".

Back to the article. dd is not a disk writing tool. Everything on UNIX is a disk writing tool. dd has one job, and does it right.

Again, I propose a single-purpose text delimiter cutting program. It doesn't need to be a whole scripting language, e.g. awk, perl, python, etc. Just get the text between two strings, and have sensible syntax.

reply

tacostakohashi | karma 2448 | avg karma 3.15 · 2017-01-08 13:36:44+00:00

At least for delimiter characters, that's what cut is.

eigengrau | karma 37 | avg karma 1.19 · 2017-01-08 13:45:16+00:00

Unless those characters are Unicode (unpatched GNU cut).

tambourine_man | karma 37043 | avg karma 6.16 · 2017-01-08 13:43:23

Yes, but you can't set block size to n bytes in cat.

Also, it's useful to have in the back of your mind that dd can very easily mean Disk Destroyer, specially because of it's sui generis syntax

reply

koala_man | karma 2309 | avg karma 6.19 · 2017-01-08 20:31:16+00:00

On the other hand, cat's block size can increase over time to remain reasonable. When I first became aware of it, it was 4 KiB. Now strace shows my GNU cat using 128 KiB.

notacoward | karma 14833 | avg karma 3.12 · 2017-01-08 14:19:19+00:00

More importantly: dd is not a benchmarking tool. I can't count how many times people have complained about dd being slow on a distributed filesystem. Well, yeah. When you're writing only one block at a time with no concurrency/parallelism at all, over a network interface that has much higher latency than your disk interface, of course it's slow. When you're using tiny block sizes, as about 80% of these people do, the effect is only magnified. Unless your use case for a distributed system is a single user who doesn't even multitask, use iozone or fio instead.

qwertyuiop924 | karma 6015 | avg karma 1.75 · 2017-01-08 14:56:22+00:00

Ouch.

Also, seriously people, use the bs flag on dd.

reply

sh_tinh_hair | karma 37 | avg karma 0.84 · 2017-01-08 19:53:25+00:00

Sure it is. It tells you exactly what performance a single threaded application can expect with serial read()s and write()s (and whatever options you choose to invoke dd with) against whatever file source and constraints/conditions are extant at the time. Perfectly valid.

notacoward | karma 14833 | avg karma 3.12 · 2017-01-08 20:00:52

OK, so it tells you how a very poorly written application - one which probably shouldn't be running on such a system in the first place - will perform. And it's useless for everyone else. Here, have an internet point.

sh_tinh_hair | karma 37 | avg karma 0.84 · 2017-01-08 20:12:28+00:00

It also tells you how a bunch of common tools will fare against your wonderfully designed solution that is only 'truly' testable by sophisticated methods.

Thanks for whatever it is that you gave me. I don't keep up with nomenclature these days.

reply

notacoward | karma 14833 | avg karma 3.12 · 2017-01-09 13:43:31+00:00

Let me try to be a bit clearer. Yes, dd will tell you how one instance of a common tool might perform. That's one piece of information, but very likely the least interesting piece of information for most use cases. It would be far more useful to know how performance changes as you run many instances of those same common tools simultaneously, or what kind of performance a well written application can achieve. Iozone or fio can give you all of the answers dd would have, and many more answers besides.

Using dd in this role and then complaining about the result before running any other kind of test is a waste of everyone's time. No filesystem, even local, is optimized for that kind of performance. You do know the difference between performance and scalability, don't you? People who evaluate server-oriented systems in 2017 based on a methodology more appropriate for a 1997 desktop are doomed to fail. In everything.

reply

II2II | karma 4301 | avg karma 2.71 · 2017-01-08 14:33:17+00:00

There are likely a number of reasons for using dd in tutorials designed for beginners:

- it is available by default on all Unix systems

- it distinguishes between input and output (i.e. if= and of=)

- it reports results

- it avoids using a common command for a dangerous operation that the user may not understand

dd also has another benefit: the ability to select a range of blocks to copy from and to. This isn't the most common scenario, but it certainly pops up on some devices.

reply

beams_of_light | karma 460 | avg karma 2.46 · 2017-01-08 19:32:07

Sometimes, it seems like people try to outsmart common knowledge, and it's fine to present new ideas, but one doesn't have to call what most know works and use inappropriate for the tasks they use it to perform.

alfalfasprout | karma 3739 | avg karma 3.95 · 2017-01-10 14:13:49+00:00

It reports results horribly. Nothing like copying 1+TB of data and having your SIGINFO signals print nothing to the screen.

dockinator | karma 31 | avg karma 3.1 · 2017-01-08 14:38:18+00:00

What? Yes dd absolutely is a disk writing tool, although that is not the only or even the tool's intended primary purpose.

It is useful for generating serial IO for a variety of purposes. For example, writing data with specific target block size; allocating contiguous blocks for use by an application (be it zeroing out a thin LUN before partitioning, or a file system); or simply dumping the content of one device to another (or to a file).

Good luck stretching out a thin LUN or creating an empty file that allocates contiguous space with cat.

reply

jwr | karma 18835 | avg karma 6.7 · 2017-01-08 14:43:54

The best thing about dd is that you can use it with conv=noerror, which will let you recover as much data as possible from an otherwise damaged device.

Freaky | karma 1387 | avg karma 2.74 · 2017-01-08 16:29:27

That loses you a dd blocksize chunk, which is likely much bigger than the underlying damaged sector.

ddrescue or recoverdisk (part of FreeBSD base) will both skip over unreadable blocks, then retry with smaller block sizes along the damaged areas to save as much data as possible.

reply

cellularmitosis | karma 2902 | avg karma 2.69 · 2017-01-08 22:24:22+00:00

Indeed. Also, you'd actually want 'conv=noerror,sync'. The 'sync' keeps the input and output block counts in sync (if an input block can't be read, it writes an output block of zeros to keep the block counts in 'sync').

The arch wiki page is very useful: https://wiki.archlinux.org/index.php/disk_cloning

reply

acomjean | karma 7658 | avg karma 2.71 · 2017-01-08 14:49:10+00:00

So what is the best way to clone a disk(or in my case a raspberry pi sd cars)?

I tried to backup one of my cards last week using dd > .iso file and then tried to put it on a new card. I tried with /dev/Rdisk (faster) but none of the new cards was bootable.

So this is saying just use copy.

(I ended up just creating a second boot disk, and ftping the files over which seems less than ideal...)

reply

koala_man | karma 2309 | avg karma 6.19 · 2017-01-08 21:10:03+00:00

The article doesn't say to stop using `dd` to write disks.

It's just saying that if you have other commands that can read/write files, such as `pv /dev/thing > file.img` (to show a progress bar), you don't have to try to shoe-horn dd into it just because /dev/thing happens to be a drive.

reply

voltagex_ | karma 10530 | avg karma 1.86 · 2017-01-08 16:06:09+00:00

If you're building disk images with blank space in them (say, for an 8GB EMMC, but your root is only 2GB) you may want to use bmap-tools [1] [2]

This way only actual data is written to the device, blocks of zeros can be skipped.

1: https://lwn.net/Articles/563355/ 2: https://github.com/01org/bmap-tools

reply

ComputerGuru | karma 29702 | avg karma 4.11 · 2017-01-08 16:27:08

I've seen people try to use dd to clone a bad disk before tossing it or attempting recovery - while dd is not the right tool for the job, ddrescue [0] is.

ddrescue gives you options for error handling and will skip past bad blocks, it handles read errors much more gracefully.

[0]: https://www.gnu.org/software/ddrescue/

reply

wa2flq | karma 13 | avg karma 0.93 · 2017-01-08 19:17:26+00:00

Agreed,using dd on media with flaws is not straight forward.

See http://www.noah.org/wiki/Dd_-_Destroyer_of_Disks

reply

chillingeffect | karma 1516 | avg karma 1.31 · 2017-01-08 20:50:21+00:00

Breaking your silence! Are you Mr. Spurrier? Ciao.

beams_of_light | karma 460 | avg karma 2.46 · 2017-01-08 19:30:08+00:00

You can specify options with dd to handle errors as well, such as padding unreadable blocks on the destination.

wa2flq | karma 13 | avg karma 0.93 · 2017-01-08 21:04:15

If you use block multiples on media with defects to speed things the good data fills the buffer first, then the remainder is zero padded for the blocks in error. This shifts good data relative to its original position and most likely further corrupts your attempts at recovery - BTDT.

This page explains the challenge - https://wiki.archlinux.org/index.php/disk_cloning

reply

eltoozero | karma 436 | avg karma 2.04 · 2017-01-08 21:17:43+00:00

ddrescue and gddrescue also have restartable mode for failing drives, very very handy feature.

I've recovered numerous otherwise unmountable Mac/Windows drives using ddrescue, mmls, sleuthkit, and foremost.

Usually I'm able to lsblk to determine the block device, rip the data partition with ddrescue and mount it loopback with mount once I use mmls to confirm the partition type.

For Mac volumes that dont mount, either fsck.hfsplus or get the file to a Mac to run diskwarrior (Alsoft has been repairing HFS volumes for a couple decades).

If nothing else works once you've got the raw bits saved, foremost will usually scrape something worthwhile off your image file.

reply

NamTaf | karma 5843 | avg karma 4.44 · 2017-01-09 00:51:21+00:00

The other cool thing ddrescue does is will 'mine' away at the failing blocks by reading it from both forward and reverse and clawing out every last bit of data it can. That, coupled with the resumption capacity, logging feature and ability to specify the number of retries on bad blocks, makes it a very useful tool and I've used it to save both my drives and aquainteances' drives many times over the years.

I can't sing the praises of ddrescue enough. NB: ddrescue and dd_rescue are not the same!

reply

ryao | karma 1990 | avg karma 2.08 · 2017-01-08 16:32:11+00:00

> This belief can make simple tasks complicated. How do you combine dd with gzip? How do you use pv if the source is raw device? How do you dd over ssh?

These operations are not that complicated. Behold the magic of UNIX pipes:

dd if=/dev/sdb | pv | gzip -c | ssh name@host "gzip -dc | dd of=/dev/sdc"

reply

vortico | karma 5453 | avg karma 3.42 · 2017-01-08 17:20:24

+1 for using pv, available in the AUR

Another useful tip is `curl https://...iso | sudo dd of=/dev/sdb` if you don't have enough disk space to hold the ISO. And sometimes internet speeds are faster than disk speeds anyway.

reply

koala_man | karma 2309 | avg karma 6.19 · 2017-01-08 20:23:25+00:00

It's even easier once you realize that dd is not required to read/write disks:

pv /dev/sdb | gzip -c | ssh name@host "gzip -dc > /dev/sdc"

Not only is this shorter with less cargo cult steps, it also gives a much better `pv` output now that it can determine the size and show a percentage.

reply

takeda | karma 6128 | avg karma 2.11 · 2017-01-08 21:55:27+00:00

Not that parent poster's example is better than yours, since it also doesn't do it, but I believe the biggest strength of dd over cat or redirection, is that you can specify block size. With right block size you will get far better performance and reduce wear of the device when writing block by click than when writing data byte by byte.

The key though is to be aware of block size, and use multiply of it.

reply

ryao | karma 1990 | avg karma 2.08 · 2017-01-09 01:31:04+00:00

I omitted block size to simplify it.

That said, what you say is correct on writes. On reads, specifying block size is mostly to reduce syscall overhead because readahead will prefetch data to accelerate things.

reply

koala_man | karma 2309 | avg karma 6.19 · 2017-01-09 02:42:21+00:00

It's not quite as simple as that when you're dealing with pipelines. For example:

    cat /dev/zero | strace dd bs=1M of=/dev/null

will show that you're not actually writing 1M blocks. You're just reading whatever `cat` outputs (128KiB blocks on my system), so all you've done is an unnecessary pipe and copy. You would have been better off just redirecting.

Similarly,

    dd if=/dev/zero bs=1M | strace gzip -1 > /dev/null

will show you that `gzip` will not start processing larger blocks. You're just adding an additional copy. Linux already has readahead, so I don't know if you'll ever see a benefit to this.

On my system, adding dd with higher block size on both sides of a gzip -1 pipeline just ends up being slower.

reply

takeda | karma 6128 | avg karma 2.11 · 2017-01-09 18:32:38+00:00

That's because bs means "up to" so if input comes at smaller blocks it will write in those. You should use obs, and preferably specify ibs to being multiple of obs.

cellularmitosis | karma 2902 | avg karma 2.69 · 2017-01-08 22:06:23+00:00

You don't need any of those gzip options.

pv /dev/sdb | gzip | ssh name@host "gunzip > /dev/sdc"

reply

0x006A | karma 1244 | avg karma 4.11 · 2017-01-08 22:51:17+00:00

you dont need gzip at all, ssh offers compression too:

pv /dev/sdb | ssh -C name@host "cat > /dev/sdc"

reply

ryao | karma 1990 | avg karma 2.08 · 2017-01-09 01:35:16+00:00

Then I would not have been able to demonstrate how to use the gzip command with dd.

catwell | karma 1177 | avg karma 3.48 · 2017-01-08 17:11:52+00:00

One of those Unix tools that would deserve to be better known is dcfldd (http://dcfldd.sourceforge.net/). It is basically dd with extra powers including on-the-fly hashing, progression, multiple outputs...

daveguy | karma 5761 | avg karma 2.78 · 2017-01-08 17:20:20

It may not be a disk writing tool, but it works wonders for swapping from HDD to SDD without having windows freak out and decide you are stealing it.

UhUhUhUh | karma 237 | avg karma 1.25 · 2017-01-08 19:37:03+00:00

Seconded.

andrewshadura | karma 1749 | avg karma 2.36 · 2017-01-08 17:46:12+00:00

bmaptool is my image writing tool, and I think it should be yours too.

iamdave | karma 5253 | avg karma 2.37 · 2017-01-08 18:46:42+00:00

It also doesn't stand for 'destroy-disk' as was thought by a junior admin I employed once, and eventually had to fire because the level of incompetence was getting to the point of almost being destructive.

Nope, that's not hyperbole. I had to stop the kid from almost installing software that would have connected to a known botnet to help a user connect a personal computer to the VPN. He passed enough checks during the interview we figured "Okay, we can train him in the rest of the things"

Lesson learned.

reply

brokenmachine | karma 3145 | avg karma 0.82 · 2017-01-09 04:33:11

What software connects to a known botnet to allow connection to a VPN?

iamdave | karma 5253 | avg karma 2.37 · 2017-01-09 05:03:12+00:00

https://www.pcworld.com/article/2928340/ultra-popular-hola-v...

A user was wanting to bypass some of our network restrictions and the intrepid jr. Admin suggested Hola unblocker to watch Netflix with me sitting 3 feet away.

This was effectively the final straw and convinced me I had made the wrong hire. He was out two days later.

reply

Poiesis | karma 833 | avg karma 1.91 · 2017-01-08 18:49:10+00:00

In my experience dd is great for binary data. Yes, pretty much everything on Unix operates on files, as does dd. But so many utilities are either line based or don't handle null bytes, and it's a pain to have determine how a given program handles binary data when I know dd will at least not mess with it.

realfinkployd | karma 18 | avg karma 6.0 · 2017-01-08 18:56:57+00:00

"Legend has it that the intended name, “cc”, was taken by the C compiler, so the letters were shifted by one to give “dd”."

Legend is wrong, this clearly derived from the mainframe JCL DD command. This is also why the syntax is so non unix-like.

https://en.wikipedia.org/wiki/Job_Control_Language#In-stream...

reply

ScottBurson | karma 10295 | avg karma 2.77 · 2017-01-08 20:03:23+00:00

My favorite thing to do with 'dd' is to break up multi-gigabyte log files into, say, 500MB chunks, so I can easily view and search them in XEmacs (this is 'csh' syntax as I use 'tcsh'):

  foreach i (0 1 2 3 [...])
  dd <big.log >big.log.$i bs=500m count=1 skip=$i
  end

(XEmacs is very fast at reading large files but has a 1GB limit.)

abbeyj | karma 1050 | avg karma 6.6 · 2017-01-08 21:13:39+00:00

How about using

    split -db 500M big.log big.log.
?

ScottBurson | karma 10295 | avg karma 2.77 · 2017-01-08 21:43:10

Oh wow, I never knew about that. Thanks!

bradknowles | karma 2658 | avg karma 0.68 · 2017-01-08 20:32:25+00:00

IIRC, there was a windowing system called "W", which pre-dated X. However, it was crude and there was good reason for wanting to replace it.

floatboth | karma 5759 | avg karma 2.0 · 2017-01-09 12:31:37+00:00

Funny how the new windowing system is also "W"… (https://wayland.freedesktop.org especially the logo)

skipt | karma 2 | avg karma 1.0 · 2017-01-08 23:04:37+00:00

dd is for converting EBCDIC to ASCII and vice versa :)

  $ echo how now brown cow > text.ascii
  $ dd conv=ebcdic < text.ascii > text.ebcdic
  0+1 records in
  0+1 records out
  18 bytes copied, 0.000261094 s, 68.9 kB/s
  $ od -xc text.ebcdic 
  0000000    9688    40a6    9695    40a6    9982    a696    4095    9683       
          210 226 246   @ 225 226 246   @ 202 231 226 246 225   @ 203 226
  0000020    25a6
          246   %
  0000022
  $ dd conv=ascii < text.ebcdic
  how now brown cow
  0+1 records in
  0+1 records out
  18 bytes copied, 0.000140529 s, 128 kB/s

skipt | karma 2 | avg karma 1.0 · 2017-01-08 23:06:58+00:00

dd is for converting EBCDIC to ASCII and vice versa :)

  $ echo how now brown cow > text.ascii
  $ dd conv=ebcdic < text.ascii > text.ebcdic
  0+1 records in
  0+1 records out
  18 bytes copied, 0.000261094 s, 68.9 kB/s
  $ od -xc text.ebcdic 
  0000000    9688    40a6    9695    40a6    9982    a696    4095    9683       
          210 226 246   @ 225 226 246   @ 202 231 226 246 225   @ 203 226
  0000020    25a6
          246   %
  0000022
  $ dd conv=ascii < text.ebcdic
  how now brown cow
  0+1 records in
  0+1 records out
  18 bytes copied, 0.000140529 s, 128 kB/s

vermaden | karma 3209 | avg karma 4.27 · 2017-01-08 23:11:02+00:00

> How do you combine dd with gzip?

# dd < /dev/ada0 bs=8m | gzip -c -9 > /mnt/file.raw.gz

> How do you use pv if the source is raw device?

# dd < /dev/ada0 bs=8m | pv | dd > /dev/ada1 bs=8m

> How do you dd over ssh?

# dd < /dev/ada0 bs=8m | gzip -9 | pv | ssh user@host 'dd > /dev/da1 bs=8m'

> This belief can make simple tasks complicated.

As master Dennis Ritchie once said - "UNIX is very simple, it just needs a genius to understand its simplicity."

reply

vacri | karma 17701 | avg karma 1.83 · 2017-01-09 01:46:57+00:00

Did you really want an un-gzip'd copy on your target in that last one?

YeGoblynQueenne | karma 22041 | avg karma 2.5 · 2017-01-09 00:19:47

>> The reason why people started using it in the first place is that it does exactly what it’s told, no more and no less.

Oh yes indeed. And for this exact reason, "dd" is commonly backronymed to "Data Death" (or, indeed, "Disk Death").

reply

koala_man | karma 2309 | avg karma 6.19 · 2017-01-09 03:37:47+00:00

Hello! I'm the author of this article.

Sorry for the confusion: dd is still a very useful tool for copying disks.

The point is that you should not feel like you have to shoehorn dd into any command dealing with disks, because only dd is somehow "raw" or "low level" enough to access them.

For example, if you have a command like this:

    pv file1 | gzip | ssh host "gzip -d > file2"

and you want to make it work with disks, just replace file1 and file2 with /dev/yourdisks and it's fine.

juliangoldsmith | karma 1022 | avg karma 2.22 · 2017-01-09 15:11:22

The statement

  cp myfile.iso /dev/sdb

is simply wrong. That will overwrite the device node for your flash drive with the ISO, instead of writing the data to the drive.

fuzzfactor | karma 2182 | avg karma 0.69 · 2017-01-11 14:58:51+00:00

"Not a disk-writing tool?!?"

Who knew?

Now we have pretty good confirmation that this little utility is performing way more effectively than designed.

Software itself could probably benefit from some of the same approaches that allowed this little computer program to outperform its original design goals, in ways that might not have been anticipated.

reply