Hacker Read

tgsovlerkhgsel · 2022-10-25 01:16:53

A bigger reason to avoid `dd` is unintuitive, "incorrect" behavior in some edge cases.

I don't remember the exact conditions that trigger it, but `dd` without `iflag=fullblock` can result in

    dd: warning: partial read (16384 bytes); suggest iflag=fullblock

I'm able to reliably trigger this with

    $ cat /dev/zero | openssl enc -aes-128-cbc -pbkdf2 -k foo | dd status=progress bs=1M count=100000 of=/dev/null

but not when I omit the `count=...` for some reason (maybe it isn't showing the warning in that case because it doesn't matter - apparently the effect this has is one of the "blocks" being smaller, and thus fewer bytes being copied, but it doesn't add padding or anything stupid like that, see https://unix.stackexchange.com/questions/121865/create-rando...).

I wish we had a cat-like tool for writing into files, for the "cat foo | do-something | sudo dd of=/dev/something" use case.

reply

adrian_b | karma 11319 | avg karma 3.11 · | 2024-03-12 08:10:51

According to the documentation of dd, "iflag=fullblock" is required only when dd is used with the "count=" option.

Otherwise, i.e. when dd has to read the entire input file because there is no "count=" option, "iflag=fullblock" does not have any documented effect.

From "info dd":

"If short reads occur, as could be the case when reading from a pipe for example, ‘iflag=fullblock’ ensures that ‘count=’ counts complete input blocks rather than input read operations."

reply

mrb | karma 22679 | avg karma 6.53 · | 2024-03-12 08:08:50

Partial reads won't corrupt the data. Dd will issue other read() until 1MB of data is buffered. The iflag=fullblock is only useful when counting or skipping bytes or doing direct I/O. See line 1647: https://github.com/coreutils/coreutils/blob/master/src/dd.c#...

tripflag | karma 99 | avg karma 2.3 · | 2024-03-12 07:38:58

This use of dd may cause corruption! You need iflag=fullblock to ensure it doesn't truncate any blocks, and (at the risk of cargo-culting) conv=sync doesn't hurt as well. I prefer to just nc -l -p 1234 > /dev/nvme0nX.

a1369209993 | karma 3438 | avg karma 1.19 · | 2020-11-06 20:29:44

Huh, pleasantly surprised to learn that dd correctly handles the truncated final block of a not-multiple-of-512-byte file; could have sworn that didn't work at some point.

fsniper | karma 1031 | avg karma 1.56 · | 2014-09-12 17:06:39

Thank you for clarification.

I was about to bet on "read fail repeat skip" cycle for dd's behaviour but, looking into coreutil's source code at https://github.com/goj/coreutils/blob/master/src/dd.c , if I'm not mistaken , dd does not try to be intelligent and just uses a zeroed out buffer so It would return 0's for unreadable blocks.

reply

rollthehard6 | karma 449 | avg karma 3.71 · | 2017-02-12 14:57:12+00:00

Some years ago I picked up the habit from a predecessor of testing such things with dd instead, that way you can experiment with the effect of different block sizes, so like -

dd of=/dev/zero of=/ddtest.out myfile bs=64k count=65536

reply

nrclark | karma 2096 | avg karma 3.97 · | 2022-05-23 15:23:49

a note: I'd recommend using tee instead of dd for that job, or add iflag=fullblock if your dd supports it.

The thing is that dd issues a read() for each block, but is doesn't actually care how many bytes it gets back in response (unless you turn on fullblock mode).

This isn't really a problem when you're reading from a block device, because it's pretty uncommon to get back less data than you requested. But when you're reading from a pipe, it can/does happen sometimes. So you might ask for five 32-byte chunks, and get [32, 32, 30, 32, 32]-sized chunks instead. This has the effect of messing up the contents of file you're writing, with possibly destructive effects.

To avoid it, use `tee` or something else. Or use iflag=fullblock to ensure that you get every byte you request (up to EOF or count==N).

reply

jsd1982 | karma 564 | avg karma 3.42 · | 2017-03-17 22:15:35+00:00

Interesting assertion. Can you show me a shell invocation without using dd that cuts off the first 16 bytes of a binary file, for example? This is a common reason I use dd.

_flux | karma 840 | avg karma 1.57 · | 2022-05-23 04:18:50

Btw, there are iflags in Gnu dd to work in bytes as well. I have ddbytes aliased to dd iflag=count_bytes,skip_bytes oflag=seek_bytes .

vbezhenar | karma 14076 | avg karma 2.34 · | 2023-05-19 11:58:42

> At least cp can figure out the block size itself.

Why does it matter? Use `bs=$((4 * 1024 * 1024))`. It'll work perfectly for any imaginable block size.

My issue with dd is that it's possible to write corrupted data with some weird flags which I did once. Something with conv=sync I believe which does unexpected things. But if you're not trying to be too smart, dd works fine.

reply

randomswede | karma 202 | avg karma 0.64 · | 2022-10-26 02:24:33

These days, my main use of dd is to get a specific amount of data from a file, where both "bs" and "count" are useful (no, "bs" does not only set the buffer, it also sets the chunk size for reads and writes, this is SOMETIMEs useful and/or necessary with tapes).

So, this is an approximation of a command pipeline I run several times per year, when I happen to need a secret of an approximate length:

    dd if=/dev/urandom bs=6 count=1 | base64

Tune the "bs=6", depending on how long you want your (guaranteed typeable) secret to be. Every 3 bytes in the input will give you 4 characters in the output and keeping the input block size a multiple of 3 avoids having the output ending in "=".

It MAY be possible to replace this use of dd with other shell commands. But, since I needed to learn enough of dd to cope with tapes, I use that.

reply

dabiged | karma 313 | avg karma 6.02 · | 2022-10-24 23:18:58

dd does all you want without a new version.

dd allows you to specify your seek, skip and count values as bytes instead of blocks using the iflag/oflag options seek_bytes, skip_bytes and count_bytes.

So to read the first MB of data 100GB into a file you can:

> dd if=/tmp/file1 bs=2M skip=100G count=1M iflag=count_bytes,skip_bytes

reply

dabiged | karma 313 | avg karma 6.02 · | 2022-10-25 01:25:07

I am running Coreutils 8.3 on Linux mint 20.

It is also in Coreutils 8.22 in RHEL7.

Edit: https://man7.org/linux/man-pages/man1/dd.1.html

Edit 2 : I just found this in the coreutils 9.0 changelog:

  dd now counts bytes instead of blocks if a block count ends in "B".
  For example, 'dd count=100KiB' now copies 100 KiB of data, not
  102,400 blocks of data.  The flags count_bytes, skip_bytes and
  seek_bytes are therefore obsolescent and are no longer documented,
  though they still work.

zelly | karma 2630 | avg karma 2.16 · | 2019-09-09 22:22:51

why? the data is easily recoverable.

    # dd if=/dev/zero of=/dev

now that's more like it.

pmav | karma 13 | avg karma 1.0 · | 2014-10-18 10:34:47+00:00

More data:

  ubuntu@c1-10-1-2-29:~$ dd if=/dev/zero of=test1 bs=1M count=512                                                               
  512+0 records in                                                                                                              
  512+0 records out                                                                                                             
  536870912 bytes (537 MB) copied, 5.43834 s, 98.7 MB/s

  ubuntu@c1-10-1-2-29:~$ dd if=test1 of=test2 bs=1M count=512                                                                   
  512+0 records in                                                                                                              
  512+0 records out                                                                                                             
  536870912 bytes (537 MB) copied, 5.869 s, 91.5 MB/s

  ubuntu@c1-10-1-2-29:~$ dd if=test2 of=/dev/null bs=1M count=512                                                               
  512+0 records in                                                                                                              
  512+0 records out                                                                                                             
  536870912 bytes (537 MB) copied, 0.60429 s, 888 MB/s

NamTaf | karma 5843 | avg karma 4.44 · | 2017-01-09 00:45:22+00:00

Per the sibling comments, you just need to specify a sane block size. dd's default is really low and if you experiment a bit with 2M or around that you'll get near-theoretical throughput.

NB: Remember the units! Without the units you specify it as bytes or something insanely small like that. I've made that mistake more than once!

reply

pmjordan | karma 7982 | avg karma 3.26 · | 2010-08-24 09:08:55+00:00

You can set dd's "bs" parameter to 1, then seek, skip & count are all in terms of bytes.

lloeki | karma 11319 | avg karma 2.78 · | 2014-04-18 09:03:34+00:00

DO @5$/mo:

    $ dd if=/dev/zero of=test bs=512 count=1500 oflag=dsync         
    1500+0 records in
    1500+0 records out
    768000 bytes (768 kB) copied, 1.64471 s, 467 kB/s

masklinn | karma 65147 | avg karma 3.36 · | 2017-02-12 13:17:57+00:00

You know, you can dd to a file to avoid destroying a drive. And you can provide a `count` to write fixed-size output files, dd will write bs×count bytes (technically it writes ibs×count, and bs sets both ibs and obs).

so

    dd if=/dev/zero of=foo bs=1k count=1m

would write 1GB in 1k blocks, and

    dd if=/dev/zero of=foo bs=1m count=1k

would write 1GB in 1m blocks.