Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

That always reads to eof, so it can't be used in the same way.


sort by: page size:

If you take the position that the semantics of read() are 'read data from this endless stream', then EOF is an error indicating that this model no longer applies.

Doesn't bother me at all.


The one place where you’ll actually see this is in Read().

If you try to read 1000 bytes, you might get “800 bytes read, EOF” as a result. The worst part is that os.File won’t ever do this!


The section about the guarantees of read/write doesn't seem entirely correct to me. For sure it's relevant to why the data is there, but the reason it doesn't terminate is just because read doesn't tell you there's an eof until you try to read again past the end. It would be entirely possible to construct a version of this loop that would terminate. Though it would be awkward.

> Also you never really use readline(), since files are iterable

If you do that, it will try to read from a closed file.

The article's code leaves the input stream open, which causes reads to block forever instead of exiting with an end-of-file marker. As the student projects are for an introductory course, making them handle that particular detail would only be a distraction from the desired learning outcomes.

> Just like ending files on ctrl-Z (yes, it is End-Of-File, but filesystems know where the end of the file is already)

Very old filesystems didn't track file sizes in bytes, just blocks, so a literal EOF byte was needed to know where to stop reading in the middle of the last block of 128 bytes or so.


Here is another implementation using Node.js (although it does not explicitly check for EOF):

  require("fs").createReadStream(process.argv[2]).pipe(process.stdout)
Here is a implementation in PostScript (it does explicitly check for EOF, and specifically deals with one byte at a time):

  /o (%stdout) (w) file def
  ARGUMENTS 0 get (r) file
  {
    dup read {
      o exch write
    } {
      quit
    } ifelse
  } bind loop
(I have no GitHub account. But if the owner of that repository want to add, they can add what I have; I post this message (and all of my other messages on Hacker News) to public domain.)

If you can't write data to a file, and then read it back in, it's not valid.

I always thought it was such a shame that you couldn't use plain poll() for this. poll()-ing on a read fd at EOF for a regular file should work like poll()-ing a network socket.

But this doesn't read the file char-by-char, but uses buffering to read it into a string

That issue is about readLine specifically right? Not reading a file in general. The only time I get to ever use readLine is to solve adventofcode problems.

> More accurately, it seems the system treats them like empty files.

The reason is that the content is generated by a callback that the kernel calls, and the kernel does not want the content to be generated just in order to stat(2) the file, so it shows a zero length, and assumes that things like /bin/cat will just read(2) until EOF is returned, without trying to be too smart.


That's because `UnexpectedEof` is never returned from `read()`, it's only ever returned from `read_exact()`. In fact, `UnexpectedEof` didn't exist originally, it was added together with `read_exact()` to represent its unique error case (which is: `read()` returned end-of-file, but we still needed more bytes to completely fill the buffer). It's an error to return `UnexpectedEof` from any of the other methods of the `Read` trait, and since it's an error, it makes sense for `read_to_end()` to stop and propagate that error.

(In fact, thinking better about it, there are some cases where `read()` could legitimately return `UnexpectedEof`, like when it's a wrapper for a compressed stream which has fixed-size fields, and that stream was truncated in the middle of one of these fields. It's clear that, in that case, `UnexpectedEof` is not an end-of-file for the wrapper; it should be treated as an I/O error.)


That makes the file backwards because main calls foo which calls bar, so it should be main, foo, bar; not bar, foo, main when you read it.

> looping over `read` is essentially never the right thing to do

Why? I do it quite often, though admittedly usually in one-time scripts.


>Beyond saving anything bigger than a sentence it really has no use.

It has a use for READING text-files


GNOME's GIO library has a function that reads all requested bytes in a single call, until EOF or an error condition is reached: https://developer.gnome.org/gio/stable/GInputStream.html#g-i...

If you need that, you should be using the low-level read() / write() instead.
next

Legal | privacy