Pancho writes:
> On 19/01/2021 16:21, Martin Gregorie wrote:
>> On Tue, 19 Jan 2021 15:34:00 +0000, Pancho wrote:
>>> On 19/01/2021 15:24, Martin Gregorie wrote:
>>>> On Tue, 19 Jan 2021 13:46:18 +0000, Pancho wrote:
>>>>> I think this clarifies in my mind why I wouldn't ever use this
>>>>> technique to observe events in practice. It is too fragile.
>>>>
>>>> So raise a bug to get it fixed: this will help everybody and is, after
>>>> all, why most Linux distros have decent bug reporting facilities. Plus
>>>> its quite a good way of thanking the developers for their work.
>>>
>>> There is not a bug, just different implementations, different behaviour.
>>> Different buffering, different arguments.
>>
>> Disagree: the delay you're seeing is definitely a bug, though possibly
>> its a task scheduler issue. If you run less than a buffer-full of data
>> through a pipe there should not be a noticeable delay under a UNIX/Linux
>> OS because the buffer is in memory and the task scheduler is a
>> multitasking scheduler and so can interleave both the writing and reading
>> tasks without any delay except those caused by task switching and being
>> preempted by higher priority tasks.
>>
>> You're reporting multi-second delays you can see which task(s) are
>> involved: run the delayed pipe again, but this time with 'top' running in
>> another console window to see what programs are active during the delay.
>
> I think you are missing the point. If I pipe 4095 characters into
> mawk, nothing happens, if a pipe an extra char to make 4096, it prints
> out.
Agreed. It is easy to reproduce.
$ (seq 9999 | head -c 4095; sleep 2; echo) | mawk '{print}'
This pauses before printing anything whatsoever.
$ (seq 9999 | head -c 4096; sleep 2; echo) | mawk '{print}'
This immediately prints whole lines up to 1040, pauses, then prints 104
(i.e. 1041, truncated).
$ (seq 9999 | head -c 4095; sleep 2; echo) | mawk -Wi '{print}'
This immediately prints whole lines up to 1040, pauses, then prints
(i.e. 1041, truncated).
This is entirely down to mawk and is nothing to do with the kernel. The
effect of -Wi is twofold.
First it disables output buffering. But this is not really relevant
here.
Second it causes mawk to read from stdin rather than file descriptor 0.
This is the key difference. With -Wi, it runs fgets on stdin, and gets
stdio’s buffering policy: read as much as possible, but don’t block
unless progress is impossible. Without -Wi, it uses its own internal
buffering policy: always read a whole block even if this means blocking
unnecessarily.
I have no idea what the benefit of the latter policy is, it seems to
make the code a lot more complicated for no clear gain (and it breaks
your use case). It’s plainly deliberate, so in that sense not a bug,
although it seems like a bizarre design decision to me.
>>> I'm using Raspbian Buster, default awk is mawk 1.3.3.
[...]
> Fedora 32 for these tests, which uses awk 5.0.1 - The Buster awk is very
> old, so raising a bug requesting an upgrade to the latest awk may be a
> good idea.
There is no such thing as mawk 5.0.1, Fedora is presumably using GNU Awk
(which also available in Debian and its derivatives). These are totally
different programs and it does not make any sense to compare their
version numbers.
--
https://www.greenend.org.uk/rjk/
--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | FidoUsenet Gateway (3:770/3)
|