FISH randomly hangs in the middle of a big copy for sh://host/
Marc MERLIN
marc_mc at merlins.org
Mon Dec 23 13:02:00 UTC 2019
Hi,
I reported the problem below 5 years ago, and it's still going on. It's
happening only rarely, and never happens if I use sshfs and copy from a
mountpoint via mc.
It also happens consistently on the same files and as far as I can tell,
in the same place (i.e. if a file hangs, I need to kill fish, make a new
connection, and if I copy it again, it hangs again in the same place)
The only change from below is that I now have a newer mc:
GNU Midnight Commander 4.8.17
Built with GLib 2.48.0
Using the S-Lang library with terminfo database
With builtin Editor
With subshell support as default
With support for background operations
With mouse support on xterm and Linux console
With support for X11 events
With internationalization support
With multiple codepages support
Virtual File Systems: cpiofs, tarfs, sfs, extfs, ext2undelfs, ftpfs, sftpfs, fish
Data types: char: 8; int: 32; long: 32; void *: 32; size_t: 32; off_t: 64;
All I can tell is that it's not a random network problem because it
seems reproduceable on some specific files, but it's rare enough (maybe 1% of files or so).
Original Email below:
Could you tell me how I can best file this bug in a way that you can
find out what's wrong?
This has happened with mc over multiple versions and years, currently
I have 4.8.12. Only interesting thing is linux/64bit with 32bit userland.
Maybe one file out of 20, mostly big files (over a gigabyte), copying
from sh://host/path/file hangs.
On the other side, if I strace the perl code launched via FISH, I see
data flowing from the file to the pipe, even 2h after the UI has hung on
the client making the copy.
read(3, "\356\'\r#mU\304\v((\320\324`\337\311\35\332\350\331\177"..., 8192) = 8192
write(1, "\1\305*\213\220\244z\371\370\2~\340\277\231/\364\235e!"..., 8192) = 8192
read(3, "\27\202S\35E\241G\302S\302\fj\37v0K\306z\276\260B\350\263"..., 8192) = 8192
write(1, "}\350oR\262\212j\\\277\313\1\177\254\36\213\237-\227\21"..., 8192) = 8192
read(3, "\344\261\272\32\35\2675\5\366\326c$-\'\305\313V\3770\313"..., 8192) = 8192
I'm not sure where that data is going since pipes don't have unlimited
buffering.
On the client, I have this:
|-bash(15452)---mc(6558)-+-bash(6560)
| `-ssh(7305)
kill -STOP 7305 on the client causes the strace perl on the other machine
to stop flowing.
I'm not too sure where that data is going, process 7305 says:
ssh 7305 root 0r FIFO 0,8 63158081 pipe
ssh 7305 root 1w FIFO 0,8 63158082 pipeg
ssh 7305 root 2w CHR 1,3 1028 /dev/null
ssh 7305 root 3u IPv4 63159122 TCP client:43163->server:ssh (ESTABLISHED)
ssh 7305 root 4r FIFO 0,8 63158081 pipe
ssh 7305 root 5w FIFO 0,8 63158082 pipe
ssh 7305 root 6w CHR 1,3 1028 /dev/null
strace shows ssh is reading from FH 3, but I can't tell where it's
shoving the data:
read(3, "\0\0@\20\2365,\224\304hx$\312M\251\262\17\236D\352\354I\211\212\213+\253\3161\200@\\"..., 8192) = 1448
clock_gettime(CLOCK_MONOTONIC, {548105, 51688985}) = 0
clock_gettime(CLOCK_MONOTONIC, {548105, 51787475}) = 0
select(7, [3 4], [], NULL, NULL) = 1 (in [3])
clock_gettime(CLOCK_MONOTONIC, {548105, 51971095}) = 0
read(3, "\33{\303\3122\237!!\206\216u\321\275\265N\341\220\264\221G\6\266\227 \314\212\377\r\371\177\247\36"..., 8192) = 1448
clock_gettime(CLOCK_MONOTONIC, {548105, 53168084}) = 0
clock_gettime(CLOCK_MONOTONIC, {548105, 53254841}) = 0
select(7, [3 4], [], NULL, NULL) = 1 (in [3])
clock_gettime(CLOCK_MONOTONIC, {548105, 53439342}) = 0
read(3, "Bp=\304;WI\\x\271\26Cy\214\245\330\336*\270\35\3507\225pw\226\316\225\220\322\300\241"..., 8192) = 1448
clock_gettime(CLOCK_MONOTONIC, {548105, 53637911}) = 0
clock_gettime(CLOCK_MONOTONIC, {548105, 53723783}) = 0
Now, the parent process, mc, pid 6558, seems to be receiving data.
If I kill -STOP 6558, I can see the data flow stop.
strace of that pid shows only reading from a pipe, only one character at
a time?
read(10, "+", 1) = 1
read(10, "\316", 1) = 1
read(10, "\222", 1) = 1
read(10, "\253", 1) = 1
read(10, "\216", 1) = 1
read(10, "\317", 1) = 1
read(10, "\316", 1) = 1
I can't see where mc is shoving that data, the UI is not showing any
progress
The destination file shown in lsof shows no progress/size change:
mc 6558 root 8w REG 0,42 1186279729 25828 /mnt/dshelf1/file
So I have no idea where mc is putting that data if it is flowing in
(albeit one character at a time?) and not flowing out according to lsof.
Can you suggest what other data I should gather to help file a that's
actionable?
gargamel:~# mc --version
GNU Midnight Commander 4.8.12
Built with GLib 2.38.2
Using the S-Lang library with terminfo database
With builtin Editor
With subshell support as default
With support for background operations
With mouse support on xterm and Linux console
With support for X11 events
With internationalization support
With multiple codepages support
Virtual File Systems: cpiofs, tarfs, sfs, extfs, ext2undelfs, ftpfs, sftpfs, fish
Data types: char: 8; int: 32; long: 32; void *: 32; size_t: 32; off_t: 64;
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08
More information about the mc-devel
mailing list