[schilytools] star hangs randomly during copy from ZFS to ZFS
Lasse Kliemann
lasse at lassekliemann.de
Wed Nov 6 13:12:41 CET 2024
Hi Robert,
I did not change the FIFO size. It appears that FreeBSD does very little tweaking of defaults:
---------------------------------------------------------------------
# grep -v '^#' /usr/local/etc/default/star
STAR_FSYNC=N
archive0=/dev/rmt/0 20 0
archive1=/dev/rmt/0n 20 0
archive2=/dev/rmt/1 20 0
archive3=/dev/rmt/1n 20 0
archive4=/dev/rmt/0 126 0
archive5=/dev/rmt/0n 126 0
archive6=/dev/rmt/1 126 0
archive7=/dev/rmt/1n 126 0
[compress]
[decompress]
---------------------------------------------------------------------
Also, on a quick look, I could not find anything substantial in the makefile in the FreeBSD Ports for star.
However, it seems that the FIFO size is crucial for the issue at hand. I got failures for fs=4m and fs=8m (which should be the default), but it seemed fine for fs=32m. I am doing more tests now.
For fs=4m for example, when the processes finally stays at 0% CPU, I checked the status with kill -3:
star: fifo had 165886 puts 164860 gets.
star: fifo was 65952 times empty and 46727 times full.
star: fifo held 4198400 bytes max, size was 4198400 bytes
star: fifo had 66 moves, total of 380416 moved bytes
star: fifo is 0% full (0k), size 4100k.
star: 0 blocks + 253428019200 bytes (total of 253428019200 bytes = 247488300.00k).
So the FIFO was empty at that time. However, it was empty before that many times already, which I found out with kill -3 while the process was still working fine. So an empty FIFO alone is not the cause, but it seems an empty FIFO is always there after or when the issue occurs.
I will report back when I did longer runs with fs=16m and fs=32m, which could take a couple of days.
Regards, Lasse
Robert Clausecker on Tue 2024-11(Nov)-05 at 10:38 wrote:
> Hi Lasse,
>
> This sounds interesting. I'll have a look at the issue and see if I can
> find anything.
>
> Did you change STAR_FIFOSIZE or similar variables in the star configuration
> file? I know that there is some sort of issue if the fifo is configured to
> hold more than ~2GB, but I'm not sure if this is related to your problem.
>
> Yours,
> Robert Clausecker
>
> Am Tue, Nov 05, 2024 at 12:01:56AM +0100 schrieb Lasse Kliemann:
>> Hi Robert,
>>
>> system information:
>>
>> FreeBSD 14.1-RELEASE-p5 GENERIC amd64
>> hw.model: Intel(R) Celeron(R) CPU J3455 @ 1.50GHz
>> hw.machine: amd64
>>
>> I can reproduce the issue, simply because it happens every time that I start the copy process. It does not happen at the same file or after the same amount of bytes copied though; this varies a lot - sometimes a few hundred GB, sometimes 1-2 TB.
>>
>> By the way, after having copied with bsdtar (to /data2 actually, not to /test; but /data2 showed the same issue with star as /test does) and then fixing directory mtimes with rsync, I started an star process to compare the copied data with the source:
>>
>> star -c -diff -vv -dump -acl diffopts='!atime,!ctime,!nsecs,!sparse' -C /data1 . /data2/.zfs/snapshot/after-copy
>>
>> This is running just fine so far, but will take about 1-2 more days to complete.
>>
>> Thanks,
>> Lasse
>>
>> Robert Clausecker on Mon 2024-11(Nov)-04 at 23:40 wrote:
>> > Hi Lasse,
>> >
>> > Thank you for your bug report.
>> >
>> > What operating system and architecture do you run star on?
>> > Do you have a deterministic way to reproduce this issue?
>> >
>> > Yours,
>> > Robert Clausecker
>> >
>> > --
>> > () ascii ribbon campaign - for an encoding-agnostic world
>> > /\ - against html email - against proprietary attachments
>> > --
>> > schilytools mailing list
>> > schilytools at mlists.in-berlin.de
>> > https://mlists.in-berlin.de/mailman/listinfo/schilytools-mlists.in-berlin.de
>>
>> --
>> Kind Regards / MfG
>> Dr. Lasse Kliemann
>> Westring 269, 24116 Kiel, Germany
>> E-Mail: lasse at lassekliemann.de
>> Telegram.org: @lassekliemann
>> Signal.org: +49 162 66 88 468
>> Website: https://lassekliemann.de
>> OpenPGP key: https://lassekliemann.de/gpg-common.asc
>> 4D69 BC04 CD1F 7589 334B 0B00 9FC2 FEE9 AE69 652A
>
>
>
>> --
>> schilytools mailing list
>> schilytools at mlists.in-berlin.de
>> https://mlists.in-berlin.de/mailman/listinfo/schilytools-mlists.in-berlin.de
>
>
> --
> () ascii ribbon campaign - for an encoding-agnostic world
> /\ - against html email - against proprietary attachments
> --
> schilytools mailing list
> schilytools at mlists.in-berlin.de
> https://mlists.in-berlin.de/mailman/listinfo/schilytools-mlists.in-berlin.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: not available
URL: <https://mlists.in-berlin.de/pipermail/schilytools-mlists.in-berlin.de/attachments/20241106/125e4f4e/attachment.sig>
More information about the schilytools
mailing list