High average load for IO?

Everything sounds very logical.
How can I examine the task?

I have been searching on that but unfortunately have not been able to find anything to help identify what udev-worker is dealing with.

The only thing that just came to mind is to grep search in the /var/log/messages file for udev. There should be some kernel message when the process is initially set up and then put in D status.

1 Like

In the log file ā€œbootlog.oldā€ I have the line

ā€œ[1.448758] Note: (UDEV worker) [379] Exited with irqs disabledā€

found. But the current PID from the UDEV worker is 381ā€¦
Is that a hint or a dead end? An indication of a wrong BIOS setting?

EDIT:
I try this:

[root@ipfire ~]# /etc/init.d/udev restart
Stopping udev daemonā€¦ [ OK ]
Populating /dev with device nodesā€¦
Starting udev daemonā€¦ [ OK ]
Timed out for waiting the udev queue being empty. [ FAIL ]

Is this helpful?
Torsten

It is giving some indication. When I restart udev the message stops at the Starting udev daemonā€¦ [ OK ] line.

I need to find where that next line is coming from. It does not look to be coming from the initscript.

EDIT:

That last line is not from IPFire, it is from udev itself. Searching on that message gets a variety of issues, some to do with fstab, some to do with the graphics card etc, etc

Unfortunately nothing I found gave a suggestion as to what could cause the problem in your situation.

I suppose udev tries to initialize the various devices.
The messages look like, there is a device reponse by means of an interrupt missing.
The UDEV worker task is the sampler for interrupts of devices stored in a queue.

Are there devices not functioning or error messages in the bootlog?

In the initscript the following code section

# Now traverse /sys in order to "coldplug" devices that have
# already been discovered
/bin/udevadm trigger --action=add

looks to discover all your hardware.

In the next section

# Now wait for udevd to process the uevents we triggered
/bin/udevadm settle
evaluate_retval

the settle command is timing out because udevd is taking too long to process all the uevents triggered.

It would suggest to me that something in your hardware is maybe not yet failed but taking too long for udev to properly process it.

There were a few findings about the settle command for udev but most of what I found was related to systemd and not sysvinit as is used in IPFire.

If you look through your bootlog are there any messages that look to be something not responding quickly enough or timing out or not providing the required response?

1 Like

The file ā€œ/var/log/bootlogā€ was generated on August 18. and is empty
When I execute

[root@iPfire ~]# /etc/init.d/udev restart
Stopping udev daemon ā€¦ [ok]
Populating /Dev with Device Nodes ā€¦
Starting udev daemon ā€¦ [ok]
Timed out for waiting the udev queue Being empty. [Fail]

in/var/log/messages, only the following is logged:

Aug 23 18:57:44 Ipfire Root: Could not find a brided zone for lo
Aug 23 18:57:44 Ipfire Root: Could not find a brided zone for tuna0

That all. Where can I find more information about udev?

The bootlog is gzipped each week or so. If you run

ls -hal /var/log/bootlog*

then you will see all the versions of bootlog.

This is what is on my system

ls -hal /var/log/bootlog*
-rw-rw-r-- 1 root syslogd 0 Aug 18 00:01 /var/log/bootlog
-rw-rā€“r-- 1 root root 14K Jun 13 19:05 /var/log/bootlog.10.gz
-rw-rw-r-- 1 root syslogd 20 Jun 2 00:01 /var/log/bootlog.11.gz
-rw-rā€“r-- 1 root root 13K Jun 1 12:36 /var/log/bootlog.12.gz
-rw-rā€“r-- 1 root root 13K May 25 13:31 /var/log/bootlog.13.gz
-rw-rw-r-- 1 root syslogd 20 May 12 00:01 /var/log/bootlog.14.gz
-rw-rw-r-- 1 root syslogd 20 May 5 00:01 /var/log/bootlog.15.gz
-rw-rw-r-- 1 root syslogd 20 Apr 28 00:01 /var/log/bootlog.16.gz
-rw-rā€“r-- 1 root root 13K Apr 26 15:44 /var/log/bootlog.17.gz
-rw-rā€“r-- 1 root root 13K Apr 18 11:06 /var/log/bootlog.18.gz
-rw-rw-r-- 1 root syslogd 20 Apr 7 00:01 /var/log/bootlog.19.gz
-rw-rw-r-- 1 root syslogd 20 Aug 11 00:01 /var/log/bootlog.1.gz
-rw-rw-r-- 1 root syslogd 20 Mar 31 00:01 /var/log/bootlog.20.gz
-rw-rw-r-- 1 root syslogd 20 Mar 24 00:01 /var/log/bootlog.21.gz
-rw-rw-r-- 1 root syslogd 20 Mar 17 00:01 /var/log/bootlog.22.gz
-rw-rā€“r-- 1 root root 13K Mar 16 09:53 /var/log/bootlog.23.gz
-rw-rw-r-- 1 root syslogd 20 Mar 3 00:01 /var/log/bootlog.24.gz
-rw-rā€“r-- 1 root root 13K Feb 28 12:27 /var/log/bootlog.25.gz
-rw-rw-r-- 1 root syslogd 20 Feb 18 2024 /var/log/bootlog.26.gz
-rw-rā€“r-- 1 root root 13K Feb 13 2024 /var/log/bootlog.27.gz
-rw-rw-r-- 1 root syslogd 20 Feb 4 2024 /var/log/bootlog.28.gz
-rw-rw-r-- 1 root syslogd 20 Jan 28 2024 /var/log/bootlog.29.gz
-rw-rā€“r-- 1 root root 14K Aug 9 13:35 /var/log/bootlog.2.gz
-rw-rw-r-- 1 root syslogd 20 Jan 21 2024 /var/log/bootlog.30.gz
-rw-rw-r-- 1 root syslogd 20 Jan 14 2024 /var/log/bootlog.31.gz
-rw-rw-r-- 1 root syslogd 20 Jan 7 2024 /var/log/bootlog.32.gz
-rw-rā€“r-- 1 root root 13K Jan 4 2024 /var/log/bootlog.33.gz
-rw-rw-r-- 1 root syslogd 20 Dec 24 2023 /var/log/bootlog.34.gz
-rw-rw-r-- 1 root syslogd 20 Dec 17 2023 /var/log/bootlog.35.gz
-rw-rw-r-- 1 root syslogd 20 Dec 10 2023 /var/log/bootlog.36.gz
-rw-rw-r-- 1 root syslogd 20 Dec 3 2023 /var/log/bootlog.37.gz
-rw-rw-r-- 1 root syslogd 20 Nov 26 2023 /var/log/bootlog.38.gz
-rw-rā€“r-- 1 root root 13K Nov 23 2023 /var/log/bootlog.39.gz
-rw-rā€“r-- 1 root root 14K Aug 1 13:11 /var/log/bootlog.3.gz
-rw-rw-r-- 1 root syslogd 20 Nov 12 2023 /var/log/bootlog.40.gz
-rw-rw-r-- 1 root syslogd 20 Nov 5 2023 /var/log/bootlog.41.gz
-rw-rw-r-- 1 root syslogd 20 Oct 29 2023 /var/log/bootlog.42.gz
-rw-rw-r-- 1 root syslogd 20 Oct 22 2023 /var/log/bootlog.43.gz
-rw-rw-r-- 1 root syslogd 20 Oct 15 2023 /var/log/bootlog.44.gz
-rw-rā€“r-- 1 root root 13K Oct 12 2023 /var/log/bootlog.45.gz
-rw-rā€“r-- 1 root root 13K Oct 3 2023 /var/log/bootlog.46.gz
-rw-rā€“r-- 1 root root 13K Sep 26 2023 /var/log/bootlog.47.gz
-rw-rw-r-- 1 root syslogd 20 Sep 17 2023 /var/log/bootlog.48.gz
-rw-rw-r-- 1 root syslogd 20 Sep 10 2023 /var/log/bootlog.49.gz
-rw-rw-r-- 1 root syslogd 20 Jul 21 00:01 /var/log/bootlog.4.gz
-rw-rw-r-- 1 root syslogd 20 Sep 3 2023 /var/log/bootlog.50.gz
-rw-rw-r-- 1 root syslogd 20 Aug 27 2023 /var/log/bootlog.51.gz
-rw-rw-r-- 1 root syslogd 20 Aug 20 2023 /var/log/bootlog.52.gz
-rw-rw-r-- 1 root syslogd 20 Jul 14 00:01 /var/log/bootlog.5.gz
-rw-rw-r-- 1 root syslogd 20 Jul 7 00:01 /var/log/bootlog.6.gz
-rw-rw-r-- 1 root syslogd 20 Jun 30 00:01 /var/log/bootlog.7.gz
-rw-rw-r-- 1 root syslogd 20 Jun 23 00:01 /var/log/bootlog.8.gz
-rw-rā€“r-- 1 root root 14K Jun 19 14:33 /var/log/bootlog.9.gz

The ones where no reboot occurred in that week will have a size of 20 bytes and those will be empty.

The ones where a reboot happened will be something like 13KB and you can review the most recent version, the one from Aug 9th by running

zless /var/log/bootlog.2.gz

where the 2 needs to be replaced with the number of the most recent file with data on your system.

1 Like

Hello,
ok, found (supposed) solution ā€¦

Step 1:
After explaining Adolf about the creation of the bootlogs by Ipfire, I was looking for the last active bootlog file.

Step 2:
Examine the 800 lines for unusual, errors or UDEV references,
found:

  • SMPboot: CPU0: Intel (R) Celeron (R) J6412 @ 2.00GHZ (Family: 0x6, model: 0x96, stepping: 0x1)
  • Bug: Kernel Null Pointer Dereference, Address: 00000000000000
    -CPU: 1 PID: 376 Comm: (UDEV worker) Not Tainted 6.6.32-Idfire #1
  • Module Linked in: Pinctrl_elkhartlake

Step 3:
google for ā€œTimed out for waiting the udev queue being emptyā€ and ā€œelkhart lakeā€,

found: https://forums.debian.net/viewtopic.php?t=154580

  • Bad solution: Black listing
  • Good solution: BIOS update

Can this be confirmed by you or am I wrong?
Greetings Torsten

1 Like

The article cited seems to point out the issue very well.
I would try to do a BIOS update.

Without functioning basic software it is a game of luck to run an OS ontop.

The problems of udev with misbehaving devices ( which are represented by BIOS functions ) is another thing.

1 Like

Looking through the steps you took and the link you found then a problem with the elkhartlake processor that probably had a link to a kernel update version does look to be the cause.

I would think you have to do the BIOS update as I think changing the code to blacklist the elkhartlake module would require you to do a full build of IPFire or it would have to be blacklisted by the devs in the code.

You canā€™t change any kernel modules on an existing system because the kernel modules are cryptographically signed when being built and afterwards the key is thrown away. So no change can be made to any module without it being detected by the kernel and stopped.

Well done on your investigation. :crossed_fingers:that there is a BIOS update available and that it resolves your issue.

2 Likes