FAILURE: in first configuration

mc5686 · 21 June 2023 08:02

Hi,
right after initial network configuration I got a “FAILURE:” message

I don’t know how to handle.

It seems /usr/bin/sendprofile is unable to run as a plain sendprofile --help crashes with “Segmentation fault”

Apparently it bombs in .../fireinfo/system.py#70 and it’s unable to tap even under pdb.

Note: this test installation is in a QEmu Virtual Machine

Please help
Mauro

bonnietwin · 21 June 2023 08:10

Can you show the “segmentation fault” output when you run sendprofile --help

mc5686 · 21 June 2023 08:27

Yes, of course I can, but I fear there’s little to be gleaned there:

It just dies.
I am available to any test, but I need directions.

Thanks for the quick answer.
Mauro

mc5686 · 21 June 2023 09:25

FYI: I traced failure to /usr/lib/python3.10/site-packages/fireinfo/hypervisor.py line 26:

self.__hypervisor = _fireinfo.detect_hypervisor()

Apparently this is a compiled extension as I cannot step into it.
Attempting to do so results in `Segmentation fault"

Here is full log:

[root@ipfire ~]# python3 -m pdb /usr/bin/sendprofile --help
> /usr/bin/sendprofile(22)<module>()
-> import json
(Pdb) b /usr/lib/python3.10/site-packages/fireinfo/system.py:96
Breakpoint 1 at /usr/lib/python3.10/site-packages/fireinfo/system.py:96
(Pdb) c
> /usr/lib/python3.10/site-packages/fireinfo/system.py(96)__init__()
-> self.hypervisor = hypervisor.Hypervisor()
(Pdb) s
--Call--
> /usr/lib/python3.10/site-packages/fireinfo/hypervisor.py(25)__init__()
-> def __init__(self):
(Pdb) l
 20  	
 21  	from . import _fireinfo
 22  	from . import system
 23  	
 24  	class Hypervisor(object):
 25  ->		def __init__(self):
 26  			self.__hypervisor = _fireinfo.detect_hypervisor()
 27  	
 28  		@property
 29  		def system(self):
 30  			"""
(Pdb) n
> /usr/lib/python3.10/site-packages/fireinfo/hypervisor.py(26)__init__()
-> self.__hypervisor = _fireinfo.detect_hypervisor()
(Pdb) s
Segmentation fault
[root@ipfire ~]#

Regards
Mauro

bonnietwin · 21 June 2023 10:12

So the code for detecting the hypervisor is causing the kernel to end up with a segmentation fault.
With a segmentation fault there must be more info on the cause and the effect around, probably in the logs related to the kernel. The sendprofile command is not normally run from the terminal so it will only send it’s output to the logs not to stdout or stderror.

Open a terminal window to IPFire and run
tail -f /var/log/messages

then run that same sendprofile command in another terminal window. In the log/messages window you should be able to see more details on what is causing the segmentation fault as it occurs.

mc5686 · 21 June 2023 11:05

Thanks Adolf,
log reports:

Jun 21 13:03:05 ipfire kernel: sendprofile[15138]: segfault at 0 ip 000071470f8c465e sp 00007fff7e948ce8 error 4 in libc.so.6[71470f840000+159000] likely on CPU 0 (core 0, socket 0)
Jun 21 13:03:05 ipfire kernel: Code: b6 07 29 c8 c3 0f 1f 80 00 00 00 00 f3 0f 1e fa 89 f8 31 d2 66 0f ef ff 09 f0 25 ff 0f 00 00 3d c0 0f 00 00 0f 8f 74 02 00 00 <f3> 0f 6f 0f f3 0f 6f 06 66 0f 74 c1 66 0f da c1 66 0f ef c9 66 0f

Interpretation is beyond me.

Regards

cfusco · 21 June 2023 11:14

This is what chatGPT4 says about the log:

The log you’ve shared is an error message from the kernel of a Unix-like operating system (like Linux). It reports a segmentation fault (segfault) that has occurred in the sendprofile process.

Let’s dissect the log message:

“Jun 21 13:03:05 ipfire kernel: sendprofile[15138]”: The timestamp shows when this log was generated. “ipfire” is the hostname of the machine, and “kernel” indicates that the log is from the kernel subsystem. The process causing the error is named “sendprofile”, and 15138 is its process ID (PID).
“segfault at 0 ip 000071470f8c465e sp 00007fff7e948ce8 error 4 in libc.so.6[71470f840000+159000] likely on CPU 0 (core 0, socket 0)”: This indicates a segmentation fault has occurred at memory location 0. The instruction pointer (ip) was at “000071470f8c465e”, and the stack pointer (sp) was at “00007fff7e948ce8”. The error occurred within the “libc.so.6” library, a fundamental library in Linux systems that contains system call functions. The “error 4” typically means there was an error in reading memory.
“Code: b6 07 29 c8 c3 0f 1f 80 00 00 00 00 f3 0f 1e fa 89 f8 31 d2 66 0f ef ff 09 f0 25 ff 0f 00 00 3d c0 0f 00 00 0f 8f 74 02 00 00 0f 6f 0f f3 0f 6f 06 66 0f 74 c1 66 0f da c1 66 0f ef c9 66 0f”: This is the section of machine code being executed at the time of the segfault.

The message indicates a software bug, possibly within the sendprofile process, or it could be an issue with the memory management of the kernel itself. To debug it further, a core dump may be needed and it should be analyzed with a debugger tool such as GDB to find out the exact cause of the segfault. It might also help to look at the source code of the sendprofile process if it’s available.

mc5686 · 21 June 2023 11:22

Ok.
This essentially says what we already know:
someone is passing a NULL pointer to some libc function and it blindly tries to dereference it.
Additional info is the access is in read.

Would a coredump be actually useful?
All executables seem stripped.

What should I do to extract more info?
Change ulimit -c and use gdb?
Is it available through pakfire?

TiA!
Mauro

bonnietwin · 21 June 2023 11:32

There must be some bug in relation to the qemu hypervisor and your hardware causing a problem.

The last change to the fireinfo code was in July 2021 to deal with the change of python from v2 to v3.

The c code involved is:-

/*
 * Fireinfo
 * Copyright (C) 2010, 2011 IPFire Team (www.ipfire.org)
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */

#include <Python.h>

#include <errno.h>
#include <fcntl.h>
#include <linux/hdreg.h>
#include <stdbool.h>
#include <string.h>
#include <sys/ioctl.h>

/* hypervisor vendors */
enum hypervisors {
	HYPER_NONE       = 0,
	HYPER_XEN,
	HYPER_KVM,
	HYPER_MSHV,
	HYPER_VMWARE,
	HYPER_OTHER,
	HYPER_LAST /* for loop - must be last*/
};

const char *hypervisor_ids[] = {
	[HYPER_NONE]    = NULL,
	[HYPER_XEN]     = "XenVMMXenVMM",
	[HYPER_KVM]     = "KVMKVMKVM",
	/* http://msdn.microsoft.com/en-us/library/ff542428.aspx */
	[HYPER_MSHV]    = "Microsoft Hv",
	/* http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458 */
	[HYPER_VMWARE]  = "VMwareVMware",
	[HYPER_OTHER]   = NULL
};

const char *hypervisor_vendors[] = {
	[HYPER_NONE]    = NULL,
	[HYPER_XEN]     = "Xen",
	[HYPER_KVM]     = "KVM",
	[HYPER_MSHV]    = "Microsoft",
	[HYPER_VMWARE]  = "VMWare",
	[HYPER_OTHER]   = "other"
};

#define NEWLINE "\n\r"

static void truncate_nl(char *s) {
	assert(s);

	s[strcspn(s, NEWLINE)] = '\0';
}

static int read_one_line_file(const char *filename, char **line) {
	char t[2048];

	if (!filename || !line)
		return -EINVAL;

	FILE* f = fopen(filename, "re");
	if (!f)
		return -errno;

	if (!fgets(t, sizeof(t), f)) {
		if (ferror(f))
			return errno ? -errno : -EIO;

		t[0] = 0;
	}

	char *c = strdup(t);
	if (!c)
		return -ENOMEM;

	truncate_nl(c);

	*line = c;
	return 0;
}

/*
 * This CPUID leaf returns the information about the hypervisor.
 * EAX : maximum input value for CPUID supported by the hypervisor.
 * EBX, ECX, EDX : Hypervisor vendor ID signature. E.g. VMwareVMware.
 */
#define HYPERVISOR_INFO_LEAF   0x40000000

int detect_hypervisor(int *hypervisor) {
#if defined(__x86_64__) || defined(__i386__)
	/* Try high-level hypervisor sysfs file first: */
	char *hvtype = NULL;
	int r = read_one_line_file("/sys/hypervisor/type", &hvtype);
	if (r >= 0) {
		if (strcmp(hvtype, "xen") == 0) {
			*hypervisor = HYPER_XEN;
			return 1;
		}
	} else if (r != -ENOENT)
		return r;

	/* http://lwn.net/Articles/301888/ */

#if defined(__amd64__)
#define REG_a "rax"
#define REG_b "rbx"
#elif defined(__i386__)
#define REG_a "eax"
#define REG_b "ebx"
#endif

	uint32_t eax = 1;
	uint32_t ecx;
	union {
		uint32_t sig32[3];
		char text[13];
	} sig = {};

	__asm__ __volatile__ (
		/* ebx/rbx is being used for PIC! */
		"  push %%"REG_b"       \n\t"
		"  cpuid                \n\t"
		"  pop %%"REG_b"        \n\t"

		: "=a" (eax), "=c" (ecx)
		: "0" (eax)
	);

	bool has_hypervisor = !!(ecx & 0x80000000U);

	if (has_hypervisor) {
		/* There is a hypervisor, see what it is... */
		eax = 0x40000000U;
		__asm__ __volatile__ (
			"  push %%"REG_b"       \n\t"
			"  cpuid                \n\t"
			"  mov %%ebx, %1        \n\t"
			"  pop %%"REG_b"        \n\t"

			: "=a" (eax), "=r" (sig.sig32[0]), "=c" (sig.sig32[1]), "=d" (sig.sig32[2])
			: "0" (eax)
		);
		sig.text[12] = '\0';

		*hypervisor = HYPER_OTHER;

		if (*sig.text) {
			for (int id = HYPER_NONE + 1; id < HYPER_LAST; id++) {
				if (strcmp(hypervisor_ids[id], sig.text) == 0) {
					*hypervisor = id;
					break;
				}
			}
		}

		return 1;
	}
#endif
	return 0;
}

static PyObject *
do_detect_hypervisor() {
	/*
		Get hypervisor from the cpuid command.
	*/
	int hypervisor = HYPER_NONE;

	int r = detect_hypervisor(&hypervisor);
	if (r >= 1) {
		const char* hypervisor_vendor = hypervisor_vendors[hypervisor];
		if (!hypervisor_vendor)
			Py_RETURN_NONE;

		return PyUnicode_FromString(hypervisor_vendor);
	}

	Py_RETURN_NONE;
}

static PyObject *
do_get_harddisk_serial(PyObject *o, PyObject *args) {
	/*
		Python wrapper around read_harddisk_serial.
	*/
	static struct hd_driveid hd;
	const char *device = NULL;
	char serial[22];

	if (!PyArg_ParseTuple(args, "s", &device))
		return NULL;

	int fd = open(device, O_RDONLY | O_NONBLOCK);
	if (fd < 0) {
		PyErr_Format(PyExc_OSError, "Could not open block device: %s", device);
		return NULL;
	}

	if (!ioctl(fd, HDIO_GET_IDENTITY, &hd)) {
		snprintf(serial, sizeof(serial) - 1, "%s", (const char *)hd.serial_no);

		if (*serial) {
			close(fd);
			return PyUnicode_FromString(serial);
		}
	}

	close(fd);

	Py_RETURN_NONE;
}

static PyMethodDef fireinfo_methods[] = {
	{ "detect_hypervisor", (PyCFunction) do_detect_hypervisor, METH_NOARGS, NULL },
	{ "get_harddisk_serial", (PyCFunction) do_get_harddisk_serial, METH_VARARGS, NULL },
	{ NULL, NULL, 0, NULL }
};

static struct PyModuleDef fireinfo_module = {
	.m_base = PyModuleDef_HEAD_INIT,
	.m_name = "_fireinfo",
	.m_size = -1,
	.m_doc = "Python module for fireinfo",
	.m_methods = fireinfo_methods,
};

PyMODINIT_FUNC PyInit__fireinfo(void) {
	PyObject* m = PyModule_Create(&fireinfo_module);
	if (!m)
		return NULL;

	return m;
}

I would suggest raising a bug on this.

https://wiki.ipfire.org/devel/bugzilla
https://bugzilla.ipfire.org/

Your IPFire People email address and password will act as your login credentials for the IPFire Bugzilla.

xperimental · 21 June 2023 11:42

What’s your host OS and what qemu version do you use? I may install a test system with qemu as well to see if this is a regular problem with qemu or just you.

mc5686 · 21 June 2023 11:51

I am running qemu via LXD on a very basic Debian Bookworm installation.
LXD was installed from packages (apt install lxd lxd-tools btrfs-progs qemu-system-x86 qemu-utils qemu-system-gui) and not via SNAP.
QEmu start under LXD is rather convoluted.

I can provide the whole process I used to install everything, if deemed useful, as I’m keeping detailed notes.

Note: int detect_hypervisor(int *hypervisor) tries to access "/sys/hypervisor/type" which does not exist on my IPFire (/sys/hypervisor/ exists but is empty).

Should I open the bug nonetheless or should I wait for your findings?

Thanks for your patience.
Mauro

xperimental · 21 June 2023 12:11

I was confused but it’s the name of the version 12.

Never used LXD before and honestly I don’t know why you use a VMP (virtual maschine player) with a VMM (virtual maschine manager).

Sure, so I don’t need to find out for myself.

There are different opinions about that. I always try to crosscheck situation before raising a bug.

mc5686 · 21 June 2023 12:27

This is surely much more than you need, but I send the whole thing as I have it

docs.tar.gz (163.9 KB)

please feel free to ask further details, if needed.

Note: any comment on document, including, but not limited to, clarity and completeness would be very welcome

Regards
Mauro

xperimental · 21 June 2023 12:46

OK I will try that tomorrow.

tphz · 21 June 2023 13:16

Chatgpt hint.

The segmentation fault error in line 25 of this code can be caused by attempting to read a value from an invalid pointer or making an invalid memory reference. In this case, the error may occur in the truncate_nl function, which is called in line 25.

In the truncate_nl function, the assert function is used to check if the s pointer is not equal to NULL. If this condition is not met, a segmentation fault error can occur at this stage.

To fix this issue, you can add a proper check to ensure that the s pointer is valid before calling the assert function. For example:

c
static void truncate_nl(char *s) {
	if (s != NULL) {
		s[strcspn(s, NEWLINE)] = '\0';
	}
}
This change will check if the s pointer is not NULL before using it in the strcspn function call. If the s pointer is NULL, the truncate_nl function will not attempt to access memory at that address, which should prevent the segmentation fault error.

edit

Could someone please check it out.
If the hint is false I will delete my post.

BR

bbitsch · 21 June 2023 13:39

Nice idea from ChatGPT.

This suppresses the error message in case of a wrong pointer. Not really useful, IMO.
The check should be placed in the caller of truncate_nl, with error message(!).

mc5686 · 21 June 2023 13:45

Apparently ChatGPT needs to study some more.

It is right parameter passed to truncate_nl() is not checked in the function (just asserted), but there’s just one call to a static function and argument is checked right before the call:

    ...
	char *c = strdup(t);
	if (!c)
		return -ENOMEM;

	truncate_nl(c);
    ...

I wasn’t able to find any other references.

Analyzing code it seems there’s a problem:

in my case /sys/hypervisor/type is not present, so the __asm__ check is performed.
I am under an hypervisor (QEmu) and thus presumably has_hypervisor is TRUE
if sig.text is not empty the check loop is entered.
if string is not found loop will try strcmp(hypervisor_ids[HYPER_OTHER], sig.text)
but hypervisor_ids[HYPER_OTHER] = NULL
BOOOOMMM!!!

Can someone cross-check my analysis?

Regards
Mauro

cfusco · 21 June 2023 13:51

OT, and a bit pedantic. Also, tongue in cheek with your joke. chatGPT does not think at all, therefore studying won’t help here. It is just a gigantic probability function that can be modified by providing a training set. For example, by providing your input to the model, you could change its probability functions to keep in account the response for a similar situation in the future.

mc5686 · 21 June 2023 14:07

Surely OT and probably even more pedantic:

AFAIK ChatGPT employs heavily SNN and thus calling it “a probability function” is probably very belittling.
… in any case I find difficult to find significant difference between “studying” and “applying a training set”, but that may be because I’m not native English speaker.

cfusco · 21 June 2023 14:23

definitely, but a neural network ultimately will assign a likelihood to the next token, given all the preceding ones and this value is what determines ultimately its choice of words. Been pedantic to the third order, ChatGPT is not heavily using Spiking Neural Networks, it has a different approach, but this is well above my head.