Sos report for collecting troubleshooting information?

I’ve been doing a comparison of what sosreport collects and what eos-diagnostic collect. So far, I’ve just added flatpak and ostree support to sos. It’s very close to collecting everything you need - and could be further customized with an endless os specific collection routine. It also supports automatic uploading (both via HTTPS and SFTP).

The biggest difference is that instead of a single file it would be an tar.xz archive. I understand that may impact how support is currently done on the community discourse.

Sosreport may help collect information in other scenarios - such as if you need more logs from a machine, or want to verify the integrity of packages (–verify flag). There are also a number of tools to help with analysis.

Thoughts on the usefulness on including sos? If you have any questions feel free to ask them here or on the project irc/issues.

Hi Bryan,

first of all, i think that sosreport is a nice tool for generating in-depth reports for support cases, but i have some doubts, mainly because of:

The current system works quite well for our use-cases:

  1. Most of the time, the information in the diagnostic is sufficient to either solve the issue or to get a grasp on what could be the case - in which additional diagnostic procedures are worked out on a per-case base.
  2. It’s a simple system, containting only one single file with all information, which makes it easy for doing a quick analysis. Personally i often use the Phone or Tablet to just download the file and take a look on an issue. Having a archive which contains multiple files would make doing such a quick look harder.
  3. sosreport seems to be a complex piece of software with many dependencies. Again personally, i like to follow the KISS principle when possible (CLOC count: sosreport ~ 34k, eos-diagnostics ~ 1k)

These are just my personal opinions.

Thanks for taking a look! Absolutely, reasons 1 and 2 make perfect sense. If eos-diagnostics doesn’t need to evolve more there is no compelling reason to change - if it does, then there might be.

3- there are no dependencies that are not already installed. (And even then it’s really just python3… other depends are needed for collecting from multiple nodes at once or testing).

That’s interesting, I’ve never seen sosreport before. The single log file from eos-diagnostics is useful since you don’t have to unpack a tarball, but I actually find it unwieldy. I’ve tried reading them on my phone before but really struggle. In order to actually find what I want in there, I always end up downloading it and reading it with less or something where I can skip around to where I expect the issue might be. I wouldn’t actually mind if there were several more targeted files in there since I have to download it to do anything useful with it, anyways.

I personally like that it’s in python as that’s something I’m comfortable with. Our script was constructed in GJS at a time when we had a lot of mindshare there. Of course, reimplementing all the various things in eos-diagnostics as sosreport plugins would be an undertaking.

It’s a lot closer then you might think.

Instead of udisksctl dump we use udevadm and other commands (lands in /sos_commands/block/)

Sos does not currently collect:
/proc/bus/input/devices (need to investigate further)
Sos sound collection might need some improvement. eos-diagnostic collects a lot more active info (we collect mostly config).
Intel HDA -> we collect /proc/asound as part of the soundcard plugin. But don’t collect sound bits from /sys/
Codecs - I don’t immediately see why it’s useful
zramctl (although we collect a lot of other memory bits so this might be covered).

The following I think would be a endless specific plugin:
Endless OS image
OSTree split system
PAYG status