Servers

Monitoring systemd logs

December 30, 2020

Monitoring systemd logs

This post is also available in : Spanish

Systemd logs: how to monitor them with Pandora FMS

Systemd logs, without exemption, can be monitored in this proof of concept that we will see next. But what is systemd and how does it work?

I will tell you in advance that I do not want to create any controversies, I will keep my opinions on systemd. Back in 2014, when I formally devoted myself to studying free software, I received classes from instructors and professors who “grew up” using a very practical initialization process (initialization or just init) called sysvinit (also known as System V initialization, System VorSysV). That’s as far as Debian is concerned, which was the distribution we studied. I briefly describe what made sysvinit so unique.

Sysvinit

Sysvinit was an init program that, after the Linux kernel was loaded into memory and ready to work, was in charge of executing the rest of the processes (what all init does). It was directly inspired by Unix® System V and, not surprisingly, it was configured by means of files that had to have a correct order to execute everything necessary on a server. For example, you had to load network interfaces before loading any network services. Once everything was running, sysvinit ended, thus freeing up memory resources: sysvinit was a “goodbye”. Anything else you needed had to be executed by your direct order. It was a slow and very orderly process, but how many times do we reboot a GNU/Linux® server in a day, a week or a month? The “cost” of the start-up was later benefited in performance.

Another peculiarity was that the process was sequential: once a service or daemon reported that it was ready and running, it went on to the next one. This was the main criticism; Simple things like time synchronization with another server (Network Protocol Time) would stop the entire boot process if the network cable was disconnected or there was no internet connection, or if the time server was off or offline (or any other cause).

Apart from all this, you must also consider the cost of executing BASH scripts that call the same functions over and over again (cat,grep, etc.). It is easy for us to type in and program, but we did not convert it to machine language and that also causes the system to lag at startup.

For this reason, in Ubuntu 6.10 they developed an init called Upstart, but that is another story since another change was coming…

Systemd

In the GNU/Linux world, upper and lower case matter a lot, so systemd is written as it is (not to be misunderstood with System D, which is something else). It was born in 2009 mainly by Leonard Poettering (very active on Twitter), Kay Sievers (who worked for Novell), Harald Hoyer (who works for Red Hat) and Dhaval Giani (ex-IBM employee).

The Fedora 15 distribution was the first to use this new init: What did – what does – systemd do differently?In my own words and with the permission of engineers, graduates and computer scientists, the following.

In Unix®, there is this concept of socket to denote a point where an application connects with another; there are even web sockets for the same purpose, but located on different computers.

Well, what systemd does is creating first the sockets before the daemons. Yes, just like the joke that of “what came first, the chicken or the egg?” Once this is done, the daemons are launched the same way, one by one, and when a socket whose owner -daemon- has not started is queried by another daemon, then wait for it to be ready and go!

You can imagine then that systemd must be constantly checking that each daemon has finished loading into memory: if a daemon depends on several daemons, it is likely to be the last one in the server workload.

Recall that systemd’s job is to load each and every daemon,but what happens if any of them go wrong? Systemd should be able to finish it and restart it, and that’s when the problems appear: Did that daemon launch itself? Did you launch another instance? That’s a big problem for systemd because it is not aware of whether the daemon grew, developed and forked and is running perfectly fine.

The solution had been presented a year before and no, it had nothing to do with systemd: it was control groups (abbreviated as cgroups).

registros-systemd
Systemd-components ( Image courtesy Wikimedia Commons, CC BY-SA 3.0 )

As you can see, cgroups, autofs and kdbus are deeply established in the Linux kernel. I really want to tell you about the whole graphic… I just make the exception that this image may not be updated to the architecture of the heart of * nix.

Returning to cgroups, this element was created to be able to host “containers”, as a way to control the resources (memory, processor usage, etc.) of the processes. This resulted, mainly, in Docker and its popular “orchestrators”: Kubernetes and Podman (more efficient virtual machines that run coordinately in clusters).

That way (and I have described it very broadly) systemd can absolutely control everything that happens under its mandate and, since power corrupts, the next step (the “dictatorial regime”) of systemd would be about to begin.

Alpha and omega

The strongest criticism of systemd is that it became the first and the last, the alpha and the omega in the world of the Linux kernel. You already know the saying “don’t put all your eggs in one basket”: systemd is the first that starts with the process identifier one (Process IDentifier 1 or PID 1) and from there, the rest of the processes, which eventually finish and at the end, systemd shuts down the computer.

During the entire time the computer is on, systemd is in charge of keeping all the services running and monitoring their performance: for all of them it already has its socket ready to start processes if it receives any request. For example, bluetooth is something that you will hardly use in a server. So the socket will be done but until you actually connect a device it won’t start any programs (despite this system, systemd detractors indicate that this also consumes some memory and CPU cycles).

For Pandora FMS, in its High Availability Cluster mode (Pandora FMS HA), systemd is extremely important: it is responsible for ensuring that pandora_ha service works constantly to monitor the server cluster (restarting Pandora FMS HA, if necessary).

Some of the major Linux distributions, apart from Fedora (and Red Hat), use systemd:

However, I must highlight (although here I always speak in past tense about sysvinit) that when writing this article, a Debian-based distribution still uses it (and also optionally openrc, another init): Devuan.

As incredible as it may seem, you may install a Devuan server offline on a 670 megabyte compact disc. If you want, you may download a graphical interface such as Xfce or MATE on both disks (there are more graphical interfaces available).

Units in systemd

Going into the general features of systemd, it uses the so-called units (units) to start and monitor the system:

  1. service: Apart from taking care of all daemons, it also supports SysV scripting support, in the case of a program that uses that technology.
  2. socket: Which I already talked about earlier.
  3. device: For device management and control.
  4. mount: This drive encapsulates a mounting point in the file system hierarchy. Systemd monitors all mounting points as they come and go, and can also be used to mount or dismount mounting points, for redundancy.
  5. automount: This type of drive encapsulates an auto-mounting point in the file system hierarchy.
  6. target: This type of drive is used for logical unit grouping; Instead of doing something by itself, it just refers to other units, so they can be controlled together.
  7. snapshot: Similar to target unit, snapshots do nothing by themselves and their only purpose is to refer to other drives. They are useful for saving states or checkpoints for restoring in an emergency or at the user’s request.

I recommend reading the article devoted to demystifying systemd, published on the basis of the significant number of its opponents.

Systemd and Pandora FMS registers

If you ever want to contribute to the development of systemd code, you can start by reading the coding standards (programming writing style) on their official site on GitHub. There you will find delicious details such as that for indenting you must use eight spaces (although there are exceptions), or much more important things like what specific C-language functions you must use.

In your case, you may want to know the systemd logs and this is specified in the Binary Format of the Journal File (Journal File Format) of systemd. There it is explained, in depth, the whole process of logs of events, messages, errors, warnings, etc., not only for systemd but for everything that systemd executes.

Now, let’s get practical first of all, the systemd records are made in binary format and ready to be sent over the Internet to any other machine that you may have designated to collect them… that reminds us of syslog (specifically syslog-ng, the new generation of syslog that added log forwarding over the network). By default, systemd logs are saved in “/var/log/journal” and you may see their size with the command sudo journalctl –disk-usage.

Although you may activate the feature that records in higher detail, as well as sets limits on disk space consumption or saving time, etc., for those who monitor with Pandora FMS, you may use of log monitoring.

Starting with version 7.0 NG 712, Pandora FMS incorporates ElasticSearch to store log information, and starting with Pandora FMS 7.0 NG update 717, a new component appears: Pandora FMS SyslogServer. The main advantage of Pandora FMS SyslogServer consists in complementing log unification. This component allows Pandora FMS to analyze the syslog of the machine where it is located, analyzing its content and storing the references in your ElasticSearch server.

Having this powerful monitoring tool, you may configure syslog-ng to read the systemd logs directly. Or even better, configure journald.conf to forward messages to syslog-ng. I will tell you in advance that there are many additional details, but this article does not intend to be a manual or wiki: in the documentation of both programs you may find precise instructions.

registros-systemd
Setting systemd logs to syslog in journald conf

Systemd logs and containerized virtualization

Additionally, you may monitor containers, and/or Docker (one of them or both at the same time), because generally in these virtual machines they do not add systemd in order to save memory and processor cycles. Again you may read the specific details in this link for Docker and here for systemd-machined.service. Always remember that you have Pandora FMS SyslogServer and ElasticSearch to be able to handle the huge amount of data and/or information that you will receive, and be aware of the network traffic that will also increase.

Systemd logs and Pandora FMS Software Agents

To finish off, you may check systemd logs with the command journalctl and filter information with BASH and its GNU commands. Let’s say you want to see the latest startups and shutdowns of your computer.
registros-systemd
sudo journalctl –list-boots

You may filter by date with the parameters –since not only using a date, but also using commands in quotation marks: if you want to see yesterday’s logs you should use –since «yesterday»; or select by units with –unit service, for example.

Pandora FMS software agents are small installed programs with minimal impact on your computers that retrieve metrics. You may also see the respective systemd logs with the command systemd analyze blame:
registros-systemd
systemd-analyze blame

With not even a third of a second, a Pandora FMS software agent (as you may see in the example above) can retrieve important data. With journalctl you may export your queries in JSON format in order to be parsed by other software and quickly retrieve precise data and fields (this is known as a program parser) to respond to a plugin that we found developed for Pandora FMS software agents.

In this excellent article, there are examples on how to use BASH and develop modules for Pandora FMS software agents to be executed in a small Raspberry model board computer. I hope you like it.

Also the popular Python programming language has libraries you may call with:

from systemd import journal

And thus, directly access systemd logs without using journalctl and deliver the results to Pandora FMS Software Agent.

Before saying goodbye, remember that you may know better what Pandora FMS can offer you by clicking here.

If you have to monitor more than 100 devices, you can also enjoy a FREE 30-day Pandora FMS Enterprise TRIAL . Get it here .

Do not hesitate to send us your questions. Pandora FMS team will be happy to help you!


Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Download the most comprehensive report on secure monitoring from IDG research