Graylog2 VS Pandora FMS: a detailed comparison
Brief history of syslog and Graylog2
Before introducing Graylog2, we first have to dive into what syslog is. Its history dates back to 1981 when Eric Paul Allman used to work at Berkeley University in California and developed a software which would be the predecessor of modern email services, Sendmail. Therefore, they needed an application to report them of any event from each server in which Sendmail was running.
UNIX, integrated into its own kernel, had the ability to generate its own messages, which were stored in text files inside the file system. Eric Allman developed a kind of software that runs in the background (daemon), called syslogd, which was in charge of reading the files with said stored messages, and created a syslog protocol to send them to other computers (called collector) to analyse the data. You can broaden your knowledge about this if you read the RFC 5424 norm on the layers and their working structures (note: the norm indicates that all the components of syslog can easily stay in just one computer, although, the image shows sort of two computers communicating between each other. We must not confuse syslog with syslogd).
Syslog became so popular that soon the need for the collector to deal, not just with the log from the server which was running Sendmail, but also to receive the logs from the clients who were connected to that email server, as merging both documents chronologically was of great use -and still is- to track errors within the code. This way, syslog was given an absolutely different future than the one it originally had.
Graylog2 was born thanks to Lennart Koopmann in mid-2009, when he decided to create his own software due to the high costs of the monitoring software. According to him, the offer in this sector of open source code was nonexistent, but we assure you that Pandora FMS was in its 3.0 version.
Its official website is www.graylog2.org (although the site redirects you automatically to www.graylog.org) and its Twitter account is @Graylog2. This name change happened the 16th of January of 2015 with the release of version 1.0 (beta). In this article, we’ll use preferentially its original name Graylog2. During the last two years, it has achieved an explosive increase of users and the 27th of April of 2016 they released the version 2.0, being the version 2.3.2 the current version (by 19th October 2017).
Graylog2 operation
Both proprietary software and open software are going to stay with us for a long time and that’s Graylog2’s bet. Right, let’s crack on by talking about security and performance!
syslog, syslog-ng, rsyslog, logrotate and nxlog
Despite being free software, Graylog2 is not like a ‘black box’ controlled by third-parties, we are us the ones who have control over it. Thus, the data we input to this software is our responsibility, solely ours.
Keeping this in mind, our monitoring needs are for many devices, inside our local network as well as any network, and syslog only allows to send the information by UDP packages. These packages don’t provide ‘acknowledgement of receipt’, neither ‘shake hands’, so we won’t be able to use a safe protocol of encryption to ensure that our information doesn’t end up in someone else’s hands. We may think of installing a VPN (Virtual Private Network) everywhere to send them, but even so, any device will keep sending information. If the network fails or our Graylog2 server goes off or out of order, it could be of no good for our purposes.
That is the reason why, besides syslog, other solutions which send packages via TCP -that can implement safe protocols and guarantee the delivery of each one of the sent packages- in addition to other features, for instance, delivering those packages straight into a database engine as MySQL or file preprocessing and even its saving name to store them in an organised way in many devices.
Syslog-n has an open version and an enterprise version (this last one with additional modules) and is available for Linux and even there’s a version for Windows running with Cygwin. In spite of the many other existing alternatives, we’ll focus on rsyslog which appears in 2004 by the hand of Rainer Gerhards directly competing with syslog-ng and nowadays it comes preinstalled with Debian, CentOS, Ubuntu 16, (Debian distro which comes with it as default as part of Logrotate). For practical purposes the event logs from each computer stop being important once the’ve been sent to Graylog2.
What are we trying to accomplish with this explanation and what does it have to do with Graylog2? Well, Logrotate, being a hint in its name, is in charge of rotating and compressing, and erasing them periodically. If this were to be left undone, our disk would end up full. For Windows proprietary operating system from version 2000 (and all its successors) they have their own integrated service of:
- Applications event log (for example, failures when accessing a database with MS SQL).
- Related events with the Active Directory, a kind of technology that allows managing hundreds of computers by using domains with the help of DNS. Imagine the amount of data generated by one of these servers handling a tree or a whole forest of these domains!
- Related events (in another category) with the DNS, using or not any Active Directory.
- File replication, handy distributed backups which synchronise with each other.
- Security: for the administrators. Everything related to the access to the operating system such as managing users, failed attempts when writing your password, etc.
- Last but not least: the events related with the operating system and its interaction with the hardware, such as hard drive’s S.M.A.R.T., running time, restarts and forced shutdowns, etc.
We can even save our own test events in Windows. With the required admin credentials on the command line we can practice the following:
eventcreate /s nombre_servidor /t ERROR /id 100 /l APPLICATION /d "This log is a test"
The eventcreate command will save it to send it later to our Graylog2 server. Talking again about the subject of the transmission, in Windows, we can install an open source solution for this purpose: Nxlog. This software is available for Windows, Unix, Linux, BSD and Android: for the operating system as well as the application’s event log. How to know which is which? Let’s go back to the last thing we saw about creating our own events in Windows (in other operating system happens in a similar way) with the parameters “/t” and “/l” passing the importance and the origin of the log message. Let’s see:
Parameter “/t”:
- ERROR
- WARNING
- INFORMATION
- SUCCESSAUDIT
- FAILUREAUDIT
Parameter “/l”:
- APPLICATION
- SYSTEM
Graylog2: functioning and requirements
The functioning and the installation of Graylog2 are closely related, so to simplify it we have created this diagram. (It’s not recommended for other purposes than mere tests, as in a production environment it will be impossible to use it that way).
We’ll run iit with the recommended installation requirements (functioning include), pointing out that we’ll just install it in a Linux host, as it’s not recommended doing it with Windows:
- Graylog’s built-in server is coded with Java so we’ll need at least Java SDK 8. In essence, it’s in charge of receiving the logs of other devices (without the source data it won’t exist any process).
- Elasticsearch, based on Apache Lucene, both coded completely in Java (in its version v2.X, but not for a greater version with Graylog2 v2.2). This component is in charge of doing the hard work. Elasticsearch is a program which will receive ALL the logs and will do anything required for breaking down, classify, link and store the information no matter which format they come (it handles a great variety of protocols). We emphasize that it’ll be here where the whole of our data will lay and we’ll need several devices. Being this job hard, Graylog2 deals with it by using the following working scheme: the users normally use the last 30 days of log and up to a maximum of one per year, which is what they recommend for a system so it doesn’t get overloaded while doing it.Pandora FMS, instead, keeps constantly migrating its data with a “prediction server” in a transparent way to its users: you are the one who decides when to erase the information; additionally, in the enterprise version you’ll also have the Goliat server at your disposition to move and migrate big amounts of data.Remember when we talked about logrotate? Well, for this job Graylog2 offers an enterprise version to protect the data (compression, encryption and transport through the Internet with secure protocols to protect the privacy of the clients).
- With Pandora FMS we need just one database engine (MySQL), while in Graylog2 we’ll need to instal, besides Java and Elasticsearch, the software MongoDB for everything concerning the user’s access through the web interface (HTML, CSS and JavaScript) and something of greater importance: the alert conditions we want it to expeditiously report to us; getting started with the basic monitoring tasks. Mongo DB will also allow us to store our indexing profiles for Elasticsearch. Gray log 2 will erase automatically the indexes which had finished their lifecycle, or even recreate them.
Filtering or selecting information?
At first glance, it sounds as if both terms were the same thing but, as we will see, they are not.
We consider Elasticsearch an auditing tool quite helpful with data mining.
Remember rsyslog? Inside the configuration file that we’re naming “60-graylog.conf”, we’ll add the following:
*.* @Graylog_server_ip_adress_ 2:8514;RSYSLOG_SyslogProtocol23Format
The asterisk-dot-asterisk at the beginning means send everything to Graylog2 server, generating a great amount of data traffic, despite being compressed! In contrast, Pandora FMS delegates the delivery to Console Agents (using well-known and accepted norms, which come already with the monitoring devices) and to Software Agents with a wide variety of monitoring options (computer temperature, status of a web server o database server) and this way, we select the information that really is of our interest. But with this we don’t mean that the rest of the data is not important! It’s just that there always are more urgent things than others to monitor. (Pandora FMS has fixed profiles for each type of client, standard configurations, and with the Enterprise version we can create tailored agents.
To back the previous statement, in Graylog2 from version 2.x they created a `pipeline process` in which we need to configure the following elements:
- ‘Pipelines’
- ‘Rules’
- ‘Stream connections’
- ‘Functions’
Once we had set all this information, we’ll proceed to filter the information that the server of Graylog2 receives so the server doesn’t process anything which is not matching what we specified before (and avoid it to get to Elasticsearch).
To conclude, remember that filtering implies to ‘move’ all the data so we keep what we are interested in and selecting -the approach of Pandora FMS- is to extract the information that we really want: with both methods we achieve the same result but at extremely different costs.
Extending Graylog2 capacities
If you found it difficult, we must remember that this was just a basic example. We could have even built it on a device like Azure, but the truth is that, for a production environment, we need to add more, much more to Graylog2.
- To make Graylog2 escalate, we first need to mount more clusters for the functioning of Elasticsearch and its database engine.
- For MongoDB, we’ll need to install copies to increase its reliability, although we won’t see any improvement in its performance.
- Another advantage of having Elasticsearch is to be able to use other products from the same company such as Logstash, a powerful tool to transport and filter data (in Elasticsearch format, of course). We could also use Kibana for fantastic and marvelous graphical representations… but we’ll need to get a solution for the management of users aside, something Graylog2 can actually do with the help of MongoDB (and with the enterprise version we’ll also be able to process the user’s aduit).
- If we need to monitor several local networks spread across a country o throughout the world, we’ll need to install a Graylog Server 2 to collect all the data from our network for send it through the Internet later. Do you remember the difference between filtering and selecting? It’d be now the time to apply it before sending it to a central server. Pandora FMS is capable of dealing with this matter with a light-weight satellite server which is specialised and fitted for this job. With Graylog2, on the other hand, we’ll practically need to install the whole solution again.
Other solutions we can choose related to the last point:
- Install Windows Servers 2008 or greater and install ‘Windows Event Collector Service’ to collect the data from a local network. Then, run Nxlog on them to finally set our Graylog2 server farm to centralize the operations.
- Install ‘Graylog Collector Sidecar’ and use the integrated API of Graylog2 (with Pandora FMS this won’t be needed as it has its own Software Agents).
- Graylog2 focus heavily on its own sending norm called GELF (Graylog Extended Log Format) which allows the delivery of messages of more than 1024 bytes well-structured, divisible in 128 pieces to send them compressed via UDP (of questionable security). What prevents us from sending them compressed through TCP is the restriction for null in JSON format, which has also carried problems with Logstash. So far, the solution for this is ‘Graylog TcpLogstashOutput Plugin’, a utility developed by a third-party which adds another task to our monitoring job.
- All these spread utilities have been fostered by the ‘Graylog marketplace’ where they sort them and store the links to their corresponding links for the users to integrate them to Graylog2.
- As we said before, the delivery of all the logs for its later filtering when they reach a Graylog2 server will contain a large amount of data, so it’s advised the use of a load balancing application for the server farm. This way we’ll have everything in the same place with a better functioning, and a greater tolerance to failures (on the contrary, by having a Graylog2 dedicated server for each geographic area we’ll get servers busier than other and even some of them idle). Balancing the load keeps all them always active and working for us.
- Pandora FMS as Graylog2 offer support for the authentication LDAP o Active Directory.
Conclusions
It’s clear that both Pandora FMS and Graylog2 have definitely some similarities when it comes to collecting data, although, for the data filtering and selection, their ways of working are quite different: while Graylog2 relies up to a great extent on third-party products for two database engines for two different sets of information, Pandora FMS uses only one database engine which can be easily replicated with absolutely all the information in one place and collects the data first-hand in a well-selected manner. With Pandora FMS flexibility also comes with simplicity, in contrast with the complexity of Graylog2.
If you’ve liked this article and you think we’ve missed something or we should correct any detail, don’t hesitate to comment below. We will be pleased to answer you.
Do you want to know more about Pandora FMS?
Programmer since 1993 at KS7000.net.ve (since 2014 free software solutions for commercial pharmacies in Venezuela). He writes regularly for Pandora FMS and offers advice on the forum . He is also an enthusiastic contributor to Wikipedia and Wikidata. He crushes iron in gyms and when he can, he also exercises cycling. Science fiction fan. Programmer since 1993 in KS7000.net.ve (since 2014 free software solutions for commercial pharmacies in Venezuela). He writes regularly for Pandora FMS and offers advice in the forum. Also an enthusiastic contributor to Wikipedia and Wikidata. He crusher of irons in gyms and when he can he exercises in cycling as well. Science fiction fan.