Secure Syslog Replacement Proposed 248
LinuxScribe writes with this bit from IT World: "In an effort to foil crackers' attempts to cover their tracks by altering text-based syslogs, and improve the syslog process as a whole, developers Lennart Poettering and Kay Sievers are proposing a new tool called The Journal. Using key/value pairs in a binary format, The Journal is already stirring up a lot of objections."
Log entries are "cryptographically hashed along with the hash of the previous entry in the file" resulting in a verifiable chain of entries. This is being done as an extension to systemd (git branch). The design doesn't just make logging more secure, but introduces a number of overdue improvements to the logging process. It's even compatible with the standard syslog interface allowing it to either coexist with or replace the usual syslog daemon with minimal disruption.
Re:I don't know... (Score:5, Insightful)
Witness the deeply-ingrained UNIX Philosophy thing where if you can't use grep(1), it naturally follows that the thing is impossible to search.
You can't grep a Berkeley DB, yet for some reason you can find stuff in it, too.
Re:Pointless -- there is already a secure solution (Score:4, Insightful)
How? (Score:5, Insightful)
Log entries are "cryptographically hashed along with the hash of the previous entry in the file" resulting in a verifiable chain of entries.
So this means that in order for someone malicious to modify a log entry, all they really need to do is then re-hash all subsequent entries?
Re:Pointless -- there is already a secure solution (Score:5, Insightful)
Does anyone really care about forensic analysis of single stand-alone systems? Do you think that the FBI will go after whoever broke into your home system? Just rebuild the OS and move on.
This is a fix which breaks lots of other stuff. Today, I can open up my logfiles (even the compressed ones) with "vim -R ". The convenience of that will be lost and my analysis will be limited by the tools available to analyze the undocumented, binary logs. What about old log files after the binary format changes? There are so many issues with the proposal and precious few advantages.
Re:Pointless -- there is already a secure solution (Score:5, Insightful)
The way we used to solve that was to have the syslog output write to a dot-matrix (or other) line printer. Every line in the security logs is written to paper immediately. You can substitute anything that can record things written to RS-232 (cue the arduino fanboys) for the line printer.
This doesn't seem to actually solve the problem - if the person can modify the file, they can modify the file. If the lines are hashed, they just get the plaintext ones, delete the last ones, modify them, and then replay the fake ones and generate a new sequence of hashes. This just means that you need more tools in your recovery filesystem for fault diagnosis.
Re:Easy to trash a log? (Score:2, Insightful)
You should probably go learn what a digital signature is and how it is not encryption.
Re:I don't know... (Score:5, Insightful)
The problem isn't searching in the ordinary case. The problem is searching in the failure case. I can grep a truncated, mangled text file. If I truncate and mangle your BerkeleyDB can you still search it?
Re:Pointless -- there is already a secure solution (Score:4, Insightful)
In your stand-alone system scenario what keeps a hacker from deleting those logs entirely or reading all the logs, removing the entries they don't want preserved, then writing them all back out, with a new hash-chain history?
Re:I don't know... (Score:5, Insightful)
Send your logs to a remote/central server (Score:3, Insightful)
There is no real problem this solves. You are far better off logging remotely. This does not stop an attacker from hiding his tracks, you'll just know the logs were altered, but you won't know what was removed, or likely if/when you can start trusting them again. Log remotely, use encryption, and use TCP. You're central/remote logger is your trusted source for logs. You close everything except incoming logs. Parse and alert on the logs from there. Its simple to do, its real time, and solves a lot more issues than this type of solution ever will.
Serious issues with this (Score:5, Insightful)
Now, without getting into how much i dislike Pulseaudio (maybe because i'm an old UNIX fart, thank you very much), I think there are really serious issues with "The Journal", which I can summarize as such:
1. the problem it's trying to fix is already fixed
2. the problem isn't fixed by the solution
2. it makes everything more opaque
3. it makes the problem worse
The first issue is that it is trying to fix a problem that is already easily solved with existing tools: just send your darn logs to an external machine already. Syslog has supported networked logging forever.
Second, if you log on a machine and that machine gets compromised, I don't see how having checksums and a chained log will keep anyone from just running trashing the whole 'journal'.
/var/log
rm -rf
What am i missing here?
Third, this implements yet another obscure and opaque system that keeps the users away from how their system works, making everything available only through a special tool (the journal), which depends on another special tool (systemd), both of which are already controversial. I like grepping my logs. I understand http://logcheck.org [slashdot.org] and similar tools are not working very well, but that's because there isn't a common format for logging, which makes parsing hard and application dependent. From what I understand, this is not something The Journal is trying to address either. To take an example from their document:
MESSAGE=User harald logged in
MESSAGE_ID=422bc3d271414bc8bc9570f222f24a9
_EXE=/lib/systemd/systemd-logind
[... 14 lines of more stuff snipped]
(Nevermind for a second the fact that to carry the same amount of information, syslog only needs one line (not 14), which makes things actually readable by humans.)
The actual important bit here is "User harald logged in". But the thing we want to know is: is that a good thing or a bad thing? If it was "User harald login failed", would it be flagged as such? It's not in the current objectives, it seems, to improve the system in that direction. I would rather see a common agreement on syntax and keywords to use, and respect for the syslog levels [debian.net] (e.g. EMERG, ALERT, ..., INFO, DEBUG), than reinventing the wheel like this.
Fourth, what happens when our happy cracker destroys those tools? This is a big problem for what they are actually trying to solve, especially since they do not intend to make the format standard, according to the design document [google.com] (published on you-know-who, unfortunately). So you could end up in a situation where you can't parse those logs because the machine that generated them is gone, and you would need to track down exactly which version of the software generated it. Good luck with that.
I'll pass. Again.
I do (was: I don't know...) (Score:5, Insightful)
Your answer is right in the summary. I can use standard syslog in conjunction with it, and then have a process running in the background that notifies me if the integrity of the text file is violated, thereby getting the best of both worlds.
Absurd (Score:5, Insightful)
From the FAQ:
we have no intention to standardize the format and we take the liberty to alter it as we see fit. We might document the on-disk format eventually, but at this point we don’t want any other software to read, write or manipulate our journal files directly.
Not only does it generate logfiles that are not human-readable, they're also in a format that in two years not even their own tool will be able to read. If it is still around in two years, which I doubt.
Re:I don't know... (Score:5, Insightful)
I disagree, the fact that such a model still works so well decades later is definitely evidence that they were doing something right. When it comes down to it, if you make everything a file then you don't have to worry about envisioning niche uses as most of them can be accomplished by chaining together several commands. The ones that don't are still not impossible as you can just throw together a Perl script or similar to manage them.
GNOME 3 crack (Score:5, Insightful)
This is on the same crack as the rest of GNOME 3. They've invented the Windows event log, well done! Now I hand you a trashed system, but you can read the disk. You look into /var/log/syslog ... no, you don't. "We might document the on-disk format eventually, but at this point we don’t want any other software to read, write or manipulate our journal files directly. The access is granted by a shared library and a command line tool."
Speaking as a sysadmin, I shudder at this incredibly stupid idea. Are they even thinking of how to get something actually readable in disaster?
Serioulsy? (Score:5, Insightful)
Is this a joke? Or is it someone just trying to push their ideology of what they think should be done to the rest of the world to make their idea a standard?
Doing something like this would be a sure way for Linux to shoot itself in the foot. For evidence, one only needs to look as far as Microsoft who insists on doing it their special way and expecting everyone else to do what they deem as "good". The concept of syslog messages are that they are meant to be 'open' so disparate systems can read the data. How to you propose to integrate with large syslog reporting/analysis tools like LogZilla (http://www.logzilla.pro)?
The authors are correct that a format needs to be written so that parsing is easier. But how is their solution any "easier"? Instead, there is a much more effective solution available known as CEE (http://cee.mitre.org/) that proposes to include fields in the text.
> Syslog data is not authenticated.
If you need that, then use TLS/certificates. when logging to a centralized host.
>Syslog is only one of many logging systems on a Linux machine.
Surely you're aware of syslog-ng and rsyslog.
Access control to the syslogs is non-existent.
> To locally stored logs? Maybe (if you don't chown them to root?)
> But, if you are using syslog-ng or rsyslog and sending to a centralized host., then what is "local" to the system becomes irrelevant.
Disk usage limits are only applied at fixed intervals, leaving systems vulnerable to DDoS attacks.
> Again, a moot point if admins are doing it correctly by centralizing with tools like syslog-ng, rsyslog and LogZilla.
>"For example, the recent, much discussed kernel.org intrusion involved log file manipulation which was only detected by chance."
Oh, you mean they weren't managing their syslog properly so they got screwed and blamed their lack of management on the protocol itself. Ok, yeah, that makes sense.
They also noted in their paper that " In a later version we plan to extend the journal minimally to support live remote logging, in both PUSH and PULL modes always using a local journal as buffer for a store-and-forward logic"
I can't understand how this would be an afterthought. They are clearly thinking "locally" rather than globally. Plus, if it is to eventually be able to send, what format will it use? Text? Ok, now they are back to their original complaint.
All of this really just makes me cringe. If RH/Fedora do this, there is no way for people that manage large system infrastructures to include those systems in their management. I am responsible for managing over 8,000 Cisco devices on top of several hundred linux systems. Am I supposed to log on to each linux server to get log information?
Re:I don't know... (Score:4, Insightful)
With binary data, you have potential issues with the binary parser (e.g. has a hacker corrupted the log to trigger a buffer exploit in the binary-to-text program). Also, binary data is open to endian issues and integer/pointer size issues. Not to mention versioning (trying to read logs written using a different version of journald that write an incompatible journal file). Likewise if you only have access to fragments of the log.
Re:Serious issues with this (Score:4, Insightful)
Seriously, how hard it is to set one of these up? Not very. How expensive is to do this? Not very. Are we going to toss out the current method of logging because of the folks who only have Linux running on a laptop and have that as their only computer?
You certainly would not need a tremendously powerful PC to sit out on your network and do nothing but accept syslog messages from other systems.
My understanding (someone correct me if I'm wrong on this) is that there will be only a single logging system, not one doing this Journal format and another for text logs. The text available from the Journal would have to come from a tool that uses certain new library calls to extract information from the Journal. Users would have to pipe the output of that, one supposes, into tools to search for error messages of interest. It's not terribly hard to use but...
Not necessarily. Several of the summaries I've read about this new logging system indicate the the format hasn't been agreed on and may change from time to time. And... there is no guarantee when they'll get around to documenting the format. Good grief! First we have to change all of our log file search scripts to use the new Journal dumping tool. Then the format changes so we have to modify our scripts again. And again, perhaps, whenever it suits Lennart. How nice!
Re:Serious issues with this (Score:4, Insightful)
First issue: This is great if you have an external system to log to - if not, you're boned. This new logging system seems to cover both cases.
No, it doesn't: it does not protect you if you do not log to a another server or at least backup the hashes somewhere else. You still need a secondary server.
Second issue: One of the big reasons for doing this is to be able to detect when the log has been altered to cover a crackers tracks. Obviously, a deleted log file is easily detected and a big indicator that your system has been compromised, so I'm not seeing your point here.
Well, I was making a rather broad stroke on that one. As I explained earlier, just like with git rebase you can certainly tamper with the logs without being detected, if you are root, so this doesn't cover that case unless (again) you use a secondary server.
Third issue: As has been stated above, you can log to both the Journal and good old text based log files. That way you can still use your existing tools on the text file while still being notified of log file alteration. I agree that a common format for log entries would be nice but may not be possible since not every application logs the same kind of data. Note also that this proposal allows for arbitrary key/value pairs so some standard conventions will probably come about after its been used for a while.
Somebody else answered to this, but yeah: if you're going to file to logfiles anyways, why bother with the journal?
Fourth issue: Not sure I understand what you are talking about here... Obviously, backward compatibility will have to be taken into account by the devs. You should be able to read the files on other machines if you backed up your encryption keys, etc. (you do backup that stuff right?). By reading the articles, it sounds like the devs have thought about these issues and/or they have already been raised by others. They seem to be fairly easy to deal with.
Backward compatibility doesn't seem to have been taken into account by the devs. It's in the FAQ:
Will the journal file format be standardized? Where can I find an explanation of the on-disk data structures?
At this point we have no intention to standardize the format and we take the liberty to alter it as we see fit. We might document the on-disk format eventually, but at this point we don’t want any other software to read, write or manipulate our journal files directly. The access is granted by a shared library and a command line tool. (But then again, it’s Free Software, so you can always read the source code!)
I'm not necessarily on board with this proposed system either, but your issues seem like they've already been covered by the proposed design.
I disagree with this analysis. :)
Re:I don't know... (Score:4, Insightful)
You seem to mix up binary data with ad-hoc binary files. Any reasonable binary format has a well-defined and well-specified encoding, which includes sizes and endianness of numeric data. Just fwriting from your program variables directly into the file is not defining a proper binary format. Also note that pointers have no place in binary files at all.
Re:I don't know... (Score:3, Insightful)
The more secure thing to do with logs is ship them to another host. This idea of signing log messages is ridiculous, because you will just have intruders signing their messages with the keys you thoughtfully provided.
I suspect this is really just the preference of the systemd dudes.
Re:Very simple text-based implementation (Score:1, Insightful)
Re:I don't know... (Score:4, Insightful)
text can be altered - so can a binary file
The articles points out that log records will be chained with hashes (like changesets in git and mercurial) so you can't just rewrite part of a log file. You would have to replace the whole chain, and an externally held checksum would pick that up.
Dubious project... (Score:4, Insightful)
If we were to accept a binary format, then at least it shouldn't be from a group that says up front:
At this point we have no intention to standardize the format and we take the liberty to alter it as we see fit. We might document the on-disk format eventually, but at this point we don’t want any other software to read, write or manipulate our journal files directly. ... we don’t want any other software to read, write or manipulate our journal files directly
This is absolutely unacceptable for projects in *nix land intending to serve such a central role as logging.
Reading the actual original document, I don't think it focuses so much on security. But to the extent it does, it's pretty pointless. They make noise about an authenticated chain of entries so you can't just modify the middle, *but* that provides no benefit as the attacker can then just rebuild the chain from that point forward. Their answer is to send it to some place that cannot be modified once transmitted. This is exactly the same as remote syslog policies, no additional security, but added complexity for no gain.
Additionally, they *could* have a system with plaintext and a binary format in place and I recommend they change their minds to do so. The binary blob can contain offsets into a corresponding text file. Thus the good old unix way (which the systemd people seem intent on destroying) is preserved while at the same time get their enhancements.
They *do* have some valid points. Syslog can't cope with binary data, it doesn't provide a good per-user logging facility, large text files are hard to search, and syslog has insufficient service/event type facilities making complex analysis a requirement in some scenarios. Even in a simplistic case, I have been left at a loss for 'what string *should* I grep for?' Many services ignore syslog because of it's limitations as pointed out in the artcile, making things that much more complicated.
But at the exact same time they bemoan so many services doing different logging, they propose making yet another facility and recommend keeping rsyslog running because they aren't going to handle syslog messages. They tell people 'tough you have to use systemd' and 'tough you must use our logging'.
They dismiss java-style namespace management due to variable width, which I think is just going *too* far to acheive theoretical performance gains. They get *very* defensive about UUIDs, and I accept when managed correctly they are unique, *but* it adds a layer of obfuscation unless you have a central coordinating master map of UUID to actual usable names. Uniqueness is an insufficient criteria. Have both worlds. An application submits a message with both a human-readable namespace *and* a UUID. If your logging facility already has the UUID, ignore the namespace. If your hash table does not have that UUID, store a mapping between the UUID and namespace. Then your tool has the added bonus of having a way to dump a quick list of currently observed message types to search by.