Splunking the Zombie Apocalypse

By |Published On: February 2nd, 2021|

At Hurricane Labs, we love logs and we love video games. Any chance to combine the two is exciting, and recently I stumbled on just such an opportunity. 

My friends and I have been hooked on 7 Days to Die, a survival horror game in which players scavenge an apocalypse-ridden world and build a defense to survive a massive zombie horde that attacks every seven days. One of my friends had found an item–a book, specifically–that would’ve been a significant upgrade for another player, but when they met up to exchange the item it had disappeared. We dismissed it as a glitch, but I began to wonder–did the game have logs, and could they help us discover what had happened? Naturally, my weapon of choice was Splunk.

Unexpected bonus: server-side logs

Server-side logs are actually pretty rare in the gaming world these days. The concept of hosting a local server is antiquated, replaced by online servers hosted by game companies themselves. 7 Days to Die, however, allows players to operate their own server and configure it however they’d like. We didn’t realize this at first–my brother was the first one to start the game and we all joined him, unaware that we were actually joining a server running locally on his computer, which also happened to be the one with the least amount of resources. Eventually he sent me the server files and I started hosting from mine. 

I was pleasantly surprised by the logs that had been generated. Every log starts with a consistent and readable timestamp and a generally uniform message format. For example:

Copy to Clipboard

Scrolling through the unexpectedly large file, I noticed that in addition to operational logs, there were logs about player behavior and AI behavior: kills, deaths and even how many zombies were spawned! It was time to get these log files into Splunk and really start digging in. 

I chose to use 7daystodie:output as the custom sourcetype name to reflect both the software and the log file name (output_log) in case there is ever another log filetype we want to ingest.

The first thing I noticed is that every log file begins with a header that has interesting information but isn’t something we want to index every time:

Copy to Clipboard

To omit this, I took advantage of the FIELD_HEADER_REGEX props setting. This uses Regular Expression to identify the last line of the header and in this case tells Splunk not to index anything above it:

Copy to Clipboard

Line-breaking and timestamp parsing was the next objective. Fortunately, it uses a pretty standard and consistent timestamp, which is already more than I can say for some other more common data sources. When I first loaded this into the data preview through the web interface, Splunk actually did a good job identifying and separating events without any further configuration. The best practice though is to be as specific as possible here to avoid any potential issues now or in the future. 

At Hurricane Labs we like to cover at least TIME_PREFIX, TIME_FORMAT, MAX_TIMESTAMP_LOOKAHEAD, and TIMEZONE. Between these settings, we’ve specified where in the log the timestamp occurs, how far from that point to consider part of the timestamp, what the timestamp looks like, and what timezone the log was written in. We also prefer to have control over the event line-breaking, in other words specifying what constitutes a single event. In most cases, such as ours, the event in the log begins with a timestamp so we’ve told Splunk not to line merge automatically and we specify our own condition in the LINE_BREAKER setting.

As you’ll recall from the brief example above, our timestamp looks like this:

Copy to Clipboard

Our props configuration so far is:

Copy to Clipboard

So, what happened to that book?

I ingested a single log file to a test index to make sure everything behaved as expected and started playing with the data to see what was available. Splunk automatically extracted some fields that were already key=value pairs, but some more specific field extractions were necessary. I had data about when players connected and disconnected, how long their session was, how many zombies (and what types) they killed, and if they died. I also had some internal system messages such as audio files failing to load, no space to load in a resource, and–most relevant to our original goal of discovering the fate of our missing item–when a resource “fell off the world.”

Some examples of the field extractions:

Copy to Clipboard

If you’re not familiar with reading Regular Expression one of the above extractions pulls out the priority: INF (informational) or WRN (warning). Narrowing down on the warning messages I began to see events such as:

Copy to Clipboard

“Fell off the world” is an interesting fate for an item, but the game is technically in early access (meaning it is still being worked on). Honestly, open world survival games usually have bugs similar to these simply due to the scope of the environment and the resources, so you kind of know what you’re getting into. But is it what happened to our item, a book that would’ve taught a much-needed skill for one of our players? 

The log above involves resourceScrapIron, which as you can probably guess is not the book in question. But with that name extracted, we can easily perform an all-time search for the “fell off” warning message and display all the different items that have been logged:

Unfortunately, none of the results pointed conclusively at a book item. We do see some generic entries, FallingBlock or LootContainer, that could have been involved though. It is also amusing to see that zombies themselves can evidently just fall off the world. I searched through the logs from that day (and all-time, to be thorough) and didn’t find any further information involving our missing book. So it seems we’ll have to accept that something glitchy happened, such as the item falling out of boundaries as demonstrated above.

Tracking game stats

I couldn’t let all these other logs just go to waste though. After exploring the logs a bit more, I created a pair of dashboards: one to display some interesting player statistics and one for server statistics. Every time the server loads it logs each individual setting so I was able to grab that and output it to table for reference, in addition to some information about the warnings covered above:

The player stats were much more interesting, especially to my friends who had been using the server. In addition to metrics on zombie interactions, I was also able to display information about the current day, when the next horde was due (which comes every seven days and is referred to as a “Bloodmoon”), and how many of said Bloodmoons had been survived.

If you’re interested in the logic used, the game logs when each day begins, so I simply take the highest number of the “Day” field for the current day:

Copy to Clipboard

To determine the next Bloodmoon, I divide the Day value by seven and use the ceiling function to round up to the nearest integer to store what week we are in. That number is then multiplied by seven to determine which day will be the next “seventh” iteration (and we take the “latest” result with the max function as before).

Copy to Clipboard

Finally, to calculate the number of Bloodmoons survived, I leveraged the logic in the previous panel and simply subtract one (since we haven’t faced that Bloodmoon yet).

Copy to Clipboard

Sometimes the dashboard is interesting to look at over specifically for the last session played – no one died!

But we can also look at statistics over all time:

Outside of some bragging rights, this is particularly useful to determine how much changing a certain multiplier in the server settings, such as how many zombies per player spawn, affects gameplay (look at the spike in kills between the 15th and the 19th)!

While we didn’t find conclusive evidence as to what happened to our friend’s upgrade book, I think we do have enough information to make a fair guess. In this process, we discovered some great data points for both understanding player and zombie behavior as well as monitoring the health of our server. I’ll often keep the dashboard up on another monitor while we play just to keep an eye on things, or to be able to provide some quick statistics. I have a log file path monitor on the server to ingest the file as it writes:

Copy to Clipboard

Conclusion

I don’t anticipate every reader to jump into the zombie apocalypse armed with a baseball bat and a Splunk dashboard, but many of the methods used for data onboarding, investigation, and dashboard building can be applied to any data source. We’re here to help whether you’re interested in the number of conference calls joined or the number of zombie hordes survived (I’ll leave it up to you on which is more painful).

About Hurricane Labs

Hurricane Labs is a dynamic Managed Services Provider that unlocks the potential of Splunk and security for diverse enterprises across the United States. With a dedicated, Splunk-focused team and an emphasis on humanity and collaboration, we provide the skills, resources, and results to help make our customers’ lives easier.

For more information, visit www.hurricanelabs.com and follow us on Twitter @hurricanelabs.