This blog post is co-authored by Tom Kopchak and Brian Glenn.
One of the major features released in Splunk 7.2 is SmartStore–a mechanism to use AWS S3 (or other S3 API compliant object stores) as volumes for storing your indexed data in Splunk.
We’re not going to spend a lot of time going into the details of what SmartStore is and the benefits of using it, since Splunk already has a fairly comprehensive series on this, check out the blogs: Splunk SmartStore: Cut the Cord by Decoupling Commute and Storage and Splunk SmartStore: Disrupting Existing Large Scale Data Management Paradigms.
The biggest takeaway here, however, is that this represents a fundamental shift in how data can be stored in Splunk.
Traditionally, Splunk relied on dedicated storage for each indexer, with various classes of storage to help manage costs (much of which is covered in one of our other tutorials, Splunking Responsibly Part 2: Sizing Your Storage). With SmartStore, the concepts of hot, warm, and cold storage are replaced with a cache manager and an object store. This effectively allows for historical data to be stored for longer at a lower cost than traditional on-premise storage options.
Let’s assume you have an existing Splunk environment and want to take SmartStore for a spin. Here’s what you need to do to get it deployed.
Standard Warning About Changing Things
Whenever you’re manipulating index settings, there can be a risk of data loss. If you’re fortunate enough to not be using your test environment for production, doing this in a lab is absolutely the best way to experiment. All of this testing was first done on a standalone Splunk instance before trying it out on our lab environment, which closely mirrors one of our managed Splunk clients.
The common element you will need to implement SmartStore is an S3 compatible object store. For our testing, we used Amazon’s S3, so we didn’t have to worry about S3 compatibility. If you’re using a third-party mechanism, Splunk docs has some information on what is required for Smart Store to work. You can read more about managing indexers and clusters of indexers in the docs.
Assuming you’re using S3, configure a new bucket and enable API access as follows:
1. Navigate to S3 in the AWS Management Console: