As part of a cyber-security programme at one of our larger customers, there was a requirement to monitor email header data from one of the client’s subsidiary organisations. This data needed to be stored in the Splunk deployment of the parent organisation. While it sounds like a reasonably straightforward requirement, there were a lot of moving parts (many of which needed to be built) in order to achieve this.
Background and Requirements
The subsidiary organisation, unlike the parent organisation, uses Google G Suite and Gmail for their user management, authentication, mail and many of the other features that G Suite provides.
There is an existing Splunk add-on (G Suite for Splunk) for monitoring G Suite environments. This add-on is great and was used to meet a number of other monitoring requirements for the cyber-security programme, but it doesn’t do email header monitoring.
One of the other requirements we had was to pull a directory listing of all users and their attributes from the G Suite Directory on a regular basis, and this was another thing that the existing add-on did not support.
Some of the additional cyber-security requirements, beyond simply extracting the available mail headers, were to handle headers for multi-part MIME messages, to extract details (specifically file names and sizes) of any attachments in the message, and finally, to provide a list of any links that were included in the original message.
In order to meet the requirements, we needed a process to accomplish the following three main tasks and to push any data generated into Splunk:
- Retrieve a listing of all users and their attributes from the G Suite Directory;
- Enable auditing on any users that do not already have auditing enabled; and
- Retrieve new audit events and extract relevant information.
Initially this was done through three separate Python scripts, which were cobbled together into a loosely-formed Splunk add-on. While it worked, the biggest issue with this approach was that the credential management wasn’t great, i.e. it relied on a .json file stored on disk holding the credentials, and that it wasn’t user-friendly to setup and configure. There was no user interface and any parameters needed to be updated in the scripts themselves.
Splunk Add-on Builder
To solve the credential management and user-friendliness issues, I decided that in my own time, I would build a new add-on from scratch, using the Splunk Add-on Builder and would incorporate the logic from the existing Python scripts into three modular inputs.
The Splunk add-on builder takes care of generating the user interface for the creation of inputs and to accept input parameters so that no modification of the scripts is necessary when using the add-on in other environments. The add-on builder also provides a number of Splunk-specific helper functions to simplify development, and allows developers to run their add-on through a series of validation and pre-certification tests, ensuring that best-practices are followed, naming conventions are adhered to and that the code does not contain any known security vulnerabilities.
In the next few screenshots, I’ll walk through the builder process at a high level and show some of its features:
Credential Management
I realised that the credential management process in the existing G Suite for Splunk add-on was doing almost exactly what I needed. As a result, if I could incorporate that process into my add-on and have the modular inputs retrieve credentials from the Splunk encrypted credential store, then it would solve both the issue of credentials being stored in a file on disk, and would provide a neat user interface for storing and authorising new sets of credentials. Huge shout-out to Kyle Smith who developed the G Suite for Splunk add-on.
It took some time, but I eventually managed to extract the sections of code from the G Suite for Splunk add-on that handle the input, storage and authorisation of G Suite credentials, incorporate them into my add-on and modify the code to suit my needs.
The fields in the retrieved audit events are mapped to the Email data model of the Splunk Common Information Model (CIM). This means that those events are readily available for premium Splunk applications, like Splunk Enterprise Security, to begin analysing for potential security threats and malicious activity.
Permission
Qantas Airways commissioned the original development of much of this code. We sought their permission to release this add-on to the publicly accessible Splunkbase, which they have kindly given. This means that any bugs found/fixed and any newly developed features will be available to them (and the wider community) at no further cost.
Example Data
The add-on requires a user to be setup within G Suite with appropriate permissions to enable auditing on other users and to read the G Suite Directory. The email inbox of this user is also used as the “compliance inbox” to which audited emails are delivered. When it comes time to retrieve audit events, the script authenticates as that user, reads any unread messages from “compliance-noreply@google.com”, marks processed messages as read and finally loads the data into Splunk.
Benefits
Having the ability to automate retrieval of Gmail audit events in this manner alleviates the need for operators to assume high levels of privilege into G Suite, such as SuperAdmin, for performing this task. A single set of credentials created specifically for this purpose means that access to those credentials can be locked down and monitored.
For companies that are broad users of G Suite, including GCP, this reduces the risk associated with managing G Suite environments as you can now keep SuperAdmin far more locked down, in that you don’t need to provide this level of access to one or more administrator accounts for this purpose.
Access to the data stored in Splunk can be tightly controlled using Splunk’s fine-grained role-based user access controls.
The fields in the retrieved audit events are mapped to the Email data model of the Splunk Common Information Model (CIM). This means that those events are readily available for premium Splunk applications, like Splunk Enterprise Security, to begin analysing for potential security threats and malicious activity.
Conclusion
Using the Splunk add-on builder, some Python knowledge and the credential management process from the G Suite for Splunk add-on, I was able to:
- Improve upon the initial incantation of the Gmail audit process;
- Make the add-on more secure;
- Make the add-on more user-friendly; and
- Publish my first Splunk add-on to Splunkbase.
The add-on is available for download from Splunkbase here: Gmail Audit
Deprecation notice
Functionality provided by the Gmail Audit add-on for Splunk has been built into the newer Splunk Add-on for Google Workspace, which is developed and supported by Splunk.
The new add-on uses Google BigQuery to handle collection of the Gmail audit headers, and the most recent versions also provide the ability to list users from the Google Workspace Directory. The use of BigQuery is far more performant and scalable for collecting audit events than via the Google Audit API method for auditing mail headers. A lot was learned through the development of the Gmail Audit add-on for Splunk and it provided valuable security information to the customer that was not readily available via other means at the time.
Nick Von Korff joined Sourced in 2015 as a Data Visualisation Consultant, specialising in monitoring and data analysis tools, such as Splunk and Datadog. He has worked in IT for over 20 years in a range of industries and roles including systems administration, IVR system programming, network administration, database administration and systems monitoring.