Securing your logs in Confluent Cloud with HashiCorp Vault
Challenge
Logging is an important part of managing service availability, security, and customer experience. It allows Site Reliability Engineers (SREs), developers, security teams, and infrastructure teams to gain insights to how their services are being consumed and address any issues before they result in service outages or security incidents. Often, logs contain sensitive information that needs to be protected.
Consider the scenario where the applications team and security operations team require access to the same set of logs, however, the teams must not be able to see specific fields in the log and the security requirement is that they must be masked or encrypted when presented back to the applications team. The ability to perform field-level encryption of the log data is difficult to achieve, it requires the ability to extract, transform, and load (ETL) the data before it is presented to the end user.
Now, you might be thinking, ETL? Do I need to build a data pipeline? What data formats do I need to use? What encryption libraries do I use? How do I protect the encryption keys? How do I scale the infrastructure to match increased demand in log ingestion and processing? Sounds complex, but it doesn't have to be. This tutorial walks you through how to build a secure data pipeline with Confluent Cloud and HashiCorp Vault.
Architecture
This section walks through an example architecture that can achieve the requirements covered earlier.
Exploring various log aggregation and data streaming services, Confluent Cloud, a cloud-native Apache Kafka® service, is used in this specific architecture because it allows for easy provisioning of fully managed Kafka, providing ease of access, storage, and management of data streams. It also provides many data integration options.
The following covers the components used in this architecture and how they come together. Please note that configurations here are only for demonstration, and not to be used in a production environment.
Application
The application (app-a) is a simple JSON data generator that dumps logs to a specific volume. It is written in Python.
A Fluentd sidecar is configured to ingest the application logs and ship them to Confluent Cloud via a Fluentd Kafka plugin. The Fluentd plugin must have PKI certificates generated to be able to connect successfully to the Confluent Cloud platform; the generation of the certificates is taken care of by HashiCorp Vault.
Confluent Cloud
One of the use cases supported by Confluent is log analytics and Confluent Cloud is a core component of this architecture, it accelerates the deployment without having to worry about standing up a Kafka cluster. Confluent Cloud will be set up with two topics:
- app-a-ingress: Kafka topic for ingesting and storing app-a logs.
- app-a-egress-dev: Kafka topic for the storage of the encrypted logs. The topic name has -dev here to represent the topic for transformed logs for the developer team. A managed Confluent connector will be set up to push the encrypted log data to a logging system, Elasticsearch, which is used by the developer team.
Confluent Cloud supports many different types of connectors; this blog sets up two connector sinks, Elasticsearch, and AWS S3 sinks. Check out the Confluent Hub for a comprehensive list of sinks.
HashiCorp Vault Enterprise
HashiCorp Vault Enterprise is an identity-based secrets and encryption management system. A secret is anything that you want to tightly control access to, such as API encryption keys, passwords, or certificates. Vault provides encryption services that are gated by authentication and authorization methods.
For encryption, this tutorial utilizes various encryption methods of Vault Enterprise including transit, masking, and format preserving encryption (FPE). For detailed information on the encryption methods, have a look at the How to Choose a Data Protection Method blog.
Transformer
Transformer (app-a-transformer-dev) is a service responsible for encrypting the JSON log data, by calling to HashiCorp Vault APIs (using the hvac Python SDK). It is both a Kafka consumer and producer where encrypted JSON logs are written to another topic. The transformer is written in Python and utilizes the hvac Python Vault API client.
Elasticsearch/Kibana
ELK is widely used for analysis of logs and dashboards. Confluent Cloud will push the encrypted logs to Elasticsearch.
Prerequisites
Should have the following installed:
- AWS CLI installed
- Amazon EKSCTL CLI
- Helm
- Vault CLI
- Kubernetes command-line interface (CLI)
- HashiCorp Vault Enterprise: To test out all the encryption features covered in this blog, you need an Enterprise license key. You can sign up for a free trial. For more information on installing a Vault enterprise license see the Vault documentation here.
- Vault enterprise license key should be in a file named
vault.hclic
.
- Vault enterprise license key should be in a file named
- Confluent Cloud subscription: You can sign up for a free trial.
- AWS account
- AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY for a IAM User that can create and destroy EC2 instances, VPCs, and EKS clusters.
Clone example repository
Clone the learn-vault-secure-logs-confluent repo.
Move into working directory.
Set up Confluent Cloud
Once logged in to Confluent Cloud, you need to set up the following.
After you log in, click Environments on the initial page.
Click +Add cloud environment.
Name the environment
confl
.Choose a Stream Governance Package - for this tutorial you want the Essentials free tier package, and then choose Begin Configuration.
In the Enable Stream Governance Essentials screen, choose AWS as a cloud provider and a region that does not incur extra cost (ex. Ohio us-east-2), choose Enable.
Add a cluster into the environment through the Create Cluster button.
In the Create Cluster page choose the Basic type and then select Begin configuration.
Choose a cloud provider to deploy the cluster to, this tutorial uses AWS, Singapore (ap-southeast-1) with a single zone and choose Continue
When the Enter payment card info page opens, look to the bottom left choose Skip payment.
Choose Launch cluster
In a short while, you will have a cluster up and running.
Configure topics
To configure the topics, select your cluster and choose the Topics on the left nav as below.
Click on Create topic, update the name to app-a-egress-dev, and then click on Create with defaults to use the default settings.
The topic Overview will appear for app-a-egress-dev.
Click on the Topics link on the left navigation panel once more.
Click on Create topic and update the name to app-a-ingress and use the default setting.
Then click on Create with defaults use the default settings.
The topic Overview will appear.
API keys
To publish to or consume data from a topic, authentication is required. Confluent Cloud provides the ability to generate API keys with role-based access control (RBAC) permissions that control which topics can be consumed to or published to. This setup uses a Global Access API key. To set this up, go to Confluent Cloud management console:
Under Cluster Overview select API Keys option on the left navigation menu.
Select the Create key button.
Select Global access and choose Next button.
Download the API credentials. The
API KEY
,API SECRET
, andBOOTSTRAP SERVER
in this file will be used to configure Vault.
Bootstrap server details
You also need the bootstrap server details, this can be found in the cluster settings page.
Also under Cluster Overview choose Cluster settings on the left navigation and see the page open.
Copy the Bootstrap server field. Keep a record of this information because it will be used for the application and transformer deployment configurations.
AWS EKS cluster
Set up your AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, replacing with the appropriate values below.
Run the eksctl command shown below to create a VPC and a managed AWS EKS cluster. Since this is a temporary environment and to keep costs down, spot instances are used.
Note
This step can take a while (20+ minutes). The following message will be displayed when the EKS cluster is ready: `2022-11-18 11:08:52 [✔] EKS cluster "cluster-1" in "ap-southeast-1" region is ready`.Create a IAM service account. This will map an AWS IAM role to a Kubernetes service account. The AWS IAM role will use a policy that allows EBS CSI Driver access.
The output should resemble this:
Retrieve and copy down your AWS Account number for use in the next step.
Add the aws-ebs-csi-driver to the EKS cluster. Update the
AWS_ACCOUNT_NUMBER
with the account number for your AWS account.Output should resemble the following:
Vault server
Move your copy of an Vault enterprise license to the current directory. The file should be named
vault.hclic
.Start with adding the HashiCorp repo to Helm.
Copy your file with the Vault Enterprise licence to the local directory.
Now you will copy the licence key to a Kubernetes secret.
Install Vault on your cluster.
This will deploy a Vault Enterprise instance in development mode with the root token set to
root
.Verify that Vault is deployed and running:
Note
Problems here are likely due to issues with the enterprise license file. Check that the Kubernetes secret vault-ent-license was successfully created.
Configure Vault
There are a few things you need to configure on Vault, including the Transit and Transform secrets engine and Kubernetes authentication methods.
Now you will connect to the Vault container and confirm you can access it.
Open a new terminal window.
Expose Vault externally to the Kubernetes cluster using port-forwarding:
Back in the original terminal window, set the AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID and run these:
You can check the status of Vault:
You should be able to see the Vault UI by navigating in your browser to http://localhost:8200.
KV secrets engine
The application and transformers will require access to the Confluent Cloud API keys and the bootstrap server details you recorded in the API keys and bootstrap server details steps above.
As part of InfoSec best practices, avoid hardcoding credentials.
Mount the KV secrets engine.
Store Confluent Cloud API keys for the application and transformer. Update the
BOOTSTRAP_SERVER
with the bootstrap server,API_key
with the Confluent Cloud global API client ID,API_SECRET
with the Confluent Cloud global API client secret before running the command.The results will resemble this:
Store configurations for json values to be encrypted and encryption method to apply. These will be fetched by the transformer.
PKI secrets engine
The PKI secrets engine needs to be set up to provide X.509 certificates for the application, specifically the Fluentd sidecar. The Kafka plugin requires the certificates to make the connection to Confluent Cloud.
Enable PKI secrets engine.
Configure the CA Certificate and private key
Create a new PKI role.
Transit secrets engine
This section walks through the setup of the Vault Transit secrets engine. The requirements specify the need to encrypt the owner.email and choices.places_of_interest
with the AES encryption method. Below are the Vault CLI commands to set up the secrets engine:
Enable the transit secrets engine.
Create a transit AES256 encryption key.
Create a convergent transit encryption key.
This will mount the Transit secrets engine and configure two AES-256 encryption keys and will be used by the transformer to encrypt the required fields in the logs.
Transform secrets engine
The Transform secrets engine is a Vault Enterprise feature that allows for more advanced encryption capabilities.
To configure the Transform secrets engine, first mount the Transform secrets engine:
NRIC transform configuration
Singaporean security requirements dictate that NRIC (National Registration Identity Card) details must be masked. This template configuration specifies the regex pattern for the NRIC, while the transformation configuration specifies the type of transform (masking or format preserving encryption) to be done.
Create a template for the NRIC pattern.
Create a transformation for NRIC.
Telephone transform configuration
Security requirements also dictate that phone numbers must be encrypted with format preserving encryption (FPE).
Create a template for the phone number pattern.
Create a transformation for the phone number.
A transform role is configured to allow access to the two transformations (sg-nric-mask and sg-phone-fpe) created earlier.
Kubernetes auth method
Since the application and the transformer will be deployed on Kubernetes and require access to HashiCorp Vault, the Kubernetes authentication method is an effective way to enable this. To configure:
Set up an authentication service account on the Kubernetes cluster.
Create a secret used by Kubernetes authentication.
Enable the Kubernetes auth method.
Need to get a few details from the Kubernetes cluster to complete the Vault configuration.
Review the values.
Blank lines indicate a problem, so output should resemble the following:
Configure the Kubernetes secrets engine.
Kubernetes auth method roles
These roles will be used by the application and transformers to authenticate to Vault.
Create the application role.
Create the transformer role.
Configure Vault policies
The application will require access to the secrets configured earlier in the KV secrets engine section. To allow this, Vault policies need to be configured:
Transformer will require access to the transit and transform secrets engines for encryption.
Transformer
The transformer will retrieve certain configurations stored in Vault as per the steps in the KV secrets engine, specifically in the
kv/app-a/config
andkv/confluent-cloud
paths. Here is a run down of the configurations:Configuration parameters description client_id string Confluent Cloud global API client ID set up in API keys client_secret string Confluent Cloud global API client secret set up in API keys connection_string string Confluent Cloud Bootstrap server found in Bootstrap server details keys_of_interest key: The JSON key path (in . notation) - method Encryption method options to use: aes, aes-converge, transform (if using transform, the transformation name also needs to be specified) - transformation Specifies the name of the transformation configuration (masking, FPE, tokenization); these transformations were created in steps NRIC transform configuration and Telephone transform configuration transform_mount string Transform secrets engine path, configured in Transform , default is transform transform_role_name string Transform role that has permissions to the transformations configured in NRIC transform configuration and Telephone transform configuration transit_mount string Transit secrets engine path, configured in Transit secrets engine transit_key_name string Name of Transit encryption key convergent_key_name string Name of Transit encryption key set with derived as true. Convergent encryption requires a context which must be provided. Encryption operations yield the same ciphertext when using this key. convergent_context_id string(base64-encoded) Context used for convergent encryption To build and deploy the transformer, run this command (from
learn-vault-secure-logs-confluent
git repo directory):The annotations in the deployment will configure a Vault Agent sidecar (listening on port 8200) and authenticate using the Kubernetes authentication method. Since agent-cache-enable and agent-cache-use-auto-auth-token are set to true, this will allow the Transformer to request secrets using the Vault Agent on
http://localhost:8200
using the supplied token to the Vault Agent.
Once the Transformer is deployed, it will subscribe to the Confluent Cloud app-a-ingress topic and monitor for incoming logs. Logs are processed and are then published to the app-a-engress-dev topic.
Elasticsearch and Kibana
The encrypted logs will be sent to Elasticsearch and viewed in Kibana. This section covers a setup with ECK (Elastic Cloud on Kubernetes) per quickstart instructions.
Some modifications were made to the deployment, including exposing Elasticsearch to the internet with a LoadBalancer.
To install, run the following:
Create the instance of Elastic Cloud.
Apply the operator.
Deploy Elasticsearch and Kibana pods.
Once deployed and Elasticsearch is up and running, you need to capture a few configurations for the Confluent Cloud connector in the next section, such as the credentials for Elasticsearch. The default username is elastic, to get the password:
Note down the password:
You also need to note down the load balancer details (EXTERNAL-IP):
Confluent Cloud connectors
Confluent Cloud connectors provide fully managed connectivity to multiple data sources and sinks. In this case, you will set up two connectors:
- Elasticsearch Service Sink connector
- Amazon S3 Sink connector
Elasticsearch service sink connector
This connector will subscribe to the app-a-engress-dev topic (containing the encrypted JSON logs) and publish all messages to an instance of Elasticsearch, to be viewed in Kibana.
In the Confluent Cloud portal, select your cluster created in Set up Confluent Cloud steps. To set up the connector:
Select Connectors left navigation menu.
In the filters, search for
Elasticsearch
and select Elasticsearch Service Sink.Choose the topic app-a-engress-dev and select Next.
On the Add Elasticsearch Service Sink connector 2. Kafka credentials choose Use an existing API key and put the API keys that you downloaded earlier.
On the 3. Authentication section, add the load balancer details you noted down earlier in the Connection URI field and append 9200 to the URI, the Connection user is elastic and the Connection password from the $PASSWORD you wrote down earlier.
In 4. Configuration the Input Kafka record value format is JSON.
Open Show advanced configurations.
Both Key ignore and Scheme ignore are true.
Data stream type and Data stream dataset are logs.
Everything else can be left with the default settings, and you can choose Continue.
In 5. Sizing, Tasks should be 1 then choose Continue.
For 6. Review and launch. the Connector name is ElasticsearchSink.
Review the settings below against the Connector configuration and if they match select Continue.
Setting | Value |
---|---|
topics | app-a-engress-dev |
Kafka Cluster Authentication mode | KAFKA_API_KEY |
Kafka API Key | Same key created in step API keys |
Kafka API Secret | Same secret created in step API keys |
Connection URI | <<loadbalancer_address>>:9200 |
Connection user | elastic |
Connection password | elastic password retrieved in step Elasticsearch and Kibana |
Enable SSL security | true |
Input messages | JSON |
Key ignore | true |
Scheme ignore | true |
Data Stream Type | logs |
Data Stream Dataset | logs |
Number of tasks for this connector | 1 |
Name | ElasticsearchSink |
If there are no errors with the configuration, after a few minutes of provisioning you should now have an operational connector:
Check connector status
On the page that appears make sure connector has a status of Running.
Application and Fluentd
The application deployment consists of two components:
- The application (app-a) itself which is a JSON data generator using the Mimesis data generator. It appends the generated JSON records to
/fluentd/log/user.log
. - The Fluentd sidecar has the fluent-plugin-kafka installed. It will track changes in the
/fluentd/log/user.log
and upload the JSON records to the app-a-ingress topic in Confluent Cloud.
The Fluentd sidecar requires a few configurations to work, including a few secrets:
- X.509 certificates for the fluent-plugin-kafka, the certificates are required by the plugin to connect to the Confluent Cloud cluster broker.
- Confluent Cloud API credentials for the fluent-plugin-kafka plugin to authenticate as a producer and push the logs to the app-a-ingress topic.
These secrets will be provided by Vault, and these configurations will be passed as part of the deployment file.
The deployment file is below and makes use of Vault Agent Sidecar Annotations to retrieve the required secrets and render the Fluentd configuration file.
Deploy the application:
Once the application is deployed, it will begin to generate fake JSON data and append to the
/fluentd/log/user.json
file.
View logs in Confluent Cloud
It is possible to see the messages being published in the Confluent Cloud topic.
To view them from the Confluent Cloud portal, you will select the topic name you wish to view as shown below.
In the app-a-ingress topic, choose the Messages tab. You should see a live stream of JSON logs being pushed by app-a Fluentd sidecar. Below is an example:
Click on a message and look at the details.
In the app-a-egress-dev topic you should see a live stream of encrypted JSON logs being pushed by the Transformer. Below is an example:
Click on a message and look at the details.
The highlighted fields were encrypted successfully.
The owner.telephone
field was put through a format preserving encryption transform and the owner.NRIC
field was masked.
The owner.email
and choices.places_of_interest
fields were encrypted with Vault Transit secrets engine. The secrets engine appends the ciphertext with vault:v1
indicating that it was encrypted by Vault, using version 1 of the encryption key. This is important as Vault Transit secrets engine can also perform key rotation; tracking which version of the key was used to encrypt is necessary to be able to decrypt the data.
Architecture considerations
Below are some important considerations related to this architecture:
- The Vault configuration is in development mode and should not be used in production; TLS was not enabled on the Vault API. TLS listener should be configured in Vault.
- The Transformer optimizes encryption requests to HashiCorp Vault in batches using batch_input, which improves the encryption performance significantly.
- HashiCorp Vault Enterprise can be horizontally scaled by adding more nodes, allowing for scaling of encryption/decryption operations.
- Confluent Cloud API keys should be configured to provide least privilege access to resources such as topics. Please see Confluent Cloud API best practices for more details.
- Confluent Cloud has a number of networking options including different private networking options.
Clean up
Delete the cluster.
Unset all the environment variables.
Go into your AWS Account and double check the CloudFormation templates with the name of "cluster-1". To verify that they deleted successfully, there will be no CloudFormation stacks present.
If there were issues with the CloudFormation templates deletion you can manually delete the Load Balancer, InternetGateway and VPC associated with "cluster-1".
If you had to manually delete anything return to CloudFormation and rerun the delete stacks. After a few minutes the stacks should delete themselves.
Help and reference
HashiCorp Vault Enterprise and Confluent Cloud can work together to address various data protection requirements. This use case is not limited to just logs, but any data that is managed within Kafka/Confluent Cloud. Vault Enterprise can be deployed across any cloud and on premises, allowing it to stay near your data, minimizing latency and improving performance.
To learn more about Confluent Cloud and HashiCorp Vault, here are a few useful resources: