Security

In the example deployment, for the ease of deployment and demo purposes, all the services have SSL security disabled and are using the default built-in users with passwords.

With NiFi 1.15+ HTTPS is enforced, this requires users to generate their own certificates. Some default publicly availble certificates are available in this repo as part of the demo but users should ALWAYS generate their own in production environment setups.

The Elasticsearch instances are now setup also with certificates, mainly cause this would most likely always be a requirement as part of a production deployment.

IMPORTANT: Please note that the actual security configuration will depend on the requirements of the user/team/organisation planning to use the services stack. The information provided in this README hence should be only considered as a hint and consulted with the key stakeholders before considering any production use.

Directory structure

./security
├── certificates_elasticsearch.env <---- env vars for ES certificates, same vars are used for both native ES & Opensearch
├── certificates_general.env <---------- root-ca env vars
├── certificates_nifi.env <------------- NiFi cert env vars
├── create_es_native_certs.sh <--------- Use this to create certificates for Elasticsearch Native (NOT FOR OPENSEARCH!)
├── create_es_native_credentials.sh <--- Use this after starting up the ES containers to create the base users for ES (NOT FOR OPENSEARCH!)
├── create_keystore.sh <---------------- Used for opensearch node cert generation
├── create_opensearch_admin_cert.sh  <-- Admin certs for Opensearch Kibana
├── create_opensearch_client_cert.sh <-- Generates certificates for client apps to access ES
├── create_opensearch_internal_passwords.sh <- Optional way of generating passwords for Opensearch Admin & Kibana accounts
├── create_opensearch_node_cert.sh <---- Use this to create certificates for the OpenSearch ES nodes
├── create_opensearch_users.sh <-------- Script to set up users for Opensearch after start-up, needs manual execution.
├── create_root_ca_cert.sh <------------ Script for generating root CA, used for NiFi/OpenSearch/Jupyterhub/OCR service
├── database_users.env <---------------- DB users env vars, for both production and samples DB
├── elasticsearch_users.env <----------- OpenSearch/ES native users, used in 'deploy/services.yml' and 'elasticsearch.yml' files for Kibana/ES and 'metricbeat.yml'
├── es_certificates <------------------- This is where OpenSearch/Elasticsearch certificates will go once generated.
├── es_native_cert_generator.sh <------- This is the script used to generate native ES certificates (NOT for Opensearch), used in create_es_native_credentials.sh
├── es_roles <-------------------------- This folder stores Elasticsearch native/Opensearch account roles and role_mappings.
├── nginx_users.env <------------------- Nginx users
├── nifi_certificates <----------------- Location of NiFi cerficiates post-generation.
├── nifi_toolkit_security.sh <---------- Script for generating NiFi certificates
├── root-ca-truststore.key <------------ all `root-ca` files  are generated by the `create_root_ca_cert.sh` script
├── root-ca.key <------------------------|
├── root-ca.keystore.jks <---------------|
├── root-ca.p12 <------------------------|
├── root-ca.pem <------------------------|
├── root-ca.srl <------------------------|
└── ssl-extensions-x509.cnf <----------- x509 settings used in OpenSearch admin cert and node cert script(s)

The .env files are used to define local env variables that are used in the services.yml file and for certificate generation. The ones that are used and should be modified depending on the deployment are: - certificates_nifi.env - nifi certificates vars - certificates_elasticsearch.env - ES certificate definitions, an important bit here are the ES_INSTANCE_NAME_1/2/3 vars, which control the location of the certificates in the services.yml file and also the location of the certificates in the es_certificates folder. - database_users.env - production and sample DB users, the user should be changed for a production environment - elasticsearch_users.env - all users used for ES native and OpenSearch deployments are declared here.

IMPORTANT NOTE

IMPORTANT: RUN EVERY TIME YOU UPDATE ANY SECURITY ENV VARIABLES.

Assuming you are in the security folder:

  1. run source ../deploy/export_env_vars.sh <– needed to set the env vars if you have modified them in the above files.

Generation of self-signed certificates

Assuming that one needs to generate self-signed certificates for the services, there are provided some useful scripts:

  • create_root_ca_cert.sh - creates root CA key and certificate, used for NiFi, MedCAT service, Jupyterhub, ocr-service etc.

  • create_opensearch_client_cert.sh - creates the client key and certificate for external apps

  • create_keystore.sh - creates the JKS keystore using previously generated (client) certificates, used in create_opensearch_node_cert.sh

  • create_opensearch_users.sh - creates system users for OpenSearch instances, to be used after finishing the container startup(s)

  • create_opensearch_admin_cert.sh - creates certs for OpenSearch Dashboard (Kibana)

  • create_opensearch_node_cert.sh - creates certificates for OpenSearch nodes

  • create_es_native_certs.sh - creates certificates for pure Elasticsearch (ES native) nodes only

Root CA

Using create_root_ca_cert.sh the key files that are generated are:

  • key: root-ca.key

  • certificate: root-ca.pem

  • keystore: root-ca.keystore.jks

  • p12 cert: root-ca.p12

  • pem cert: root-ca.pem

Generating the base certificates for NiFi/Nginx/JupyterHub/OCR-service/Tika/MedCAT service certificates

Configure certificate settings for NiFi in certificates_nifi.env and for the root CA in certificates_general.env.

Assuming you are in the security folder:

  1. run source ../deploy/export_env_vars.sh <– needed to set the env vars if you have modified them in the above files.

  2. run bash create_root_ca_cert.sh

  3. run bash nifi_toolkit_security.sh

You must run them in the above order as the root CA is required by the NiFi toolkit.

ELK stack

Follow the instructions carefully, there are a few sections detailing the differences between Elastic versions.

Generating Elasticsearch native/OpenSearch + KIBANA/OpenSearch Dashboard CERTS

Elasticsearch/OpenSearch Security Requirements

Each version has it’s own scripts for generating the necessary certificates. All security variables used within the .sh scripts for CERTIFICATE GENERATION are set in the following files:

  • ./certificates_elasticsearch.env

  • ./certificates_general.env

  • ./certificates_nifi.env

Please pay attention to the following sections, they describe what is needed to secure each version of ES deployments(Opensearch/Native ES).

Common certificates used for all ES types

Certificate namings are now common across ES versions, the deployment requires the following certificates, available in the security folder:

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elastic-stack-ca.crt.pem

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elastic-stack-ca.key.pem

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-1/http-elasticsearch-1.crt

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-1/http-elasticsearch-1.p12

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-1/http-elasticsearch-1.key

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-2/http-elasticsearch-2.crt

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-2/http-elasticsearch-2.key

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-2/http-elasticsearch-2.p12

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-3/http-elasticsearch-3.crt

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-3/http-elasticsearch-3.p12

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-3/http-elasticsearch-3.key

The ${ELASTICSEARCH_VERSION} MUST be set in the deploy/elastiscsearch.env before starting any container! it will also mount all the certificates seaminglessly according to the ES version, for native ES the certificate files are in security/es_certicates/elasticsearch, OpenSearch variant in security/es_certificates/opensearch/.

IMPORTANT NOTE: the es_certifcates folder is mounted inside NiFi so that you can load certificates seamlessly without the need to restart the NiFi service.


For OpenSearch

For information on OpenSearch security features and their configuration please refer to the official documentation.

We have to make sure to execute the following commands bash ./create_opensearch_nodecert.sh elasticsearch-1 && bash ./create_opensearch_nodecert.sh elasticsearch-2 && bash ./create_opensearch_nodecert.sh elasticsearch-3 this will generate the certificates for all 3 nodes, make sure to generate the ADMIN authorization certificate by doing bash ./create_opensearch_admin_cert.sh.

The keystore/truststore certificates are also generated when creating the node certificates, these are used in the NiFi workflows.


For Elasticsearch Native

We also provide as part of our deployment the native Elastisearch version since it is used across many organisations in production environments documentation. Please note that the deployment of native ES version requires different settings to be changed from the current repository state.

To generate the above certificates all that is needed is to run the create_es_native_certs.sh.


There are a few variables related to the certificate names, pleas read the following carefully:

  • ES_INSTANCE_NAME_1, this variable is usually set to the same name as ELASTICSEARCH_NODE_1_NAME from /deploy/elasticsearch.env, it is used to determine the certificate paths, and also in the certificate hostname SUBJ lines, there are two other vars with the same name aside from the numbering for each node.

  • ES_INSTANCE_ALTERNATIVE_1_NAME, this is used along with ES_INSTANCE_NAME_1 to provide additional hostnames forr the certificate generation, also useful incase the node name is different from the elastic search hostname.

  • ES_HOSTNAMES, set all your hostnames here, they should include the names of the nodes and also additional hostnames & DNS-es, please follow the exact indentation as it is in the .env file. If it does not work, then manually do : export ES_HOSTNAMES="- elasticsearch-1
    - elasticsearch-2
    - elasticsearch-3
    "

  • ES_CLIENT_SUBJ_ALT_NAMES and ES_NODE_SUBJ_ALT_NAMES, set these with additional domain names as needed, both client and node should have the nodes and the kibana hostname instances added.

Kibana

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-1/elasticsearch-1.crt

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-1/elasticsearch-1.key

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-2/elasticsearch-2.crt

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-2/elasticsearch-2.key

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-3/elasticsearch-3.crt

  • es_certificates/${ELASTICSEARCH_VERSION}/elasticsearch/elasticsearch-3/elasticsearch-3.key

  • es_certificates/ca/ca.crt

These certificates are generates by the steps mentioned in the above Elasticsearch Native section.


OpenDashboard (OpenSearch version of Kibana)

OpenDashboard requires:

  • admin.pem

  • admin-key.pem

  • es_kibana_client.pem

  • es_kibana_client.key

  • like the Kibana section above, all the certificates are used under the same names, of course, they will come from es_certificates/opensearch/ folder.

Once generated, the files can be further referenced in services/kibana/config/kibana_opensearch.yml and/or linked directly in the Docker compose file with services configuration.


Users and roles in ElasticSearch/OpenSearch

Generating users

Users and passwords enironment variables

The sample users and passwords are specified in the following .env files in security/ directory:

  • elasticsearch_users.env - contains passwords for ElasticSearch internal users.

  • database_users.env - containes account details for both production and samples DB instances

  • nginx_users.env - nginx account

Setting up OpenSearch

Please see the security/opensearch folder for the roles mappings and internal users for user data. You can also use the create_es_users.sh script for this.

On the first run, after changing the default passwords, one should change the default admin and kibanaserver passwords as specified in the OpenSearch documentation.

To do so, one can:

  • run the script generate_es_internal_passwords.sh to generate hashes,

  • modify the internal_users.yml file with the generated hashes,

  • restart the stack, but with using docker-compose down -v to remove the volume data.

Following, one should modify the default passwords for the other build-in users (logstash, kibanaro, readall, snapshotrestore) and to create custom users (cogstack_pipeline, cogstack_user, nifi), as specified below. The script create_es_users.sh creates and sets up example users and roles in ElasticSearch cluster.

Setting up Elasticsearch

For configuring default users, please see the following env files:

  • ./elasticsearch_users.env which is used in the create_es_native_credentials.sh script post ES container startup, it creates all the default users. If you wish to add more users make sure to take a look at the official documentation on how to create roles and accounts.

  • This script also creates a SERVICE ACCOUNT TOKEN which can be used for Kibana configuration. Please copy the token manually into the elasticsearch.env ELASTICSEARCH_SERVICE_ACCOUNT_TOKEN variable.

New roles

Example new roles that will be created after running create_es_users.sh:

  • ingest - used for data ingestion, only cogstack_* and nifi_* indices can be used,

  • cogstack_accesss - used for read-only access to the data only from cogstack_* and nifi_* indices.

New users

Example new users will be created after running create_es_users.sh:

  • cogstack_pipeline - uses ingest role (deprecated),

  • nifi - uses ingest role,

  • cogstack_user - uses cogstack_access role.

JupyterHub

Similarly, as in case of ELK stack, one should obtain certificates for JupyterHub to secure the access to the exposed endpoint. The generated certificates (by create_root_ca_cert.sh) can be referenced directly in services.yml file in the example deployment or directly in the internal JupyterHub configuration file. The COOKIE secret is a key used to encrypt browser cookies, please use the generate_cookie_secret.sh(./services/jupyter-hub/generate_cookie_secret.sh) script to generate a new key, make sure it is done before starting the container.

One should also configure and set up users, since the default user is admin, and the password is set the first time the account is logged in to (be careful, if there is a mistake delete the jupyter container and its volumes and restart). See example deployment services for more details.

Once the container is started up you can create your users and also assing them to groups.

You can create users before hand by adding newlines in the userlist(services/jupyter-hub/config/userlist) file, users with admin roles will need to have their role specificed on the same line, e.g: user_name admin.

If you want to create shared folder for users to use add them to the teamlist(services/jupyter-hub/config/teamlist) file, the first column is the shared folder name and the rest are just the usernames assigned to it.

For more information on JupyterHub security features and their configuration please refer to the official documentation.

Apache NiFi

For securing Apache NiFi endpoint with self-signed certificates please refer to the official documentation.

Regarding connecting to services that use self-signed certificates (such as Elasticsearch), it is required that these certificates use JKS keystore format. The certificates can be generated using create_keystore.sh. Usage: bash create_keystore.sh <cert_name> <jks_store> | the password is optional.

NGINX

Alternatively, one can secure the access to selected services by using NGINX reverse proxy. This may be essential in case some of the web services that need to be exposed to end-users do not offer SSL encryption. See the official documentation for more details on using NGINX for that.

Nginx only requires the root-CA certificate by default, so use the above generate cert section to create it.