Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

THREESCALE-11156 add tls env var to zync and system deployments #1026

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

austincunningham
Copy link
Contributor

@austincunningham austincunningham commented Oct 15, 2024

Issue

THREESCALE-11156

What

Add database client side tls to 3scale

Verification

# install crd's
make install
# create project
oc new-project 3scale-test
# get the wildcarddomain
DOMAIN=$(oc get routes console -n openshift-console -o json | jq -r '.status.ingress[0].routerCanonicalHostname' | sed 's/router-default.//')
# create s3 secret
oc apply -f - <<EOF
---
kind: Secret
apiVersion: v1
metadata:
  name: s3-credentials
data:
  AWS_ACCESS_KEY_ID: UkVQTEFDRV9NRQ==
  AWS_BUCKET: UkVQTEFDRV9NRQ==
  AWS_REGION: UkVQTEFDRV9NRQ==
  AWS_SECRET_ACCESS_KEY: UkVQTEFDRV9NRQ==
type: Opaque
EOF

kubectl apply -f - <<EOF                    
---                                    
apiVersion: apps.3scale.net/v1alpha1   
kind: APIManager                     
metadata:
  name: apimanager-sample
spec:
  system:
    systemDatabaseTLSEnabled: true
    database:
      postgresql: {}
    fileStorage:
      simpleStorageService:
        configurationSecretRef:
          name: s3-credentials
  zync:
    zyncDatabaseTLSEnabled: true
  wildcardDomain: $DOMAIN
  externalComponents:
    backend:
      redis: true
    system:
      database: true
      redis: true
    zync:
      database: true
EOF
  • Setup the system-database and zync secrets with sslmode disabled (need to be run from the postgres repo as thats where the certs are created)
oc create secret generic zync \
  --namespace=3scale-test \
  --from-literal=DATABASE_SSL_MODE=verify-ca \
  --from-literal=DATABASE_URL=postgresql://postgres:[email protected]/zync_production \
  --from-literal=ZYNC_DATABASE_PASSWORD=postgres \
  --from-file=DB_SSL_CA=rootCA.crt \
  --from-file=DB_SSL_CERT=client.crt \
  --from-file=DB_SSL_KEY=client.key 
oc create secret generic system-database \
  --from-literal=DATABASE_SSL_MODE=verify-ca \
  --from-literal=URL=postgresql://postgres:[email protected]/system \
  --from-literal=DB_USER=postgres \
  --from-literal=DB_PASSWORD=postgres \
  --from-file=DB_SSL_CA=rootCA.crt \
  --from-file=DB_SSL_CERT=client.crt \
  --from-file=DB_SSL_KEY=client.key 
  • Create the default redis and wait for the deployments to be up
 make cluster/create/system-redis
 make cluster/create/backend-redis
  • Run the operator locally
make run
  • zync-que may crash as the database needs some bootstrapping, exec into the pod and run the bootstrap, Not tls related as happens without tls enabled.
bin/rails db:create
  • wait for the install to complete
  • exec into the zync and system pods to to confirm the env vars have been set e.g.
# get the pod name
 oc get po
NAME                                     READY   STATUS      RESTARTS   AGE
apicast-production-6d4d589848-shlls      1/1     Running     0          3h5m
apicast-staging-767df4cbd8-m2kf5         1/1     Running     0          3h5m
backend-cron-6bbd4d5846-hmc7r            1/1     Running     0          3h5m
backend-listener-9df5df447-842qk         1/1     Running     0          3h5m
backend-redis-b59dd94cb-pksk7            1/1     Running     0          3h5m
backend-worker-64d47694b8-89q2g          1/1     Running     0          3h5m
system-app-85c4d98dd-j9mtr               3/3     Running     0          4m40s
system-app-post-884dp                    0/1     Completed   0          3h3m
system-app-pre-wlzrd                     0/1     Completed   0          3h5m
system-memcache-6b4df58598-q6mzj         1/1     Running     0          3h5m
system-redis-bf8db486d-h5bq9             1/1     Running     0          3h5m
system-searchd-659bffb8f4-swf4g          1/1     Running     0          5m50s
system-searchd-manticore-reindex-5jjb4   0/1     Completed   0          3h5m
system-sidekiq-59cc7dd9ff-l6pxg          1/1     Running     0          5m53s
zync-6f49fb7c5c-2pltl                    0/1     Init:0/2    0          7m51s
zync-que-fdd4d69f7-7gxdx                 1/1     Running     0          7m50s
#e.g. exec and run printenv to check the envars
oc exec -it zync-que-fdd4d69f7-7gxdx -- printenv | grep DATABASE_SSL
Defaulted container "que" out of: que, set-permissions (init)
DATABASE_SSL_KEY=/tls/tls.key
DATABASE_SSL_MODE=verify-ca
DATABASE_SSL_CA=/tls/ca.crt
DATABASE_SSL_CERT=/tls/tls.crt
  • Add that the watch-by label has been applied to zync and system-database secret
oc label secret zync apimanager.apps.3scale.net/watched-by=apimanager -n <operator-namespace>
oc label secret system-database apimanager.apps.3scale.net/watched-by=apimanager -n <operator-namespace>
  • update the zync and system-database secrets key value DATABASE_SSL_MODE=disable

NOTE: side note the flag is different for MySQL which is disabled

  • exec into the zync and system pods to to confirm the env vars have been set e.g.
# get the pod name
 oc get po
NAME                                     READY   STATUS      RESTARTS   AGE
apicast-production-6d4d589848-shlls      1/1     Running     0          3h5m
apicast-staging-767df4cbd8-m2kf5         1/1     Running     0          3h5m
backend-cron-6bbd4d5846-hmc7r            1/1     Running     0          3h5m
backend-listener-9df5df447-842qk         1/1     Running     0          3h5m
backend-redis-b59dd94cb-pksk7            1/1     Running     0          3h5m
backend-worker-64d47694b8-89q2g          1/1     Running     0          3h5m
system-app-85c4d98dd-j9mtr               3/3     Running     0          4m40s
system-app-post-884dp                    0/1     Completed   0          3h3m
system-app-pre-wlzrd                     0/1     Completed   0          3h5m
system-memcache-6b4df58598-q6mzj         1/1     Running     0          3h5m
system-redis-bf8db486d-h5bq9             1/1     Running     0          3h5m
system-searchd-659bffb8f4-swf4g          1/1     Running     0          5m50s
system-searchd-manticore-reindex-5jjb4   0/1     Completed   0          3h5m
system-sidekiq-59cc7dd9ff-l6pxg          1/1     Running     0          5m53s
zync-6f49fb7c5c-2pltl                    0/1     Init:0/2    0          7m51s
zync-que-fdd4d69f7-7gxdx                 1/1     Running     0          7m50s
#e.g. exec and run printenv to check the envars
oc exec -it zync-que-fdd4d69f7-7gxdx -- printenv | grep DATABASE_SSL
Defaulted container "que" out of: que, set-permissions (init)
DATABASE_SSL_KEY=/tls/tls.key
DATABASE_SSL_MODE=disable
DATABASE_SSL_CA=/tls/ca.crt
DATABASE_SSL_CERT=/tls/tls.crt

@austincunningham
Copy link
Contributor Author

/test test-e2e

@austincunningham austincunningham force-pushed the THREESCALE-11156 branch 7 times, most recently from 52d6f47 to de4820b Compare December 13, 2024 15:57
@austincunningham
Copy link
Contributor Author

/test test-e2e

@austincunningham austincunningham changed the title WIP THREESCALE-11156 add tls env var to zync and system deployments THREESCALE-11156 add tls env var to zync and system deployments Jan 10, 2025
@austincunningham austincunningham force-pushed the THREESCALE-11156 branch 9 times, most recently from 8c46579 to f626b13 Compare January 15, 2025 08:03
THREESCALE-11156 add volume mounts for secrets

THREESCALE-11156 fix default internal to work without tls

THREESCALE-11156 set cert permissions

THREESCALE-11156 add apimanger flags for TLS

THREESCALE-11156 fix e2e

THREESCALE-11156 add Database TLS doc

THREESCALE-11156 add watchby logic

THREESCALE-11156 removed internal database changes
helper.EnvVarFromSecretOptional("DB_SSL_CERT", SystemSecretSystemDatabaseSecretName, SystemSecretSslCert),
helper.EnvVarFromSecretOptional("DB_SSL_KEY", SystemSecretSystemDatabaseSecretName, SystemSecretSslKey),
helper.EnvVarFromSecretOptional("DATABASE_SSL_MODE", SystemSecretSystemDatabaseSecretName, SystemSecretDatabaseSslMode),
helper.EnvVarFromValue("DATABASE_SSL_CA", helper.TlsCertPresent("DATABASE_SSL_CA", SystemSecretSystemDatabaseSecretName, system.Options.SystemDbTLSEnabled)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed, let's move the creation of empty keys + validation of keys + validation of values here: https://github.com/3scale/3scale-operator/blob/master/pkg/3scale/amp/operator/highavailability_options_provider.go#L29

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree its better if the reconciler catches this.

Copy link
Contributor Author

@austincunningham austincunningham Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the secret validation should check if the secret values are populated in the highavailability reconcile and populate with an empty string if not set and return an error.

Two things here

  • It doesn't look like highavailability_options_provider GetHighAvailabilityOptions() handles zync database at the moment (maybe because spec.highavaibility is deprecated). Can add the functionally but am now doubting this is the correct place to do so.
  • And may still leave the helper to populate the env vars that are not available in the secret .
    i.e. the env vars in the secret have actual certs where as the env var that are used have paths to certs

MountPath: "/var/lib/searchd",
},
},
//Env: s.commonSearchdEnvVars(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this required

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think it was commented during some debugging . So I would say its required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having reviewed its not required.

@@ -338,6 +338,7 @@ Some examples are available [here](/doc/adding-apicast-custom-environments.md)
| MemcachedPriorityClassName | `memcachedPriorityClassName` | string | No | N/A | If specified, indicates the pod's priority. "system-node-critical" and "system-cluster-critical" are two special keywords which indicate the highest priorities with the former being the highest priority. Any other name must be defined by creating a PriorityClass object with that name. If not specified, the pod priority will be default or zero if there is no default. (see [docs](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/)) |
| MemcachedTopologySpreadConstraints | `memcachedTopologySpreadConstraints` | \[\][v1.TopologySpreadConstraint](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.25/#topologyspreadconstraint-v1-core) | No | `nil` | Specifies how to spread matching pods among the given topology |
| MemcachedLabels | `memcachedLabels` | map[string]string | No | `nil ` | Specifies labels that should be added to component |
| SystemDatabaseTLSEnabled | `systemDatabaseTLSEnabled`| bool | No | false | Use external databases only |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use external databases only - is that correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair point we are only using external from now on.

@@ -528,6 +529,7 @@ Note: Deploying databases internally with this section is meant for evaluation p
| DatabaseTopologySpreadConstraints | `databaseTopologySpreadConstraints` | \[\][v1.TopologySpreadConstraint](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.25/#topologyspreadconstraint-v1-core) | No | `nil` | Specifies how to spread matching pods among the given topology |
| DatabaseLabels | `databaseLabels` | map[string]string | No | `nil ` | Specifies labels that should be added to component |
| DatabaseAnnotations | `databaseAnnotations` | map[string]string | No | `nil ` | Specifies Annotations that should be added to component |
| ZyncDatabaseTLSEnabled | `zyncDatabaseTLSEnabled`| bool | No | false | Use external databases only |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use external databases only - is that correct? Should we mention here what the TLSEnabled is doing rather than what it should be used with?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

| DATABASE_SSL_MODE | [Database SSL Mode](https://github.com/brianmario/mysql2?tab=readme-ov-file#ssltls-options) | Required to set TLS Database connection. Only for TLS |
| DB_SSL_CA | SSL CA certificate | Required to set TLS Database connection. Only for TLS |
| DB_SSL_CERT | SSL CERT certificate | Required to set TLS Database connection. Only for TLS |
DB_SSL_KEY | SSL Key | Required to set TLS Database connection. Only for TLS |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DB_SSL_KEY | SSL Key | Required to set TLS Database connection. Only for TLS |
| DB_SSL_KEY | SSL Key | Required to set TLS Database connection. Only for TLS |

| DATABASE_SSL_MODE | [Database SSL Mode](https://www.postgresql.org/docs/current/libpq-ssl.html#LIBPQ-SSL-PROTECTION) | Required to set TLS Database connection. Only for TLS |
| DB_SSL_CA | SSL CA certificate | Required to set TLS Database connection. Only for TLS |
| DB_SSL_CERT | SSL CERT certificate | Required to set TLS Database connection. Only for TLS |
DB_SSL_KEY | SSL Key | Required to set TLS Database connection. Only for TLS |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DB_SSL_KEY | SSL Key | Required to set TLS Database connection. Only for TLS |
| DB_SSL_KEY | SSL Key | Required to set TLS Database connection. Only for TLS |

| DATABASE_SSL_MODE | [Database SSL Mode](https://www.postgresql.org/docs/current/libpq-ssl.html#LIBPQ-SSL-PROTECTION) | Required to set TLS Database connection. Only for TLS |
| DB_SSL_CA | SSL CA certificate | Required to set TLS Database connection. Only for TLS |
| DB_SSL_CERT | SSL CERT certificate | Required to set TLS Database connection. Only for TLS |
DB_SSL_KEY | SSL Key | Required to set TLS Database connection. Only for TLS |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DB_SSL_KEY | SSL Key | Required to set TLS Database connection. Only for TLS |
| DB_SSL_KEY | SSL Key | Required to set TLS Database connection. Only for TLS |


It is possible to connect to both the system-database and zync database via TLS provided these databases have TLS enabled. To enable TLS communication to these databases you will need to configure the ApiManager and the database secret.

In ApiManager CR we set the boolean to enable TLS configuration for the respictive databases
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In ApiManager CR we set the boolean to enable TLS configuration for the respictive databases
In ApiManager CR set the boolean to enable TLS configuration for the respective databases

- `spec.zync.zyncDatabaseTLSEnabled: true`
- `spec.system.systemDatabaseTLSEnabled: true`

We pass the cert files in via the respective secret i.e. system-database & zync
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We pass the cert files in via the respective secret i.e. system-database & zync
Pass the cert files in via the respective secret i.e. system-database & zync

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, the operator passes the cert files to the respective deployments via the database secrets i.e. system-database & zync

--from-literal=DATABASE_URL=postgresql://postgres:[email protected]/zync_production \
--from-literal=ZYNC_DATABASE_PASSWORD=password \
--from-file=DB_SSL_CA=rootCA.crt \
--from-file=DB_SSL_CERT=client.crt \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was getting the following error in the operator:

Secret field 'DATABASE_SSL_CERT' is required in secret 'system-database'

I think the ha filter needs some tidying up. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea probably using the wrong variable in the error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -768,6 +768,22 @@ func (r *ApicastReconciler) getSecretUIDs(ctx context.Context) (map[string]strin
}
}

// system-database

if r.apiManager.IsSystemDatabaseTLSEnabled() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the above seems apicast related only, why have the system/zync db TLS stuff in this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just following the add watch secret user guide https://github.com/3scale/3scale-operator/blob/master/doc/development.md#adding-new-watched-secrets . Mentioned it to Carl at the time. Seemed happy for it to live here.


// Check if SSL_KEY, SSL_CERT, and SSL_CA are empty
// checks the cert is populated in the secret, if so populates the path in the volume mount
if sslCert, ok := secret.Data[sslEnvVar]; ok && len(sslCert) > 0 && databaseTLSEnabled {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we are only setting paths if the value is non-empty - and when it is empty, we are returning "".
In case of having 2/3 created successfully (with values) - am I understanding it correctly that we are still going to just mount them, with 2/3 having the right values and 1/3 having empty string as value?
If so, the TLS validation will fail on the component level anyway so I'm trying to understand what is the benefit of doing this check and if it would be better to do it earlier (at ha level), keys exist + values are non-empty = add to volume for component validation of TLS, if keys exist + values are empty - do not even attempt to mount since we know this is going to be incorrect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No Mounts will have been created already with the cert values from the secret.

This called when we are creating the env var , so believe it is the correct place to set them as they don't exist prior to this. System and Zync can only consume the path to the cert and not the cert itself.
The flow secret has the cert, those certs gets mounted , we then create env var that have the path to the mounted cert. These env var are what is used by system and zync to connect to the db with TLS.

As for the empty value check this will be caught In the HA reconciller so may be redundant to have the check here. But it causes no harm

@MStokluska
Copy link
Contributor

Looks good but just few more concerns:

  1. Zync-que doesn't start up with TLS, the solution like you mention is to run "bin/rails db:create" - can we clarify that it is expected for the db to exists when using external dbs for zync?
  2. Setting the tlsEnabled to "false" doesn't remove the envars (probably need the env mutator)
  3. When TLS is on and Keys are missing, the issue can be identified by looking at operator logs, I think it would be nice to have it it APIManager status instead - but let's do it as a follow up Jira
  4. When TLS values are missing, the error is picked up at the component level. That is fine for now I think, but would be nice to create a follow up Jira.

Point 3/4 might be evaluated by preflights for fresh installs but for on-going, I think we need the follow ups created.

Command: []string{
"bash",
"-c",
"bundle exec sh -c \"until rake boot:redis && curl --insecure --output /dev/null --silent --fail --head http://system-master:3000/status; do sleep $SLEEP_SECONDS; done\"",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we adding the --insecure here? It seems to be working fine without it...but maybe I'm doing something wrong,.

Command: []string{
"bash",
"-c",
"bundle exec sh -c \"until rake boot:redis && curl --insecure --output /dev/null --silent --fail --head http://system-master:3000/status; do sleep $SLEEP_SECONDS; done\"",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants