Add support for X.509 Pod Certificates in GCS Fuse CSI driver#1317
Add support for X.509 Pod Certificates in GCS Fuse CSI driver#1317yangspirit wants to merge 2 commits intoGoogleCloudPlatform:mainfrom
Conversation
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: yangspirit The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request introduces support for using pod certificates to authenticate with GCS via the Security Token Service (STS). Key changes include the implementation of a certificate-based token source, updates to the CSI driver to handle certificate secrets, and modifications to the mutating webhook to inject the required volumes and environment variables based on pod annotations. The review feedback highlights several improvement opportunities, such as handling errors during annotation parsing, implementing nil checks for Kubernetes Secret data to prevent panics, removing unused function parameters, and lowering the verbosity of diagnostic logs to reduce noise.
| // Inject GCP workload identity credential config configmap and token/certificate. | ||
| usePodCertificate := false | ||
| if usePodCertStr, ok := pod.Annotations[GcsFuseUsePodCertificateAnnotation]; ok { | ||
| usePodCertificate, _ = ParseBool(usePodCertStr) |
There was a problem hiding this comment.
The error from ParseBool is ignored here. It's better to handle it and return an admission error if the annotation value is invalid, ensuring consistency with how other boolean annotations are processed in this file. Additionally, ensure the error message uses a constant format string to satisfy CI requirements.
| usePodCertificate, _ = ParseBool(usePodCertStr) | |
| usePodCertificate, err := ParseBool(usePodCertStr) | |
| if err != nil { | |
| return admission.Errored(http.StatusBadRequest, fmt.Errorf("the acceptable values for %q are 'True', 'true', 'false' or 'False'", GcsFuseUsePodCertificateAnnotation)) | |
| } |
References
- When using fmt.Errorf, ensure the format string is constant to avoid non-constant format string in call to fmt.Errorf errors in CI.
|
|
||
| if usePodCertificate { | ||
| signerName := "kubernetes.io/kube-apiserver-client" // Default signer | ||
| filename, credentialConfig, err = appendPodCertificateVolumes(si.K8SClient, pod, configMapName, signerName) |
There was a problem hiding this comment.
The signerName variable is passed to appendPodCertificateVolumes but is not used within that function. It should be removed from the call site and the function signature to keep the code clean.
| filename, credentialConfig, err = appendPodCertificateVolumes(si.K8SClient, pod, configMapName, signerName) | |
| filename, credentialConfig, err = appendPodCertificateVolumes(si.K8SClient, pod, configMapName) |
| }, | ||
| } | ||
| if usePodCertificate { | ||
| sidecarCredentialConfig.GaccEnv = &corev1.EnvVar{Name: "GOOGLE_API_CERTIFICATE_CONFIG", Value: fmt.Sprintf("%s/%s", SidecarContainerWICredentialConfigMapVolumeMountPath, "certificate_config.json")} |
There was a problem hiding this comment.
| return filename, credConfig, nil | ||
| } | ||
|
|
||
| func appendPodCertificateVolumes(client kubernetes.Interface, pod *corev1.Pod, configMapName string, signerName string) (string, *CredentialConfig, error) { |
There was a problem hiding this comment.
The signerName parameter is unused in the function body. Consider removing it if it's not required for the implementation.
| func appendPodCertificateVolumes(client kubernetes.Interface, pod *corev1.Pod, configMapName string, signerName string) (string, *CredentialConfig, error) { | |
| func appendPodCertificateVolumes(client kubernetes.Interface, pod *corev1.Pod, configMapName string) (string, *CredentialConfig, error) { |
| // prepareStorageService prepares the GCS Storage Service using the Kubernetes Service Account from VolumeContext. | ||
| func (s *nodeServer) prepareStorageService(ctx context.Context, vc map[string]string) (storage.Service, error) { | ||
| ts := s.driver.config.TokenManager.GetTokenSourceFromK8sServiceAccount(vc[VolumeContextKeyPodNamespace], vc[VolumeContextKeyServiceAccountName], vc[VolumeContextKeyServiceAccountToken], "" /*audience*/, false) | ||
| klog.Infof("prepareStorageService called, VolumeContext: %+v", vc) |
There was a problem hiding this comment.
Logging the entire VolumeContext at Infof level can be quite noisy and might inadvertently log sensitive configuration data. It's recommended to use klog.V(4) for this type of diagnostic information.
| klog.Infof("prepareStorageService called, VolumeContext: %+v", vc) | |
| klog.V(4).Infof("prepareStorageService called, VolumeContext: %+v", vc) |
| certData := secret.Data["tls.crt"] | ||
| keyData := secret.Data["tls.key"] |
There was a problem hiding this comment.
It's safer to verify that the Secret's Data map is not nil before accessing its keys to prevent a panic, and then verify that the keys tls.crt and tls.key exist. If they are missing, tls.X509KeyPair will return a generic error later, making it harder to diagnose the configuration issue.
| certData := secret.Data["tls.crt"] | |
| keyData := secret.Data["tls.key"] | |
| if secret.Data == nil { | |
| return nil, fmt.Errorf("secret %s Data is nil", secretName) | |
| } | |
| certData, certOk := secret.Data["tls.crt"] | |
| keyData, keyOk := secret.Data["tls.key"] | |
| if !certOk || !keyOk { | |
| return nil, fmt.Errorf("secret %s is missing tls.crt or tls.key", secretName) | |
| } |
References
- When using the Kubernetes Go client to read a resource like a ConfigMap or Secret, the returned Data map can be nil. Always perform a nil check before accessing its keys to prevent a panic.
- When using fmt.Errorf, ensure the format string is constant to avoid non-constant format string in call to fmt.Errorf errors in CI.
| } | ||
|
|
||
| func (ts *GCPTokenSource) fetchTokenViaCertificate(ctx context.Context) (*oauth2.Token, error) { | ||
| klog.Infof("fetchTokenViaCertificate called with audience: %s", ts.audience) |
There was a problem hiding this comment.
This log message (and others at lines 119 and 171) is emitted during routine token operations. Using klog.V(4) is preferred for informational messages that aren't critical for standard operation logs.
| klog.Infof("fetchTokenViaCertificate called with audience: %s", ts.audience) | |
| klog.V(4).Infof("fetchTokenViaCertificate called with audience: %s", ts.audience) |
What type of PR is this? /kind feature
What this PR does / why we need it: This PR adds support for X.509 Pod Certificates for authentication in the GCS Fuse CSI driver.
Native support for Pod Certificates in OSS Kubernetes (via CSI TokenRequest) is a long-term effort requiring KEPs and substantial development time (6-12 months). This PR implements a workaround to unblock an important customer by leveraging the mutating webhook and refactoring the driver's token exchange logic to use the official STS library.
Which issue(s) this PR fixes:
Fixes b/495760030
Special notes for your reviewer: This PR contains changes in two main areas:
Webhook (pkg/webhook): Added support for annotations gke-gcsfuse/use-pod-certificate and gke-gcsfuse/workload-identity-credential-configmap to inject the necessary environment variables and volume mounts. It also automatically injects volume attributes needed by the driver.
CSI Driver (pkg/cloud_provider/auth): Refactored fetchTokenViaCertificate to use the official Google STS library and resolved issues with 404 (base URL) and 400 (missing certificate chain in payload) errors.
All unit tests for the modified pkg/webhook package passed successfully.
Does this PR introduce a user-facing change?:
release-note
Added support for X.509 Pod Certificates in GCS Fuse CSI driver via new pod annotations
gke-gcsfuse/use-pod-certificateandgke-gcsfuse/workload-identity-credential-configmap.