Enable Certificate Validation in SDDC Manager (VCF 4.5.x)

Recently, I had to use the Asyncpatch tool in SDDC Manager to Patch our vcenter to 7.0U3o due to the Critical Security patch VMSA-2023-0023 and came across this issue when performing the precheck for Management Domain in SDDC Manager.

If you Expand “Sddc Security Configuration”, the error was on the option “VMware Cloud Foundation certificate validation check”

if you come across this issue, perform the following commands to enable the Certificate Validation Check in SDDC Manager

Review the Certificate Validation Setting

Command --
root@sddcmgr1# curl localhost/appliancemanager/securitySettings

Output --
{"fipsMode":false,"certificateValidationEnabled":false}

Enable the Certification Validation

Command --
root@sddcmgr1# curl 'http://localhost/appliancemanager/securitySettings' -X POST -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{"fipsMode":false,"certificateValidationEnabled":true}'

Check the Certificate Validation Setting after Enabling the Certificate Validation

Command --
root@sddcmgr1# curl localhost/appliancemanager/securitySettings

Output --
{"fipsMode":false,"certificateValidationEnabled":true}

You can observe from the above Output that the certificate validation is enabled as true.

Now, you can go ahead and retry the precheck and it will go through.

The final precheck which is green is shown in the screenshot below

How to Change the admin@local account password in VCF 4.5.x (UNOFFICIAL)

You might have come across an issue where the VCF REST API User account admin@local password is either lost or need to reset the password of this account.

NOTE: This is NOT an official way to do it according to VMware and is meant to be done under the supervision of VMware Support.

Login into the SDDC Manager as vcf and go to the root user prompt

sddcmanager#

type the following commands to reset the password of admin@local account on the sddc manager

mkdir -p /etc/security/local
chown root:vcf_services /etc/security/local
chmod 650 /etc/security/local
echo -n "" > /etc/security/local/.localuserpasswd
chown root:vcf_services /etc/security/local/.localuserpasswd
chmod 660 /etc/security/local/.localuserpasswd

type the following command to set a new password -  in this example it is NewP@SSW0rd010

echo -n 'NewP@SSW0rd010' | openssl dgst -sha512 -binary | openssl enc -base64 | tr -d '\n' > /etc/security/local/.localuserpasswd

Once you have changed the password, I would recommend to restart the sddc manager services using the following command

/opt/vmware/vcf/operationsmanager/scripts/cli/sddcmanager_restart_services.sh

Once the services are restarted, you can verify if the password for admin@local has been successfully changed by going to lookup_passwords

in lookup_passwords, use the admin@local to check if it can pull the passwords from the SDDC Password Manger.

Failed to get tasks data. Something went wrong. SDDC Manager 4.5 Password Management shows Error Message on its UI

Recently, after upgrading our SDDC Manager from 3.11.1 to 4.5, I came across an UI message on the SDDC Password Management tab as follows:

Issue Description: The error message shown in the screenshot above is “Failed to get tasks data. Something went wrong. Please retry or contact the service provider and provide the reference token.

Root Cause: As I was digging deeper into this, I found that the functionality of the password manager itself was fine, but it was somehow not able to get the complete task list data and hence throwing this error out.

after looking at the operationsmanager log at

/var/log/vmware/vcf/operationsmanager/operationsmanager.log

we found the following error in the log

2023-08-08T16:02:57.768+0000 ERROR [vcf_om,0000000000000000,0000] [o.a.c.c.C.[.[.[.[dispatcherServlet],http-nio-127.0.0.1-7300-exec-6] Servlet.service() for servlet [dispatcherServlet] in context wit
h path [/operationsmanager] threw exception [Request processing failed; nested exception is java.lang.IllegalArgumentException: No enum constant com.vmware.vcf.passwordmanager.exception.PasswordManag
erErrorCode.PASSWORD_MANAGER_VRA_ENDPOINT_FAILED] with root cause
java.lang.IllegalArgumentException: No enum constant com.vmware.vcf.passwordmanager.exception.PasswordManagerErrorCode.PASSWORD_MANAGER_VRA_ENDPOINT_FAILED

This means that after the SDDC Manager Upgrade, it was not able to determine the error with “PASSWORD_MANAGER_VRA_ENDPOINT_FAILED” in its internal database

Solution: After consulting with VMware Engineering, the issue was resolved by going into the operationsmanager db on the sddc manager and executing the following command to replace the message with something which the SDDC Manager could understand.

Disclaimer: Do this at your own risk, I would highly recommend to contact VMware GSS if you have the same issue to get an official resolution.

operationsmanager=# update passwordmanager.master_password_transaction set diagnostic_message = replace (diagnostic_message, 'PASSWORD_MANAGER_VRA_ENDPOINT_FAILED','PASSWORD_UPDATE_VRA_ENDPOINT_FAILED');

Once this is executed, you will get an message of UPDATE XX (where XX is the update number if it has been updated successfully)

Conclusion: Now, you can refresh the SDDC Manager UI and you will not see the error message on the UI in Password Management as before.

Hope this helps if you see similar message after your VCF Upgrade from 3.11.x to 4.5

SDDC Manager 3.x [3.11.x] Issues & Solutions

Hello All,

Recently I came across a few issues while preparing for an VCF Upgrade from 3.11.x version to 4.x in our environment.

Below are few of the issues and how to resolve them using commands on the SDDC Manager

Issue: SDDC Manager UI shows the message “Password Manager option failed in pre-validation stage” or when you go to the security tab for password management, it shows that one of the password tasks have failed.

Issue: Deployment locked by password manager when you try to rotate the passwords of PSC, VCENTER from the security -> password management tab

Solution/s:

Find the Deployment Lock ID in the sddc manager by logging into sddc manager using SSH, login as VCF and then root user and use the following command

psql --host=localhost -U postgres -d platform -c "select * from lock"

This will display the ID’s of the locked tasks in the SDDC Manager

Delete the locked task by using the following command

psql --host=localhost -U postgres -d platform -c "delete from lock where id=<ID displayed above>"

Once the task is deleted, the lock is released. You can now refresh the SDDC Manager UI and then continue with the password update or rotate options

Issue: Remove Failed Tasks in SDDC Manager

Solution:

Reference https://www.martingustafsson.com/removing-failed-tasks-in-sddc-manager/

Multiple Useful Commands for SDDC 3.x

The Unofficial VCF Troubleshooting Guide v2 – https://www.lab2prod.com.au/2021/03/the-unofficial-vcf-troubleshooting-guide.html

Reference

MANAGING VMWARE CLOUD FOUNDATION – FIRST LOOK

Password Operation Failed to Change the SSO password on an external PSC in VCF 3.11

Recently I came across an issue trying to change the SSO account (administrator@vsphere.local) password from the SDDC Manager using the Rotate password option under Security in VCF 3.11

I tried to Rotate the SSO password using the SDDC Manager, and got the following error:

However, Interesting thing is the sddc manager did change the SSO password in the backend

However, to check on this error, I dug a little deeper and saw the following error in the password rotate task:

I used the following command to check the operationsmanager.log to check the log in SDDC Manager

less /var/log/vmware/vcf/operationsmanager/operationsmanager.log

The log also shows that the sddc manager is trying to change the sso credential (administrator@vsphere.local) on VRA endpoints

I had to open a VMware Support ticket and here is the answer I received:

“As per the Engineering team this issue is due to a misconfiguration of vRA endpoints. SDDC Manager is trying to change the administrator@vsphere.local on the VRA endpoints but VRA endpoints are configured with a different user (vcf-secured-user@vsphere.local).  This issue is addressed in VCF 4.x”

What the VMware Engineering team is saying is that in VCF 3.10.x, 3.11 there is an issue with VRA as it is typically configured using a different tenant admin instead of using administrator@vsphere.local user to configure the endpoints in it. However, the SDDC manager is trying to change the administrator@vsphere.local credential on VRA endpoints. Hence this issue. Looks like this issue has been fixed in VCF 4.x

This resolves the issue at this time as we will be working to upgrade our VCF to 4.x soon.

Check for Passwords in SDDC Manager in VCF 3.x

Recently I had to check the existing passwords in sddc manager in our VCF 3.11 environment and found out there is a simple way. Here it is.

SSH into your SDDC Manager using vcf user and go to the root prompt using su command and use the below command:

root@sddcmgr01 [home/vcf]# lookup_passwords

Screenshot:

This will bring up all the products which sddc manager keeps track of

Select any product and then you will have to provide the sddc secured user credentials which you provided at the time of deploying SDDC manager in the VCF environment. This credential is also used for the backup of SDDC Manager and NSX Components.

in this case ESXI was selected to display the esxi hosts credentials

This way, you can get all the passwords for all the components controlled by SDDC Manager in VCF 3.x

NOTE/Disclaimer: I had to Blur/Pixelate certain components in my screenshots as they are in a live environment.

Great VCF Troubleshooting Guide by my Fellow vExpert

I wanted to ping back one of the great article by one of my fellow vExpert Shank Mohan on his website about an unofficial VCF Troubleshooting guide. I have learned from this article and would like to remember this article and hence posting it back on my blog.

Great VCF Troubleshooting guide by Shank Mohan

LCM Directory Permission Error When pre-checking for SDDC Manager Upgrade with VCF 3.11 Patch

I was getting ready to patch our environment from VCF 3.10.2.2 to VCF 3.11 as VMware has officially released a complete Patch for VCF 3.10.x this month, when I was performing the VCF Upgrade Pre-Check for the Management Domain, I came across this issue

The LCM Pre-Check Failed due to a directory permission issue for one of the lcm directory

Issue is that the pre-check says that the directory “/var/log/vmare/vcf/lcm/upgrades/<long code directory>/lcmAbout” owner is root but the owner needs to be user vcf_lcm

This is how I resolved the issue:

Login into SDDC Manager as user vcf, do su and provide the root password

then go to the following directory “/var/log/vmware/vcf/lcm/upgrades/<long code directory as displayed in the lcm error on sddc manager>

chown vcf_lcm lcmAbout
chmod 750 lcmAbout

The above two commands will change the owner from root to vcf_lcm and also provide the required permissions to the folder so the pre-check can complete.

The full screenshot of what I performed is below:

Commands to change owner to vcf_lcm and to provide the required permissions for the folder lcmAbout

Once you perform the commands above, you can run the pre-check and this time it will proceed successfully as shown below

Hope this article helps if you come across this issue with sddc manager upgrade from VCF 3.10.2.2 to 3.11

VCF 3.x patch 3.11 for Log4J Vulnerability and Other Security Patches included

VMware has finally realeased an patch version for VCF 3.x and the version is 3.11. You can only download this as a patch form from the SDDC Manager. You can Upgrade to version 3.11 from 30.10.2.2 or VCF 3.5 or later.

VMSA-2021-0028.13 (vmware.com)

This Release VCF 3.11 includes the following:

  • Security fixes for Apache Log4j Remote Code Execution Vulnerability: This release fixes CVE-2021-44228 and CVE-2021-45046. See VMSA-2021-0028.
  • Security fixes for Apache HTTP Server: This release fixes CVE-2021-40438. See CVE-2021-40438.
  • Improvements to upgrade prechecks: Upgrade prechecks have been expanded to verify filesystem capacity, file permissions, and passwords. These improved prechecks help identify issues that you need to resolve to ensure a smooth upgrade.
  • This also resolves the following Security Advisory VMSA-2022-0004 which deals with several vulnerabilities in esxi 6.7 hosts
  • This also resolves the vulnerability in VCF SDDC Manager 3.x according to the security advisory VMSA-2022-0003
  • This version also addresses the heap-overflow vulnerability in esxi hosts according to the security advisory VMSA-2022-0001.2

The Updated product versions according to the BOM for VCF 3.11 are

Hope this post helps for the teams who have VCF 3.10.x and waiting for the long awaited log4j patch instead of an workaround.

Workaround instructions to address CVE-2021-44228 in vCenter Server 6.7.x – For VCF 3.10.x

UPDATE: VMware has Updated the KB 87081 to Include the Script to remove log4j_class

I have taken these Workaround Instructions from the KB article 87081 and KB article 87095

For vCenter 6.7.x appliance in an VCF 3.10.x setup, some of the instructions in article 87081 don’t work and also in VCF 3.10.x since there are external PSC’s and the order to execute the instructions is as follows.

I am calling out VMware team to amend the steps for vCenter 6.7.x appliance in an non-HA configuration in the article 87081, especially for VCF 3.10.x installations.

For vCenter 6.7.x ; Steps to execute

vMON Service

  1. Backup the existing java-wrapper-vmon file

cp -rfp /usr/lib/vmware-vmon/java-wrapper-vmon /usr/lib/vmware-vmon/java-wrapper-vmon.bak

  1. Update the java-wrapper-vmon file with a text editor such as vi

vi /usr/lib/vmware-vmon/java-wrapper-vmon

  1. At the very bottom of the file, replace the very last line with 2 new lines
    • Originalexec $java_start_bin $jvm_dynargs “$@”Updated
      log4j_arg=”-Dlog4j2.formatMsgNoLookups=true”
      exec $java_start_bin $jvm_dynargs $log4j_arg “$@” 
  2. Restart vCenter Services

service-control –stop –all
service-control –start –all

Note: If the services do not start, ensure the file permissions are set correctly with these commands:

  • chown root:cis /usr/lib/vmware-vmon/java-wrapper-vmon
  • chmod 754 /usr/lib/vmware-vmon/java-wrapper-vmon

Analytics Service

NOTE:- The below workaround (Analytics service) is applicable for vCenter Server Appliance 6.7 Update 3o and Older versions only. vCenter Server Appliance 6.7 Update 3p is by default covered by vMON Service workaround. 

  1. Back up the log4j-core-2.8.2.jar file

cp -rfp /usr/lib/vmware/common-jars/log4j-core-2.8.2.jar /usr/lib/vmware/common-jars/log4j-core-2.8.2.jar.bak

  1. Run the zip command to disable the class

zip -q -d /usr/lib/vmware/common-jars/log4j-core-2.8.2.jar org/apache/logging/log4j/core/lookup/JndiLookup.class

  1. Restart the Analytics service

service-control –restart vmware-analytics 

CM Service

  1. Back up the log4j-core.jar file

cp -rfp /usr/lib/vmware-cm/lib/log4j-core.jar /usr/lib/vmware-cm/lib/log4j-core.jar.bak

  1. Run the zip command to disable the class

zip -q -d /usr/lib/vmware-cm/lib/log4j-core.jar org/apache/logging/log4j/core/lookup/JndiLookup.class

  1. Restart the CM service

service-control –restart vmware-cm

Run the remove_log4j_class.py script

1. Download the script attached to this KB (remove_log4j_class.py)

2. Login to the vCSA using an SSH Client (using Putty.exe or any similar SSH Client)

3. Transfer the file to /tmp folder on vCenter Server Appliance using WinSCP
Note: It’s necessary to enable the bash shell before WinSCP will work

4. Execute the script copied in step 1:

python remove_log4j_class.py

The script will stop all vCenter services, proceed with removing the JndiLookup.class from all jar files on the appliance and finally start all vCenter services. The files that the script modifies will be reported as “VULNERABLE FILE” as the script runs.

Verify the changes

Once all sections are complete, use the following steps to confirm if they were implemented successfully.

  1. Verify if the stsd, idmd, and vMon controlled services were started with the new -Dlog4j2.formatMsgNoLookups=true parameter:

ps auxww | grep formatMsgNoLookups

Check if the processes include -Dlog4j2.formatMsgNoLookups=true

  1. Verify the Analytics Service changes:

grep -i jndilookup /usr/lib/vmware/common-jars/log4j-core-2.8.2.jar | wc -l
 This should return 0 lines

  1. Verify the CM Service changes:

grep -i jndilookup /usr/lib/vmware-cm/lib/log4j-core.jar | wc -l

This should return 0 lines

The remaining steps for Secure Token Service, Identity Management Service don’t work for vcenter 6.7.x in VCF 3.10.x (3.10.2.1) environment

——– So, after this Step, we will have to SSH into the External PSC and follow the below steps ———-

CM Service

  1. Back up the log4j-core.jar file

cp -rfp /usr/lib/vmware-cm/lib/log4j-core.jar /usr/lib/vmware-cm/lib/log4j-core.jar.bak

  1. Run the zip command to disable the class

zip -q -d /usr/lib/vmware-cm/lib/log4j-core.jar org/apache/logging/log4j/core/lookup/JndiLookup.class

  1. Restart the CM service

service-control –restart vmware-cm


Secure Token Service

  1. Back up and edit the the vmware-stsd file

cp /etc/rc.d/init.d/vmware-stsd /root/vmware-stsd.bakvi /etc/rc.d/init.d/vmware-stsd

  1. Find the section labeled start_service(). Insert a new line near line 266, just before “$DAEMON_CLASS start” with “-Dlog4j2.formatMsgNoLookups=true \” as seen in the example:

start_service()
{
  perform_pre_startup_actions

  local retval
  JAVA_MEM_ARGS=`/usr/sbin/cloudvm-ram-size -J vmware-stsd`
  $JSVC_BIN -procname $SERVICE_NAME \
            -home $JAVA_HOME \
            -server \
            <snip>
            -Dauditlog.dir=/var/log/audit/sso-events  \
            -Dlog4j2.formatMsgNoLookups=true \
            $DAEMON_CLASS start

  1. Restart the vmware-stsd service

service-control –stop vmware-stsd
service-control –start vmware-stsd

Identity Management Service

  1. Back up and edit the the vmware-sts-idmd file

cp /etc/rc.d/init.d/vmware-sts-idmd /root/vmware-sts-idmd.bakvi /etc/rc.d/init.d/vmware-sts-idmd

  1. Insert a new line near line 177 before “$DEBUG_OPTS \” with “-Dlog4j2.formatMsgNoLookups=true \” as seen in the example:

$JSVC_BIN -procname $SERVICE_NAME \
          -wait 120 \
          -server \
          <snip>
          -Dlog4j.configurationFile=file://$PREFIX/share/config/log4j2.xml \
          -Dlog4j2.formatMsgNoLookups=true \
          $DEBUG_OPTS \
          $DAEMON_CLASS

  1. Restart the vmware-sts-idmd service

service-control –stop vmware-sts-idmd
service-control –start vmware-sts-idmd

Verify the changes

Once all sections are complete, use the following steps to confirm if they were implemented successfully.

  1. Verify if the stsd, idmd, psc-client, and vMon controlled services were started with the new -Dlog4j2.formatMsgNoLookups=true parameter:

ps auxww | grep formatMsgNoLookups

Check if the processes include -Dlog4j2.formatMsgNoLookups=true

  1. Verify the CM Service changes:

grep -i jndilookup /usr/lib/vmware-cm/lib/log4j-core.jar | wc -l

This should return 0 lines

The steps in VMware KB Article 87081 is for vCenter with Embedded PSC and the above steps are for the vCenter server 6.7 with an External PSC

Hope this article helps the Engineers who are working on this log4j Vulnerability and if they have VCF 3.10.x you can follow the above steps with an external PSC Configuration.