VMware: vCenter VCSA appliance upgrade fails with “Service cannot be started”

Hi,

often, when a vCenter VCSA update fails, the error is a certificate missmatch. For example the sso/vmware-stsd service


[2021-12-12 10:08:11,628] : Running patch script.....
[2021-12-12T10:11:32.346] : Patch command patch failed
[2021-12-12T10:11:32.346] :
Mismatch:
summary: Internal error occurs during execution of update process Traceback (most recent call last):
File "/storage/core/software-packages/scripts/patches/py/vmware_b2b/patching/phases/patcher.py", line 203, in patch
_patchComponents(ctx, userData, statusAggregator.reportingQueue)
File "/storage/core/software-packages/scripts/patches/py/vmware_b2b/patching/phases/patcher.py", line 84, in _patchComponents
_startDependentServices(c)
File "/storage/core/software-packages/scripts/patches/py/vmware_b2b/patching/phases/patcher.py", line 53, in _startDependentServices
serviceManager.start(depService)
File "/storage/core/software-packages/scripts/patches/libs/sdk/service_manager.py", line 901, in wrapper
return getattr(controller, attr)(*args, **kwargs)
File "/storage/core/software-packages/scripts/patches/libs/sdk/service_manager.py", line 794, in start
super(VMwareServiceController, self).start(serviceName)
File "/storage/core/software-packages/scripts/patches/libs/sdk/service_manager.py", line 665, in start
raise IllegalServiceOperation(errorText)
service_manager.IllegalServiceOperation: Service cannot be started. Error: Error executing start on service sts. Details {
"detail": [
{
"id": "install.ciscommon.service.failstart",
"translatable": "An error occurred while starting service '%(0)s'",
"args": [
"sts"
],
"localized": "An error occurred while starting service 'sts'"
}
],

Looking at the sts log file gives an hint that the service could not registered in lookup service. An typical indenticator for certificate issues.

root@vCenter /var/log/vmware/sso  # cat sts-prestart.log
2021-12-11T21:21:16.388Z INFO     START: Executing STS pre start script...
2021-12-11T21:21:16.414Z INFO     Current value in key StsInstalled is '0'. Action will be taken...
2021-12-11T21:21:16.414Z INFO     Executing STS pre start commands: Register STS with LookupSvc
2021-12-11T21:21:16.415Z INFO     Node type is embedded
2021-12-11T21:21:17.251Z INFO     STS reregistration failed
2021-12-11T21:21:17.252Z ERROR    Failed to register VMware STS with Lookup Service

If this happens run the lookup service doctor tool (lsdoctor). Copy the tool to the appliance and login via local shell or ssh. Download the zip file and extract it.

lsdoctor recognize certificate issues and could fix it.

Here a typical error is:

root@vCenter [ ~ ] unzip lsdoctor.zip
root@vCenter [ ~ ] cd lsdoctor-master
root@vCenter [ ~/lsdoctor-master ]# python lsdoctor.py -l

    ATTENTION:  You are running a reporting function.  This doesn't make any changes to your environment.
    You can find the report and logs here: /var/log/vmware/lsdoctor

2021-12-12T10:29:26 INFO main: You are reporting on problems found across the SSO domain in the lookup service.  This doesn't make changes.
2021-12-12T10:29:27 INFO live_checkCerts: Checking services for trust mismatches...
2021-12-12T10:29:27 INFO generateReport: Listing lookup service problems found in SSO domain
2021-12-12T10:29:27 ERROR generateReport: site\vCenter.myDomain.org (Update Manager) found SSL Trust Mismatch: Please run python ls_doctor.py --trustfix option on this node.

If such errors occurs, run lsdoctor with the trustfix switch

root@vCenter [ ~/lsdoctor-master ]# python lsdoctor.py --trustfix
2021-12-12T10:32:33 INFO findAndFix: Attempting to reregister ec039d94-9443-416d-a002-fc9e8a8fb96d for vCenter.myDomain.org
2021-12-12T10:32:34 INFO findAndFix: We found 45 mismatch(s) and fixed them :)
2021-12-12T10:32:34 INFO main: Please restart services on all PSC's and VC's when you're done.

This should solve the problem.

Michael

Leave a Reply