- SAP system does not start.
We have received many incident reports (aka message) whose titles are “SAP system does not start”.
For
SAP support engineers, this title has broad meanings. Since SAP system
is based on the client – server technology and consists of many
processes or threads, there are many places to be checked in order to
understand what the exact problem you are facing.
What do you mean by “SAP system does not start”?
By
reading the description of the incident report further, the reporters
describe the situation like: “.. the users cannot logon after
restart..”, “.. the SUM tool says startup failure..”, “.. my message
server didn’t come up..”, “..I cannot see OS processes..”, “..I started
my own startup scripts but nothing happened..”, and so forth.
Some of those descriptions are still quite vague, and some are quite to the point of the root cause.
First of all, please check this question:
Are Message Server, a dispatcher and work processes up and running?
Yes:
-> this means that the startup of SAP system was finished
successfully. It can be said that the problem is the logon itself, and
not SAP system startup. If this is the case, then please avoid using the
title: “SAP system does not start”.
No: -> literally it means that SAP system DID NOT startup. Its startup sequence had some kind of problems.
Let’s have a look this startup problem further.
This
wiki aims to offer troubleshooting guides for startup problems. With
this guide, we hope you can analyze and solve the problem by yourself.
In some cases, you may still need to raise an incident ticket (aka
message). Then, we hope you can narrow down the problem and provide us
detailed information so that we can instantly assist you.
Startup sequence
897933 - Start and stop sequence for SAP systems
In short, SAP system instances can be started up by the following sequence:
- Database instance
- Central instance or (A)SCS
- Application servers (Dialog Instances)
*Database startup problem is not discussed in this wiki.
The following link explains standard methods to start SAP systems:
http://help.sap.com/saphelp_nw70ehp1/helpdata/en/47/fd7230eca159e8e10000000a421937/frameset.htm
SAP MC, SAPMMC and startsap script needs “SAP start service”:
http://help.sap.com/saphelp_nw70ehp2/helpdata/en/b3/903925c34a45e28a2861b59c3c5623/frameset.htm
As
you can see from above link, many processes are involved in the
startup. Among those processes, the sapstartsrv process plays an
important role of controlling a SAP instance.
The types of the
processes which need to be executed in the system are declared in the
instance profile (or start profile). Then, the sapstartsrv process
starts up the specified processes.
During this startup sequence, each process generates its trace file on DIR_HOME directory.
Common problems and solutions
Quite
common problems are related to the sapstartsrv process, especially
following two symptoms: “The process doesn’t exist” and “The process
exists but does not respond”.
“The process doesn’t exist”
On UNIX/LINUX, use an OS command ‘ps’ to check the process:
ps –ef | grep sapstartsrv
When the process exists, the above command shows like this:
abcadm 7012358 1 0 17:39:33 -
0:05 /usr/sap/ABC/DVEBMGS00/exe/
sapstartsrv pf=/usr/sap/ABC/SYS/profile/START_DVEBMGS00_walldorf -D -u abcadm
On WINDOS, use WINDOWS’ task manager tool to see if the service is running or not.
If the sapstartsrv process doesn’t exist, please consult following SAP notes and start up.
1762827 - Startup of Instance Service failed in UNIX or Linux environment
“The sapstartsrv process exists but does not respond”
The
OS command ‘PS’ or the taskmanager shows the process. But the
sapstartsrv doesn’t respond to any actions. Then, most likey, the UNIX
domain socket (/tmp/.sapstream<port-no>) or Windows named pipe
(\\<host>\pipe\sapcontro_<xx>) may be broken. If it still
exists, please manually delete. And start the sapstartsrv service again.
Other cases…
Case 1:
Found the following error information in the startsap.log.
exec(): 0509-036 Cannot load program disp+work because of the following
errors:
rtld: 0712-001 Symbol ixml_iXMLParser_IsNormalizingAtt was referenced
from module disp+work(), but a runtime definition
of the symbol was not found.
rtld: 0712-001 Symbol ixml_iXMLParser_SetNormalizingAtt was referenced
from module disp+work(), but a runtime definition
Root cause:
The library path was not configured correctly after upgrade.
Solution:
Configure the library path.
SAP Note 1265456 SAP system does not start after kernel upgrade
SAP Note 1104735 Upgrade to the new instance-specific directory on UNIX
Case 2:
The sapstartsrv service is up but the dispatcher doesn’t startup.
sapcontrol -nr 00 -function GetProcessList
30.11.2013 18:46:21
GetProcessList
OK
name, description, dispstatus, textstatus, starttime, elapsedtime, pid
disp+work, Dispatcher, GRAY, Stopped, , , 26680
igswd_mt, IGS Watchdog, GREEN, Running, 2013 11 30 12:57:32, 5:48:49,
26681
Root cause:
The zombie process of the dispatcher existed.
Solution:
Kill the zombie process.
Case 3:
On NT, frequently failover in few minutes.
Root cause:
The version of the sapstartsrv.exe was different between DIR_CT_RUN and DIR_EXECUTABLE.
Solution:
Copy the sapstartsrv.exe from DIR_CT_RUN to DIR_EXECUTABLE.
SAP Note 1043592 MSCS: Cluster Resource Monitor Crashes on W2K3 SP2
SAP Note 1345206 Handling and preventing sapstartsrv.exe corruptions
SAP Note 1375494 SAP system does not start after applying SAP kernel patch
Case 4:
Startsap.log says:
(1159276) Starting: local /usr/sap/ABC/SYS/exe/run/igswd_mt -
mode=profile pf=/usr/sap/ABC/SYS/profile/ABC_DVEBMGS00_walldorf
(442404) Waiting for Child Processes to terminate.
(442404) **** 2010/04/10 08:07:11 Child 1155152 terminated with Status
255 . ****
(1155152) **** 2010/04/10 08:07:11 No RestartProgram command for
program 2 ****
The information above is very generic. stderr5 file shows more detailed information.
Command: /usr/sap/ABC/SYS/exe/run/igswd_mt
-mode=profile
pf=/usr/sap/ABC/SYS/profile/ABC_DVEBMGS00_walldorf
(1159276) **** Failed to Start Local Command. No such file or directory
Root cause:
The necessary files weren't copied correctly.
Solution:
Copy the igswd_mt to /usr/sap/ABC/SYS/exe/run
Case 5:
After
kernel patching, the system doesn't come up but the startsap commands
runs successful. The dispatcher was immediately down and no
work process is active.
Stderr2 says:
(1114238) New Child Process created.
(1114238) Starting local Command:
Command: dw.sapABC_DVEBMGS01
pf=/usr/sap/ABC/SYS/profile/ABC_DVEBMGS01_walldorf
Fri Mar 5 08:59:56 2010
Profile configuration error detected, use temporary corrected setup
Shared Pool 40: ipc/shm_psize_40 = 2180000000 (too small)
Shared Pool 40: (smaller than min requirement 2186246040)
Shared Pool 40: (estimated size assumed 2190000000)
*** ERROR => Illegal ShmAdm Slot. Key=2104967296.
*** ERROR => ShmGet: Inv PoolKey=2104967296, Key=40, Flag=1, Ptr=1152921504606843304.
***
Root cause:
Shared memory area may need to be adjusted for new environment (i.e, after patching).
Solution:
Run sappfpar check pf=<profile> and adjust ipc/shm_psize_xx to suggested values.
Case 6:
The dispatcher went down right after startup. The dev_disp shows:
rdisp/http_min_wait_dia_wp : 1 -> 1
***LOG CPS=> DpLoopInit, ICU ( 3.4 3.4 4.1) [dpxxdisp.c 1706]
***LOG Q0K=> DpMsAttach, mscon ( CLMAGMACI) [dpxxdisp.c 12674]
*** DP_FATAL_ERROR => DpMsAttach: local hostname 'SRMAGMA98' is resolved
to loopback address (cf. SAP note 1054467 for details)
*** DISPATCHER EMERGENCY SHUTDOWN ***
Root Cause
Incorrect setup of the loopback address.
Solution:
#1054467 Local host name refers to loopback address
Case 7:
The dispatcher went down right after startup. The dev_disp shows:
*** ERROR => gateway (pid 29622344) died [dpxxdisp.c 16763]
*** DP_FATAL_ERROR => Gateway died with status 2 - I better exit now
*** DISPATCHER EMERGENCY SHUTDOWN ***
The gateway trace dev_rd shows:
***LOG Q0I=> NiIBindSocket: bind (67: Address already in use)
[nixxi.cpp 3740]
*** ERROR => NiIBindSocket: SiBind failed for hdl 9/sock 12
(SI_EPORT_INUSE/67; I4; ST; 0.0.0.0:3300) [nixxi.cpp 3740]
***LOG S0V=> GwStopGateway, gateway stopped () [gwxxrd.c 13096]
Root Cause:
The service port 3300 was declared twice in the /etc/services file.
sapgw00 3300/tcp
sapgw00s 3300/tcp
Solutoin:
Make sure only one service is available.
sapgw00 3300/tcp
#sapgw00s 3300/tcp
Case 8:
The dispatcher went down right after startup. The dev_disp shows:
sapstart.log
(11925) Starting local Command:
Command: dw.sapPIP_DVEBMGS00
pf=/usr/sap/PIP/SYS/profile/PIP_DVEBMGS00_unjspfsappip
(11856) Waiting for Child Processes to terminate.
(11856) **** 2013/09/19 10:31:13 Child 11925 terminated with Status 0 . ****
dev_disp
*** ERROR => NiDgHdlBindName: invalid hostname 'localhost' (rc=-2;hdl 1) [nixx.c 4341]
*** ERROR => DpCommInit: NiDgHdlBindName failed: -8 [dpxxdisp.c 10339]
*** DP_FATAL_ERROR => DpSapEnvInit: DpCommInit
*** DISPATCHER EMERGENCY SHUTDOWN ***
Root Cause:
“hosts” file was modified and “localhost” was misconfigured.
Solution:
Configure “localhost” in the “hosts” file correctly.
Case 9:
The sapstartsrv is up.
When
you execute sapcontrol without host opetion, you get: NIECON_REFUSED
(WSAECONNREFUSED) error. However, it success with -host opetion.
For example:
sapcontrol -nr 00 -function GetVersionInfo
FAIL: NIECONN_REFUSED (WSAECONNREFUSED: Connection refused),
NiRawConnect failed in plugin_fopen()
but
sapcontrol -nr 00 -host saphost.sap.co:50013 -function GetVersionInfo -> works!
Root Cause:
Please
refer SAP Note 2062508 sapcontrol commands fail with error
"NIECONN_REFUSED (WSAECONNREFUSED: Connection refused)" during upgrade
Solution:
Comment out parameter service/hostname for the SAP start service and restart sapstartsrv.
The sapstartsrv exisits. -function GetProcessList doesn't show programs like dispatcher, gateway, etc.
In normal case, you will see something like this:
----
sapcontrol -nr 00 -function GetProcessList
GetProcessList
OK
name, description, dispstatus, textstatus, starttime, elapsedtime, pid
msg_server, MessageServer, GREEN, Running, 2014 02 25 05:45:24, 0:00:53, 7326
disp+work,
Dispatcher, GREEN, Running, Message Server connection ok, Dialog Queue
time: 0.00 sec, 2014 02 25 05:45:24, 0:00:53, 7327
igswd_mt, IGS Watchdog, GREEN, Running, 2014 02 25 05:45:24, 0:00:53, 7328
---
-->here, you can see programs (message server, disp+work and IGS) are up and running.
But in this problematic situation, something like....
----
sapcontrol -nr 00 -function GetProcessList
GetProcessList
OK
name, description, dispstatus, textstatus, starttime, elapsedtime, pid
----
--> No programs are listed .
This is because 'Start_Program_n' command with the program is missing in the profile file which sapstartsrv starts with.
For this 'Start_Programm_n", please refer the following link:
http://help.sap.com/saphelp_nw74/helpdata/en/9c/ac0d50a40af26be10000000a423f68/frameset.htm
(NOTE!!!
since kernel 74x release, 'Start Profile' is obsolated. You cannot use
'Start Profile' like ealier release. All the information in Start
Profile can be marged to Instance Profile)
After upgrading to 74x system, SAP doesn't startup.
In
most cases, the reason is the sapstartsrv is pointing to Start Profile.
Start Profile is obsolated since 74x kernel. You need to recofigure
your profile.
- Delete Start Profile physically from 'profile' dir.
- In UNIX, adjust /usr/sap/sapservices file.
- In WINDOWS, adjust the SAP service.
- Delete UNIX domain socket or Window's named pipe, and restart sapstartsrv.
UNIX Domain Socket .sapstream5xx13 was not enabled...
Case 1:
sapstartsrv process exists but the corresponding UNIX Domain Socket doesn't exist.
Checki sapstartsrv.log. It ends at CCMS agent initialization phase:
cat sapstartsrv.old
---------------------------------------------------
trc file: "sapstartsrv.log", trc level: 0, release: "741"
---------------------------------------------------
pid 2134
[Thr 01] Mon Sep 15 10:09:13 2014
HistoryLog_Init: logfile (/usr/sap/ABC/SCS00/work/history.glf) already exists, check parameter
No halib defined => HA support disabled
CCMS agent initialization for instance type SCS: return code 0.
-->that's all.... sapstartsrv should initialize sapcontrol webservice....
Normally, you should see something like:
CCMS agent initialization for instance type SCS: return code 0.
CCMS agent disabled by profile parameter ccms/enable_agent = -1.
Initializing SAPControl Webservice
Auto PSE update thread started
Starting AutoRestart thread
Starting WebService SSL thread
AutoRestart thread started
Starting WebService thread
Webservice thread started, listening on port 55013
Webservice SSL thread started, listening on port 55014
Webservice SSL thread using default SAP SSL credential
Trusted https connect via Unix domain socket '/tmp/.sapstream55014' enabled.
Trusted http connect via Unix domain socket '/tmp/.sapstream55013' enabled.
Root Cause:
CCMS agent has been running already by saphostcontrol.
Solution:
set ccms/enable_agent = 0 in the instance profile.
Case 2:
A trace file called sapstartsrv_ccms.log was generated in work dir. The content of i ends with the information:
INFO: Checking lock file.....
Root Cause:
Please refer SAP note 1916333 sapcontrol error "FAIL: NIECONN_REFUSED" due to agent.log file
Solution:
set ccms/enable_agent = 0 in the instance profile.