About Me

My photo
I am Suresh Chinta, working on SAP HANA Cloud & SAP BTP Cloud/ AWS/Azure cloud consultant.I have experience in SAP Basis/Netweaver , S4HANA Cloud implementations / Support. I'm certified Microsoft Azure cloud & AWS professional. I have started this blog to share my knowledge with all those who are interested to learn & enhance their career.

Tuesday, April 28, 2020

SAP HANA System Down, HANA not starting


how to perform checks if the SAP HANA instance is not starting. At the end of this guide, there will be frequently asked questions and common problems that are encountered.

Checks to perform

The first thing that is needed to be determined, is if the SAP HANA database is running. To do this run:

ps -ef | grep hdb



If the HANA database is running the following processes will be present

hdbnameserver

hdbpreprocessor

hdbcompileserver

hdbindexserver

hdbstatisticsserver (this may not be present as of post SP7 this could be merged into the Indexserver)



Please ensure that processes are being ran by the correct <SID>adm user incase they have multiple HANA's running on the system

If you see the running processes then please review the System Hang section.



To see if the HANA database will start try via putty going to /usr/sap/<SID>/HDB<instance#> and running

HDB start



If this fails go to /usr/sap/<SID>/HDB<instance#>/exe here you can try and run the processes manually

usually you will only need to call ./hdbnameserver and then the ./hdbindexserver and continue with the

rest if it is successful, if it is successful the issue could be with hdbdaemon or sapstartsvr and you

will check the associated logs.





Check the HANA trace files in the following location /usr/sap/<SID>/HDB<instance#>/<server name>/trace

or create a full system dump by following SAP Note 1732157 - Collecting diagnosis information for SAP HANA



The order of checking the trace files should be first the daemon, nameserver, indexserver, compileserver, and

preprocessor (statistics server would not cause the system to stop starting).

However, after checking the indexserver, you should be able to see where the error lies



Issues and Reported Problems

Common issues that we can see are:

Disk Full Error

In this the trace file will contain the words 'rc=24 no space left on device errors' for this please

review SAP Note 2083715 - Analyzing log volume full situations



Corrupt Log Segments

The trace file will say something like cannot find or cannot read a log segment at a hexadecimal address, the only

resolution to a corrupt log segment is to do a recovery that does not involve that log segment



Missing Log Segments

In the trace we will see 'Cannot open file "/<path_to_missing_logsegment>/logsegment_000_XXXXXXXX.dat", rc=2: No such file or directory'

for this please review SAP Note 1788692 - Index Server crash due to missing LogSegment file



Authorization Issues

In the trace we will see the message 'not authorized' in the trace, in this scenario check as the <SID>adm

user and see if that user can make a file in the location specified in the trace to verify this. If you

cannot create the file run the chmod command on the folder to allow reading and writing (ie chmod 764)



Hardware issue

There is no generic line in the trace would point to hardware, but if the issue is OS related or a disk cannot mount please follow the

hardware portion of the survival guide



HANA Not Starting after a failed hdblcm rename (hdbrename)

When you try to start HANA it fails with "process hdbdaemon HDB Daemon not running". No daemon, nameserver, or indexserver trace is created which indicates that it hasn't even gotten to the point of trying to start the services.

SAP Note 2142432 - SAP HANA does not start after a failed attempt to rename the HANA SID







System Crash

An SAP incident will have to be made with a full system dump (SAP Note 1732157 - Collecting diagnosis information for SAP HANA)

HANA up but SAP system not starting:

Check if a connection is possible to the database by running

R3trans -d

this will end with a return code. RC <8 is a successful connection to the database but rc=12 would be a failure. 

Check the trans.log which is produced to see further details about why the abap side of the SAP system could not connect to the database.

Here are some examples of common issues when R3trans d results in r=12



Your HANA DB rev is SPS9 (rev 90 or higher) and you see something similar to what is listed below:
"

4 ETW000  [     dev trc,00000]  Database release is HDB 1.00.090.00.1413897729                            54  0.055046 4 ETW000  [dbhdbsql.cpp,00000]  *** ERROR => Using non supported HANA version: 1.00.090.00.1413897729 4 ETW000  [dbhdbsql.cpp,00000]  *** ERROR => Min. version for this release must be 1.00.62

"

 Please see SAP Note 1952701  - DBSL supports new SAP HANA SP9 version number





Timezone and DST issues:

The system may come up but have dumps of ZDATE_LARGE_TIME_DIFF

Follow the guidelines at: http://scn.sap.com/docs/DOC-58741  

SAP Note 1932132 - SAP HANA : Large time difference between application server and HANA database

SAP KBA  2137138 - Timezone name incorrect after DST switch

Related Documents

 For DST preparations: http://scn.sap.com/docs/DOC-58741







Related Videos:





Related SAP Notes/KBAs

SAP Note 1732157 - Collecting diagnosis information for SAP HANA

SAP Note 2083715 - Analyzing log volume full situations

SAP Note 1788692 - Index Server crash due to missing LogSegment file






If SAP instance is not getting start/up

How to check, If SAP instance is not getting start/up in Linux – disp+work dispatcher IGS Watchdog Gateway ICM



How to check, If SAP instance is not getting start/up in Linux – disp+work dispatcher IGS Watchdog Gateway ICM

we can check & analyze, if an sap instance is not getting start.

There are many root causes for that. That may be

Sap buffer memory allocation issue.

Shared memory  allocation issue.

Dispatcher work process is in struct/hang state

May be port issue, etc.

Root causes and analysis – SAP instance:

Memory Allocation issues :

If your maintaining the system sizing & files system properly as per standard sap guides & as per business process requirement. After that system will allocate some types buffer memories as default min values. But some times, as per system installation working process, the work processes had some more additional requires the shared segments or increment/decrements of abap buffer sizes.In this case, we can check the analysis by executing below sappfpar command with <SID>adm user at OS level.


>/usr/sap/<SID>/SYS/exe/run/sappfpar check pf=/usr/sap/<SID>/SYS/exe/run/<profile_name> nr=<instance nuber> name=<SID> | more
Here, profile name should like “<SID>_D<instance no>_Hostname“.
After executing the command, you will get the all buffer memory allocation report & requirement with errors & warnings. As per requirement change the profile parameters values & confirm by re-executing the same.

Dispatcher is stopped :

Most of the time sap instance is not getting boot because of the respective dispatcher is not in running state. We can easily check & confirm with the below command.
> sapcontrol -nr <instance number> -function GetProcessList



Description: instance

In cause, if suppose the network issue has occurred, then respective all services will down in the server. Then if you try to start the instance services manually, It could not be start & the dispatcher is in stopped state with Gray rather than GREEN & running status. Solution : 

Find out the stopped work processes id’s (pid) by executing above command once again. Then kill that all work process manually.
> kill -9 <WP ID>


Then start the sap_instance again and also check the dispatcher status.

Gateway/Dispatcher Ports issue :

Some times both instances ASCS, PASS are started but respective Dispatcher is not in running status. Because while booting, the respective gateway/dispatcher ports 33<nn>, 32<nn> are not in free with in the server. Those ports are already established in that server. So, you need find out & kill them manually by using below commands.
fuser port/tcp or >netstat -nap | grep 33/32  : to find listening ports>fuser -k port/tcp  : to kill the listening port.
Otherwise simple reboot the application server.


Once the Database is up and running, then it should be connect through the <SID>adm user from Application server. You can cross verify it by using below command. Here, R3trans should be finished with ‘0000’.
#sidadm> R3trans -d


Description: instance
If not, it may cause due to the dispatcher & gateway not working, you can cross verify from Step 2 again. You can also check the trans.log as like below,
>su – <sid>adm
>cat trans.log



Buffer instance IPC cleanup process :

You can cleanup the ipc buffer by executing the below commands at instance level.

Switch to <sid>adm user then run the below command

cleanipc <instance no> remove
OR

cleanipc all remove

Note : Still if you face any issue, please check the below log files, which are exist under the instance work directory. Take the action accordingly.
dev_disp
dev_icm
dev_rd
dev_w0, dev_w1