A colleague and I were recently working on a Db2 server with a large amount of memory (512 Gb) and were supporting the system administrators in applying operating system maintenance to bring the system from RedHat Enterprise Linux (RHEL) 7.6 to RHEL 7.8.
The operating system upgrade was successful, however, we found that we were unable to activate the database after the system was rebooted. We received an SQL0930N error (“There is not enough storage available to process the statement.”) in response to the ACTIVATE DATABASE command.
Looking in the db2diag.log file, we found the following error message:
2020-08-10-188.8.131.523633-240 I64894E612 LEVEL: Error PID : 24888 TID : 140736762734336 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : SAMPLE APPHDL : 0-7 APPID: *LOCAL.db2inst1.200810155758 AUTHID : DB2INST1 HOSTNAME: db2server EDUID : 48 EDUNAME: db2agent (SAMPLE) 0 FUNCTION: DB2 UDB, buffer pool services, sqlbinit, probe:604 MESSAGE : ZRC=0x850F0005=-2062614523=SQLO_NOSEG "No Storage Available for allocation" DIA8305C Memory allocation failure occurred.
The SQLO_NOSEG error code pointed me to the issue that Db2 was unable to allocate enough shared memory segments to support the database’s memory requirements.
Finding the issue
We followed IBM’s documentation for setting the kernel parameters kernel.shmmni and kernel.msgmni, which states that you should set kernel.shmmni to 256 times the size of RAM (in GB), and kernel.msgmni to 1024 times the size of RAM (also in GB). On this server, with 512 Gb RAM, the values were set to 131072 and 524288, respectively.
However, we found that upon rebooting to the new kernel version included with RHEL 7.8 (kernel version 3.10.0-1127), these values were not set as expected:
$ ipcs -l ------ Messages Limits -------- max queues system wide = 4096 max size of message (bytes) = 2097152 default max size of queue (bytes) = 2097152 ------ Shared Memory Limits -------- max number of segments = 4096 max seg size (kbytes) = 536870912 max total shared memory (kbytes) = 1073741824 min seg size (bytes) = 1 ------ Semaphore Limits -------- max number of arrays = 129024 max semaphores per array = 250 max semaphores system wide = 256000 max ops per semop call = 32 semaphore max value = 32767
Furthermore, attempts to manually increase these values also failed:
# sysctl -w kernel.shmmni=131072 sysctl: setting key "kernel.shmmni": Invalid argument kernel.shmmni = 131072
Experimentation revealed that we were able to increase both kernel.shmmni and kernel.msgmni to 32768, but no further. If you are following IBM’s recommendations, this means that you won’t be able to set kernel.msgmni properly if you have more than 32Gb of RAM on your server.
Fortunately, we were working with a very smart system engineer, and after scratching our heads for a while (and considering rolling the maintenance back), she found this RedHat support article.
The article confirmed our suspicion that RedHat had made some changes to the kernel in RedHat 7.8, limiting the kernel.msgmni, kernel.semmni, and kernel.shmmni to 32768.
To our relief, the article provides a solution: add the ipcmni_extend boot-time command-line option for the kernel. We set this parameter by adding the ipcmni_extend option to GRUB_CMDLINE_LINUX line in the /etc/default/grub file and executing the grubby command. After rebooting the server, we were relieved to discover that ipcs -l was reporting the proper values for kernel.msgmni and kernel.shmmni.
With these values set properly, we were able to activate our database and inform the customer that the operating system patching had been completed successfully.