Resolving SQL0930N Memory Errors after an RHEL 7.8 Upgrade

A colleague and I were recently working on a Db2 server with a large amount of memory (512 Gb) and were supporting the system administrators in applying operating system maintenance to bring the system from RedHat Enterprise Linux (RHEL) 7.6 to RHEL 7.8.

The operating system upgrade was successful, however, we found that we were unable to activate the database after the system was rebooted. We received an SQL0930N error (“There is not enough storage available to process the statement.”) in response to the ACTIVATE DATABASE command.

Looking in the db2diag.log file, we found the following error message:

2020-08-10-11.58.48.813633-240 I64894E612            LEVEL: Error  
PID     : 24888                TID : 140736762734336 PROC : db2sysc 0  
INSTANCE: db2inst1             NODE : 000            DB   : SAMPLE  
APPHDL  : 0-7                  APPID: *LOCAL.db2inst1.200810155758  
AUTHID  : DB2INST1             HOSTNAME: db2server  
EDUID   : 48                   EDUNAME: db2agent (SAMPLE) 0  
FUNCTION: DB2 UDB, buffer pool services, sqlbinit, probe:604  
MESSAGE : ZRC=0x850F0005=-2062614523=SQLO_NOSEG  
          "No Storage Available for allocation"  
          DIA8305C Memory allocation failure occurred.

The SQLO_NOSEG error code pointed me to the issue that Db2 was unable to allocate enough shared memory segments to support the database’s memory requirements.

Finding the issue

We followed IBM’s documentation for setting the kernel parameters kernel.shmmni and kernel.msgmni, which states that you should set kernel.shmmni to 256 times the size of RAM (in GB), and kernel.msgmni to 1024 times the size of RAM (also in GB). On this server, with 512 Gb RAM, the values were set to 131072 and 524288, respectively.

However, we found that upon rebooting to the new kernel version included with RHEL 7.8 (kernel version 3.10.0-1127), these values were not set as expected:

$ ipcs -l  
  
------ Messages Limits --------  
max queues system wide = 4096  
max size of message (bytes) = 2097152  
default max size of queue (bytes) = 2097152  
  
------ Shared Memory Limits --------  
max number of segments = 4096  
max seg size (kbytes) = 536870912  
max total shared memory (kbytes) = 1073741824  
min seg size (bytes) = 1  
  
------ Semaphore Limits --------  
max number of arrays = 129024  
max semaphores per array = 250  
max semaphores system wide = 256000  
max ops per semop call = 32  
semaphore max value = 32767

Furthermore, attempts to manually increase these values also failed:

# sysctl -w kernel.shmmni=131072  
sysctl: setting key "kernel.shmmni": Invalid argument  
kernel.shmmni = 131072

Experimentation revealed that we were able to increase both kernel.shmmni and kernel.msgmni to 32768, but no further. If you are following IBM’s recommendations, this means that you won’t be able to set kernel.msgmni properly if you have more than 32Gb of RAM on your server.

Thanks, RedHat?

Fortunately, we were working with a very smart system engineer, and after scratching our heads for a while (and considering rolling the maintenance back), she found this RedHat support article.

The article confirmed our suspicion that RedHat had made some changes to the kernel in RedHat 7.8, limiting the kernel.msgmni, kernel.semmni, and kernel.shmmni to 32768.

To our relief, the article provides a solution: add the ipcmni_extend boot-time command-line option for the kernel. We set this parameter by adding the ipcmni_extend option to GRUB_CMDLINE_LINUX line in the /etc/default/grub file and executing the grubby command. After rebooting the server, we were relieved to discover that ipcs -l was reporting the proper values for kernel.msgmni and kernel.shmmni.

With these values set properly, we were able to activate our database and inform the customer that the operating system patching had been completed successfully.