GVM, Redis DBs, and resource management

GVM versions

gsad: Greenbone Security Assistant 20.08.0~git
gvmd: Greenbone Vulnerability Manager 20.08.0
openvas-scanner: OpenVAS 20.8.0
gvm-libs: gvm-libs 20.8.0

Environment

Operating system: Debian 10.7 (buster)
Kernel: Linux openvas 4.19.0-13-amd64 #1 SMP Debian 4.19.160-2 (2020-11-28) x86_64 GNU/Linux
Installation method / source: Release tarballs

GVM apparently got OOM Killed doing a scan overnight. Now, I can’t run a scan without encountering the same error as before:

==> /var/log/gvm//ospd-scanner.log <==
OSPD[3663] 2021-01-08 10:25:09,012: INFO: (ospd.ospd) Currently 1 queued scans.
OSPD[3663] 2021-01-08 10:25:10,528: INFO: (ospd.ospd) Starting scan 7225d010-b6d4-4050-988b-768499ba8a20.                                                                      
OSPD[3663] 2021-01-08 10:25:10,731: ERROR: (ospd.ospd) While scanning: 7225d010-b6d4-4050-988b-768499ba8a20                                                                    
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/ospd/ospd.py", line 571, in start_scan
    self.exec_scan(scan_id)
  File "/usr/local/lib/python3.7/dist-packages/ospd_openvas/daemon.py", line 1226, in exec_scan                                                                                
    openvas_scan_id = scan_prefs.prepare_openvas_scan_id_for_openvas()
  File "/usr/local/lib/python3.7/dist-packages/ospd_openvas/preferencehandler.py", line 94, in prepare_openvas_scan_id_for_openvas                                             
    self.kbdb.add_scan_id(self.scan_id, self._openvas_scan_id)
AttributeError: 'NoneType' object has no attribute 'add_scan_id'
OSPD[3663] 2021-01-08 10:25:10,764: INFO: (ospd.ospd) 7225d010-b6d4-4050-988b-768499ba8a20: Scan finished.

As previously mentioned, I think this specific crash is a bug in ospd. However, the bug is triggered when it finds no redis dbs to connect to.

I have configured a redis with max of 128 databases (up from the default 16). When I ran into this issue originally, most of the databases were unused. How, each of the databases have at least some keys in them.

Reading through the source code for ospd_openvas it looks like my problem is likely due to the OOM kill leaving open DBs without properly releasing them, IOW, all my redis DBs are now marked as “in use”:

class KbDB:
    ...
    def try_database(self, index: int) -> bool:
        """ Check if a redis db is already in use. If not, set it
        as in use and return.

        Arguments:
            ctx: Redis object connected to the kb with the
                DBINDEX_NAME key.
            index: Number intended to be used.

        Return True if it is possible to use the db. False if the given db
            number is already in use.
        """
        _in_use = 1
        try:
            resp = self.ctx.hsetnx(DBINDEX_NAME, index, _in_use)
        except:
            raise OspdOpenvasError(
                'Redis Error: Not possible to set %s.' % DBINDEX_NAME
            )

        return resp == 1

So, first of all, how to I clear this in-use flag? Find the key name, and manually update all databases?

Secondly, can someone help me understand how GVM uses redis dbs, and what my expectations of its behavior should be? Why does it matter how many I have configured? I imagine that the max number of databases likely has a strong relationship with system memory use, and my have some implied relationship with max concurrent tasks, or hosts scanned, or tests per host, etc.

It seems that GVM doesn’t self regulate sufficiently, and given that, I need some framework within which I can decide how to best configure my system to operate within its constraints.

I want to configure the minimum number of redis dbs required, reduce memory use to the extent possible in order to avoid OOM conditions. Within those limitations, I’d like to maximize concurrency.

Any guidance?

So, re: issue number one (how to clear the flag):

So, GVMD running, but no active scans, all dbs are currently marked “in use” with the exception of the default database (db index 0), where this global “in use” data is stored.

root@openvas:~# for idx in $(seq 0 128); do redis-cli -n 0 hget GVM.__GlobalDBIndex $idx; done |grep 1 |wc -l                                                                  
127

I shut down all gvm related services…and all databases are still marked as “in use”:

root@openvas:~# systemctl stop gvmd ospd-openvas gsad;
root@openvas:~# for idx in $(seq 0 128); do redis-cli -n 0 hget GVM.__GlobalDBIndex $idx; done |grep 1 |wc -l                                                                  
127

I manually clear the “in use” key:

root@openvas:~# for idx in $(seq 0 128); do redis-cli -n 0 hdel GVM.__GlobalDBIndex $idx; done > op.log

Start gvm services again, and only 1 db is now marked as “in use”:

root@openvas:~# systemctl start gvmd ospd-openvas gsad;                                                                                                                        
root@openvas:~# for idx in $(seq 0 128); do redis-cli -n 0 hget GVM.__GlobalDBIndex $idx; done |grep 1 |wc -l                                                                  
1

And now, the scan at least starts without error:

==> /var/log/gvm/gvmd.log <==
event task:MESSAGE:2021-01-08 12h21.27 EST:11988: Status of task Scan developer workstations now (2cbc544c-0eb2-4cb7-88af-935135042cbf) has changed to Requested               
event task:MESSAGE:2021-01-08 12h21.27 EST:11988: Task Scan developer workstations now (2cbc544c-0eb2-4cb7-88af-935135042cbf) has been requested to start by admin             

==> /var/log/gvm/ospd-scanner.log <==
OSPD[11612] 2021-01-08 12:22:39,972: INFO: (ospd.command.command) Scan cdc8277f-06f2-4123-9f1f-c5ebc2b19a2b added to the queue in position 1.                                  

==> /var/log/gvm/gvmd.log <==
event task:MESSAGE:2021-01-08 12h22.40 EST:11991: Status of task Scan developer workstations now (2cbc544c-0eb2-4cb7-88af-935135042cbf) has changed to Queued                  

==> /var/log/gvm/ospd-scanner.log <==
OSPD[11612] 2021-01-08 12:22:41,319: INFO: (ospd.ospd) Currently 1 queued scans.
OSPD[11612] 2021-01-08 12:22:42,712: INFO: (ospd.ospd) Starting scan cdc8277f-06f2-4123-9f1f-c5ebc2b19a2b.                                                                     

==> /var/log/gvm/gvmd.log <==
event task:MESSAGE:2021-01-08 12h22.45 EST:11991: Status of task Scan developer workstations now (2cbc544c-0eb2-4cb7-88af-935135042cbf) has changed to Running                 

==> /var/log/gvm/openvas.log <==
sd   main:MESSAGE:2021-01-08 17h23.32 utc:12425: openvas 20.8.0 started

So far, this scan has caused 4 additional databases to be marked as “in use”

root@openvas:~# for idx in $(seq 0 128); do redis-cli -n 0 hget GVM.__GlobalDBIndex $idx; done |grep 1 |wc -l                                                                  
5

There is a documentation on how the scanner is using Redis available here:

3 Likes

Thanks, that was helpful.

1 Like