Retrieving large reports using the gvm scripts

altjx · July 9, 2021, 12:37pm

I have managed to put together the following method in order to retrieve both CSV and “Anonymous XML” reports from a completed report:

def get_report(gmp: Gmp, report_id, file_name):
    # First, find the right report ID
    resp = gmp.get_report_formats()
    report_formats = resp.xpath("//report_format")  # [-1].xpath("@id")
    report_format_id = False
    extension = ""
    for f in report_formats:
        save_file = False
        if f.xpath(".//name")[1].text == "CSV Results":
            report_format_id = f.xpath("@id")[0]
            save_file = True
            extension = "csv"
        elif f.xpath(".//name")[1].text == "XML":
            report_format_id = f.xpath("@id")[0]
            save_file = True
            extension = "xml"

        # Print the data and/or add it to a file
        resp = gmp.get_report(report_id, report_format_id=report_format_id, ignore_pagination=True)

        if save_file == True:
            if extension == "csv":
                f = open("{}.{}".format(file_name, extension), "wb")
                csv_in_b64 = resp.xpath('report/text()')[0]
                csv = base64.b64decode(csv_in_b64)
                f.write(csv)
                f.close()
            if extension == "xml":
                f = open("{}.{}".format(file_name, extension), "w")
                resp = gmp.get_report(report_id, report_format_id=report_format_id, ignore_pagination=True)
                data = print_pretty_xml(resp)
                f.write(data)
                f.close()

This works perfectly fine for smaller reports, but this fails miserably on larger reports. I have a report that has a lot of findings, resulting in a 1.1MB CSV and a 22MB XML file when exported manually. However, when trying to run the script and ignore pagination, it just fails with a timed out error message.

└─# time runuser -u openvas -- gvm-script --gmp-username admin --gmp-password $(cat /home/openvas/.openvas_creds.txt) socket /home/openvas/gvm-script.py --report /home/openvas/openvas-20a4d0cfc753-1424
Timeout while reading the response

real    1m2.808s
user    0m0.337s
sys     0m0.052s

As you can see from above, it waits like a minute or so before it times out. Is there any way that I can just simply increase the timeout window, or is there an “easy” way to implement retrieving the report contents using pagination? I noticed that all of the python gvm-scripts on GitHub ignore pagination, so I’m not sure what this would look like if pagination wasn’t ignored.

bricks · July 9, 2021, 12:44pm

--help is your friend

  --timeout TIMEOUT     Response timeout in seconds, or -1 to wait indefinitely (default: 60)

altjx · July 9, 2021, 12:49pm

That’s pretty embarrassing, lol. I didn’t notice that initially. I did try it, however, and it still timed out:

└─# runuser -u openvas -- gvm-script --timeout -1 --gmp-username admin --gmp-password $(cat /home/openvas/.openvas_creds.txt) socket /home/openvas/gvm-script.py --report /home/openvas/openvas-20a4d0cfc753-1424
timed out

Am I missing something by chance?

EDIT:

Actually, I don’t think the --timeout feature even matters in this scenario. I have a simply python script that sleeps for 5 minutes, but I used --timeout 5 and it didn’t time out in 5 seconds.

From what I can see, this is handled by DEFAULT_TIMEOUT=60 in the connections.py script and can’t be changed with command line arguments.

altjx · July 11, 2021, 2:43am

Well, as I mentioned in the previous post, the --timeout feature actually doesn’t seem to do anything. Setting it to -1 or even 9999 did no difference for me as it still timed out after 60 seconds.

So here’s my solution:

sed -i"" "s/DEFAULT_TIMEOUT = 60/DEFAULT_TIMEOUT = 999999/g" /usr/lib/python3/dist-packages/gvm/connections.py

This simply replaces the “DEFAULT_TIMEOUT = 60” from the connections.py script.

ioannisp · February 20, 2023, 7:51pm

Hi @altjx,
I have one question based on your code.
Which file did you import for function print_pretty_xml?

Thank you

altjx · February 20, 2023, 8:15pm

Hey @ioannisp,

I defined my print_pretty_xml function within the script itself here:

def print_pretty_xml(data):
    xmlstr = ElementTree.tostring(data, encoding='utf8', method='xml')
    dom = xml.dom.minidom.parseString(xmlstr)
    pretty_xml_as_string = dom.toprettyxml()
    # print(pretty_xml_as_string)
    return pretty_xml_as_string

bricks · February 21, 2023, 6:06am

There is a pretty_print function in gvm.xml already https://github.com/greenbone/python-gvm/blob/58b36c6c058fad1858d0000282f2b7a3d2c3ef35/gvm/xml.py#L82

ioannisp · February 21, 2023, 3:03pm

Thank you very much altjx and bricks!
You helped me a lot!