Invalid UTF-8 characters in database cause migration to fail ("fix" included)

I was upgrading from GVM 8.0.1 to 9.0.0 and noticed from the logs that the database migration (from 205 to 221) failed at version 214 with error message:

md manage:MESSAGE:2019-12-10 10h33.20 utc:45677: check_db_versions: database version of database: 205
md manage:MESSAGE:2019-12-10 10h33.20 utc:45677: check_db_versions: database version supported by manager: 221
md   main:CRITICAL:2019-12-10 10h33.20 utc:45677: gvmd: database is wrong version
md   main:MESSAGE:2019-12-10 10h35.12 utc:45701:    Greenbone Vulnerability Manager version 9.0.0 (DB revision 221)
md   main:   INFO:2019-12-10 10h35.12 utc:45701:    Migrating database.
md   main:   INFO:2019-12-10 10h35.12 utc:45701:    Migrating to 206
md   main:   INFO:2019-12-10 10h35.13 utc:45701:    Migrating to 207
md   main:   INFO:2019-12-10 10h35.13 utc:45701:    Migrating to 208
md   main:   INFO:2019-12-10 10h35.23 utc:45701:    Migrating to 209
md   main:   INFO:2019-12-10 10h35.23 utc:45701:    Migrating to 210
md   main:   INFO:2019-12-10 10h35.24 utc:45701:    Migrating to 211
md   main:   INFO:2019-12-10 10h35.24 utc:45701:    Migrating to 212
md   main:   INFO:2019-12-10 10h35.24 utc:45701:    Migrating to 213
md   main:   INFO:2019-12-10 10h35.24 utc:45701:    Migrating to 214
md manage:WARNING:2019-12-10 10h38.51 utc:45701: sql_exec_internal: PQexec failed: ERROR:  invalid byte sequence for encoding "UTF8": 0xb8

I straced the process and noticed that it was related to a result content:

sendto(4, “P\0\0\0\373\0SELECT rhd.value FROM report_host_details AS rhd JOIN report_hosts ON report_hosts.id = rhd.report_host WHERE name = ‘SSLDetails:C84C5DA49D7FAE53EECFA72DBDA3140004DA036BF7E7A1D79E278FDA1D56F619’ ORDER BY report_hosts.start_time DESC LIMIT 1;\0\0\0B\0\0\0”…, 289, MSG_NOSIGNAL, NULL, 0) = 289
poll([{fd=4, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=4, revents=POLLIN}])
recvfrom(4, “1\0\0\0\0042\0\0\0\4T\0\0\0\36\0\1value\0\3\310`9\0\7\0\0\0\31\377\377\377\377\377\377\0\0D\0\0\0\221\0\1\0\0\0\207issuer:O=ADVUNI\303\202\302\270\303\203\302\274,OU=Storage,CN=ADVUNI\303\202\302\270\303\203\302\274 Storage Root CA 1|serial:669518|notBefore:20020701T000000|notAfter:20220701T000000C\0\0\0\rSELECT 1\0Z\0\0\0\5T”, 16384, 0, NULL, NULL) = 207
sendto(4, “P\0\0\0\225\0SELECT id FROM tls_certificates WHERE sha256_fingerprint = ‘C84C5DA49D7FAE53EECFA72DBDA3140004DA036BF7E7A1D79E278FDA1D56F619’ AND owner = 1\0\0\0B\0\0\0\16\0\0\0\0\0\0\0\1\0\0D\0\0\0\6P\0E\0\0\0\t\0\0\0\0\0S\0\0\0\4”, 187, MSG_NOSIGNAL, NULL, 0) = 187
poll([{fd=4, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=4, revents=POLLIN}])
recvfrom(4, “1\0\0\0\0042\0\0\0\4T\0\0\0\33\0\1id\0\3\310a\\0\1\0\0\0\27\0\4\377\377\377\377\0\0C\0\0\0\rSELECT 0\0Z\0\0\0\5T”, 16384, 0, NULL, NULL) = 58
sendto(4, “P\0\0\5\307\0INSERT INTO tls_certificates (uuid, owner, name, comment, creation_time, modification_time, certificate, subject_dn, issuer_dn, trust, activation_time, expiration_time, md5_fingerprint, sha256_fingerprint, serial, certificate_format) SELECT make_”…, 1517, MSG_NOSIGNAL, NULL, 0) = 1517
poll([{fd=4, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=4, revents=POLLIN}])
recvfrom(4, “E\0\0\0sSERROR\0VERROR\0C22021\0Minvalid byte sequence for encoding "UTF8": 0xb8\0Fwchar.c\0L2017\0Rreport_invalid_encoding\0\0Z\0\0\0\5E”, 16384, 0, NULL, NULL) = 122
getpid() = 49282
open(“/usr/local/var/log/gvm/gvmd.log”, O_WRONLY|O_CREAT|O_APPEND, 0666) = 6
fstat(6, {st_mode=S_IFREG|0600, st_size=257, …}) = 0
write(6, “md manage:WARNING:2019-12-10 11h33.05 utc:49282: sql_exec_internal: PQexec failed: ERROR: invalid byte sequence for encoding "UTF8": 0xb8\n (7)\n”, 144) = 144

I checked the database and found out that the entry contained characters that caused the migration script to fail:

Certificate details:
subject …: O=ADVUNI¸ü,OU=Storage,CN=XXXX,CN=10.10.10.10

I ended up removing the results with problematic characters (from results and public.report_host_details table) and the database migration worked. Certificate is base64 encoded in the public.report_host_details table so search for the fingerprint.

I hope this helps someone with a similar issue before it gets fixed.

3 Likes

You could create a bugreport for the development team at Issues · greenbone/gvmd · GitHub to make them aware that a fix might be required.

True. Done:

4 Likes