In the protobuf: Web browser artifacts using Google’s data interchange format

The next goal was parsing these URLs with a Python script so that they can be sent to a reputation checking API or added to a timeline.

Google provides guidance for working with protobufs in Python, but it’s a touch fiddly and probably too time-consuming for the harried forensicator during a case. Luckily, once the module has been generated for a ‘.proto’ file, it can be reused. So here’s one I prepared earlier, adapting the Google tutorial to my purposes. I also made a file for the resource_prefetch_predictor_host_redirect blobs.

Note: to use these _pb2 files, you will need to install the protobuf library for Python. The easiest way to do this is with pip:

pip install protobuf

All that’s left is importing the respective file and parsing your Network Action Predictor database of choice:

import sqlite3

import RPPO_pb2

sqlite_db_file = “Network Action Predictor”

table=’resource_prefetch_predictor_origin’

con = sqlite3.connect(sqlite_db_file)

cur = con.cursor()

res = cur.execute(f”SELECT * FROM {table}”)

records = res.fetchall()

RPPO = RPPO_pb2.OriginData()

for record in records:

      RPPO.ParseFromString(record[1])

      print(RPPO)

From there, you can work on accessing only the data you need. You may want to print the value in ‘RPPO.host’ and then loop through all of the origins in origin and print those URLs too:

for i in RPPO.origins:

      print (i.origin)

You may also want to decode the timestamp (which uses the same epoch as WebKit) with a function such as this:

import datetime

def parse_webkit_timestamp(timestamp):

    time = datetime.timedelta(microseconds=int(timestamp))

    time = datetime.datetime(1601,1,1) + time

    return (time)

Head over to ChrisTappin/Make-Resource-Prefetch-Predictor-Happen on GitHub if you want an example script that can parse the records from a database, either to read or produce a CSV.

Continue Reading