Auditlog: Add django-pghistory
as audit log (optional for now)
#13126
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
As part of evaluating and improving import/reimport performance we have selected django-pghistory as the new auditlog for Defect Dojo.
It is entirely based on Postgres triggers to maintain a history/audit log. This makes it run faster and offers new features such as reverting a record to an old version, adding richer context to processes and events, finding events on related records, etc.
This PR doesn't intoduce new features yet, it just adds
django-pghistory
as replacement fordjango-auditlog
.The latter is still the default as we want to do some tests before changing the default setting.
In a future release
dhango-pghistory
will be the auditlog implementation anddjango-auditlog
will no longer be available for auditlogging.HowTo
To switch an instance over to
django-pghistory
, some steps are needed:DD_AUDITLOG_TYPE
todjango-pghistory
docker compose exec -it uwsgi ./manage.py pghistory backfill
to load the initial snapshot of data intodjango-pghistory
. (this is NOT a data migration fromdjango-auditlog
, but an initial snapshot from the current data in Defect Dojo).In the future both will be part of a new release which will perform the backfill automatically.
Once switched over, you cannot switch back (unless you know what you're doing).
The action history pages will always display the data from both
django-pghistory
anddjango-auditlog
. As these data formats are completely different, there are two tables on the action history page.Some notes about the implementation
django-pghistory
works with database triggers these are created as part of a migration regardless of which audit log is configured.django-auditlog
depending on the chosen auditlog type in settings.Performance
I did a couple of test runs to compare both audit log types using the JFrog Unified Very Many sample scan that contains ~14000 findings. These tests runs show a 20-30% speed improvement on my laptop running docker compose. In a (production) environment with an external database the difference might be bigger due to the increased latency.
django-pghistory
runs inside the database so there are a lot less network roundtrips needed.Scaling
djang-pghistory
documents two settings or design decisions that affect performance: https://django-pghistory.readthedocs.io/en/3.4.3/performance/Row level vs Statement level triggers.
Defect Dojo doesn't do (m)any bulk inserts/updates on the tracked models. So there's no benefit now to switch to Statement level triggers.
If needed we can switch in the future using a one-time schema-only migration.
Denormalizing Context
Context is stored in its own table and events have a foreign key relation to this table. If the Context is intensively used and updated during processes/requests and lots of parallel requests are happening that trigger pghistory events, there could be some contention on that Context table. To "solve" this we could choose to store the context in a column in the Event tables. I don't think the typical use-case in Defect Dojo would really benefit from this, it could be detrimental as it would require more storage and filtering/extracting data from the context might be slower.
If needed we can switch in the future using a one-time schema+data migration.
TODO: