-
Notifications
You must be signed in to change notification settings - Fork 460
fix: library objects synchronization on library update #2268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…work-containing-reference-controls-these-are-not-updated-correctly
WalkthroughThe update introduces a generic synchronization method to the Changes
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
backend/core/tests/test_library_update.py (4)
12-22
: Consider extracting common fixture attributes as constants.The fixture creates a proper test library instance, but the hardcoded values could be extracted as module-level constants for better maintainability.
+# Test constants +TEST_OLD_LIBRARY_URN = "urn:lib:old:1" +TEST_LOCALE = "en" +TEST_OLD_PROVIDER = "TestProvider" +TEST_VERSION = 1 @pytest.fixture def old_library(): """Fixture to create a real LoadedLibrary instance in the test DB.""" return LoadedLibrary.objects.create( - urn="urn:lib:old:1", - locale="en", + urn=TEST_OLD_LIBRARY_URN, + locale=TEST_LOCALE, default_locale=True, - provider="TestProvider", - version=1, + provider=TEST_OLD_PROVIDER, + version=TEST_VERSION, )
87-115
: Effective threat synchronization test with duplicate logic.The test correctly validates threat synchronization but contains significant code duplication with the reference control test.
Consider extracting a helper method to reduce duplication:
+ def _create_test_objects(self, model_class, old_library, test_data): + """Helper method to create test objects with common attributes.""" + objects = [] + for item in test_data: + obj = model_class.objects.create( + urn=item["urn"], + name=item["name"], + library=old_library, + locale="en", + default_locale=True, + provider="TestProvider", + ) + objects.append(obj) + return objects def test_synchronize_threats(self, old_library, new_library): """Tests create, update, and unlink operations for Threats in one go.""" - # Arrange - Threat.objects.create( - urn="urn:threat:200", - name="Threat Two Original", - library=old_library, - locale="en", - default_locale=True, - provider="TestProvider", - ) - Threat.objects.create( - urn="urn:threat:300", - name="Threat Three To Unlink", - library=old_library, - locale="en", - default_locale=True, - provider="TestProvider", - ) + test_data = [ + {"urn": "urn:threat:200", "name": "Threat Two Original"}, + {"urn": "urn:threat:300", "name": "Threat Three To Unlink"}, + ] + self._create_test_objects(Threat, old_library, test_data)
127-156
: Effective update-only test with more code duplication.The test correctly validates the update scenario but continues the pattern of duplicated object creation code.
This test would also benefit from the helper method suggested earlier to reduce code duplication.
178-210
: Thorough no-change scenario test with potential improvement.The test correctly validates that no changes occur when the database state matches the incoming data. However, there's a minor inconsistency in the provider field that could be addressed.
Consider extracting the provider value to maintain consistency:
+ provider = new_library.content.get("provider", "NewProvider") ReferenceControl.objects.create( urn=rc1_data["urn"], name=rc1_data["name"], library=old_library, locale="en", default_locale=True, - provider="NewProvider", + provider=provider, is_published=True, )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/core/models.py
(2 hunks)backend/core/tests/test_library_update.py
(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
backend/core/tests/test_library_update.py (1)
backend/core/models.py (7)
LibraryUpdater
(361-891)LoadedLibrary
(894-1004)StoredLibrary
(235-358)ReferenceControl
(1045-1115)Threat
(1007-1042)update_reference_controls
(486-492)update_threats
(494-500)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Analyze (python)
- GitHub Check: startup-docker-compose-test
- GitHub Check: test (3.12)
- GitHub Check: enterprise-startup-functional-test (3.12)
- GitHub Check: enterprise-startup-docker-compose-test
- GitHub Check: startup-functional-test (3.12)
- GitHub Check: enterprise-functional-tests (3.12, chromium)
- GitHub Check: functional-tests (3.12, chromium)
- GitHub Check: build (3.12)
🔇 Additional comments (8)
backend/core/models.py (1)
7-7
:typing.Self
ties you to Python 3.11+.
Self
was added in Python 3.11. If the project’s runtime or tooling (e.g. mypy CI images) is still on ≤ 3.10 this single import will explode at import-time. Consider eitherfrom typing import TYPE_CHECKING, Type, Union, List if TYPE_CHECKING: # or use typing_extensions from typing import Selfor simply switch to
typing_extensions.Self
until the minimum Python version is bumped.backend/core/tests/test_library_update.py (7)
1-10
: LGTM! Clean and appropriate imports.The imports are well-organized and include all necessary models for testing the LibraryUpdater functionality.
24-39
: Good test data structure with clear naming.The fixture provides well-structured test data that covers the synchronization scenarios effectively.
41-51
: Fixture correctly creates StoredLibrary with proper dependencies.The fixture properly sets up the StoredLibrary with content and dependencies, aligning with the LibraryUpdater's expectations.
53-56
: Well-structured test class with clear documentation.The test class is properly organized with a descriptive docstring explaining its purpose.
57-86
: Comprehensive mixed-mode synchronization test with proper assertions.This test effectively validates the core synchronization logic by testing create, update, and unlink operations simultaneously. The assertions correctly verify all expected behaviors.
116-126
: Clear test for create-only scenario with good assertions.The test properly validates the creation of new objects when the database is empty, with appropriate assertions on both count and return values.
157-177
: Good edge case test for unlinking all objects.The test correctly handles the scenario where all existing objects should be unlinked when incoming data is empty. The approach of modifying the library content and saving it is appropriate.
if not incoming_data: | ||
model_class.objects.filter(library=self.old_library).update(library=None) | ||
return [] | ||
|
||
incoming_ids = {item[unique_field].lower() for item in incoming_data} | ||
|
||
model_class.objects.filter(library=self.old_library).exclude( | ||
**{f"{unique_field}__in": incoming_ids} | ||
).update(library=None) | ||
|
||
existing_obj_map = { | ||
getattr(obj, unique_field): obj | ||
for obj in model_class.objects.filter( | ||
**{f"{unique_field}__in": incoming_ids} | ||
) | ||
} | ||
|
||
def update_reference_controls(self): | ||
for reference_control in self.reference_controls: | ||
normalized_urn = reference_control["urn"].lower() | ||
ReferenceControl.objects.update_or_create( | ||
urn=normalized_urn, | ||
defaults=reference_control, | ||
create_defaults={ | ||
to_create, to_update = [], [] | ||
update_fields = set() | ||
|
||
for item_data in incoming_data: | ||
normalized_id = item_data[unique_field].lower() | ||
update_fields.update(item_data.keys()) | ||
|
||
if normalized_id in existing_obj_map: | ||
instance = existing_obj_map[normalized_id] | ||
for key, value in item_data.items(): | ||
setattr(instance, key, value) | ||
instance.library = self.old_library | ||
to_update.append(instance) | ||
else: | ||
create_data = { | ||
**self.referential_object_dict, | ||
**self.i18n_object_dict, | ||
**reference_control, | ||
**item_data, | ||
unique_field: normalized_id, | ||
"library": self.old_library, | ||
}, | ||
) | ||
} | ||
to_create.append(model_class(**create_data)) | ||
|
||
created_objects = ( | ||
list(model_class.objects.bulk_create(to_create)) if to_create else [] | ||
) | ||
if to_update: | ||
update_fields.add("library") | ||
model_class.objects.bulk_update(to_update, list(update_fields)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generic sync helper can silently hijack rows & violate uniqueness.
Key issues spotted:
-
existing_obj_map
is built withmodel_class.objects.filter(**{f"{unique_field}__in": incoming_ids})
No
library
filter ⇒ objects that belong to another library are pulled in and laterinstance.library = self.old_library
re-assigned.
Result: cross-library objects get “stolen”, breaking referential integrity. -
update_fields
contains every key pluslibrary
plusunique_field
(e.g.urn
).
Updating primary identifiers is useless and risks hitting unique constraints; keep them immutable. -
Union of arbitrary
item_data.keys()
goes straight intobulk_update
.
A rogue field in YAML ⇒ runtimeFieldError
.
Suggested minimal fix:
@@
-existing_obj_map = {
- getattr(obj, unique_field): obj
- for obj in model_class.objects.filter(
- **{f"{unique_field}__in": incoming_ids}
- )
-}
+existing_obj_map = {
+ getattr(obj, unique_field): obj
+ for obj in model_class.objects.filter(
+ **{
+ f"{unique_field}__in": incoming_ids,
+ "library": self.old_library, # prevent cross-library hijack
+ }
+ )
+}
@@
- update_fields.update(item_data.keys())
+ update_fields.update(
+ k for k in item_data.keys() if k != unique_field # keep id immutable
+ )
@@
- if to_update:
- update_fields.add("library")
- model_class.objects.bulk_update(to_update, list(update_fields))
+ if to_update:
+ model_class.objects.bulk_update(
+ to_update,
+ fields=list(update_fields | {"library"}),
+ )
(You may also want to wrap the whole sync in a transaction to keep old ↔ new state consistent.)
🤖 Prompt for AI Agents
In backend/core/models.py lines 437 to 483, the sync helper lacks a library
filter when building existing_obj_map, causing cross-library objects to be
incorrectly reassigned. Fix this by adding a filter for library=self.old_library
in the queryset used for existing_obj_map. Also, exclude the unique_field from
update_fields to avoid updating primary identifiers, and sanitize update_fields
to include only valid model fields to prevent runtime FieldErrors. Optionally,
wrap the entire sync operation in a transaction to maintain consistency.
Summary by CodeRabbit
Refactor
Tests