Skip to content

Conversation

nas-tabchiche
Copy link
Collaborator

@nas-tabchiche nas-tabchiche commented Jul 9, 2025

  • detach no longer used reference controls from library on lib update
  • use UPSERT strategy
  • factorize sync methods and synchronize threats on library update
  • add reasonable test coverage for LibraryUpdater's sync methods

Summary by CodeRabbit

  • Refactor

    • Improved efficiency and maintainability of library synchronization by consolidating duplicate update logic into a single reusable method.
  • Tests

    • Added comprehensive tests to verify correct synchronization of threats and reference controls, including creation, updating, unlinking, and no-change scenarios.

Copy link
Contributor

coderabbitai bot commented Jul 9, 2025

Walkthrough

The update introduces a generic synchronization method to the LibraryUpdater class, consolidating logic for updating related objects such as threats and reference controls. The methods now return lists of created instances. A comprehensive test suite is added to verify synchronization scenarios, including creation, update, and unlinking of related entities.

Changes

File(s) Change Summary
backend/core/models.py Refactored LibraryUpdater to use a generic _synchronize_related_objects method for updating threats and reference controls. Updated method signatures to return lists of created instances.
backend/core/tests/test_library_update.py Added a new test suite for LibraryUpdater, covering creation, update, unlink, and no-change scenarios for related entities. Introduced relevant fixtures and test data.

Poem

In the warren of code, we tidy our nest,
With bulk updates and tests, we ensure only the best.
Threats and controls now sync with one clever call,
While fixtures and checks keep us safe from a fall.
A rabbit’s delight—refactored and neat,
Our code hops ahead, nimble and fleet! 🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
backend/core/tests/test_library_update.py (4)

12-22: Consider extracting common fixture attributes as constants.

The fixture creates a proper test library instance, but the hardcoded values could be extracted as module-level constants for better maintainability.

+# Test constants
+TEST_OLD_LIBRARY_URN = "urn:lib:old:1"
+TEST_LOCALE = "en"
+TEST_OLD_PROVIDER = "TestProvider"
+TEST_VERSION = 1

 @pytest.fixture
 def old_library():
     """Fixture to create a real LoadedLibrary instance in the test DB."""
     return LoadedLibrary.objects.create(
-        urn="urn:lib:old:1",
-        locale="en",
+        urn=TEST_OLD_LIBRARY_URN,
+        locale=TEST_LOCALE,
         default_locale=True,
-        provider="TestProvider",
-        version=1,
+        provider=TEST_OLD_PROVIDER,
+        version=TEST_VERSION,
     )

87-115: Effective threat synchronization test with duplicate logic.

The test correctly validates threat synchronization but contains significant code duplication with the reference control test.

Consider extracting a helper method to reduce duplication:

+    def _create_test_objects(self, model_class, old_library, test_data):
+        """Helper method to create test objects with common attributes."""
+        objects = []
+        for item in test_data:
+            obj = model_class.objects.create(
+                urn=item["urn"],
+                name=item["name"],
+                library=old_library,
+                locale="en",
+                default_locale=True,
+                provider="TestProvider",
+            )
+            objects.append(obj)
+        return objects

     def test_synchronize_threats(self, old_library, new_library):
         """Tests create, update, and unlink operations for Threats in one go."""
-        # Arrange
-        Threat.objects.create(
-            urn="urn:threat:200",
-            name="Threat Two Original",
-            library=old_library,
-            locale="en",
-            default_locale=True,
-            provider="TestProvider",
-        )
-        Threat.objects.create(
-            urn="urn:threat:300",
-            name="Threat Three To Unlink",
-            library=old_library,
-            locale="en",
-            default_locale=True,
-            provider="TestProvider",
-        )
+        test_data = [
+            {"urn": "urn:threat:200", "name": "Threat Two Original"},
+            {"urn": "urn:threat:300", "name": "Threat Three To Unlink"},
+        ]
+        self._create_test_objects(Threat, old_library, test_data)

127-156: Effective update-only test with more code duplication.

The test correctly validates the update scenario but continues the pattern of duplicated object creation code.

This test would also benefit from the helper method suggested earlier to reduce code duplication.


178-210: Thorough no-change scenario test with potential improvement.

The test correctly validates that no changes occur when the database state matches the incoming data. However, there's a minor inconsistency in the provider field that could be addressed.

Consider extracting the provider value to maintain consistency:

+        provider = new_library.content.get("provider", "NewProvider")
         ReferenceControl.objects.create(
             urn=rc1_data["urn"],
             name=rc1_data["name"],
             library=old_library,
             locale="en",
             default_locale=True,
-            provider="NewProvider",
+            provider=provider,
             is_published=True,
         )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a05e144 and 7a1f264.

📒 Files selected for processing (2)
  • backend/core/models.py (2 hunks)
  • backend/core/tests/test_library_update.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
backend/core/tests/test_library_update.py (1)
backend/core/models.py (7)
  • LibraryUpdater (361-891)
  • LoadedLibrary (894-1004)
  • StoredLibrary (235-358)
  • ReferenceControl (1045-1115)
  • Threat (1007-1042)
  • update_reference_controls (486-492)
  • update_threats (494-500)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Analyze (python)
  • GitHub Check: startup-docker-compose-test
  • GitHub Check: test (3.12)
  • GitHub Check: enterprise-startup-functional-test (3.12)
  • GitHub Check: enterprise-startup-docker-compose-test
  • GitHub Check: startup-functional-test (3.12)
  • GitHub Check: enterprise-functional-tests (3.12, chromium)
  • GitHub Check: functional-tests (3.12, chromium)
  • GitHub Check: build (3.12)
🔇 Additional comments (8)
backend/core/models.py (1)

7-7: typing.Self ties you to Python 3.11+.

Self was added in Python 3.11. If the project’s runtime or tooling (e.g. mypy CI images) is still on ≤ 3.10 this single import will explode at import-time. Consider either

from typing import TYPE_CHECKING, Type, Union, List
if TYPE_CHECKING:           # or use typing_extensions
    from typing import Self

or simply switch to typing_extensions.Self until the minimum Python version is bumped.

backend/core/tests/test_library_update.py (7)

1-10: LGTM! Clean and appropriate imports.

The imports are well-organized and include all necessary models for testing the LibraryUpdater functionality.


24-39: Good test data structure with clear naming.

The fixture provides well-structured test data that covers the synchronization scenarios effectively.


41-51: Fixture correctly creates StoredLibrary with proper dependencies.

The fixture properly sets up the StoredLibrary with content and dependencies, aligning with the LibraryUpdater's expectations.


53-56: Well-structured test class with clear documentation.

The test class is properly organized with a descriptive docstring explaining its purpose.


57-86: Comprehensive mixed-mode synchronization test with proper assertions.

This test effectively validates the core synchronization logic by testing create, update, and unlink operations simultaneously. The assertions correctly verify all expected behaviors.


116-126: Clear test for create-only scenario with good assertions.

The test properly validates the creation of new objects when the database is empty, with appropriate assertions on both count and return values.


157-177: Good edge case test for unlinking all objects.

The test correctly handles the scenario where all existing objects should be unlinked when incoming data is empty. The approach of modifying the library content and saving it is appropriate.

Comment on lines +437 to +483
if not incoming_data:
model_class.objects.filter(library=self.old_library).update(library=None)
return []

incoming_ids = {item[unique_field].lower() for item in incoming_data}

model_class.objects.filter(library=self.old_library).exclude(
**{f"{unique_field}__in": incoming_ids}
).update(library=None)

existing_obj_map = {
getattr(obj, unique_field): obj
for obj in model_class.objects.filter(
**{f"{unique_field}__in": incoming_ids}
)
}

def update_reference_controls(self):
for reference_control in self.reference_controls:
normalized_urn = reference_control["urn"].lower()
ReferenceControl.objects.update_or_create(
urn=normalized_urn,
defaults=reference_control,
create_defaults={
to_create, to_update = [], []
update_fields = set()

for item_data in incoming_data:
normalized_id = item_data[unique_field].lower()
update_fields.update(item_data.keys())

if normalized_id in existing_obj_map:
instance = existing_obj_map[normalized_id]
for key, value in item_data.items():
setattr(instance, key, value)
instance.library = self.old_library
to_update.append(instance)
else:
create_data = {
**self.referential_object_dict,
**self.i18n_object_dict,
**reference_control,
**item_data,
unique_field: normalized_id,
"library": self.old_library,
},
)
}
to_create.append(model_class(**create_data))

created_objects = (
list(model_class.objects.bulk_create(to_create)) if to_create else []
)
if to_update:
update_fields.add("library")
model_class.objects.bulk_update(to_update, list(update_fields))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Generic sync helper can silently hijack rows & violate uniqueness.

Key issues spotted:

  1. existing_obj_map is built with

    model_class.objects.filter(**{f"{unique_field}__in": incoming_ids})

    No library filter ⇒ objects that belong to another library are pulled in and later

    instance.library = self.old_library

    re-assigned.
    Result: cross-library objects get “stolen”, breaking referential integrity.

  2. update_fields contains every key plus library plus unique_field (e.g. urn).
    Updating primary identifiers is useless and risks hitting unique constraints; keep them immutable.

  3. Union of arbitrary item_data.keys() goes straight into bulk_update.
    A rogue field in YAML ⇒ runtime FieldError.

Suggested minimal fix:

@@
-existing_obj_map = {
-    getattr(obj, unique_field): obj
-    for obj in model_class.objects.filter(
-        **{f"{unique_field}__in": incoming_ids}
-    )
-}
+existing_obj_map = {
+    getattr(obj, unique_field): obj
+    for obj in model_class.objects.filter(
+        **{
+            f"{unique_field}__in": incoming_ids,
+            "library": self.old_library,   # prevent cross-library hijack
+        }
+    )
+}
@@
-        update_fields.update(item_data.keys())
+        update_fields.update(
+            k for k in item_data.keys() if k != unique_field  # keep id immutable
+        )
@@
-    if to_update:
-        update_fields.add("library")
-        model_class.objects.bulk_update(to_update, list(update_fields))
+    if to_update:
+        model_class.objects.bulk_update(
+            to_update,
+            fields=list(update_fields | {"library"}),
+        )

(You may also want to wrap the whole sync in a transaction to keep old ↔ new state consistent.)

🤖 Prompt for AI Agents
In backend/core/models.py lines 437 to 483, the sync helper lacks a library
filter when building existing_obj_map, causing cross-library objects to be
incorrectly reassigned. Fix this by adding a filter for library=self.old_library
in the queryset used for existing_obj_map. Also, exclude the unique_field from
update_fields to avoid updating primary identifiers, and sanitize update_fields
to include only valid model fields to prevent runtime FieldErrors. Optionally,
wrap the entire sync operation in a transaction to maintain consistency.

@nas-tabchiche nas-tabchiche changed the title CA 1034 when updating a library with a framework containing reference controls these are not updated correctly fix: library objects synchronization on library update Jul 9, 2025
@nas-tabchiche nas-tabchiche marked this pull request as draft July 28, 2025 08:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant