Skip to content

Conversation

heliocastro
Copy link
Contributor

The Scanoss server provides endpoints to retrieve the related snippet; however, the API call depends on the file hash, which is not currently available.
Added the file hash and URL server origin in case of on-premises installation.

@heliocastro heliocastro self-assigned this Sep 5, 2025
@heliocastro heliocastro requested a review from a team as a code owner September 5, 2025 10:08
@heliocastro heliocastro force-pushed the fix/scanoss_source_url branch 2 times, most recently from 22d3bcd to 742e15a Compare September 5, 2025 10:14
Copy link

codecov bot commented Sep 5, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.39%. Comparing base (3bc08bb) to head (e1b1c6f).

Additional details and impacted files
@@            Coverage Diff            @@
##               main   #10811   +/-   ##
=========================================
  Coverage     57.39%   57.39%           
  Complexity     1674     1674           
=========================================
  Files           346      346           
  Lines         12764    12764           
  Branches       1210     1210           
=========================================
  Hits           7326     7326           
  Misses         4972     4972           
  Partials        466      466           
Flag Coverage Δ
funTest-docker 71.11% <ø> (-0.05%) ⬇️
funTest-non-docker 32.99% <ø> (+0.03%) ⬆️
test-ubuntu-24.04 41.92% <ø> (ø)
test-windows-2025 41.90% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sschuberth
Copy link
Member

@isasmendiagus, this might be interesting to you, and I'd appreciate your opinion here as well!

@heliocastro heliocastro force-pushed the fix/scanoss_source_url branch 4 times, most recently from 471c79b to ec115f8 Compare September 5, 2025 14:02
@isasmendiagus
Copy link
Contributor

@isasmendiagus, this might be interesting to you, and I'd appreciate your opinion here as well!

Hi @sschuberth, I've rebased this branch and split the original commit into two smaller, logical commits to align with the project's commit history standards. However, I'm unable to push the new commits to this branch due to a permissions error.

I've already sent an email to @heliocastro with the .patch files so he can push them, or he can grant me write access to the branch.

@sschuberth
Copy link
Member

However, I'm unable to push the new commits to this branch due to a permissions error.

Yeah, only maintainers can do this. But I believe the cariad-tech org has disabled the feature altogether.

Anyway, I wasn't actually expecting you to do any code changes 😅 But you could always just push to a new branch / PR.

@heliocastro heliocastro force-pushed the fix/scanoss_source_url branch 2 times, most recently from ae8cfc8 to f398281 Compare September 5, 2025 18:10
@heliocastro
Copy link
Contributor Author

@isasmendiagus Just merged your changes in current branch and added you as co-author
@sschuberth FYI

@@ -158,7 +158,11 @@ private fun createSnippetFindings(details: ScanFileDetails, localFilePath: Strin
val vcsInfo = VcsHost.parseUrl(url.takeUnless { it == "none" }.orEmpty())
val provenance = RepositoryProvenance(vcsInfo, ".")

val additionalData = purls.associateWith { "" }
val additionalData = mutableMapOf(
"related_purls" to purls.joinToString(", ")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rather call this "additional_purls" as it's part of additionalData? Or maybe just "purls"? What's the exact relation of these to the primary purl?

Copy link
Contributor

@isasmendiagus isasmendiagus Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relationship between these PURLs is that they represent the same component but from different perspectives.
When the engine matches a component, it can find related PURLs because the same software often exists in multiple forms:

Primary PURL (pkg:github/scanoss/scanoss.js): This represents the source code repository where the component is developed and maintained
Related PURLs (pkg:npm/scanoss): This represents the same component as distributed through package managers

So regarding your naming question, "additional_purls" makes sense since these are additional package identifiers for the same component beyond the primary match. "purls" alone might be too generic.

The key thing is that these represent related/alternative package identifiers for the same underlying software component.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. Given your NPM package example, that looks a bit like source / binary artifacts are given separate purls in addition to the generic purl for the "abstract" package. Something to keep in mind for the ORT data model maybe... essentially these properties would get their own purls:

/**
* The remote artifact where the binary package can be downloaded.
*/
val binaryArtifact: RemoteArtifact,
/**
* The remote artifact where the source package can be downloaded.
*/
val sourceArtifact: RemoteArtifact,
/**
* Original VCS-related information as defined in the package's metadata.
*/
val vcs: VcsInfo,

Copy link
Member

@sschuberth sschuberth Sep 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"additional_purls" makes sense since these are additional package identifiers for the same component beyond the primary match.

So, should we stick to "related_purls", or change to "additional_purls"? Either is fine with me, we just need to sort this out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with any as well. I just followed @isasmendiagus suggestion to change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you forget to push the change @heliocastro? The current code still seems to say "related_purls".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sschuberth This is the patch that @isasmendiagus sent to me. My original one was additional_data.

Copy link
Member

@sschuberth sschuberth Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But above @isasmendiagus wrote:

"additional_purls" makes sense

So I was assuming that he also prefers "additional_purls" now. Let's wait for @isasmendiagus to clarify.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just stick to "related_purls" for now to move forward with this.

I've created #10825 which gets rid of the merge commit in the PR again, addresses this pending comment, and does some other minor clean-ups.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sschuberth for creating PR #10825 and moving this forward.

I'm fine with either "related_purls" or "additional_purls", both accurately describe what these purls represent. As I mentioned before, "additional_purls" makes sense since they are additional package identifiers for the same component, and "related_purls" also works well to show the relationship.

@sschuberth
Copy link
Member

In the commit messages, there should be no blank line between these footer lines:

Co-authored-by: Agustin Isasmendi <agustin.isasmendi@scanoss.com>

Signed-off-by: Helio Chissini de Castro <helio.chissini.de.castro@cariad.technology>

@heliocastro heliocastro force-pushed the fix/scanoss_source_url branch from f398281 to 041b2f3 Compare September 6, 2025 08:42
@heliocastro
Copy link
Contributor Author

@isasmendiagus @sschuberth Changes requested done.
funtest is failing becuase ClearDefined

Replace individual PURL keys with empty values with a single
"related_purls" key containing comma-separated values. This is
a preparation for adding more structured metadata fields to
additionalData while improving JSON readability.

BREAKING CHANGE: additionalData format changed from individual
PURL keys to CSV format in "related_purls" key.

Co-authored-by: Agustin Isasmendi <agustin.isasmendi@scanoss.com>

Signed-off-by: Helio Chissini de Castro <helio.chissini.de.castro@cariad.technology>
SCANOSS server provides endpoints to retrieve the related snippet found,
but API call depends on the file hash, which is not provided on
current status.
Added file hash, file_url and source_hash into additionalData
instalation.

Co-authored-by: Agustin Isasmendi <agustin.isasmendi@scanoss.com>

Signed-off-by: Helio Chissini de Castro <helio.chissini.de.castro@cariad.technology>
@heliocastro heliocastro force-pushed the fix/scanoss_source_url branch from 041b2f3 to 01d4aea Compare September 8, 2025 06:47
@sschuberth
Copy link
Member

Superseded by #10825.

@sschuberth sschuberth closed this Sep 9, 2025
@heliocastro heliocastro deleted the fix/scanoss_source_url branch September 10, 2025 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants