Skip to content

Conversation

cisprague
Copy link
Collaborator

@cisprague cisprague commented May 15, 2025

This pull request is intended to improve the handling of Gaussians in AutoEmulate and standardize the outputs active learners can expects from emulators.

Implemented covariance structures:

  • Full covariance
  • Block-diagonal
  • Diagonal
  • Separable
  • Dirac
  • Empirical
  • Ensemble

@cisprague cisprague linked an issue May 15, 2025 that may be closed by this pull request
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@cisprague
Copy link
Collaborator Author

Need to discuss how to handle empirically constructed Gaussians. Currently, we have explicit specialised classes to handle them, e.g. Empirical_Block_Diagonal, but we could also append class methods from_samples to each class, e.g. Block_Diagonal.from_samples. Either way, there needs to be a way to convert the classes to Dense to construct ensembles with Ensemble --- the method to_dense accomplishes this.

Lastly, regarding compatibility with GPyTorch: they internally use LinearOperator to handle different kernel specialisations, which looks good but might be a bit more than we need.

@codecov-commenter
Copy link

codecov-commenter commented May 17, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.40%. Comparing base (365fc39) to head (14aafeb).
Report is 187 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main     #471       +/-   ##
===========================================
- Coverage   90.53%   80.40%   -10.14%     
===========================================
  Files          96      104        +8     
  Lines        5983     7057     +1074     
===========================================
+ Hits         5417     5674      +257     
- Misses        566     1383      +817     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

github-actions bot commented May 17, 2025

Coverage report

This PR does not seem to contain any modification to coverable code.

@cisprague
Copy link
Collaborator Author

cisprague commented May 19, 2025

Need to discuss how to handle empirically constructed Gaussians. Currently, we have explicit specialised classes to handle them, e.g. Empirical_Block_Diagonal, but we could also append class methods from_samples to each class, e.g. Block_Diagonal.from_samples. Either way, there needs to be a way to convert the classes to Dense to construct ensembles with Ensemble --- the method to_dense accomplishes this.

Lastly, regarding compatibility with GPyTorch: they internally use LinearOperator to handle different kernel specialisations, which looks good but might be a bit more than we need.

We now have .to_dense() and .from_dense() for most of the structured classes. These methods allow for the aggregation of structured distributions into an Ensemble class.

# Anisotropic empirical distribution from k samples at n sampling locations, each with d dimensions
k, n, d = 1000, 50, 3
samples = torch.rand(k, n, d)
dist0 = Empirical(samples)

# Isotropic empirical distribution (set off-diagonal elements to zero)
samples = torch.rand(k, n, d)
dist1 = Diagonal.from_dense(Empirical(samples))

# Just to demonstrate the .to_dense() method
dist1 = Diagonal.from_dense(dist1.to_dense())

# We can combine them into an ensemble
dist2 = Ensemble([dist0, dist1])
print(dist2.logdet(), dist2.trace(), dist2.max_eig())
tensor(-375.8140) tensor(12.4797) tensor(0.1201)

@radka-j
Copy link
Member

radka-j commented May 20, 2025

This is looking great! I just want to confirm we all agree with where this code is meant to sit and how it is meant to be used. My understanding is that it will be within the active learning module and these classes are used to reshape the covariance matrix of the torch distribution returned by AutoEmulate (GaussianLike) to enable efficient metrics computation. Is this correct?

@radka-j
Copy link
Member

radka-j commented May 20, 2025

It would also help me if we had some examples to illustrate when/how the different covariance structures emerge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve handling of Gaussians
3 participants