-
-
Notifications
You must be signed in to change notification settings - Fork 161
Open
Labels
questionFurther information is requestedFurther information is requested
Description
I am new to neural differential equations and have been going through some tutorials to better understand them. I noticed that in Python's Diffrax tutorial, they use a batching scheme for training, where every gradient step seems to be using 32 trajectories. This runs surprisingly fast, and when I tried to implement this in Julia, either via Optimization (setting maxiters=1
in solve
) or via Lux.Training directly, it takes forever.
Am I totally misunderstanding something from the tutorial, or is this not a feature that is optimised for in any of the Julia packages that use DiffEqFlux? Thank you in advance!
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested