The impact of loss function on training result



- In [Weather forecasting example,](https://docs.sciml.ai/DiffEqFlux/stable/examples/neural_ode_weather_forecast/#Weather-forecasting-with-neural-ODEs) you choose the `sum(abs2)` as the loss function, but in [Sebastian Callh personal blog](https://sebastiancallh.github.io/), he use the `Flux.mse` as the loss function. And the difference of `losses` are orders of magnitude. The forecasting result also not satisfied compared with the original one. Is this because of the different loss functions? 

- The callback function used `false`, can we set different criteria for each `Feature` so we can terminate if loss is small enough?

- All raw data was pre-processed as a whole in the original example, while in this example, you divided it into train and test, and then standardized it separately, this resulted in slightly different training data, despite using the same set of data. How much impact does this have on the training and the final test outcome?

![Image](https://github.com/user-attachments/assets/f2ee45ba-cfd0-4281-8ddd-72e958d8cf5b)

![Image](https://github.com/user-attachments/assets/89f40b62-d7dd-4d26-90a6-84d824983d40)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

The impact of loss function on training result #970

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

The impact of loss function on training result #970

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions