Replies: 1 comment
-
Hello, Thank you very much for your insightful observations and valuable suggestions. Your analysis is spot on. Yes, in theory, the MuZero and Stochastic MuZero algorithms share a great deal of common code. From a code-reuse perspective, using a configuration option (like However, our current decision to keep them separate is primarily driven by considerations for code readability and ease of use. If we were to merge them, we would need to introduce numerous conditional statements throughout several critical parts of the codebase (e.g., in the policy, the model, the search tree, buffer, etc.). This would significantly increase the complexity and reduce the readability of the code. For a user who wants to study or use one specific algorithm, the cost of understanding and debugging the code would become higher. Therefore, to maintain the clarity, independence, and maintainability of each algorithm's implementation, our current recommendation is to keep them in separate files for the time being. Thank you again for your deep thinking and proactive willingness to contribute! Your suggestions are incredibly valuable for the improvement of the project. We look forward to more discussions with you in the future. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
The only theorical difference between MuZero and Stochastic MuZero resides in the afterstates and their estimation. Therefore, 99% of the code should be the same between these two.
Though, in their implementation in LightZero's
policy
folder, differences betweenmuzero.py
andstochastic_muzero.py
were accumulated, mostly because MuZero was regularly updated, contrarly to the Stochastic version. The Stochastic version was designed using inheritance, but most of the code redefined in the inheritance is still the same than what MuZero's code was at the moment Stochastic MuZero was created.To both solve this issue and prevent it from happen again, wouldn't it be better to globalize both classical and Stochastic MuZero in a single
muzero.py
file, with some kind ofstochastic_variant=True
property set to activate the Stochastic variant if the user wants it ?If this convinces you, I can try to take care of this fusion. This would mean merging
stochastic_muzero.py
intomuzero.py
; mergingstochastic_muzero_model.py
intomuzero_model.py
; and merginggame_buffer_stochastic_muzero.py
intogame_buffer_muzero.py
.I don't want to start working on this before having your blessing since :
I look forward to read you on the matter ! If any details or clarifications are needed feel free to reach out to me.
Beta Was this translation helpful? Give feedback.
All reactions