Skip to content

Conversation

tAnGjIa520
Copy link

@tAnGjIa520 tAnGjIa520 commented Aug 12, 2025

atari_unizero_multitask_segment_ddp_config_debug_naive.py 单任务从0微调
atari_unizero_multitask_segment_ddp_config_finetune_SpaceInvaders_full.py 全量微调
atari_unizero_multitask_segment_ddp_config_finetune_SpaceInvaders_head_back_encoder_lora.py 微调head+encoder(lora)+backbone(lora)
atari_unizero_multitask_segment_ddp_config_finetune_SpaceInvaders_head_back_lora.py 微调head+backbone(lora)
atari_unizero_multitask_segment_ddp_config_finetune_SpaceInvaders_head.py 微调(head)

# finetune_components = ['transformer'] # load-enc-trans_finetune-trans-head
finetune_components = [] # load-enc-trans_finetune-encoder-head

for seed in [3]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scalezero加载ckpt全量调整和scalezero从零训的版本都指定为seed0

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

n_episode = 8
evaluator_env_num = 3
# num_simulations = 50
num_simulations = 25
Copy link
Collaborator

@puyuan1996 puyuan1996 Aug 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是需要改成collect_num_simulations为25, eval_num_simulations为50。全部改成25,eval的性能是会下降的

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

num_segments = collector_env_num
n_episode = 8
evaluator_env_num = 3
num_simulations = 25
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里也是需要按上面的修改

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants