待办事项(有生之年……)

待读论文:

软工综述性论文:

LLM-Based Multi-Agent Systems for Software Engineering: Literature Review, Vision and the Road Ahead

Large Language Model for Vulnerability Detection and Repair: Literature Review and the Road Ahead

A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models

Deep Learning Library Testing: Definition, Methods and Challenges

Towards an Understanding of Context Utilization in Code Intelligence

RL 等基础知识学习论文:

https://arxiv.org/pdf/2503.02951

https://arxiv.org/pdf/2508.05170

https://arxiv.org/pdf/2509.17325

待完成实践

采用 RL 的框架 :https://github.com/volcengine/verl,分别跑 GRPO 和 DAPO 两个算法。

先跑论文数据集复现:https://huggingface.co/datasets/BytedTsinghua-SIA/DAPO-Math-17k

然后在新的数据集上测测: https://huggingface.co/datasets/zwhe99/DeepMath-103K


Melon_Musk

猛男嘤嘤!

0 条评论

发表评论

邮箱地址不会被公开。 必填项已用*标注