tag
#RL
1 post
ReFT: The Future of LLM Fine-Tuning
A new method called Reinforced Fine-Tuning (ReFT) uses reinforcement learning to help large language models solve complex math problems more effectively than traditional supervised fine-tuning.