This article evaluates the ability of language models (LLMs), such as GPT-4 and GPT-3.5 Turbo, to support learning, especially in solving mathematical problems. A comparative experiment was carried out, including models with and without fine-tuning, where GPT-4 obtained the best performance, solving 94% of the questions. The results indicate challenges in algebraic expressions and fractions, highlighting the importance of personalized training for specific tasks.