• Optimizations to models, no need for 70b to each job.
• Summary running on the whole chat:
• Make summary for last 5 iter, not all the history of chat?
• Clean chat after summary been made?
• Save chat history as grade to each answer, and use it as chat history to summary:
[0.7,0.3,1] = question 3 answer correct, question 2 didn't
• ChainChat Supervisor doesnt supervise. After right supervision, re_llama doesnt correct. Need to change ChainChat to other model, LLaMa3.3 cant be fully trusted (or llama in general).
• Add evaluation aspects, its important not just right/wrong BUT- - How long, how many hints? ect... - Maybe Log list? for each day? how many questions, hints, time?
• Add Topics and questions
?• Add in Database json, "ToGenerate", so questions like sin(0)/0(=1) will not changed.
• Make all in Hebrew
?• JSON for summary, and data extraction, so json wil have summary and user_data
• Level 3 get only answer from DeepSeekR1?, not always provide full answer + llama not always can get to R1 answer by himself.
ChainChat: (WolframAlpha addition)
- When RAG question is Generated, wolfram_runner creates wolfram ans. (stores it in file)
- Stores in file because the code running in itr each input.("None" if wasn't able to create)
- If was able to create, uses it as supervisor to llama ((llama+wolfram))
- Else, it puts "None" in (ChainChat)func and DeepSeekR1 runs instead.((llama+DeepSeekR1))
Add a paragraph about users personality (confident, anxious ,disrespectful...) It gives the evaluator option the form of work.(Some kind of First impression) (anxious + good work = give encouraging feedback) (confident + bad work = give wake-up call)
Reference for Idea: The user's behavior in this session, such as claiming they did not need hints and providing a final answer without using the L'Hopital rule when it was not applicable, may indicate overconfidence or a lack of understanding of the underlying concepts. This behavior will be monitored in future sessions to ensure that the user is making progress and not developing bad habits.