Pr/multilingual#121
Conversation
bd0e0d9 to
81923ab
Compare
68fb05a to
f5a5b52
Compare
606dc7d to
ada9699
Compare
katstankiewicz
left a comment
There was a problem hiding this comment.
can you also add ensure_ascii=False to AuditLog save()
aa152bd to
b3ad5dc
Compare
26fa5ec to
ff7bd59
Compare
…behavioral_fidelity judge prompt
| de: Hallo! Wie kann ich dir heute helfen? | ||
| en: Hello! How can I help you today? | ||
| es: ¡Hola! ¿En qué puedo ayudarte hoy? | ||
| fr: Bonjour ! Comment puis-je vous aider aujourd'hui ? |
There was a problem hiding this comment.
easy to add, will do. The existing languages are just things I saved from testing because why waste them
| "downtown", | ||
| "engineering center", |
There was a problem hiding this comment.
Since we have this alias in English, I think we should also add the translations for these two in other languages.
Also, can you sort translation by language please?
There was a problem hiding this comment.
What do you mean by the first part? there are translations in other languages?
| "Main Garage" | ||
| "a garage", | ||
| "main garage", | ||
| "garage a", |
There was a problem hiding this comment.
Wouldn't that match with the name already?
There was a problem hiding this comment.
Not sure what you mean
| "name_aliases": [ | ||
| "Downtown", | ||
| "Engineering Center" | ||
| "downtown", |
There was a problem hiding this comment.
Also, since we have all these alias in each file, shouldn't we put all the aliases in a separate common file?
initial multilingual version
Easily extendable to many language using the add_culture_data script. This will do translation, gender consistent naming, suggest names, extend data, etc. So if anyone wants to run a language not committed in EVA data, it is trivially easy to do so.
Readme section showing basic of adding a language.
This adds:
Multilingual data schema and content (initial utterances, system prompt, name aliases)
multilingual support in code
Prompt updates to support multi languages
Script to "add a language" with high degree of automation
WER metric normalization rules, dynamically set per language and creatable via LLM through adding script
Automatic .env.example adjustments (maintains config app accuracy)
Still TODO: