Agentic Translation with validation

In the previous post, I wrote about the LLM for translation. It was the most basic way to use the model to translate the text. However, the translation quality is not always good. In this post, I will try to use the model to generate the translation and validate it with a judge using SocietyOfMindAgent with AutoGen.

Autogen

Autogen is a framework for creating multi-agent AI applications that can act autonomously or work alongside humans creating by Microsoft. In this article, I will use the framework to create a translation service with a judge to validate the translation quality using the SocietyOfMindAgent.

SocietyOfMindAgent

Deepmind proposed in its paper Improving Factuality and Reasoning in Language Models through Multiagent Debate the “society of mind” approach inspired by Marvin Minsky’s theory of the same name. This approach is also called language generation through multi-agent debate.
The society of mind is a collection of agents that work together to solve a problem. Each agent has its own expertise and can communicate with other agents to solve the problem.

Example from Autogen’s documentation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
        import asyncio
        from autogen_agentchat.ui import Console
        from autogen_agentchat.agents import AssistantAgent, SocietyOfMindAgent
        from autogen_ext.models.openai import OpenAIChatCompletionClient
        from autogen_agentchat.teams import RoundRobinGroupChat
        from autogen_agentchat.conditions import TextMentionTermination


        async def main() -> None:
            model_client = OpenAIChatCompletionClient(model="gpt-4o")

            agent1 = AssistantAgent("assistant1", model_client=model_client, system_message="You are a writer, write well.")
            agent2 = AssistantAgent(
                "assistant2",
                model_client=model_client,
                system_message="You are an editor, provide critical feedback. Respond with 'APPROVE' if the text addresses all feedbacks.",
            )
            inner_termination = TextMentionTermination("APPROVE")
            inner_team = RoundRobinGroupChat([agent1, agent2], termination_condition=inner_termination)

            society_of_mind_agent = SocietyOfMindAgent("society_of_mind", team=inner_team, model_client=model_client)

            agent3 = AssistantAgent(
                "assistant3", model_client=model_client, system_message="Translate the text to Spanish."
            )
            team = RoundRobinGroupChat([society_of_mind_agent, agent3], max_turns=2)

            stream = team.run_stream(task="Write a short story with a surprising ending.")
            await Console(stream)


        asyncio.run(main())

In this example, the society of mind agent holds two agents: assistant1(writer) and assistant2(editor). The two assistants are the basic text generation agents.

A round robin group chat is used to switch between the two agents. A termination message is added to the group chat to stop the conversation when the editor approves the text. If there is no termination message defined, the conversation will continue indefinitely. This group chat is what we call the team that used by SocietyOfMindAgent.
The society of mind agent is in fact an agent that process all the discussions generated by the group chat. And it comes with an instruction to wrap the discussion with a response prompt to generate a response based on the discussion.
The default instruction is:

1
2
3
4
Earlier you were asked to fulfill a request. 
You and your team worked diligently to address that request. 
Here is a transcript of that conversation: 
{conversation}"

The default response prompt is:

1
Output a standalone response to the original request, without mentioning any of the intermediate discussion.

Finally the the society of mind agent is added to the group chat as a single agent to generate the final response.

To summary, the society of mind is the representative of a group of agents to generate the final response.

The experiment

My case is a lot simpler than the example above. I just need to translate a text and make it validate by a judge. It could be a multi-turn translate and refine iteration. If the judge approves the translation, the conversation stops and the society of mind agent will generate the final response which is the translation.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import asyncio
from autogen_agentchat.ui import Console
from autogen_agentchat.agents import AssistantAgent, SocietyOfMindAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination


async def main() -> None:
    model_client = OpenAIChatCompletionClient(
        model="qwen2.5:7b",
        api_key="ff",
        base_url="http://192.168.88.251:11434/v1",
        model_capabilities={
            "vision": False,
            "function_calling": False,
            "json_output": True,
        },
    )

    translator = AssistantAgent(
        "translator",
        model_client=model_client,
        system_message="You are a translator good at translating texts. You translate the text with faithfulness to the original text."
    )
    
    translator_reviewer = AssistantAgent(
        "translator_reviewer",
        model_client=model_client,
        system_message="You are a reviewer good at reviewing translations. You review the translation for faithfulness, grammar and naming consistency. You provide feedback for the translation. Respond with 'APPROVE' if the translation meets all the requirements.")
    translation_termination = TextMentionTermination("APPROVE")
    translation_team = RoundRobinGroupChat(
        [translator, translator_reviewer], termination_condition=translation_termination, max_turns=3
    )
    response_prompt = "Output a standalone translation to the original request, without mentioning any of the intermediate discussion and the APPROVE message"
    society_of_mind_agent = SocietyOfMindAgent(
        "society_of_mind", team=translation_team, model_client=model_client, response_prompt=response_prompt
    )
    task = """
    Translate the following text to English:
    ### 
    Si vous arrêtez d'utiliser REFRESH, collyre en récipient unidose
    ### 
    """
    stream = society_of_mind_agent.run_stream(task=task)
    result = await Console(stream)
    return result.messages

if __name__ == "__main__":
    result = asyncio.run(main())
    print("-----------------")
    print(result[-1].content)

The above code is a simple translation service with a judge. The translation team holds two agents: translator and translator_reviewer. The conversation will stop when the reviewer approves the translation. The society of mind agent will generate the final response which is the translation.
My model is way smaller than the example. At the beginning, I used the Qwen2.5 1.5B model to translate the text. It works well most of the time. However, the translation is not consistent all the time and it will add some extra info that doesn’t exist in the original text. Much worse, the judge failed to spot the problem and approve the wrong translation. The 3B model failed to achieve the consistency and the judge process. At the end, I stopped the experiment at the 7B model. The model can generate the translation with high quality and the judge can spot the problem and approve the translation.
However, it works as expected with minor changes. I changed the response prompt to only output the translation without mentioning the intermediate discussion and the APPROVE message. With the original response prompt, the final response contains the “APPROVE” message.

The final response is:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
---------- user ----------

    Translate the following text to English:
    ### 
    Si vous arrêtez d'utiliser REFRESH, collyre en récipient unidose
    ### 
    
---------- translator ----------
If you stop using REFRESH, one-dose eye drops
[Prompt tokens: 69, Completion tokens: 13]
---------- translator_reviewer ----------
APPROVE

The provided translation is accurate and conveys the meaning of the original French text correctly. It uses appropriate English terminology for "REFRESH" (presumably a product name) and properly translates "collyre en récipient unidose" to "one-dose eye drops." The sentence structure is clear and grammatically correct in English.
[Prompt tokens: 105, Completion tokens: 74]
---------- society_of_mind ----------
If you stop using REFRESH, one-dose eye drops
[Prompt tokens: 148, Completion tokens: 13]
---------- Summary ----------
Number of messages: 4
Finish reason: None
Total prompt tokens: 322
Total completion tokens: 100
Duration: 2.78 seconds
-----------------
If you stop using REFRESH, one-dose eye drops

A second example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---------- user ----------

    Translate the following text to English:
    ###
    Faites attention avec DOLIPRANE 100 mg :
· Si la douleur persiste plus de 5 jours, ou la fièvre plus de 3 jours, ou en cas d’efficacité insuffisante ou de survenue de tout autre signe, ne continuez pas le traitement sans l’avis de votre médecin.

· La prise de paracétamol peut entraîner des troubles du fonctionnement du foie.

· Vous devez demander l’avis de votre médecin avant de donner ce médicament à votre enfant :

· s’il a une maladie du foie ou une maladie grave des reins,

· s’il souffre de déshydratation,

· s’il souffre par exemple de malnutrition chronique, s’il est en période de jeûne, s’il a perdu beaucoup de poids récemment, s’il est atteint du virus du SIDA ou d’une hépatite virale chronique, s’il souffre de mucoviscidose (maladie génétique et héréditaire caractérisée notamment par des infections respiratoires graves), ou encore s’il est atteint de la maladie de Gilbert (maladie héréditaire associée à une augmentation du taux de bilirubine dans le sang),

· si votre enfant est allergique à l’aspirine et/ou aux anti-inflammatoires non stéroïdiens.

· À titre informatif : la consommation de boissons alcoolisées pendant le traitement est déconseillée. En cas de sevrage récent d’un alcoolisme chronique, le risque d’atteinte hépatique est majoré

· En cas d’administration chez un enfant, la dose dépend de son poids (voir rubrique « Comment utiliser DOLIPRANE 100 mg, suppositoire sécable ? »).
###
    
---------- translator ----------
Be careful when using DOLIPRANE 100 mg:
- If pain persists for more than 5 days or fever for more than 3 days, or if the medication is not effective or other symptoms appear, do not continue treatment without your doctor's advice.

- Taking paracetamol can cause liver function problems.

- Before giving this medicine to your child, consult your doctor:

- If he has a liver disease or serious kidney illness,

- If he suffers from dehydration,

- For example, if he is chronically malnourished, fasting, has recently lost significant weight, has the HIV virus or chronic viral hepatitis, has mucoviscidosis (a genetic hereditary condition characterized by severe respiratory infections), or suffers from Gilbert's syndrome (a hereditary disease associated with increased bilirubin levels in the blood),

- If your child is allergic to aspirin and/or nonsteroidal anti-inflammatory drugs.

- For information: Alcoholic beverages should be avoided during treatment. The risk of liver damage after chronic alcoholism withdrawal is increased.

- When administering to a child, the dose depends on his weight (see "How to use DOLIPRANE 100 mg, cuttable suppository?" section).
[Prompt tokens: 464, Completion tokens: 254]
---------- translator_reviewer ----------
APPROVE

The translation accurately conveys all the information from the original text. It maintains the correct medical terminology and warning messages pertinent to using DOLIPRANE 100 mg responsibly. The sentence structure in English matches the intended meaning effectively while also ensuring that it is understandable to the target audience. Naming consistency has been preserved, and there are no grammatical errors noted within the provided translation.
[Prompt tokens: 740, Completion tokens: 83]
---------- society_of_mind ----------
Be careful when using DOLIPRANE 100 mg:
- If pain persists for more than 5 days or fever for more than 3 days, or if the medication is not effective or other symptoms appear, do not continue treatment without your doctor's advice.

- Taking paracetamol can cause liver function problems.

- Before giving this medicine to your child, consult your doctor:

- If he has a liver disease or serious kidney illness,

- If he suffers from dehydration,

- For example, if he is chronically malnourished, fasting, has recently lost significant weight, has the HIV virus or chronic viral hepatitis, has mucoviscidosis (a genetic hereditary condition characterized by severe respiratory infections), or suffers from Gilbert's syndrome (a hereditary disease associated with increased bilirubin levels in the blood),

- If your child is allergic to aspirin and/or nonsteroidal anti-inflammatory drugs.

- For information: Alcoholic beverages should be avoided during treatment. The risk of liver damage after chronic alcoholism withdrawal is increased.

- When administering to a child, the dose depends on his weight (see "How to use DOLIPRANE 100 mg, cuttable suppository?" section).
[Prompt tokens: 397, Completion tokens: 254]
---------- Summary ----------
Number of messages: 4
Finish reason: None
Total prompt tokens: 1601
Total completion tokens: 591
Duration: 19.30 seconds
-----------------
Be careful when using DOLIPRANE 100 mg:
- If pain persists for more than 5 days or fever for more than 3 days, or if the medication is not effective or other symptoms appear, do not continue treatment without your doctor's advice.

- Taking paracetamol can cause liver function problems.

- Before giving this medicine to your child, consult your doctor:

- If he has a liver disease or serious kidney illness,

- If he suffers from dehydration,

- For example, if he is chronically malnourished, fasting, has recently lost significant weight, has the HIV virus or chronic viral hepatitis, has mucoviscidosis (a genetic hereditary condition characterized by severe respiratory infections), or suffers from Gilbert's syndrome (a hereditary disease associated with increased bilirubin levels in the blood),

- If your child is allergic to aspirin and/or nonsteroidal anti-inflammatory drugs.

- For information: Alcoholic beverages should be avoided during treatment. The risk of liver damage after chronic alcoholism withdrawal is increased.

- When administering to a child, the dose depends on his weight (see "How to use DOLIPRANE 100 mg, cuttable suppository?" section).

Final thoughts

I had a lot of fun with this agent. In this very short exploration, I didn’t make sure the the society of mind at the end can generate a consistent response but only the inner team.
The agent is somehow overkill for this use case. Most of the time, we can expect the model to output very good translation even with the 1.5B model. The judge can be more useful if the we provide much more context to the translation: for example, the domain of the text, the target audience, etc.
There are still a lot of things to explore with the agent. For example, I can add more agents to the society of mind agent to generate the final response. Or I can make the judge to vote instead of approve the translation to make the final translation more reliable.
This kind of generation by debate is very powerful. There are apps with a whole IT company built with agents this way like ChatDev. It is a very interesting approach to generate the final response.
I will try to make a more complex agent in the future with other use cases.

[LLM] Society of Mind Agent for translation

Agentic Translation with validation

Autogen

SocietyOfMindAgent

The experiment

Final thoughts

References

[LLM] Society of Mind Agent for translation

Agentic Translation with validation

Autogen

SocietyOfMindAgent

The experiment

Final thoughts

References

See Also