1. 为什么要采用ActionNode的数据结构？

随着Agent需求的复杂化，SOP也会复杂化。动作的执行不再只是线性，需要支持CoT、ToT、GoT等技术，使用 ActionNode 可以实现统一抽象。

之前的Action，我们是用结构化的prompt形式，用markdown文本输入一个很长的文本信息，然后在action的run方法中解析LLM返回的信息，这种方式的灵活性是比较差的。

而对于LLM的请求，实际上都可以拆解为结构化的填槽：指定问题的上下文、需求、目的、输出格式。
对于写出一个有 SOP 且完整的文档，人类的撰写方式实际上是分解文档的子标题，对子标题进行动作的编排以及填槽。而通俗的来看SOP应该在被规划进行逐一执行，而不是硬编码在prompt当中要求llm一次性进行返回。我们平常用llm写文章的时候就会发现生成的回复往往是各个章节均衡的。对于写代码而言，llm往往能够生成一个60-75行左右的详细代码，而对于复杂的系统级需求，它生成的代码往往只能提供一个结构和部分注释。

ActionNode，实现了结构化填槽，使LLM请求和返回都更加标准化。

参考文档：

2. ActionNode基础知识

在MG框架0.5版本中，新增加了ActionNode类，为Agent的动作执行提供更强的能力。

ActionNode可以被视为一组动作树，根据类内定义，一个动作树的父节点可以访问所有的子动作节点；也就是说，定义了一个完整的动作树之后，可以从父节点按树的结构顺序执行每一个子动作节点。因此，动作的执行也可以突破0.4版本框架中，需要在Role的_react内循环执行的限制，达到更好的CoT效果。

ActionNode仍然需要基于Action类进行构建，在定义一个ActionNode动作树之后，需要将该动作树作为参数赋予一个Action子类，在并将其输入到Role中作为其动作。在这个意义上，一个ActionNode动作树可以被视为一个内置CoT思考的Action。

同时，在ActionNode基类中，也配置了更多格式检查和格式规范工具，让CoT执行过程中，内容的传递更加结构化。这也服务于让MetaGPT框架生成更好、更长、更少Bug的代码这一目的。

3. ActionNode源码

ActionNode具体实现，参见metagpt/actions/action_node.py

SIMPLE_TEMPLATE = """
## context
{context}

-----

## format example
{example}

## nodes: "<node>: <type>  # <comment>"
{instruction}

## constraint
{constraint}

## action
Fill in the above nodes based on the format example.
"""

class ActionNode:
    """ActionNode is a tree of nodes."""
    mode: str

    # Action Context
    context: str  # all the context, including all necessary info
    llm: BaseGPTAPI  # LLM with aask interface
    children: dict[str, "ActionNode"]

    # Action Input
    key: str  # Product Requirement / File list / Code
    expected_type: Type  # such as str / int / float etc.
    # context: str  # everything in the history.
    instruction: str  # the instructions should be followed.
    example: Any  # example for In Context-Learning.

    # Action Output
    content: str
    instruct_content: BaseModel

    def __init__(self, key: str, expected_type: Type, instruction: str, example: str, content: str = "",
                 children: dict[str, "ActionNode"] = None):
        self.key = key
        self.expected_type = expected_type
        self.instruction = instruction
        self.example = example
        self.content = content
        self.children = children if children is not None else {}

    def compile(self, context, to="json", mode="children", template=SIMPLE_TEMPLATE) -> str:
        """
        mode: all/root/children
            mode="children": 编译所有子节点为一个统一模板，包括instruction与example
            mode="all": NotImplemented
            mode="root": NotImplemented
        """

        # FIXME: json instruction会带来格式问题，如："Project name": "web_2048  # 项目名称使用下划线",
        self.instruction = self.compile_instruction(to="markdown", mode=mode)
        self.example = self.compile_example(to=to, tag="CONTENT", mode=mode)
        prompt = template.format(
            context=context, example=self.example, instruction=self.instruction, constraint=CONSTRAINT
        )
        return prompt

class Action(ABC):
    def __init__(self, name: str = "", context=None, llm: LLM = None):
        self.name: str = name
        if llm is None:
            llm = LLM()
        self.llm = llm
        self.context = context
        self.prefix = ""  # aask*时会加上prefix，作为system_message
        self.profile = ""  # FIXME: USELESS
        self.desc = ""  # for skill manager
        self.nodes = ...

这里重点学习一下ActionNode中的结构化填槽。
SIMPLE_TEMPLATE 定义了一个结构化的包含槽位的模板：问题的上下文、需求、目的、输出格式。
ActionNode初始化的时候，传入一些参数，之后调用compile方法的时候，再传入一些参数，生成完整的prompt。
Action中的一个变量（nodes）是ActionNode，以此关联。也可以在Action直接调用ActionNode，不必作为Action的变量。

4. 本章作业

《第三章 MetaGPT框架组件介绍》中基于Action实现了一个技术文档助手，本章作业是基于ActionNode实现一个功能相同的技术文档助手。

一个计算机技术爱好者与学习者

基于MetaGPT实现一个订阅智能体：第五章 ActionNode

1. 为什么要采用ActionNode的数据结构？

2. ActionNode基础知识

3. ActionNode源码

4. 本章作业