Get Started

SEIMEI – Search-Enhanced Interface for Multi-Expertise Inference

SEIMEI is KyotoAI’s agent system. It uses reward-model-based search (RMSearch) to guide thousands of reasoning agents, keeping long chains of thought accurate and grounded in your data.

Alpha – API stable, internals evolving

What the READMEs highlight

This page condenses the two READMEs in the repository (README.md and seimei/README.md). Together they describe SEIMEI as a search-enhanced orchestrator where RMSearch picks the next reasoning step and lightweight agents execute the plan.

Reward-model guidance – stay close to the process shown in the main README image gallery: search a large pool of experts and favor the most promising thought chain.
Composable orchestrator – the `seimei` class (documented in seimei/README.md) loads agents, enforces token limits, and logs every run.
Dataset logging – each run writes artifacts under seimei_runs/ so you can train RMSearch or replay experiments.

Key idea from README.md

Train a smaller RMSearch model to choose the next branch instead of constantly fine-tuning the base LLM. The orchestrator stays fast, reusable, and far cheaper to adapt to new domains.

Architecture at a glance

Borrowing the "Search the Best Agent" and "The Most Intelligent Search Engine" sections from the main README, SEIMEI flows through four repeating steps:

1. Ingest & index

Pull papers, repos, experiment logs, or meeting notes into an index. Combine your own vector store with the RMSearch training data mentioned in the README walkthrough.

2. Reward rerank

RMSearch scores "agent × question" pairs and reasoning states, similar to the comparison plots shown in the README. High scorers get scheduled next.

3. Agent execution

The orchestrator from seimei/README.md loads planners, tool users, and synthesis agents, applying allowlists and shared context.

4. Final response

When an agent returns final_output, SEIMEI summarizes the reasoning traces, saves logs, and (optionally) feeds the run back into the knowledge base.

Typical workflow

Sync your corpus and agents into a directory.
Train or plug in RMSearch (see the comparison charts in the main README).
Instantiate the `seimei` orchestrator with the config shown in seimei/README.md.
Let RMSearch rank instructions/agents on every step, keeping reasoning grounded and affordable.
Persist the run for later evaluation or dataset generation under seimei_runs/.

Quick Start

Follow the same steps documented in README.md and seimei/README.md: install the package, export your API keys, and run either the CLI or a small Python script. Everything below is a lightly formatted version of those instructions.

Install from source (README.md)

Clone the repository locally and install it in editable mode so the CLI and Python import paths stay in sync with your edits.

git clone https://github.com/kyotoai/SEIMEI.git
cd SEIMEI
pip install -e .

Optional: pip install duckduckgo_search requests to enable the web-search agent showcased below.

Set API keys (both READMEs)

Export your OpenAI-compatible key (for LLM calls) and the KyotoAI key (for RMSearch). The CLI inherits them automatically.

export OPENAI_API_KEY="your-openai-api-key"
export KYOTOAI_API_KEY="your-kyotoai-api-key"

Prefix these commands with set -a in zsh or add them to your shell profile if you want them available for every session.

Example commands

Use the CLI for a fast smoke test or the Python orchestrator for a full-featured run. Both snippets are lifted directly from the READMEs.

# Start the CLI that mirrors seimei/README.md
seimei

# Sample prompt:
# "Analyze the files inside this folder and explain what SEIMEI is."

import asyncio
from seimei import seimei

async def demo():
    orchestrator = seimei(
        llm_kwargs={"model": "gpt-5-nano"},
        rm_kwargs={"url": "https://kyotoai.net/v1/rmsearch", "agent_routing": False, "knowledge_search": True},
        allow_code_exec=True,
        agent_log_head_lines=1,
        max_tokens_per_question=30000,
        load_knowledge_path="seimei_knowledge/knowledge.csv",
    )

    result = await orchestrator(
        messages=[{"role": "user", "content": "Design a 7-day turbulence plan based on my history."}],
    )
    print(result["output"])

asyncio.run(demo())

Log every run to knowledge (README.md)

Set generate_knowledge=True to append retrospectives into seimei_knowledge/. The helper seimei.knowledge.generate_from_runs mirrors the snippet in the root README.

result = await orchestrator(
    messages=[{"role": "user", "content": "Find ways to speed up the ETL pipeline."}],
    generate_knowledge=True,
    save_knowledge_path="seimei_knowledge/knowledge.csv",
    knowledge_prompt_path="seimei/knowledge/prompts/generate_from_runs.md",
)

Code Act agent

Reproduce the built-in seimei/agents/code_act.py example from seimei/README.md. It executes whitelisted shell commands, streams logs, and can write knowledge after every run.

Turn on allow_code_exec=True and restrict allowed_commands to keep the sandbox tight.
Pass a system message if you need stricter execution etiquette (e.g., "never run unasked commands").
Inspect result["msg_history"][-2] for the agent’s raw reply, exactly as depicted in the README snippet.

import asyncio
from seimei import seimei

async def demo_code_act():
    orchestrator = seimei(
        agent_config=[{"file_path": "seimei/agents/code_act.py"}],
        llm_kwargs={"model": "gpt-4o-mini"},
        allow_code_exec=True,
        allowed_commands=["ls", "cat", "python"],
        agent_log_head_lines=1,
        max_tokens_per_question=2000,
    )

    result = await orchestrator(
        messages=[
            {"role": "system", "content": "You are an execution assistant that never runs unasked commands."},
            {"role": "user", "content": "List the repo root and summarize the files."},
        ],
    )
    print(result["msg_history"][-2]["content"])

asyncio.run(demo_code_act())

Web Search agent

The README also ships a seimei/agents/web_search.py demo. Install duckduckgo_search (see Quick Start) and use the agent to collect quick facts without leaving the orchestrator.

import asyncio
from seimei import seimei

async def demo_web_search():
    orchestrator = seimei(
        agent_config=[{"file_path": "seimei/agents/web_search.py"}],
        llm_kwargs={"model": "gpt-4o-mini"},
        agent_log_head_lines=2,
        max_tokens_per_question=4000,
    )

    result = await orchestrator(
        messages=[
            {"role": "system", "content": "You gather concise search summaries."},
            {"role": "user", "content": "Search for recent applications of perovskite solar cells."},
        ]
    )
    print(result["msg_history"][-2]["content"])

asyncio.run(demo_web_search())

Custom agent skeleton

Extend SEIMEI by following the “Create an agent” guide inside seimei/README.md. Drop new files anywhere and add them to agent_config.

from seimei import Agent

class prioritise_docs(Agent):
    """Rank documentation chunks that should be read next."""

    description = "Select document chunks relevant to the latest user request."

    async def inference(self, messages, shared_ctx, **kwargs):
        search = shared_ctx.get("search")
        if not search:
            return {"content": "search helper unavailable", "log": {}}

        question = next((m["content"] for m in reversed(messages) if m.get("role") == "user"), "")
        candidates = [{"key": text, "section": name} for name, text in kwargs.get("docs", [])]

        ranked = await search(
            query=question,
            keys=candidates,
            k=3,
            context={"purpose": "doc_ranking"},
        )
        plan = "\n".join(f"- {item['payload']['section']}" for item in ranked if item.get("payload"))
        return {"content": f"Review next:\n{plan}", "log": {"query": question, "sections": ranked}}

Use agent_config=[{"dir_path": "my_agents"}] to load every Python file in that directory. The shared context exposes the RMSearch helper, instruction list, and run-scoped LLM proxy.

Rate limiting & usage units

SEIMEI and RMSearch are billed and rate-limited using UTF-8 bytes, not tokens. This makes costs predictable across languages and models.

Per-minute soft limit (example)

Suppose we allow 2,000,000 bytes/minute per API key:

Total bytes for one RMSearch call =
  overhead_bytes
  + len(query.encode("utf-8"))
  + sum(len(doc_i.encode("utf-8")) for each document)

When you exceed the soft limit, requests may still succeed but be throttled into a higher-latency mode. For production deployments we recommend aggregating queries and using streaming where possible.

Pricing & deployment model

SEIMEI is designed to work in both open-source and managed modes. The exact pricing will depend on your deployment, but a typical setup looks like:

Open-source core – the SEIMEI orchestration code and RMSearch reference models under an Apache-2.0-style license.
Managed API – KyotoAI hosts SEIMEI and RMSearch for you with SLAs, monitoring, and scaling.
Custom deployments – on-prem or VPC setups for sensitive scientific workloads.

Contract note

If you are providing a project-specific search model or SEIMEI deployment under a non-exclusive contract, make sure your terms explicitly distinguish between:

the trained model weights and interface, and
any customer-owned data or analysis deliverables.

That way you can open-source SEIMEI and RMSearch while keeping customer data private and respecting their usage rights.