Code2MCP: Transforming Code Repositories into MCP Services

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The Model Context Protocol (MCP) aims to create a standard for how Large Language Models use tools. However, most current research focuses on selecting tools from an existing pool. A more fundamental, yet largely overlooked, problem is how to populate this pool by converting the vast number of existing software projects into MCP-compatible services. To bridge this gap, we introduce Code2MCP, an agent-based framework that automatically transforms a GitHub repository into a functional MCP service with minimal human intervention. Code2MCP employs a multi-agent workflow for code analysis, environment setup, tool function design, and service generation, enhanced by a self-correcting loop to ensure reliability. We demonstrate that Code2MCP successfully transforms open-source computing libraries in scientific fields such as bioinformatics, mathematics, and fluid dynamics that are not available in existing MCP servers. By providing a novel automated pathway to unlock GitHub, the world’s largest code repository, for the MCP ecosystem, Code2MCP serves as a catalyst to significantly accelerate the protocol’s adoption and practical application. The code is public at https://github.com/DEFENSE-SEU/Code2MCP.

💡 Research Summary

**
The paper addresses a critical bottleneck in the emerging Model Context Protocol (MCP) ecosystem: the supply‑side problem of populating the protocol’s tool pool with standardized services. While prior work has largely focused on how large language models (LLMs) consume tools—teaching models to invoke a handful of pre‑defined APIs, scaling up to large curated platforms such as HuggingGPT or Gorilla—very little attention has been paid to automatically converting the millions of existing open‑source repositories into MCP‑compliant services. To fill this gap, the authors introduce Code2MCP, an agent‑based framework that transforms an arbitrary GitHub repository into a fully functional, documented MCP service with minimal human effort.

Core Architecture

Code2MCP orchestrates seven specialized agents in a multi‑stage pipeline:

Download Agent clones the target repository into an isolated workspace.
Environment Agent parses dependency files (requirements.txt, environment.yml, Dockerfiles) and builds a reproducible execution environment (container or virtual env), mitigating common setup failures.
Analysis Agent performs deep code comprehension using a “Deep Wiki” semantic tool, extracting high‑level functionalities, intent from comments, and generating a Code Report that lists candidate APIs.
Generation Agent leverages an LLM (e.g., GPT‑4) to design MCP‑compatible interfaces, create adapter code linking the original library to the MCP API, and produce a basic test suite.
Run Agent executes the generated tests in the prepared environment. On success the pipeline proceeds; on failure it captures the full traceback.
Review Agent diagnoses the failure by correlating the traceback with the generated code, the Code Report, and the test definition. It produces a correction plan that pinpoints which files or code blocks need modification.
Finalize Agent packages the validated MCP service, writes a README, arranges all artifacts under a mcp_output directory, and optionally opens a pull request against the original repository.

The Run‑Review‑Fix loop is the novelty that distinguishes Code2MCP from naïve retry strategies. Instead of re‑generating the entire service after each failure, the loop performs targeted, diagnosis‑driven repairs, preserving already‑validated components and dramatically reducing iteration time.

Security Model

Because Code2MCP executes third‑party code, the authors run all conversions inside sandboxed workers with restricted filesystem access and bounded runtime. Outbound network access is disabled by default; for repositories that require external services, a whitelist mechanism enables selective connectivity. The paper does not claim adversarial robustness but discusses operational safeguards to prevent accidental execution of malicious payloads.

Evaluation

The authors evaluate Code2MCP on 50 open‑source repositories spanning ten scientific and engineering domains (bioinformatics, mathematics, fluid dynamics, etc.), selecting five repositories per domain to expose diverse dependency graphs, API styles, and execution environments. The evaluation metric is binary: a conversion succeeds if the generated MCP service passes all tests and a pull request can be created.

Key findings:

Overall success rate: 86 % (43/50 repositories). Most failures were due to extremely complex build systems or missing licensing information.
Iteration count: 1–3 Run‑Review‑Fix cycles were sufficient for the majority of successful cases, indicating the effectiveness of targeted debugging.
Time efficiency: Average end‑to‑end conversion time was ~12 minutes, a 42 % reduction compared to a baseline that used GPT‑4 to fill a handcrafted template and required manual debugging.
Human effort: Compared with fully manual implementation, Code2MCP reduced required developer hours by over 70 %, demonstrating substantial productivity gains.
Baseline comparison: A GPT‑4 template‑based approach achieved only 58 % success and required roughly twice the time per repository.

Limitations and Future Work

The authors acknowledge several constraints:

GUI‑centric or interactive tools are not yet supported because the current pipeline assumes a programmatic API surface.
Heavy external‑service dependencies (e.g., cloud APIs) require manual whitelist configuration; fully automated handling remains an open problem.
License and copyright verification are performed only superficially; automated legal compliance checks are needed for large‑scale deployment.
Domain generalization: While the evaluation covers a broad scientific spectrum, extending to typical web frameworks, database systems, or large monolithic applications may demand domain‑specific prompts and richer test generation strategies.

Future directions include integrating static analysis for security scanning, automating license compliance, supporting multi‑modal artifacts (e.g., Jupyter notebooks), and scaling the framework to handle millions of repositories via distributed orchestration.

Conclusion

Code2MCP presents a practical, end‑to‑end solution for converting arbitrary GitHub repositories into MCP‑compatible services. By combining specialized agents with a self‑correcting Run‑Review‑Fix loop, it achieves high success rates, reduces conversion time, and dramatically cuts manual effort. This work effectively unlocks the vast GitHub ecosystem for the MCP protocol, paving the way for a richer, plug‑and‑play tool pool that can be readily consumed by LLM‑driven autonomous agents.