The MuseCoco Text-to-MIDI Service is a refactored version of the MuseCoco repository, designed as a deployable service module. This service adapts to new data, manages the history of its checkpoints, and abstracts away the underlying implementation details to provide a seamless interface for generating MIDI files from textual inputs. Detailed comments are included to facilitate easy navigation and understanding of the codebase.
This service requires a CUDA-compatible NVIDIA GPU. The development and testing was performed with:
Minimum Requirements:
The repository is organized as follows:
musecoco-text2midi-service/
βββ src/
β βββ musecoco_text2midi_service/
β βββ control/ # Controllers for orchestrating service logic
β β βββ __init__.py
β β βββ _musecoco/ # MuseCoco model implementation
β β β βββ attribute2music_dataprepare/
β β β βββ attribute2music_model/
β β β βββ evaluation/
β β β βββ text2attribute_dataprepare/
β β β βββ text2attribute_model/
β β β βββ __init__.py
β β β βββ view.py
β β βββ _text2midi.py
β βββ dao/ # Data Access Objects for configuration management
β β βββ __init__.py
β β βββ _config_manager.py
β βββ model/ # Models representing the structure and workflow of MIDI generation
β β βββ __init__.py
β β βββ _config_model.py
β βββ utils/ # Utility functions for common tasks
β β βββ __init__.py
β β βββ _watch_dog.py
β βββ view/ # Views for API or CLI outputs
β β βββ __init__.py
β βββ __init__.py
β βββ main.py # CLI entry point for the service
βββ storage/
β βββ checkpoints/ # Model checkpoints
β β βββ linear_mask-1billion/
β β βββ checkpoint_2_280000.pt
β β βββ README.md # Instructions for managing checkpoints
β βββ config/ # Configuration files
β β βββ main_config.yaml # Main configuration file
β β βββ att_key.json
β β βββ num_labels.json
β βββ data/ # Training/evaluation data
β βββ generation/ # Generated output files
β βββ input/ # Input files for predictions
β β βββ predict_backup.json # Example input format for predictions
β β βββ predict.json
β βββ log/ # Log files
β βββ tmp/ # Temporary files and outputs
βββ tests/
β βββ test_text2midi.py # Test modules for various components
βββ docs/
β βββ openapi.yaml # OpenAPI specification for the REST API
βββ .gitignore # Specifies files and directories to ignore in version control
βββ .python-version # Python version specification
βββ fastapi_server.py # FastAPI REST API server
βββ inference.ipynb # Jupyter notebook for interactive inference
βββ LICENSE # License file
βββ pyproject.toml # Project metadata and dependencies (uv/pip)
βββ README.md # Project description and instructions
βββ uv.lock # Locked dependencies for reproducible builds
To install the MuseCoco Text-to-MIDI Service, follow these steps:
Clone the Repository:
git clone https://github.com/yhbcode000/musecoco-text2midi-service.git
cd musecoco-text2midi-service
Install Dependencies with uv (GPU Required):
This project uses uv for fast, reliable Python package management and requires a CUDA-compatible GPU. Install uv if you havenβt already:
# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
Important - Check Your CUDA Version First:
# Check your system CUDA version
nvcc --version
Two-Step Installation (required due to pytorch-fast-transformers build dependency):
# Step 1: Install base dependencies (includes PyTorch)
uv sync
# Step 2: Install pytorch-fast-transformers (requires torch to be installed first)
uv pip install pytorch-fast-transformers --no-build-isolation
Note:
pytorch-fast-transformersmust be installed separately because it requires PyTorch to be present during its build process.
The service uses YAML configuration files located in storage/config/. The main configuration file is main_config.yaml, which is used by the main.py script. You can modify these files to configure parameters such as model checkpoints, logging settings, and API keys.
Checkpoints should follow the instructions provided in storage/checkpoints/linear_mask-1billion/README.md and be saved in the same directory as the README.md file.
To run the terminal-based demo application:
python src/musecoco_text2midi_service/main.py
The src/musecoco_text2midi_service/main.py file provides a terminal-based app demo.
Refer to
storage/input/predict_backup.jsonfor examples of acceptable input formats for the service. This file contains sample data that illustrates how to structure text input for the MIDI generation process.
To start the FastAPI REST API server on port 8001:
# Using uv to run the FastAPI server
uv run python fastapi_server.py --port 8001
# Or activate the environment first
source .venv/bin/activate # On Linux/macOS
python fastapi_server.py --port 8001
The server accepts the following arguments:
--host - Host address to bind (default: 0.0.0.0)--port - Port number (default: 8001)--reload - Enable auto-reload for development--workers - Number of worker processes (default: 1)The FastAPI server provides the following REST API endpoints:
/ - API information and documentation links/health - Health check endpoint/submit-text - Submit text for MIDI generation (returns a job ID)/check-status/{job_id} - Check the status of a MIDI generation job/get-result/{job_id} - Get the metadata of a completed MIDI generation/download-midi/{job_id} - Download the generated MIDI fileFastAPI provides automatic interactive API documentation:
http://localhost:8001/docs - Interactive API explorer with request/response exampleshttp://localhost:8001/redoc - Clean, responsive API documentation# Submit a text for MIDI generation
curl -X POST http://localhost:8001/submit-text \
-H "Content-Type: application/json" \
-d '{"text": "This music uses a major key, with grand piano and cello, conveying edginess."}'
# Response:
# {
# "jobId": "abc-123",
# "status": "submitted",
# "message": "Job submitted successfully. Use the job_id to check status."
# }
# Check job status
curl http://localhost:8001/check-status/abc-123
# Response: {"jobId": "abc-123", "status": "completed"}
# Get result metadata
curl http://localhost:8001/get-result/abc-123
# Response:
# {
# "jobId": "abc-123",
# "status": "completed",
# "metaData": {...}
# }
# Download MIDI file
curl -O http://localhost:8001/download-midi/abc-123
import requests
# Submit job
response = requests.post(
"http://localhost:8001/submit-text",
json={"text": "A peaceful piano melody in C major"}
)
job_id = response.json()["jobId"]
# Poll for completion
import time
while True:
status = requests.get(f"http://localhost:8001/check-status/{job_id}")
if status.json()["status"] == "completed":
break
time.sleep(1)
# Download MIDI file
midi_file = requests.get(f"http://localhost:8001/download-midi/{job_id}")
with open("output.mid", "wb") as f:
f.write(midi_file.content)
You can also import the package into your own Python project:
from musecoco_text2midi_service.control import Text2Midi
from musecoco_text2midi_service.dao import load_config_from_file
config = load_config_from_file("storage/config/main_config.yaml")
text2midi = Text2Midi(config)
input_text = "This music's use of major key creates a distinct atmosphere, with a playtime of 1 ~ 15 seconds. The rhythm in this song is very pronounced, and the music is enriched by grand piano, cello and drum. Overall, the song's length is around about 6 bars. The music conveys edginess."
midi_data, meta_data = text2midi.text_to_midi(input_text, return_midi=True)
To run the test suite, use:
pytest tests/
This command will execute all test cases in the tests directory and provide a report of the test results. Ensure that the project is built correctly before running the tests.
We welcome contributions from the community. Please follow these steps to contribute:
This project is licensed under Apache License 2.0 - see the LICENSE file for more details.
pytorch-fast-transformers requires a two-step installation process (see Installation section)Thank you for your interest in the MuseCoco Text-to-MIDI Service! If you have any questions or need further assistance, feel free to open an issue or contact us.