musecoco-text2midi-service

🎡 MuseCoco Text-to-MIDI Service

The MuseCoco Text-to-MIDI Service is a refactored version of the MuseCoco repository, designed as a deployable service module. This service adapts to new data, manages the history of its checkpoints, and abstracts away the underlying implementation details to provide a seamless interface for generating MIDI files from textual inputs. Detailed comments are included to facilitate easy navigation and understanding of the codebase.

πŸ“‹ Table of Contents

✨ Features

πŸ’» System Requirements

This service requires a CUDA-compatible NVIDIA GPU. The development and testing was performed with:

Minimum Requirements:

πŸ“‚ Directory Structure

The repository is organized as follows:

musecoco-text2midi-service/
β”œβ”€β”€ src/
β”‚   └── musecoco_text2midi_service/
β”‚       β”œβ”€β”€ control/                   # Controllers for orchestrating service logic
β”‚       β”‚   β”œβ”€β”€ __init__.py
β”‚       β”‚   β”œβ”€β”€ _musecoco/             # MuseCoco model implementation
β”‚       β”‚   β”‚   β”œβ”€β”€ attribute2music_dataprepare/
β”‚       β”‚   β”‚   β”œβ”€β”€ attribute2music_model/
β”‚       β”‚   β”‚   β”œβ”€β”€ evaluation/
β”‚       β”‚   β”‚   β”œβ”€β”€ text2attribute_dataprepare/
β”‚       β”‚   β”‚   β”œβ”€β”€ text2attribute_model/
β”‚       β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚       β”‚   β”‚   └── view.py
β”‚       β”‚   └── _text2midi.py
β”‚       β”œβ”€β”€ dao/                       # Data Access Objects for configuration management
β”‚       β”‚   β”œβ”€β”€ __init__.py
β”‚       β”‚   └── _config_manager.py
β”‚       β”œβ”€β”€ model/                     # Models representing the structure and workflow of MIDI generation
β”‚       β”‚   β”œβ”€β”€ __init__.py
β”‚       β”‚   └── _config_model.py
β”‚       β”œβ”€β”€ utils/                     # Utility functions for common tasks
β”‚       β”‚   β”œβ”€β”€ __init__.py
β”‚       β”‚   └── _watch_dog.py
β”‚       β”œβ”€β”€ view/                      # Views for API or CLI outputs
β”‚       β”‚   └── __init__.py
β”‚       β”œβ”€β”€ __init__.py
β”‚       └── main.py                    # CLI entry point for the service
β”œβ”€β”€ storage/
β”‚   β”œβ”€β”€ checkpoints/                   # Model checkpoints
β”‚   β”‚   └── linear_mask-1billion/
β”‚   β”‚       β”œβ”€β”€ checkpoint_2_280000.pt
β”‚   β”‚       └── README.md              # Instructions for managing checkpoints
β”‚   β”œβ”€β”€ config/                        # Configuration files
β”‚   β”‚   β”œβ”€β”€ main_config.yaml           # Main configuration file
β”‚   β”‚   β”œβ”€β”€ att_key.json
β”‚   β”‚   └── num_labels.json
β”‚   β”œβ”€β”€ data/                          # Training/evaluation data
β”‚   β”œβ”€β”€ generation/                    # Generated output files
β”‚   β”œβ”€β”€ input/                         # Input files for predictions
β”‚   β”‚   β”œβ”€β”€ predict_backup.json        # Example input format for predictions
β”‚   β”‚   └── predict.json
β”‚   β”œβ”€β”€ log/                           # Log files
β”‚   └── tmp/                           # Temporary files and outputs
β”œβ”€β”€ tests/
β”‚   └── test_text2midi.py              # Test modules for various components
β”œβ”€β”€ docs/
β”‚   └── openapi.yaml                   # OpenAPI specification for the REST API
β”œβ”€β”€ .gitignore                         # Specifies files and directories to ignore in version control
β”œβ”€β”€ .python-version                    # Python version specification
β”œβ”€β”€ fastapi_server.py                  # FastAPI REST API server
β”œβ”€β”€ inference.ipynb                    # Jupyter notebook for interactive inference
β”œβ”€β”€ LICENSE                            # License file
β”œβ”€β”€ pyproject.toml                     # Project metadata and dependencies (uv/pip)
β”œβ”€β”€ README.md                          # Project description and instructions
└── uv.lock                            # Locked dependencies for reproducible builds

βš™οΈ Installation

To install the MuseCoco Text-to-MIDI Service, follow these steps:

  1. Clone the Repository:

    git clone https://github.com/yhbcode000/musecoco-text2midi-service.git
    cd musecoco-text2midi-service
    
  2. Install Dependencies with uv (GPU Required):

    This project uses uv for fast, reliable Python package management and requires a CUDA-compatible GPU. Install uv if you haven’t already:

    # Install uv (if not already installed)
    curl -LsSf https://astral.sh/uv/install.sh | sh
    

    Important - Check Your CUDA Version First:

    # Check your system CUDA version
    nvcc --version
    

    Two-Step Installation (required due to pytorch-fast-transformers build dependency):

    # Step 1: Install base dependencies (includes PyTorch)
    uv sync
    
    # Step 2: Install pytorch-fast-transformers (requires torch to be installed first)
    uv pip install pytorch-fast-transformers --no-build-isolation
    

    Note: pytorch-fast-transformers must be installed separately because it requires PyTorch to be present during its build process.

πŸ”§ Configuration

The service uses YAML configuration files located in storage/config/. The main configuration file is main_config.yaml, which is used by the main.py script. You can modify these files to configure parameters such as model checkpoints, logging settings, and API keys.

Checkpoints should follow the instructions provided in storage/checkpoints/linear_mask-1billion/README.md and be saved in the same directory as the README.md file.

πŸš€ Usage

Option 1: Command-Line Demo

To run the terminal-based demo application:

python src/musecoco_text2midi_service/main.py

The src/musecoco_text2midi_service/main.py file provides a terminal-based app demo.

Refer to storage/input/predict_backup.json for examples of acceptable input formats for the service. This file contains sample data that illustrates how to structure text input for the MIDI generation process.

Option 2: FastAPI REST API Server

To start the FastAPI REST API server on port 8001:

# Using uv to run the FastAPI server
uv run python fastapi_server.py --port 8001

# Or activate the environment first
source .venv/bin/activate  # On Linux/macOS
python fastapi_server.py --port 8001

The server accepts the following arguments:

API Endpoints

The FastAPI server provides the following REST API endpoints:

Interactive API Documentation

FastAPI provides automatic interactive API documentation:

Example API Usage

# Submit a text for MIDI generation
curl -X POST http://localhost:8001/submit-text \
  -H "Content-Type: application/json" \
  -d '{"text": "This music uses a major key, with grand piano and cello, conveying edginess."}'

# Response:
# {
#   "jobId": "abc-123",
#   "status": "submitted",
#   "message": "Job submitted successfully. Use the job_id to check status."
# }

# Check job status
curl http://localhost:8001/check-status/abc-123
# Response: {"jobId": "abc-123", "status": "completed"}

# Get result metadata
curl http://localhost:8001/get-result/abc-123
# Response:
# {
#   "jobId": "abc-123",
#   "status": "completed",
#   "metaData": {...}
# }

# Download MIDI file
curl -O http://localhost:8001/download-midi/abc-123

Python Client Example

import requests

# Submit job
response = requests.post(
    "http://localhost:8001/submit-text",
    json={"text": "A peaceful piano melody in C major"}
)
job_id = response.json()["jobId"]

# Poll for completion
import time
while True:
    status = requests.get(f"http://localhost:8001/check-status/{job_id}")
    if status.json()["status"] == "completed":
        break
    time.sleep(1)

# Download MIDI file
midi_file = requests.get(f"http://localhost:8001/download-midi/{job_id}")
with open("output.mid", "wb") as f:
    f.write(midi_file.content)

Option 3: Python Package Import

You can also import the package into your own Python project:

from musecoco_text2midi_service.control import Text2Midi
from musecoco_text2midi_service.dao import load_config_from_file

config = load_config_from_file("storage/config/main_config.yaml")
text2midi = Text2Midi(config)

input_text = "This music's use of major key creates a distinct atmosphere, with a playtime of 1 ~ 15 seconds. The rhythm in this song is very pronounced, and the music is enriched by grand piano, cello and drum. Overall, the song's length is around about 6 bars. The music conveys edginess."

midi_data, meta_data = text2midi.text_to_midi(input_text, return_midi=True)

πŸ§ͺ Running Tests

To run the test suite, use:

pytest tests/

This command will execute all test cases in the tests directory and provide a report of the test results. Ensure that the project is built correctly before running the tests.

🀝 Contributing

We welcome contributions from the community. Please follow these steps to contribute:

  1. Fork the repository.
  2. Create a new branch for your feature or bugfix.
  3. Commit your changes with descriptive commit messages.
  4. Push your changes to your forked repository.
  5. Create a pull request with a detailed description of your changes.

πŸ“„ License

This project is licensed under Apache License 2.0 - see the LICENSE file for more details.

Reference:

Recommendation

Notice


Thank you for your interest in the MuseCoco Text-to-MIDI Service! If you have any questions or need further assistance, feel free to open an issue or contact us.