The final graph processing/coaching pipeline for symbolic music scores inside GraphMuse includes the next steps:
- Preprocess the database of scores to generate enter graphs, GraphMuse can do that for you quick and simple;
- Pattern the enter graphs to create memory-efficient batches, once more GraphMuse obtained your again;
- Type a batch as a brand new graph with nodes and edges from numerous sampled enter graphs; For every graph, a set of nodes is chosen which we name goal nodes. The neighbors of the goal nodes will also be fetched by demand in a course of known as node-wise sampling.
- Replace the goal nodes’ representations by means of graph convolution to create node embeddings. GraphMuse supplies some fashions that you should utilize, in any other case PyTorch Geometric will also be your good friend;
- Use these embeddings for task-specific functions. This half is on you however I’m positive you may make it!
Word that focus on nodes might embrace all or a subset of batch nodes relying on the sampling technique.
Now that the method is graphically defined let’s take a more in-depth have a look at how GraphMuse handles sampling notes from every rating.
Sampling course of per rating.
- A randomly chosen word (in yellow) is first sampled.
- The boundaries of the goal notes are then computed with a finances of 15 notes on this instance (pink and yellow notes).
- Then the k-hop neighbors are fetched for the targets (mild blue for 1-hop and darker blue for 2-hop). The k-hop neighbors are computed with respect to the enter graph (depicted with coloured edges connecting noteheads within the determine above).
- We are able to additionally lengthen the sampling course of for the beat and measure parts. Word that the k-hop neighbors needn’t be strictly associated to a time window.
To maximise the computational assets (i.e. reminiscence) the above course of is repeated for a lot of scores without delay to create one batch. Utilizing this course of, GraphMuse asserts that each sampled phase goes to have the identical measurement of goal notes. Each sampled phase might be mixed to a brand new graph which can be of measurement at most #_scores x #_target_notes. This new graph constitutes the batch for the present coaching step.
For the hands-on half let’s attempt to use GraphMuse and use a mannequin for pitch spelling. The pitch spelling job is about inferring the word title and accidentals when they’re absent from the rating. An instance of this software is when now we have a quantized midi and wish to create a rating corresponding to the instance within the determine under:
Earlier than putting in GraphMuse you will want to put in PyTorch and PyTorch Geometric. Take a look at the suitable model in your system here and here.
After this step, to put in GraphMuse open your most popular terminal and sort:
pip set up graphmuse
After set up, let’s learn a MIDI file from a URL and create the rating graph with GraphMuse.
import graphmuse as gmmidi_url_raw = "https://github.com/CPJKU/partitura/uncooked/refs/heads/most important/exams/knowledge/midi/bach_midi_score.mid"
graph = gm.load_midi_to_graph(midi_url_raw)
The underlying course of reads the file with Partitura after which feeds it by means of GraphMuse.
To coach our mannequin to deal with Pitch Spelling, we first want a dataset of musical scores the place the pitch spelling has already been annotated. For this, we’ll be utilizing the ASAP Dataset (licenced below CC BY-NC-SA 4.0), which is able to function the inspiration for our mannequin’s studying. To get the ASAP Dataset you may obtain it utilizing git or directly from github:
git clone https://github.com/cpjku/asap-dataset.git
The ASAP dataset contains scores and performances of assorted classical piano items. For our use-case we are going to use solely the scores which finish in .musicxml
.
As we load this dataset, we’ll want two important utilities: one to encode pitch spelling and one other to deal with key signature info, each of which can be transformed into numerical labels. Thankfully, these utilities can be found throughout the pre-built pitch spelling mannequin in GraphMuse. Let’s start by importing all the required packages and loading the primary rating to get began.
import graphmuse as gm
import partitura as pt
import os
import torch
import numpy as np# Listing containing the dataset, change this to the placement of your dataset
dataset_dir = "/your/path/to/the/asap-dataset"
# Discover all of the rating recordsdata within the dataset (they're all named 'xml_score.musicxml')
score_files = [os.path.join(dp, f) for dp, dn, filenames in os.walk(dataset_dir) for f in filenames if f == 'xml_score.musicxml']
# Use the primary 30 scores, change this quantity to make use of kind of scores
score_files = score_files[:30]
# probe the primary rating file
rating = pt.load_score(score_files[0])
# Extract options and word array
options, f_names = gm.utils.get_score_features(rating)
na = rating.note_array(include_pitch_spelling=True, include_key_signature=True)
# Create a graph from the rating options
graph = gm.create_score_graph(options, rating.note_array())
# Get enter function measurement and metadata from the primary graph
in_feats = graph["note"].x.form[1]
metadata = graph.metadata()
# Create a mannequin for pitch spelling prediction
mannequin = gm.nn.fashions.PitchSpellingGNN(
in_feats=in_feats, n_hidden=128, out_feats_enc=64, n_layers=2, metadata=metadata, add_seq=True
)
# Create encoders for pitch and key signature labels
pe = mannequin.pitch_label_encoder
ke = mannequin.key_label_encoder
Subsequent, we’ll load the remaining rating recordsdata from the dataset to proceed making ready our knowledge for mannequin coaching.
# Initialize lists to retailer graphs and encoders
graphs = [graph]# Course of every rating file
for score_file in score_files[1:]:
# Load the rating
rating = pt.load_score(score_file)
# Extract options and word array
options, f_names = gm.utils.get_score_features(rating)
na = rating.note_array(include_pitch_spelling=True, include_key_signature=True)
# Encode pitch and key signature labels
labels_pitch = pe.encode(na)
labels_key = ke.encode(na)
# Create a graph from the rating options
graph = gm.create_score_graph(options, rating.note_array())
# Add encoded labels to the graph
graph["note"].y_pitch = torch.from_numpy(labels_pitch).lengthy()
graph["note"].y_key = torch.from_numpy(labels_key).lengthy()
# Append the graph to the record
graphs.append(graph)
As soon as the graph buildings are prepared, we will transfer on to creating the info loader, which is conveniently offered by GraphMuse. At this stage, we’ll additionally outline customary coaching elements just like the loss perform and optimizer to information the educational course of.
# Create a DataLoader to pattern subgraphs from the graphs
loader = gm.loader.MuseNeighborLoader(graphs, subgraph_size=100, batch_size=16, num_neighbors=[3, 3])# Outline loss capabilities for pitch and key prediction
loss_pitch = torch.nn.CrossEntropyLoss()
loss_key = torch.nn.CrossEntropyLoss()
# Outline the optimizer
optimizer = torch.optim.Adam(mannequin.parameters(), lr=0.001)
Let me remark a bit extra on the gm.loader.MuseNeighborLoader.
That is the core dataloader in GraphMuse and it incorporates the sampling that was defined within the earlier part. subgraph_size refers back to the variety of goal nodes per enter graph, batch_size is the variety of sampled graphs per batch, and at last, num_neighbors refers back to the variety of neighbors sampled per sampled node in every layer.
With all the things in place, we’re lastly prepared to coach the mannequin. So, let’s dive in and begin the coaching course of!
# Practice the mannequin for five epochs
for epoch in vary(5):
loss = 0
i = 0
for batch in loader:
# Zero the gradients
optimizer.zero_grad()# Get neighbor masks for nodes and edges for extra environment friendly coaching
neighbor_mask_node = {okay: batch[k].neighbor_mask for okay in batch.node_types}
neighbor_mask_edge = {okay: batch[k].neighbor_mask for okay in batch.edge_types}
# Ahead cross by means of the mannequin
pred_pitch, pred_key = mannequin(
batch.x_dict, batch.edge_index_dict, neighbor_mask_node, neighbor_mask_edge,
batch["note"].batch[batch["note"].neighbor_mask == 0]
)
# Compute loss for pitch and key prediction
loss_pitch_val = loss_pitch(pred_pitch, batch["note"].y_pitch[batch["note"].neighbor_mask == 0])
loss_key_val = loss_key(pred_key, batch["note"].y_key[batch["note"].neighbor_mask == 0])
# Whole loss
loss_val = loss_pitch_val + loss_key_val
# Backward cross and optimization
loss_val.backward()
optimizer.step()
# Accumulate loss
loss += loss_val.merchandise()
i += 1
# Print common loss for the epoch
print(f"Epoch {epoch} Loss {loss / i}")
Hopefully, we’ll quickly see the loss perform lowering, a optimistic signal that our mannequin is successfully studying tips on how to carry out pitch spelling. Fingers crossed!
GraphMuse is a framework that tries to make the coaching and deployment of graph fashions for symbolic music processing simpler.
For many who wish to retrain, deploy, or finetune earlier state-of-the-art fashions for symbolic music evaluation, GraphMuse incorporates a few of the vital elements to re-build and re-train your mannequin quicker and extra effectively.
GraphMuse retains its flexibility by means of its simplicity, for many who wish to prototype, innovate, and design new fashions. It goals to supply a easy set of utilities reasonably than together with advanced chained pipelines that may block the innovation course of.
For many who wish to study, visualize, and get hands-on expertise, GraphMuse is nice to get you began. It affords a simple introduction to fundamental capabilities and pipelines with just a few strains of code. GraphMuse can be linked with MusGViz, which permits graphs and scores to be simply visualized collectively.
We can’t discuss concerning the optimistic facets of any mission with out discussing the unfavorable ones as properly.
GraphMuse is a new child mission and in its present state, it’s fairly easy. It’s targeted on overlaying the important components of graph studying reasonably than being a holistic framework that covers all potentialities. Due to this fact it nonetheless focuses loads on user-based implementation on many components of the aforementioned pipeline.
Like each open-source mission in growth GraphMuse wants assist to develop. So please, in case you discover bugs or need extra options don’t hesitate to report, request, or contribute to the GraphMuse GitHub mission.
Final however not least, GraphMuse makes use of C libraries corresponding to torch-sparse and torch-scatter and has its personal C-bindings to speed up graph creation due to this fact set up shouldn’t be all the time easy. The home windows set up is more difficult judging from our person testing and person interplay reviews, though not unimaginable (I’m operating it on Home windows myself).
Future plans embrace:
- Making set up simpler;
- Add extra help for fashions and dataloaders for exact duties;
- Develop the open-source neighborhood round GraphMuse to maintain graph coding for music rising.
GraphMuse is a Python library that makes working with music graphs a bit bit simpler. It focuses on the coaching side of graph-based fashions for music however goals to retain flexibility when research-based tasks require it.
If you want to help the event and future progress of GraphMuse please star the repo here .
Comfortable graph coding !!!
[all images are by the author]