musicaiz.models.transformer_composers¶

Transformer Composers¶

This submodule presents a GPT2 model that generates music.

The tokenization is previously done with musanalysis MMMTokenizer() class.

Installation¶

To train these models you should install torch with cuda. We recommend torch version 1.11.0 with cuda 113:

>>> pip3 install torch==1.11.0 --extra-index-url https://download.pytorch.org/whl/cu113

Apart from that, apex is also necessary. To install it properly, follow the instructions in: https://github.com/NVIDIA/apex

Configurations¶

GPTConfigs()

...

TrainConfigs()

Dataloaders¶

`build_torch_loaders`(dataset_path, ...[, ...])	Builds the train a validation dataloaders.
`get_vocabulary`(dataset_path)	Read one txt file and retrieves the vocabulary.

Model¶

`self_attention`(query, key, value, device[, ...])
`MultiheadAttention`(n_heads, embed_dim, device)
`PositionalEncoding`(d_model[, dropout, max_len])
`Embedding`(vocab_size, embedding_dim, device)
`ResidualConnection`(size, dropout)
`FeedForward`(dim[, mult, dropout])
`Decoder`(d_model, n_head, device[, causal, ...])
`GPT2`(vocab_size, embedding_dim, n_decoders, ...)

Train¶

train(dataset_path[, sequence_length, ...])

Parameters

Generation¶

sample_sequence(dataset_path[, ...])

This function generates a sequence from a pretrained model.

Gradio App¶

There’s a simple app for this model built with Gradio. To try the demo locally run:

>>> python models/transformer_composers/app.py

Examples¶

Train model:

>>> python models/transformer_composers/train.py --dataset_path="..." --is_splitted True

Generate Sequence:

>>> python models/transformer_composers/generate.py --dataset_path H:/GitHub/musicaiz-datasets/jsbchorales/mmm/all_bars --dataset_name jsbchorales --save_midi True --file_path ../midi