musicaiz.tokenizers.MMMTokenizerArguments

class musicaiz.tokenizers.MMMTokenizerArguments(prev_tokens: str = '', windowing: bool = True, time_unit: str = 'THIRTY_SECOND', num_programs: Optional[List[int]] = None, shuffle_tracks: bool = True, track_density: bool = False, window_size: int = 4, hop_length: int = 1, time_sig: bool = False, velocity: bool = False, quantize: bool = False, tempo: bool = True)[source]
prev_tokens: str

if we want to add tokens after the PIECE_START token and before the 1st TRACK_START token (for conditioning…).

windowing: bool

if True, the method tokenizes each file by applying bars windowing.

time_unit: str

the note length in VALID_TIME_UNITS that one TIME_DELTA unit will be equal to. This allows to tokenize in a wide variety of note lengths for diverse purposes. Be careful when choosing this value because if there are notes which duration is lower than the chosen time_unit value, they won’t be tokenized.

num_programs: List[int]

the number of programs to tokenize. If None, the method tokenizes all the tracks.

shuffle_tracks: bool

shuffles the order of tracks in each window (PIECE).

track_density: bool

if True a token DENSITY is added at the beggining of each track.

window_size: int

the number of bars per track to tokenize.

hop_length: int

the number of bars to slice when tokenizing. If a MIDI file contains 5 bars and the window size is 4 and the hop length is 1, it’ll be splitted in 2 PIECE tokens, one from bar 1 to 4 and the other on from bar 2 to 5 (somehow like audio FFT).

time_sig: bool

if we want to include the time signature in the samples. Note that the time signature will be added to the piece-level, that is, before the first track starts.

velocity: bool

if we want to add the velocity token. Velocities ranges between 1 and 128 (ints).

quantize: bool

if we want to quantize the symbolic music data for tokenizing.

__init__(prev_tokens: str = '', windowing: bool = True, time_unit: str = 'THIRTY_SECOND', num_programs: Optional[List[int]] = None, shuffle_tracks: bool = True, track_density: bool = False, window_size: int = 4, hop_length: int = 1, time_sig: bool = False, velocity: bool = False, quantize: bool = False, tempo: bool = True) None

Methods

__init__([prev_tokens, windowing, ...])

save(args, out_dir[, file])

Saves the configs as a json file.

Attributes

hop_length

num_programs

prev_tokens

quantize

shuffle_tracks

tempo

time_sig

time_unit

track_density

velocity

window_size

windowing