mftools package
Submodules
mftools.barcodes module
A collection of functions to work with the barcodes decoded by MERlin.
Functions
- process_merlin_barcodes
Determines error bit/type for barcodes.
- expand_codebook
Creates a new codebook with additional barcodes representing all possible single bit flips for all genes. Used to assign error correction information to barcodes.
- normalize_codebook
Returns a codebook with L2 normalized barcodes.
- set_barcode_stats
Adds barcode statistics to the global stats object (see stats.py)
- make_table
Given a MERlin analysis folder and codebook, creates a table of all decoded barcodes with error correction information.
- calculate_global_coordinates
Adds the global coordinates to a barcode table.
- assign_to_cells
Determines the cell IDs for each barcode.
- link_cell_ids
Renames cell IDs to unify overlapping cells in adjacent FOVs.
- mark_barcodes_in_overlaps
Sets the status of barcodes to “edge” for those barcodes which are in the overlapping regions of FOVs.
- create_cell_by_gene_table
Given a barcode table with assigned cell IDs, returns a cell by gene matrix. Barcodes marked with “edge” status are not counted.
- count_unfiltered_barcodes
Gets the number of barcodes MERlin decoded before applying adaptive filtering.
- get_per_bit_stats
Calculates the per-bit error rates for a given gene.
- get_per_gene_error
Calculates the overall error rates for each gene.
- per_bit_error
Calculates the average error rate for each bit across all genes.
- per_fov_error
Calculates the error rate within each FOV.
- mftools.barcodes.assign_to_cells(barcodes, masks, drifts=None, transpose=False, flip_x=True, flip_y=True)[source]
- mftools.barcodes.calculate_global_coordinates(barcodes: DataFrame, positions: DataFrame) None[source]
Add global_x and global_y columns to barcodes.
- mftools.barcodes.count_unfiltered_barcodes(merlin_result: MerlinOutput) int[source]
Count the total number of barcodes decoded by MERlin before adaptive filtering.
- mftools.barcodes.expand_codebook(codebook: DataFrame) DataFrame[source]
Add codes for every possible bit flip to a codebook.
- mftools.barcodes.make_table(merlin_result: MerlinOutput, codebook: DataFrame) DataFrame[source]
Create a table of all barcodes with error correction information.
- mftools.barcodes.normalize_codebook(codebook: DataFrame) DataFrame[source]
L2 normalize a codebook.
- mftools.barcodes.per_bit_error(barcodes, colors) DataFrame[source]
Get barcode error statistics per bit.
- mftools.barcodes.process_merlin_barcodes(barcodes: DataFrame, neighbors: IndexFlatL2, expanded_codebook: DataFrame) DataFrame[source]
Process the barcodes for a single field of view.
The error type and bit are determined using an expanded codebook and returned as a DataFrame with extraneous columns removed.
- mftools.barcodes.set_barcode_stats(merlin_result: MerlinOutput, bcs: DataFrame, colors: list) DataFrame[source]
mftools.cellgene module
- mftools.cellgene.adjust_spatial_coordinates(adata, flip_horizontal=False, flip_vertical=False, transpose=False)[source]
- mftools.cellgene.create_scanpy_object(analysis, name=None, positions=None, codebook=None, keep_empty_cells=True) AnnData[source]
- mftools.cellgene.find_cell_communities(adata: AnnData, labels: str, radius: int = 150) None[source]
Group cells based on the cell types present in their vicinity.
mftools.config module
Global configuration settings for the pipeline.
Settings
- merlin_folder
The folder containing the output from MERlin.
- image_folder
The folder containing the raw image files.
- segmentation_folder
The folder containing the segmentation files.
- output_folder
The folder to save output to.
minimum_cell_volume
maximum_cell_volume
barcode_colors
- mask_size
The size (width and height) in pixels of the segmentation masks
- scale
The scale of the images compared to segmentation masks. For example, if the masks were generated by downsampling the images by a factor of 4, scale should be set to 4.
- omit_fovs
FOV ids to skip processing (currently not implemented)
- reference_counts
A filename with gene counts to correlate with. TODO: add more details about format this should be in
- flip_barcodes
If true, the positions of barcodes within a FOV should be flipped horizontally and vertically before assigning to the segmentation mask.
- transpose_barcodes
If true, the x and y positions of barcodes within a FOV should be swapped.
mftools.fileio module
Module for loading and saving files and data related to MERFISH experiments.
- class mftools.fileio.DaxFile(filename: str, num_channels: int | None = None)[source]
Bases:
objectLoads data from a DAX image file.
- block(center: Sequence[int], volume: Sequence[int], channel: int | None = None)[source]
Return a 3D block of the image specified by the given center and volume.
The volume specifies the radius of the block in each dimension.
- block_range(xr: Sequence[int] | None = None, yr: Sequence[int] | None = None, zr: Sequence[int] | None = None, channel: int | None = None)[source]
Return a 3D block of the image specific by the given x, y, and z ranges.
- class mftools.fileio.ImageDataset(folderpath: str, data_organization: str | None = None)[source]
Bases:
object- load_image(fov: int, zslice: int | None = None, channel: str | None = None, max_projection: bool = False, fiducial: bool = False) ndarray[source]
Load an image from the dataset.
The image to load can be specified by passing either the bit or the hybridization round and color channel. If the zslice to be loaded is not specified, then either a 3D image containing all z-slices, or a 2D max projection along the z-axis is returned, depending on the max_projection parameter.
- class mftools.fileio.MerfishAnalysis(folderpath: str, save_to_subfolder: str = '')[source]
Bases:
objectA class for saving and loading results from this software package.
- class mftools.fileio.MerlinOutput(folderpath: str)[source]
Bases:
objectA class for loading results from a MERlin output folder.
- count_raw_barcodes(fov: int) int[source]
Count the number of barcodes for an fov in the Decode folder.
- load_codebook() DataFrame[source]
Get the codebook used for this MERFISH experiment.
The ‘name’ and ‘id’ columns are identical, and both contain the name of the gene or blank barcode encoded by that row. The ‘bit1’ through ‘bitN’ columns contain the 0s or 1s of the barcode.
- load_drift_transformations(fov: int)[source]
Get the drifts calculated between hybridization rounds for the given FOV.
Returns a numpy array containing scikit-image SimilarityTransform objects. The load_hyb_drifts function can be used instead to convert this to a pandas DataFrame.
- load_filtered_barcodes(fov: int) DataFrame[source]
Load detailed barcode metadata from the AdaptiveFilterBarcodes folder.
- load_fov_positions() DataFrame[source]
Get the global positions of the FOVs.
The coordinates indicate the top-left corner of the FOV.
- load_hyb_drifts(fov: int) DataFrame[source]
Get the drifts calculated between hybridization rounds for the given FOV.
The ‘X drift’ and ‘Y drift’ columns indicate the translation required to align coordinates in the FOV and hybridization round to the first hybridization round for that FOV. These drifts are calculated by MERlin.
- mftools.fileio.load_data_organization(filename: str) DataFrame[source]
Load a data organization file into a pandas DataFrame.
- Parameters:
filename – The path to the data organization file.
- Returns:
A pandas DataFrame of the data organization.
- mftools.fileio.load_fov_positions(path: Path) DataFrame[source]
Get the global positions of the FOVs.
The coordinates indicate the top-left corner of the FOV.
- Parameters:
path – A pathlib.Path to the FOV positions file.
- Returns:
A pandas DataFrame containing the FOV positions.
- mftools.fileio.load_mask(segmask_dir: Path, fov: int) ndarray[source]
Load the segmentation mask for the given FOV.
- Parameters:
segmask_dir – The directory containing segmentation masks.
fov – Which field of view to the load the mask for.
- Returns:
The segmentation mask for the specified field of view.
- mftools.fileio.save_mask(filename: Path, mask: ndarray) None[source]
Save a mask in cellpose format.
- Parameters:
filename – A pathlib.Path for the save location.
cellpose_data – A tuple containing the segmentation image and the masks, flows, and diams returned by cellpose.
- mftools.fileio.search_for_mask_file(segmask_dir: Path, fov: int) Path[source]
Find the filename for the segmentation mask of the given FOV.
This function searches the given directory for a file matching a number of different possible patterns for segmentation mask filenames, based on cellpose output naming and various other scripts used for segmentation internally at the USCD Center for Epigenomics.
- Parameters:
segmask_dir – The directory containing segmentation masks.
fov – The field of view to find the mask file for.
- Returns:
The mask file found for the specified field of view.
- Raises:
FileNotFoundError – If no mask file could be found.
mftools.images module
Functions for working with fluorescent microscopy images.
- class mftools.images.FOVPositions(positions: DataFrame | None = None, filename: str | None = None, merlin: MerlinOutput | None = None)[source]
Bases:
object- find_fov_overlaps(fovsize: int = 220, get_trim: bool = False) List[list][source]
Identify overlaps between FOVs.
- property overlaps
- class mftools.images.Overlap(fov, xslice, yslice)
Bases:
tuple- fov
Alias for field number 0
- xslice
Alias for field number 1
- yslice
Alias for field number 2
- mftools.images.flat_field_correct(image: ndarray, sigma: float, filter_size: int | None = None) ndarray[source]
- mftools.images.get_median_image(imageset: ImageDataset, bit: int, sample_size: int | None = None) ndarray[source]
- mftools.images.get_slice(diff: float, fovsize: int = 220, get_trim: bool = False) slice[source]
Get a slice for the region of an image overlapped by another FOV.
- Parameters:
diff – The amount of overlap in the global coordinate system.
fovsize – The width/length of a FOV in the global coordinate system, defaults to 220.
get_trim – If True, return the half of the overlap closest to the edge. This is for determining in which region the barcodes should be trimmed to avoid duplicates.
- Returns:
A slice in the FOV coordinate system for the overlap.
mftools.plotting module
- mftools.plotting.create_color_image(red: ndarray | None = None, green: ndarray | None = None, blue: ndarray | None = None, vmax=100) ndarray[source]
mftools.segmentation module
Provides the CellSegmentation class for working with segmentation masks.
- class mftools.segmentation.CellSegmentation(mask_folder: str | None = None, output: MerfishAnalysis | None = None, positions: DataFrame | None = None, imagedata: ImageDataset | None = None, channel: str = 'PolyT', zslice: int | None = None)[source]
Bases:
objectA collection of segmentation masks from all FOVs.
- find_overlapping_cells() List[set][source]
Identify the cells overlapping FOVs that are the same cell.
- property metadata: DataFrame
Get the cell metadata table.
When the metadata table is accessed for the first time, it will first attempt to load a saved metadata table if output was given. If the file doesn’t exist, the metadata table will be created and stored in memory. If output was given, the table will be saved to disk so it can be loaded in the future.
- Returns:
The cell metadata table.
- mftools.segmentation.match_cells_in_overlap(strip_a: ndarray, strip_b: ndarray) Set[tuple][source]
Find cells in overlapping regions of two FOVs that are the same cells.
- Parameters:
strip_a – The overlapping region of the segmentation mask from one FOV.
strip_b – The overlapping region of the segmentation mask from another FOV.
- Returns:
A set of pairs of ints (tuples) representing the mask labels from each mask that are the same cell. For example, the tuple (23, 45) means mask label 23 from the mask given by strip_a is the same cell as mask label 45 in the mask given by strip_b.
mftools.stats module
Calculates and stores various statistics and quality metrics.
This module should generally not be interacted with directly, but rather through an instance of the MerfishExperiment class (see experiment.py).
mftools.util module
Module contents
MERFISH analysis in Python.