Info
openproblems_
Luecken et al. (2021)
2.2 GiB
14-02-2024
90261 × 13953
Quick links
Used in
No related benchmarks found.
Single-cell CITE-Seq (GEX+ADT) data collected from bone marrow mononuclear cells of 12 healthy human donors.
CREATED
14-02-2024
DIMENSIONS
90261 × 13953
Single-cell CITE-Seq data collected from bone marrow mononuclear cells of 12 healthy human donors using the 10X 3 prime Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site.
dataset_mod1
is an AnnData object with n_obs × n_vars = 90261 × 13953 with slots:
size_factors
, cell_type
, batch
feature_name
, feature_id
, hvg
, hvg_score
X_svd
counts
, normalized
dataset_description
, dataset_id
, dataset_name
, dataset_organism
, dataset_reference
, dataset_summary
, dataset_url
, normalization_id
dataset_mod2
is an AnnData object with n_obs × n_vars = 90261 × 134 with slots:
cell_type
, batch
, size_factors
feature_name
, feature_id
, hvg
, hvg_score
X_svd
counts
, normalized
dataset_description
, dataset_id
, dataset_name
, dataset_organism
, dataset_reference
, dataset_summary
, dataset_url
, normalization_id
Name | Description | Type | Data type | Size |
---|---|---|---|---|
obs | ||||
batch
|
A batch identifier. This label is very context-dependent and may be a combination of the tissue, assay, donor, etc. |
vector
|
category
|
90261 |
cell_
|
Classification of the cell type based on its characteristics and function within the tissue or organism. |
vector
|
category
|
90261 |
size_
|
The size factors created by the normalisation method, if any. |
vector
|
float32
|
90261 |
var | ||||
feature_
|
Unique identifier for the feature, usually a ENSEMBL gene id. |
vector
|
object
|
13953 |
feature_
|
A human-readable name for the feature, usually a gene symbol. |
vector
|
object
|
13953 |
hvg
|
Whether or not the feature is considered to be a ‘highly variable gene’ |
vector
|
bool
|
13953 |
hvg_
|
A ranking of the features by hvg. |
vector
|
float64
|
13953 |
obsm | ||||
X_
|
The resulting SVD embedding. |
densematrix
|
float32
|
90261 × 100 |
layers | ||||
counts
|
Raw counts |
sparsematrix
|
float32
|
90261 × 13953 |
normalized
|
Normalised expression values |
sparsematrix
|
float32
|
90261 × 13953 |
uns | ||||
dataset_
|
Long description of the dataset. |
atomic
|
str
|
1 |
dataset_
|
A unique identifier for the dataset. This is different from the obs.dataset_id field, which is the identifier for the dataset from which the cell data is derived.
|
atomic
|
str
|
1 |
dataset_
|
A human-readable name for the dataset. |
atomic
|
str
|
1 |
dataset_
|
The organism of the sample in the dataset. |
atomic
|
str
|
1 |
dataset_
|
Bibtex reference of the paper in which the dataset was published. |
atomic
|
str
|
1 |
dataset_
|
Short description of the dataset. |
atomic
|
str
|
1 |
dataset_
|
Link to the original source of the dataset. |
atomic
|
str
|
1 |
normalization_
|
Which normalization was used |
atomic
|
str
|
1 |
Name | Description | Type | Data type | Size |
---|---|---|---|---|
obs | ||||
batch
|
A batch identifier. This label is very context-dependent and may be a combination of the tissue, assay, donor, etc. |
vector
|
category
|
90261 |
cell_
|
Classification of the cell type based on its characteristics and function within the tissue or organism. |
vector
|
category
|
90261 |
size_
|
The size factors created by the normalisation method, if any. |
vector
|
float32
|
90261 |
var | ||||
feature_
|
Unique identifier for the feature, usually a ENSEMBL gene id. |
vector
|
object
|
134 |
feature_
|
A human-readable name for the feature, usually a gene symbol. |
vector
|
object
|
134 |
hvg
|
Whether or not the feature is considered to be a ‘highly variable gene’ |
vector
|
bool
|
134 |
hvg_
|
A ranking of the features by hvg. |
vector
|
float64
|
134 |
obsm | ||||
X_
|
The resulting SVD embedding. |
densematrix
|
float32
|
90261 × 100 |
layers | ||||
counts
|
Raw counts |
sparsematrix
|
float32
|
90261 × 134 |
normalized
|
Normalised expression values |
sparsematrix
|
float32
|
90261 × 134 |
uns | ||||
dataset_
|
Long description of the dataset. |
atomic
|
str
|
1 |
dataset_
|
A unique identifier for the dataset. This is different from the obs.dataset_id field, which is the identifier for the dataset from which the cell data is derived.
|
atomic
|
str
|
1 |
dataset_
|
A human-readable name for the dataset. |
atomic
|
str
|
1 |
dataset_
|
The organism of the sample in the dataset. |
atomic
|
str
|
1 |
dataset_
|
Bibtex reference of the paper in which the dataset was published. |
atomic
|
str
|
1 |
dataset_
|
Short description of the dataset. |
atomic
|
str
|
1 |
dataset_
|
Link to the original source of the dataset. |
atomic
|
str
|
1 |
normalization_
|
Which normalization was used |
atomic
|
str
|
1 |
dataset_mod1.layers['counts']
In R: dataset_mod1$layers[["counts"]]
Type: sparsematrix
, data type: float32
, shape: 90261 × 13953
Raw counts
dataset_mod1.layers['normalized']
In R: dataset_mod1$layers[["normalized"]]
Type: sparsematrix
, data type: float32
, shape: 90261 × 13953
Normalised expression values
dataset_mod1.obs['size_factors']
In R: dataset_mod1$obs[["size_factors"]]
Type: vector
, data type: float32
, shape: 90261
The size factors created by the normalisation method, if any.
dataset_mod1.obs['cell_type']
In R: dataset_mod1$obs[["cell_type"]]
Type: vector
, data type: category
, shape: 90261
Classification of the cell type based on its characteristics and function within the tissue or organism.
dataset_mod1.obs['batch']
In R: dataset_mod1$obs[["batch"]]
Type: vector
, data type: category
, shape: 90261
A batch identifier. This label is very context-dependent and may be a combination of the tissue, assay, donor, etc.
dataset_mod1.obsm['X_svd']
In R: dataset_mod1$obsm[["X_svd"]]
Type: densematrix
, data type: float32
, shape: 90261 × 100
The resulting SVD embedding.
dataset_mod1.uns['dataset_description']
In R: dataset_mod1$uns[["dataset_description"]]
Type: atomic
, data type: str
, shape: 1
Long description of the dataset.
dataset_mod1.uns['dataset_id']
In R: dataset_mod1$uns[["dataset_id"]]
Type: atomic
, data type: str
, shape: 1
A unique identifier for the dataset. This is different from the obs.dataset_id
field, which is the identifier for the dataset from which the cell data is derived.
dataset_mod1.uns['dataset_name']
In R: dataset_mod1$uns[["dataset_name"]]
Type: atomic
, data type: str
, shape: 1
A human-readable name for the dataset.
dataset_mod1.uns['dataset_organism']
In R: dataset_mod1$uns[["dataset_organism"]]
Type: atomic
, data type: str
, shape: 1
The organism of the sample in the dataset.
dataset_mod1.uns['dataset_reference']
In R: dataset_mod1$uns[["dataset_reference"]]
Type: atomic
, data type: str
, shape: 1
Bibtex reference of the paper in which the dataset was published.
dataset_mod1.uns['dataset_summary']
In R: dataset_mod1$uns[["dataset_summary"]]
Type: atomic
, data type: str
, shape: 1
Short description of the dataset.
dataset_mod1.uns['dataset_url']
In R: dataset_mod1$uns[["dataset_url"]]
Type: atomic
, data type: str
, shape: 1
Link to the original source of the dataset.
dataset_mod1.uns['normalization_id']
In R: dataset_mod1$uns[["normalization_id"]]
Type: atomic
, data type: str
, shape: 1
Which normalization was used
dataset_mod1.var['feature_name']
In R: dataset_mod1$var[["feature_name"]]
Type: vector
, data type: object
, shape: 13953
A human-readable name for the feature, usually a gene symbol.
dataset_mod1.var['feature_id']
In R: dataset_mod1$var[["feature_id"]]
Type: vector
, data type: object
, shape: 13953
Unique identifier for the feature, usually a ENSEMBL gene id.
dataset_mod1.var['hvg']
In R: dataset_mod1$var[["hvg"]]
Type: vector
, data type: bool
, shape: 13953
Whether or not the feature is considered to be a ‘highly variable gene’
dataset_mod1.var['hvg_score']
In R: dataset_mod1$var[["hvg_score"]]
Type: vector
, data type: float64
, shape: 13953
A ranking of the features by hvg.
dataset_mod2.layers['counts']
In R: dataset_mod2$layers[["counts"]]
Type: sparsematrix
, data type: float32
, shape: 90261 × 134
Raw counts
dataset_mod2.layers['normalized']
In R: dataset_mod2$layers[["normalized"]]
Type: sparsematrix
, data type: float32
, shape: 90261 × 134
Normalised expression values
dataset_mod2.obs['cell_type']
In R: dataset_mod2$obs[["cell_type"]]
Type: vector
, data type: category
, shape: 90261
Classification of the cell type based on its characteristics and function within the tissue or organism.
dataset_mod2.obs['batch']
In R: dataset_mod2$obs[["batch"]]
Type: vector
, data type: category
, shape: 90261
A batch identifier. This label is very context-dependent and may be a combination of the tissue, assay, donor, etc.
dataset_mod2.obs['size_factors']
In R: dataset_mod2$obs[["size_factors"]]
Type: vector
, data type: float32
, shape: 90261
The size factors created by the normalisation method, if any.
dataset_mod2.obsm['X_svd']
In R: dataset_mod2$obsm[["X_svd"]]
Type: densematrix
, data type: float32
, shape: 90261 × 100
The resulting SVD embedding.
dataset_mod2.uns['dataset_description']
In R: dataset_mod2$uns[["dataset_description"]]
Type: atomic
, data type: str
, shape: 1
Long description of the dataset.
dataset_mod2.uns['dataset_id']
In R: dataset_mod2$uns[["dataset_id"]]
Type: atomic
, data type: str
, shape: 1
A unique identifier for the dataset. This is different from the obs.dataset_id
field, which is the identifier for the dataset from which the cell data is derived.
dataset_mod2.uns['dataset_name']
In R: dataset_mod2$uns[["dataset_name"]]
Type: atomic
, data type: str
, shape: 1
A human-readable name for the dataset.
dataset_mod2.uns['dataset_organism']
In R: dataset_mod2$uns[["dataset_organism"]]
Type: atomic
, data type: str
, shape: 1
The organism of the sample in the dataset.
dataset_mod2.uns['dataset_reference']
In R: dataset_mod2$uns[["dataset_reference"]]
Type: atomic
, data type: str
, shape: 1
Bibtex reference of the paper in which the dataset was published.
dataset_mod2.uns['dataset_summary']
In R: dataset_mod2$uns[["dataset_summary"]]
Type: atomic
, data type: str
, shape: 1
Short description of the dataset.
dataset_mod2.uns['dataset_url']
In R: dataset_mod2$uns[["dataset_url"]]
Type: atomic
, data type: str
, shape: 1
Link to the original source of the dataset.
dataset_mod2.uns['normalization_id']
In R: dataset_mod2$uns[["normalization_id"]]
Type: atomic
, data type: str
, shape: 1
Which normalization was used
dataset_mod2.var['feature_name']
In R: dataset_mod2$var[["feature_name"]]
Type: vector
, data type: object
, shape: 134
A human-readable name for the feature, usually a gene symbol.
dataset_mod2.var['feature_id']
In R: dataset_mod2$var[["feature_id"]]
Type: vector
, data type: object
, shape: 134
Unique identifier for the feature, usually a ENSEMBL gene id.
dataset_mod2.var['hvg']
In R: dataset_mod2$var[["hvg"]]
Type: vector
, data type: bool
, shape: 134
Whether or not the feature is considered to be a ‘highly variable gene’
dataset_mod2.var['hvg_score']
In R: dataset_mod2$var[["hvg_score"]]
Type: vector
, data type: float64
, shape: 134
A ranking of the features by hvg.