fiftyone.zoo.datasets.base.FIWDataset

class documentation

class FIWDataset(FiftyOneDataset): (source)

Families in the Wild is a public benchmark for recognizing families via facial images. The dataset contains over 26,642 images of 5,037 faces collected from 978 families. A unique Family ID (FID) is assigned per family, ranging from F0001-F1018 (i.e., some families were merged or removed since its first release in 2016). The dataset is a continued work in progress. Any contributions are both welcome and appreciated!

Faces were cropped from imagery using the five-point face detector MTCNN from various phototypes (i.e., mostly family photos, along with several profile pics of individuals (facial shots). The number of members per family varies from 3-to-26, with the number of faces per subject ranging from 1 to >10.

Various levels and types of labels are associated with samples in this dataset. Family-level labels contain a list of members, each assigned a member ID (MID) unique to that respective family (e.g., F0011.MID2 refers to member 2 of family 11). Each member has annotations specifying gender and relationship to all other members in that respective family.

The relationships in FIW are:

ID Type

0 not related or self

1 child

2 sibling

3 grandchild

4 parent

5 spouse

6 grandparent

7 great grandchild

8 great grandparent

9 TBD

ID	Type
0	not related or self
1	child
2	sibling
3	grandchild
4	parent
5	spouse
6	grandparent
7	great grandchild
8	great grandparent
9	TBD

Within FiftyOne, each sample corresponds to a single face image and contains primitive labels of the Family ID, Member ID, etc. The relationship labels are stored as :ref:`multi-label classifications <multilabel-classification>`, where each classification represents one relationship that the member has with another member in the family. The number of relationships will differ from one person to the next, but all faces of one person will have the same relationship labels.

Additionally, the labels for the Kinship Verification task are also loaded into this dataset through FiftyOne. These labels are stored as classifications just like relationships, but the labels of kinship differ from those defined above. For example, rather than Parent, the label might be fd representing a Father-Daughter kinship or md for Mother-Daughter.

In order to make it easier to browse the dataset in the FiftyOne App, each sample also contains a face_id field containing a unique integer for each face of a member, always starting at 0. This allows you to filter the face_id field to 0 in the App to show only a single image of each person.

For your reference, the relationship labels are stored in disk in a matrix that provides the relationship of each member with other members of the family as well as names and genders. The i-th rows represent the i-th family member's relationship to the j-th other members.

For example, FID0001.csv contains:

MID 1 2 3 Name Gender

1 0 4 5 name1 f 2 1 0 1 name2 f 3 5 4 0 name3 m

Here we have three family members, as listed under the MID column (far-left). Each MID reads across its row. We can see that MID1 is related to MID2 by 4 -> 1 (Parent -> Child), which of course can be viewed as the inverse, i.e., MID2 -> MID1 is 1 -> 4. It can also be seen that MID1 and MID3 are spouses of one another, i.e., 5 -> 5.

Note

The spouse label will likely be removed in future version of this dataset. It serves no value to the problem of kinship.

For more information on the data (e.g., statistics, task evaluations, benchmarks, and more), see the recent journal:

Robinson, JP, M. Shao, and Y. Fu. "Survey on the Analysis and Modeling of Visual Kinship: A Decade in the Making." IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2021.

Note

For your convenience, FiftyOne provides get_pairwise_labels() and get_identifier_filepaths_map() utilities for FIW.

Example usage:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("fiw", split="test")

session = fo.launch_app(dataset)

Dataset size: 173.00 MB
Source: https://web.northeastern.edu/smilelab/fiw

Property	`license`	The license or list,of,licenses under which the dataset is distributed, or None if unknown.
Property	`name`	The name of the dataset.
Property	`supported_splits`	A tuple of supported splits for the dataset, or None if the dataset does not have splits.
Property	`tags`	A tuple of tags for the dataset.
Method	`_download_and_prepare`	Internal implementation of downloading the dataset and preparing it for use in the given directory.

Inherited from ZooDataset (via FiftyOneDataset):

Static Method	`get_info_path`	Returns the path to the `ZooDatasetInfo` for the dataset.
Static Method	`has_info`	Determines whether the directory contains `ZooDatasetInfo`.
Static Method	`load_info`	Loads the `ZooDatasetInfo` from the given dataset directory.
Method	`download_and_prepare`	Downloads the dataset and prepares it for use.
Method	`get_split_dir`	Returns the directory for the given split of the dataset.
Method	`has_split`	Whether the dataset has the given split.
Method	`has_tag`	Whether the dataset has the given tag.
Property	`has_patches`	Whether the dataset has patches that may need to be applied to already downloaded files.
Property	`has_splits`	Whether the dataset has splits.
Property	`has_tags`	Whether the dataset has tags.
Property	`importer_kwargs`	A dict of default kwargs to pass to this dataset's `fiftyone.utils.data.importers.DatasetImporter`.
Property	`is_remote`	Whether the dataset is remotely-sourced.
Property	`parameters`	An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.
Property	`requires_manual_download`	Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.
Property	`supports_partial_downloads`	Whether the dataset supports downloading partial subsets of its splits.
Method	`_get_splits_to_download`	Undocumented
Method	`_is_dataset_ready`	Undocumented
Method	`_is_split_ready`	Undocumented
Method	`_patch_if_necessary`	Internal method called when an already downloaded dataset may need to be patched.

@property
license = (source) ¶

overrides fiftyone.zoo.datasets.ZooDataset.license

The license or list,of,licenses under which the dataset is distributed, or None if unknown.

@property
name = (source) ¶

overrides fiftyone.zoo.datasets.ZooDataset.name

The name of the dataset.

@property
supported_splits = (source) ¶

overrides fiftyone.zoo.datasets.ZooDataset.supported_splits

A tuple of supported splits for the dataset, or None if the dataset does not have splits.

@property
tags = (source) ¶

overrides fiftyone.zoo.datasets.ZooDataset.tags

A tuple of tags for the dataset.

def _download_and_prepare(self, dataset_dir, scratch_dir, split): (source) ¶

overrides fiftyone.zoo.datasets.ZooDataset._download_and_prepare

Internal implementation of downloading the dataset and preparing it for use in the given directory.

Parameters
dataset_dir	the directory in which to construct the dataset. If a `split` is provided, this is the directory for the split
scratch_dir	a scratch directory to use to download and prepare any required intermediate files
split	the split to download, or None if the dataset does not have splits
Returns
tuple of	dataset_type: the `fiftyone.types.Dataset` type of the dataset num_samples: the number of samples in the split. For datasets that support partial downloads, this can be `None`, which indicates that all content was already downloaded classes: an optional list of class label strings