class documentation

Families in the Wild is a public benchmark for recognizing families via facial images. The dataset contains over 26,642 images of 5,037 faces collected from 978 families. A unique Family ID (FID) is assigned per family, ranging from F0001-F1018 (i.e., some families were merged or removed since its first release in 2016). The dataset is a continued work in progress. Any contributions are both welcome and appreciated!

Faces were cropped from imagery using the five-point face detector MTCNN from various phototypes (i.e., mostly family photos, along with several profile pics of individuals (facial shots). The number of members per family varies from 3-to-26, with the number of faces per subject ranging from 1 to >10.

Various levels and types of labels are associated with samples in this dataset. Family-level labels contain a list of members, each assigned a member ID (MID) unique to that respective family (e.g., F0011.MID2 refers to member 2 of family 11). Each member has annotations specifying gender and relationship to all other members in that respective family.

The relationships in FIW are:

ID Type
0 not related or self
1 child
2 sibling
3 grandchild
4 parent
5 spouse
6 grandparent
7 great grandchild
8 great grandparent
9 TBD

Within FiftyOne, each sample corresponds to a single face image and contains primitive labels of the Family ID, Member ID, etc. The relationship labels are stored as :ref:`multi-label classifications <multilabel-classification>`, where each classification represents one relationship that the member has with another member in the family. The number of relationships will differ from one person to the next, but all faces of one person will have the same relationship labels.

Additionally, the labels for the Kinship Verification task are also loaded into this dataset through FiftyOne. These labels are stored as classifications just like relationships, but the labels of kinship differ from those defined above. For example, rather than Parent, the label might be fd representing a Father-Daughter kinship or md for Mother-Daughter.

In order to make it easier to browse the dataset in the FiftyOne App, each sample also contains a face_id field containing a unique integer for each face of a member, always starting at 0. This allows you to filter the face_id field to 0 in the App to show only a single image of each person.

For your reference, the relationship labels are stored in disk in a matrix that provides the relationship of each member with other members of the family as well as names and genders. The i-th rows represent the i-th family member's relationship to the j-th other members.

For example, FID0001.csv contains:

MID 1 2 3 Name Gender
1 0 4 5 name1 f 2 1 0 1 name2 f 3 5 4 0 name3 m

Here we have three family members, as listed under the MID column (far-left). Each MID reads across its row. We can see that MID1 is related to MID2 by 4 -> 1 (Parent -> Child), which of course can be viewed as the inverse, i.e., MID2 -> MID1 is 1 -> 4. It can also be seen that MID1 and MID3 are spouses of one another, i.e., 5 -> 5.

Note

The spouse label will likely be removed in future version of this dataset. It serves no value to the problem of kinship.

For more information on the data (e.g., statistics, task evaluations, benchmarks, and more), see the recent journal:

Robinson, JP, M. Shao, and Y. Fu. "Survey on the Analysis and Modeling of Visual Kinship: A Decade in the Making." IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2021.

Note

For your convenience, FiftyOne provides get_pairwise_labels() and get_identifier_filepaths_map() utilities for FIW.

Example usage:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("fiw", split="test")

session = fo.launch_app(dataset)
Dataset size
173.00 MB
Source
https://web.northeastern.edu/smilelab/fiw
Property name The name of the dataset.
Property supported_splits A tuple of supported splits for the dataset, or None if the dataset does not have splits.
Property tags A tuple of tags for the dataset.
Method _download_and_prepare Internal implementation of downloading the dataset and preparing it for use in the given directory.

Inherited from ZooDataset (via FiftyOneDataset):

Static Method get_info_path Returns the path to the ZooDatasetInfo for the dataset.
Static Method has_info Determines whether the directory contains ZooDatasetInfo.
Static Method load_info Loads the ZooDatasetInfo from the given dataset directory.
Method download_and_prepare Downloads the dataset and prepares it for use.
Method get_split_dir Returns the directory for the given split of the dataset.
Method has_split Whether the dataset has the given split.
Method has_tag Whether the dataset has the given tag.
Property has_patches Whether the dataset has patches that may need to be applied to already downloaded files.
Property has_splits Whether the dataset has splits.
Property has_tags Whether the dataset has tags.
Property importer_kwargs A dict of default kwargs to pass to this dataset's fiftyone.utils.data.importers.DatasetImporter.
Property is_remote Whether the dataset is remotely-sourced.
Property parameters An optional dict of parameters describing the configuration of the zoo dataset when it was downloaded.
Property requires_manual_download Whether this dataset requires some files to be manually downloaded by the user before the dataset can be loaded.
Property supports_partial_downloads Whether the dataset supports downloading partial subsets of its splits.
Method _get_splits_to_download Undocumented
Method _is_dataset_ready Undocumented
Method _is_split_ready Undocumented
Method _patch_if_necessary Internal method called when an already downloaded dataset may need to be patched.

The name of the dataset.

@property
supported_splits = (source)

A tuple of supported splits for the dataset, or None if the dataset does not have splits.

A tuple of tags for the dataset.

def _download_and_prepare(self, dataset_dir, scratch_dir, split): (source)

Internal implementation of downloading the dataset and preparing it for use in the given directory.

Parameters
dataset_dirthe directory in which to construct the dataset. If a split is provided, this is the directory for the split
scratch_dira scratch directory to use to download and prepare any required intermediate files
splitthe split to download, or None if the dataset does not have splits
Returns
tuple of
  • dataset_type: the fiftyone.types.Dataset type of the dataset
  • num_samples: the number of samples in the split. For datasets that support partial downloads, this can be None, which indicates that all content was already downloaded
  • classes: an optional list of class label strings