We generalize the continuous time framework for score-based generative models from an underlying Brownian motion (BM) to an approximation of fractional Brownian motion (FBM). We derive a continuous reparameterization trick and the reverse time model by representing FBM as a stochastic integral over a family of Ornstein-Uhlenbeck processes to define generative fractional diffusion models (GFDM) with driving noise converging to a non-Markovian process of infinite quadratic variation. The Hurst index H∈(0,1) of FBM enables control of the roughness of the distribution transforming path. To the best of our knowledge, this is the first attempt to build a generative model upon a stochastic process with infinite quadratic variation.
ArXiv
Generative Fractional Diffusion Models
Gabriel Nobis, Marco Aversa, Maximilian Springenberg, Michael Detzel, Stefano Ermon, Shinichi Nakajima, and 5 more authors
We present DiffInfinite, a hierarchical diffusion model that generates arbitrarily large histological images while preserving long-range correlation structural information. Our approach first generates synthetic segmentation masks, subsequently used as conditions for the high-fidelity generative diffusion process. The proposed sampling method can be scaled up to any desired image size while only requiring small patches for fast training. Moreover, it can be parallelized more efficiently than previous large-content generation methods while avoiding tiling artefacts. The training leverages classifier-free guidance to augment a small, sparsely annotated dataset with unlabelled data. Our method alleviates unique challenges in histopathological imaging practice: large-scale information, costly manual annotation, and protective data handling. The biological plausibility of DiffInfinite data is validated in a survey by ten experienced pathologists as well as a downstream segmentation task. Furthermore, the model scores strongly on anti-copying metrics which is beneficial for the protection of patient data.
ICML
Data Models for Dataset Drift Controls in Machine Learning With Optical Images
Luis Oala, Marco Aversa, Gabriel Nobis, Kurt Willis, Yoan Neuenschwander, Michèle Buck, and 5 more authors
International Conference on Machine Learning, Differentiable Almost Everything Workshop, 2023
ICML
Data Models for Dataset Drift Controls in Machine Learning With Optical Images
Luis Oala, Marco Aversa, Gabriel Nobis, Kurt Willis, Yoan Neuenschwander, Michèle Buck, and 5 more authors
International Conference on Machine Learning, Spurious Correlations, Invariance, and Stability Workshop, 2023
TMLR
Data Models for Dataset Drift Controls in Machine Learning With Optical Images
Luis Oala*, Marco Aversa*, Gabriel Nobis, Kurt Willis, Yoan Neuenschwander, Michèle Buck, and 5 more authors
Camera images are ubiquitous in machine learning research. They also play a central role in the delivery of important public services spanning medicine or environmental surveying. However, the application of machine learning models in these domains has been limited because of robustness concerns. A primary failure mode are performance drops due to differences between the training and deployment data. While there are methods to prospectively validate the robustness of machine learning models to such dataset drifts, existing approaches do not account for explicit models of machine learning’s primary object of interest: the data. This limits our ability to study and understand the relationship between data generation and downstream machine learning model performance in a physically accurate manner. In this study, we demonstrate how to overcome this limitation by pairing traditional machine learning with physical optics to obtain explicit and differentiable data models. We demonstrate how such data models can be constructed for image data and used to control downstream machine learning model performance related to dataset drift. The findings are distilled into three applications. First, drift synthesis enables the controlled generation of physically faithful drift test cases to power model selection and targeted generalization. Second, the gradient connection between machine learning task model and data model allows advanced, precise tolerancing of task model sensitivity to changes in the data generation. These drift forensics can be used to precisely specify the acceptable data environments in which a task model may be run. Third, drift optimization opens up the possibility to create drifts that can help the task model learn better faster, effectively optimizing the data generating process itself to support the downstream machine vision task. This is an interesting upgrade to existing imaging pipelines which traditionally have been optimized to be consumed by human users but not machine learning models. Alongside the data model code we release two datasets to the public that we collected as part of this work. In total, the two datasets, Raw-Microscopy and Raw-Drone, comprise 1,488 scientifically calibrated reference raw sensor measurements, 8,928 raw intensity variations as well as 17,856 images processed through twelve data models with different configurations. A guide to access the open code and datasets is available at https://github.com/aiaudit-org/raw2logit.
2022
NeurIPS Contributed Talk
Physical Data Models in Machine Learning Imaging Pipelines
Marco Aversa, Luis Oala, Christoph Clausen, Roderick Murray-Smith, and Bruno Sanguinetti
Advances in Neural Information Processing Systems, Machine Learning and the Physical Science Workshop, 2022
Light propagates from the object through the optics up to the sensor to create an image. Once the raw data is collected, it is processed through a complex image signal processing (ISP) pipeline to produce an image compatible with human perception. However, this processing is rarely considered in machine learning modelling because available benchmark data sets are generally not in raw format. This study shows how to embed the forward acquisition process into the machine learning model. We consider the optical system and the ISP separately. Following the acquisition process, we start from a drone and airship image dataset to emulate realistic satellite raw images with on-demand parameters. The end-to-end process is built to resemble the optics and sensor of the satellite setup. These parameters are satellite mirror size, focal length, pixel size and pattern, exposure time and atmospheric haze. After raw data collection, the ISP plays a crucial role in neural network robustness. We jointly optimize a parameterized differentiable image processing pipeline with a neural network model. This can lead to speed up and stabilization of classifier training at a margin of up to 20% in validation accuracy.
OBPDC
Data-centric AI workflow based on compressed raw images
Marco Aversa, Ziad Malik, Phillip Geier, Fabien Droz, Andres Upegui, Roderick Murray-Smith, and 2 more authors
Proceedings of the OBPDC2022-8th Internationl Worshop on Onboard payload data compression, 2022
In order to extract the full potential of the high volume of image data coming from earth observation, image compression is needed for transfer and storage, and artificial intelligence (AI) is needed for analysis. The promise of AI is to perform complex operations with low programming effort, naturally shifting the focus of the development of machine learning systems from the code, i.e. the implementation of the neural network, to the training process, and in particular to the acquisition, selection and preparation of training data. Lossy compression (like many other image processing methods), however, was developed primarily to compress already processed images for visual inspection, not regarding damage to invisible image properties which play an important role in machine-learning, such as higher order statistics, correlations and bias. The Jetraw image format, in contrast, was designed to compress raw image data, preserving its statistics and embedding camera calibration profile and noise model. These features facilitate the generation of accurate raw synthetic data. They allow for “Jetraw functions” to take a Jetraw image as an argument and return another Jetraw image, complete with its newly computed calibration profile and noise model. Several of these functions can be chained to build complex operations while always maintaining metrologically correct data, i.e. values that have independent errors, are unbiased and have a well-defined noise model. Jetraw images and functions may be used in end-to-end models to generate synthetic data with statistics matching those of genuine raw images, and play an important role in data-centric AI methodologies. Here we show how these features are used for a machine-learning task: the segmentation of cars in an urban, suburban and rural environment. Starting from a drone and airship image dataset in the Jetraw format (with calibrated sensor and optics), we use an end-to-end model to emulate realistic satellite raw images with on-demand parameters. First, we study the effect of various satellite parameters on the task’s performance as well as on the compressed image size. These parameters are satellite mirror size, focal length, pixel size and pattern, exposure time and atmospheric haze. Then, we discuss characterising and improving the performance and tolerances of the neural network through the use of on-the-fly generation of data that accurately reflects the statistics of the target system.
NeurIPS
Bessel Equivariant Networks for Inversion of Transmission Effects in Multi-Mode Optical Fibres
Joshua Mitton, Simon Mekhail, Miles Padgett, Daniele Faccio, Marco Aversa, and Roderick Murray-Smith
Advances in Neural Information Processing Systems, 2022
We perform percolation analysis of crossed-polarizer transmission images in a biased nanodisordered bulk KTN: Li perovskite. Two distinct percolative transitions are identified at two electric field thresholds. The low-field transition involves a directional fractal chain of dimension D= 1.65, while the high-field transition has a dimension D> 2. Direct cluster imaging in the volume is achieved using high-resolution orthographic 3D projections based on giant refraction. Percolation is attributed to a full-3D domain reorientation that mediates the transition from a ferroelectric supercrystal state to a disordered domain mosaic.