The CAMELS project

Cosmology is the branch of astronomy that studies the constituents and laws of the Universe. In the late 90s a discovery transformed this centuries-old discipline: the Universe is accelerating its expansion. The nature and properties of the substance responsible for such an effect, the so-called dark energy, is one of the biggest mysteries in modern physics. According to cosmologists, the Universe is a very strange place: around 95% of its content is made up of things that we have never seen such as dark energy and dark matter.

In order to unveil the nature and properties of these dark components, the scientific community has spent billions of dollars building large telescopes to survey gigantic volumes of the Universe. A standard method to find hints about the composition of the Universe is to study the pattern generated by the distribution of galaxies. Cosmologists investigate that pattern over colossal scales, the smallest ones being on the order of tens of millions of light years. Why don’t they look at the cosmic pattern on smaller scales? The reason is that the theoretical models cosmologists use are only accurate on such humongous large scales. On scales smaller than those, astrophysical processes, such as supernovae explosions and the energy emitted by supermassive black-holes, will affect the distribution of galaxies in a poorly understood manner.

The formation and evolution of galaxies, and the physical processes involved in them, are studied by a different branch of astronomy. Typically, there is very little talking between scientists working on the two branches. For instance, cosmologists will restrict their analyses to scales that are not affected by astrophysical processes. However, cosmology and the astrophysical processes involved in galaxy formation are tightly interlaced: for example, cosmologists observe and work with galaxies, objects dominated by astrophysical processes. Building bridges between both astronomy branches will enrich them and may become the key to solving some mysteries.

Unfortunately, the interplay between cosmology and astrophysics is very complex. Scientists typically rely on sophisticated numerical simulations to model and investigate the interactions between cosmological and astrophysical processes. When a simulation is run, scientists need to choose a particular model, e.g. what fraction of the Universe is in the form of dark energy and how much energy is released by supermassive black holes.

However, we do not know the values of these quantities with infinite precision; in fact, there are some quantities for which scientists barely know the values. For that reason, it would be desirable to run numerical simulations with different values for these quantities. Unfortunately the number of possibilities is so large that a gazillion of simulations would be needed to cover all possible combinations. Fortunately, recent advances in artificial intelligence are allowing researchers to use sophisticated methods to train computers to learn complex patterns from the data.

The Cosmology with Astrophysics with MachinE Learning Simulations (CAMELS) project was designed to tackle all problems outlined above. CAMELS contains a collection of 4,233 numerical simulations, the largest set of these characteristics ever run. These simulations are designed to build bridges between cosmology and astrophysics and to serve as a massive dataset to exploit machine learning techniques. For instance, instead of running simulations to explore all possible combinations of how to arrange the value of the cosmological and astrophysical parameters, one may use machine learning to train a computer to learn the relation between some combination of the parameters and the desired quantity, e.g. the abundance of some particular type of galaxy.

CAMELS is a gigantic collection of different Universes, each with a different cosmology and different astrophysics. The simulations of the CAMELS project were run using supercomputers at New York and San Diego using thousands of cores for months. Hundreds of terabytes of disk space were employed to store the data that CAMELS produced, which contains billions of particles and millions of galaxies. All that data is publicly available for anyone to explore it.

CAMELS is the first of its kind. By combining cosmology and astrophysics through thousands of numerical simulations and artificial intelligence, CAMELS will help researchers to investigate properties of the Universe in a completely different way as done before. Many researchers worldwide have either analyzed parts of the data or used it for different tasks. In other posts we will describe in detail their findings.


Further reading:

CAMELS introductory paper:

CAMELS data release paper:

CAMELS website:

Interested in joining the CAMELS collaboration?

Please send us an email to

Post author:

Francisco Villaescusa-Navarro

Research Scientist, Simons Foundation

160 5th Avenue, New York, NY, 10010, USA