Calculating the expansion rate of the universe

A close up of a map Description automatically generated


Ever since the proposal of the Big Bang model, scientists have been puzzling over the expansion rate of the universe. What is the current speed of expansion? Is our universe expanding at an accelerating or decelerating rate? What does this tell us about the substances in the universe and our ultimate fate?

In this project, the redshifts of distant astronomical objects are calculated through emission spectrum analysis of galaxies, from which their recessional velocities are deduced. The expansion rate of the universe is then determined locally using the distances and recessional velocities of two types of ‘standard candles’, which are objects with known absolute brightness: Type Ia Supernovae and Cepheid variables.

The data is partially primary and partially collected from online astronomical databases; statistical analysis of the data of the astronomical objects plays an important role in obtaining the numerical value of the Hubble parameter. The analysis also suggests an acceleration in the expansion of the universe; however, the uncertainties for the data are too big to determine a specific model for the future evolution of the universe. Unlike the value of the Hubble parameter determined theoretically by the Planck Satellite teamthe values in this paper are determined locally.


Emission spectrum and spectral line series?

Emission spectrums are maps of frequencies of electromagnetic radiation emitted by certain elements, caused by excited electrons jumping from higher energy levels to lower energy levels.

Inside an atom, only certain electron energy levels are allowed. Because of that, lights are only emitted at specific frequencies. Thus, the spectrum of a chemical element has a fixed pattern, with spectral lines at fixed wavelengths.

The pattern of spectral lines of a certain element is called a spectral line series; for instance, the spectral lines in the visible light range of hydrogen are called the Balmer series. Due to the invariant property of the spectral line series in distant astronomical objects, they can be identified, and their observed wavelengths provide us with information about distant astronomical objects.

Hubble’s Law and the Hubble Parameter

In 1929, Edwin Hubble observed the emission spectrum of distant galaxies, and concluded a linear relationship between the distance of the galaxies and their recessional velocitiesThe only plausible explanation of this discovery is that the fabric of space itself is expanding. This led to the development of the Big Bang model, which predicted that the universe started in a hot, dense singularity around 13.8 billion years ago, and has since been expanding.

Using his observations, Edwin Hubble formulated Hubble’s Law:

Where H0 stands for the Hubble’s constant, which essentially expresses the metric expansion rate of the universe, and hence will be used interchangeably with the phrase. D stands for distance to the objects and v stands for the recessional velocity of the objects. Therefore, the Hubble constant can be derived from the velocity and distance of an object.

It was later discovered that the name “Hubble constant” is a misnomer, since H0 is only a constant in space, but on the contrary, is a variable in time. Thus, it would be more appropriate to call it the Hubble parameter. Considering its change in time, another way to describe the Hubble Parameter is , where a(t) is the time-dependent scale factor of the universe, and is the rate of change of the scale factor. 3The scale factor indicates the size of a standard unit of space; therefore, an increase in the scale factor indicates an expansion of the fabric of space.

Measuring the recessional velocity

To calculate the value of the Hubble parameter, one piece of data needed is the velocity by which an object is receding from us. The recessional velocities of distant astronomical objects cannot be measured directly due to the large distances between the Earth and other galaxies, with units in lightyears. Thus, the recessional velocity is calculated from the redshift of the objects’ emission spectrums. Redshift, in the context of cosmology, is the lengthening of wavelengths of the electromagnetic radiation emitted by a source object. It is caused by the expansion of the fabric of space itself during the radiation’s course of travel to the Earth, a different mechanism than relative motion, which causes Doppler effects such as the change in tone of a passing ambulance’s siren. 4

The definition of z redshift is:

Where λrest is the rest wavelength and λobserved  is the observed wavelength of the spectral lines. The calculation of redshift is only possible because specific elements only emit light at particular wavelengths, with known patterns that could be identified. Hence, so long as the emission spectrum of an element can be recognized, λrest is known.

Generally speaking, the recessional velocity v can be approximated by the linear approximation v ≈ z×c , where is the speed of light in vacuum. At lower velocities, this approximation gives results similar to those of the more complex original formula. However, if the recessional velocity of an object is near the speed of light, the result of this linear approximation diverges from the original solution using general relativity, which accounts for the geometry of the expanding space using different cosmological models, as shown in graph 1.5

A close up of a map

Description automatically generated

For redshifts under 1.0, it is evident that values calculated using special relativity are close to the values calculated using general relativity, since they lie within the grey band of possible values calculated using general relativity, while the linear approximation starts to deviate from the original solution at a redshift of 0.5. As a result, a formula that accounts for the effect of special relativity would give a better approximation of the recessional velocities than the linear approximation. The formula below demonstrates the special relativistic approximation for recessional velocity:

which takes into account the upper limit of speed in the universe as predicted by special relativity: the speed of light, c.

Calculating the distances using standard candles

Measuring the distance to very distant astronomical objects can be difficult, since the standard technique using parallax requires extreme precision of instruments and takes too long. Hence astronomers use what is known as “standard candles”, which are objects whose absolute luminosity is known.

Two types of standard candles are used in this paper: cepheid variables, which are stars that pulsate at regular periods, and Type Ⅰa supernovae, whose light curve shape correlates with their intrinsic luminosity. Type Ia supernovae are used in addition to cepheids because of the limitation of the visible distance range of cepheid variables.

The absolute luminosity of a cepheid variable could be deduced from its period of pulsation, since there is a reliable direct relation between the two, 6 as discovered in 1908 by Henrietta Swan Leavitt.

Type Ia Supernovae are formed by a white dwarf siphoning mass from its binary companion until it hits a point of critical mass and explodes due to a runaway nuclear fusion reaction. Their luminosity can be standardized through the use of the correlation between the width of their light curves and their intrinsic brightness

Using the values of absolute luminosity and observed luminosity flux of standard candles, the distance of the objects can be deduced fairly accurately, using the formula , where r is the distance, L is the luminosity, and F is the luminosity flux. Although recently, concerns have been raised over intergalactic gas absorbing lights coming from the standard candles, making the objects appear further than they actually are, they are currently still one of our best methods to measure the distance of objects far from us.

Historical studies

A large number of studies on the measurements of the Hubble constant has been published, which use a variety of methods. Each study tends to give different values of the Hubble constant, potentially resulting from the uncertainty of the measurements or fundamental flaws in physics.

Following the original paper of Hubble published in 1929 as mentioned earlier, which proposed the method of using a velocity-distance diagram in order to elucidate the expansion history of the universe, the observations of the distances and redshifts of distant objects have been refined. Hubble’s results were close to ten times the modern value, due to systematic errors in measuring the distances to galaxies. In this project, the value produced from statistical analysis has the modern values within its range of uncertainty.

In 2018, the Planck Satellite measurements of Cosmic Microwave background radiation was used by the Planck team to calculate a value for the Hubble Constant using the concordant model of cosmology, which gave a value of . The data from Planck depends largely on the theoretical models of cosmology, whereas in this project, the results are derived locally.

In March 2019, Adam Riess and his team published a local determination of the Hubble constant using the data of 70 cepheid variables in the Large Magellanic Cloud obtained by the Hubble Space Telescope (HST): Instead of only using cepheid variables, Type Ia Supernovae are also used in this paper.

Although the measurements had become more accurate since Hubble’s age, there is still a large discrepancy between the measurements of the Hubble constant using different methods; thus, it is valuable to derive a primary value. In this project, the linear trend of redshift against distance is first tested using estimation of distance based on galaxy clusters; then, both cepheid variables and Type Ia supernovae are used in deriving the value for the Hubble Constant.


How are redshift and recessional velocity related to the distance?

The emission spectrum chosen to measure the redshift with is the hydrogen Balmer series, due to the prevalence of hydrogen in astronomical objects and the availability of spectral data in the visible range of the electromagnetic spectrum. The Balmer series is characterised by electron transitions from a higher energy level with principal quantum number n>2 to an energy level with principal quantum number 2. The theoretical values of the emission lines’ wavelengths can be calculated from the Balmer series equation, , using the value 10973732 m-1 for the Rydberg’s constant RH . When the order n = 3, the spectral line is called hydrogen alpha, when n = 4, the name is hydrogen beta, when n = 5, the name is hydrogen gamma, and when n = 6, it is named hydrogen delta.  λHα rest656.11nm, λHβ rest ≈ 486.01 nm ,  λHγ rest ≈ 433.94 nm , λHδ  rest ≈ 410.07  nm where λrest refers to the theoretical rest wavelength.

Since sources of experimental data online suggest different λrest  values for the Balmer series, disagreeing with one another, the theoretical values are verified first. To verify the theoretical values for , primary data was collected in the laboratory.

Two methods were evaluated and compared: one using the more accessible method of diffraction grating, another using a digital spectrometer. First, the experiment is designed, using diffraction grating. Diffraction grating acts like numerous slits, creating an interference pattern of light, from which the wavelength of the light source can be deduced; as shown in Figures 2, 3, and 4.

After evaluation, several problems with using diffraction grating to measure the rest wavelength surfaced. After calculating the range of uncertainty of the results, it was discovered that the error bars are too big to obtain a value that is accurate enough. Furthermore, this method only works for monochromatic light, once there is more than one wavelength, the pattern may be too blurry for distinct emission lines to be recognized.

Therefore, the second approach of using a digital spectrometer was adopted. By placing a hydrogen discharge tube in a dark room, then using a digital spectrometer connected to the computer to generate a spectrum, the values of the wavelengths of the emission lines could be read off. Within the limit of precision, the results agreed with the theoretical predictions calculated using the Balmer series formula in the previous paragraph, verifying them. Therefore, when calculating the redshift, the predicted values were used.

The first approach to obtain data for the observed wavelength, of astronomical objects was to use the ESO (European Southern Observatory) Science Archive facility, where archived graphs of spectrums in the visible range of EM radiation of some distant objects can be accessed The emission line series of hydrogen can be identified through the pattern of the series, so the can be read off.

Spectral lines with considerable widths can occur, possibly due to the overlapping of shifted spectral lines, present in different parts of the galaxy that are red-shifted slightly differently. Random decisions on where in the interval the wavelength reading is taken could introduce further inaccuracy to the data. In order to improve the consistency of the readings, the value of the wavelength corresponding to the middle of the dip was taken, as illustrated in Graph 8.

Next, for each astronomical object’s spectrum, the redshift of all four hydrogen spectral lines were measured and the average redshift value of the four was calculated to minimize the effect of anomalous data.

However, a problem surfaced later when gathering the data of the distances to the galaxies. Since the objects in the ESO database are mainly in the NGC (new general catalogue), most of them were not very well-known, and the data of their distances to us could not be found in the ESO database nor any other database. After a round of searching, only three objects in the Messier Catalogue had distance data. Calculating the redshifts, then plotting the distance against redshift gives us a weakly positive correlation, as shown in Graph 9; however, the sample size with distance is too small, and the trend could easily be wrong due to the effect of anomalous data.Since it was hard to decide on the absolute distance of the distant objects, at first, the approach of estimating relative distances of the objects was used to find out the relationship between redshift (roughly proportional to the recessional velocities) and distances.

There are a few other approaches available – using the luminosity magnitudes of galaxies to compare their relative distances, which is a measure of their apparent brightness, or by using the relative apparent sizes of galaxies, which is roughly inversely proportional to their distances.

However, both approaches assume identical absolute luminosities and sizes of different galaxies, which is not true. So, to minimize the effect of this assumption, another approach was taken; that is, to use galaxy clusters. Statistically, it is likely that between groups of galaxies, the medium-sized galaxies in each cluster should have approximately similar sizes10

As a result, to estimate the relative distances of galaxies, only the apparent brightness of every 5th-ranked galaxy in each galaxy cluster was used, which should statistically be more likely to have similar absolute brightness than randomly selected galaxies. The SDSS-C4 DR2 Galaxy Cluster Catalog, a list of galaxy clusters generated using the C4 algorithm, was used as the sample frame. 11 31 samples are randomly selected; the data of the u-magnitudes of the fifth-ranked galaxy in each sample galaxy cluster are then recorded from the Sloan Digital Sky Server (SDSS) Data Release 15. Their relative distances, d, were calculated from the data, using the formula

where m stands for magnitude and F is the radiant flux. Plotting a graph of redshift against relative distance, Graph 10 was generated. The correlation is strongly positive, and evidently linear, which supports Hubble’s discovery of a linear relationship between distances and redshifts of astronomical objects, hence verifying Hubble’s law at relatively close distances.

To further extend the data set size and increase the reliability of the conclusions, a more computational approach was adopted. A large dataset could be used to show general trends despite being based on the assumption of equal absolute luminosity magnitudes; as a result, the use of galaxy clusters is not necessary.

A close up of a map

Description automatically generated

Using SQL (Structured Query Language) in the SDSS DR15 database, a list of redshifts and u-magnitudes of 700 galaxies within the constraint 20.8<u<25 were generated. 12 The data was then cleared: the data sets with redshift z<0.015 were deleted, since such small redshifts are more likely to be caused by random errors and the Doppler effect due to peculiar velocities, rather than by the expansion of space. Data points with a u-magnitude u>26.0 or redshift z>1.0  were also cleared, since there are too few data sets in this range to be representative of any general trends.

Then, relative distances were calculated from the u-magnitudes using formula 4, and recessional velocities are calculated from redshifts using formula 3. Graph 11 was generated from the remaining 635 sets of data.

The Pearson Product Moment Correlation Coefficient (PPMCC), a measurement of the strength of linear correlation between two sets of data, is 0.664. This indicates a fairly strong positive linear correlation between recessional velocities and relative distances and further verifies the theory of the metric expansion of space, supporting the prediction of the Big Bang model. The strongest pattern of linear positive correlation is shown for data sets with relative distances below 50000, where most data was

concentrated. The linear pattern is marked by the red line in the graph, which obeys Hubble’s Law.

If we account for the pattern of the loosely distributed datasets at the higher end, however, the trend appears to be more non-linear the further an object is, as shown by the blue trendline. Since the gradient here is the expansion rate of the universe, a changing gradient could be seen as evidence for the acceleration of the expansion, hence supporting the existence of dark energy. The further an object is, the older the light we receive from it, the slower the expansion rate is. The gradient decreases as the distance increases, meaning that the universe expanded more slowly at earlier times and indicating an acceleration in the expansion.

This is only an estimation of the trend since it was assumed when making this graph that all the objects have the same absolute luminosity, which is not the case in reality. Consequently, this trend may be affected by differences in the absolute luminosities of galaxies.

Calculating the Hubble parameter

The gradients of graphs derived in the previous sections could not be used to calculate the exact value of the expansion rate because relative distances were utilized; data of absolute distances was needed to derive a numerical value of the Hubble Parameter. Data of standard candles was utilized, since precise measurements of their absolute distances are available.

Data of cepheid variables was located in the data collection of the NED master list of galaxy distances (NED-1D), a collection of distance data from the NASA/IPAC Extragalactic Database, compiled by Barry F. Madore and Ian P. Steer When there were multiple distance measurements for the same object (measured by different instruments), the average of three values (the largest, the smallest, and the median) was taken to reduce the effect of anomalies. Using the data, the scatter graph shown was plotted.

A picture containing text

Description automatically generated

There is a strong positive linear correlation for data between 0 and 14 Mpc, marked by the black trendline in Graph 13. Following the clear initial linear trend, the gradient, or the expansion rate, was calculated to be Using a second approach of averaging the maximum and minimum gradient (marked by the red trendlines), the Hubble Parameter was then calculated to be . Both values of Hubble Parameter derived here are reasonable, since the exact value of the Hubble Parameter is still of great debate, with published values ranging from  . The great range of uncertainty could be caused by sources of random errors in the experiment designs of various methods – for instance, the absorption of light by intergalactic gases.

However, there is a limitation to the method using cepheid variables, namely the range of data it can be used to measure: usually up to 30-40 Mpc away from Earth. This restriction results in a greater concentration of data in the range below 24 Mpc. As a result, the very few data sets at the higher distance range could not be used to deduce a reliable relationship.

This problem was tackled through the use of Type Ia supernovae, which have much brighter absolute luminosities and could be seen from larger distances. In the NED-4D data collection, 128 sets of data were randomly selected, consisting of the recessional velocities and distances of Type Ia supernovae. Plotting their recessional velocities against distances, Graph 14 was generated.

A close up of a map

Description automatically generated

The initial trend is positively linear (as marked in red), after which the trend is better approximated by a non-linear trendline marked in blue. The Hubble Parameter was calculated from the red trendline: . This value is lower than the current best estimates of the Hubble Parameter, possibly because of the large uncertainty involved in the dataset; since the NED-4D dataset includes supernovae data from 1990 onwards, so older datasets included could have had significant uncertainty, deviating the derived Hubble parameter value.

The decrease in gradient at larger distances, again, could be interpreted as evidence for the accelerating expansion of the universe, which in turn implies the existence of dark energy.


In this investigation, the theoretical rest wavelengths of the hydrogen Balmer series, as predicted by the Balmer series formula, were verified experimentally. The observed wavelengths were calculated from the electromagnetic spectrum of the galaxies taken from online databases. Redshifts were then calculated.

Plotting the relative distances estimated statistically using galaxy clusters against redshift shows us a strong positive correlation which supports Hubble’s Law, which in turn supports the Big Bang model of cosmology. Then, the luminosity distances of a large sample of 635 galaxies were plotted against their recessional velocities, demonstrating a strongly positive correlation with PPMCC being 0.664. The trend evidently becomes non-linear at larger distances with a decreasing gradient, which could be interpreted as evidence for the existence of dark energy.

The value of the Hubble Parameter derived from cepheid variables, following the linear trend of closer galaxies, is , whereas averaging the maximum and minimum values gives . This is close to the results obtained by Riess et al. 8 using cepheid variables, but has a bigger uncertainty. The uncertainty difference is due to the classification of two types of cepheid variables in Riess et al 8and their calibrations of cepheids using other methods of distance measurement, which improves the accuracy of the value derived.

To look at trends at larger distances, data of Type Ia Supernovae were used, giving , which is within the published range of values. The trend exhibited by the data of Type Ia Supernovae at larger distances is again non-linear, with gradient decreasing with distance, indicating a potential acceleration of the expansion.

The range of uncertainty of the expansion rate indicates that we could not determine an exact model of the universe from this data alone, with multiple models of the fate of the universe all being possible. However, the existence of the acceleration in the expansion does suggest the existence of dark energy, which indicates that we are likely in the dark energy dominated era of expansion. From the observed matter density in the universe, it is highly likely that our universe would not stop expanding – it would not end in a “Big Crunch”, since the gravitational effect of normal matter is not enough to stop the expansion, and space will just keep on expanding forever. However, accounting for the effect of yet-to-be-discovered dark matter, everything is still possible.

To improve this project, one could investigate the instrumentation used to measure the data in the databases and estimate a numerical value for the uncertainty. This estimation would allow error bars to be drawn on the diagrams, which would in turn facilitate the fitting of a trendline. Currently, since the trendlines are drawn without error bars, they are only a subjective estimation; error bars could give an objective limit of the range of possible trendlines, which would improve the accuracy of the estimation.


I would like to thank Mr Gerald Skym for giving me advice on potential areas to investigate and some resources available. I would also like to thank Zoe Fong for the support she generously offered me.


  1. Collaboration, Planck, N. Aghanim, Y. Akrami, M. Ashdown, J. Aumont, C. Baccigalupi, and M. Ballardini et al. 2020. “Planck 2018 Results. VI. Cosmological Parameters”. Arxiv.Org.
  2. Hubble, Edwin. “A Relation between Distance and Radial Velocity among Extra-Galactic Nebulae.” A Relation between Distance and Radial Velocity among Extra-Galactic Nebulae. PNAS, 1929.
  3. Susskind, Leonard. 2013. Cosmology Lecture 1. Stanford, winter.
  4. Swinburne University of Technology. n.d. Cosmological redshift. Accessed 10 30, 2019.
  5. Tamara M. Davis, Charles H. Lineweaver. 2000. “Superluminal Recession Velocities.”
  6. Feast, M. W., Walker, A. R. 1987. “Cepheids as distance indicators.”
  7. Phillips, M. M. 1993. “The absolute magnitudes of Type IA supernovae.”
  8. Riess, Adam G., Stefano Casertano, Wenlong Yuan, Lucas M. Macri, and Dan Scolnic. 2020. “Large Magellanic Cloud Cepheid Standards Provide A 1% Foundation For The Determination Of The Hubble Constant And Stronger Evidence For Physics Beyond ΛCDM.”
  9. European Southern Observatory. n.d. ESO – Science Archive Facility.
  10. N.d. The Hubble diagram.
  11. Miller, Christopher. 2005. “The SDSS-C4 DR2 Galaxy Cluster Catalog.”
  12. SDSS. n.d. Sky Server Search form.
  13. Steer, Barry F. Madore and Ian P. n.d. NASA/IPAC extragalactic database master list of galaxy distances.

About the Author

Hyaline Chen is currently in year 12 at the Perse School, Cambridge. She takes double maths, physics and philosophy. Hyaline is always enthusiastic about ‘the big questions’, the nature of reality and the start of everything, for instances; and she admires the certainty with which physics seeks to answer those questions. Cosmology and particle physics, out of all areas, interest her the most.

Leave a Reply

Your email address will not be published. Required fields are marked *