Hubble’s Constant

It is difficult to imagine that barely 90 years ago, we were ignorant about what our universe looks like. In fact, it was believed that our universe was limited to the Milky Way. This did not change until American astronomer Edwin Hubble used his observations of Cepheids variable stars in spiral nebulae to calculate the distance to these objects. He did so by utilizing the relation between the period of the Cepheid (i.e., the time it takes for its brightness to oscillate) and its luminosity. Comparing the absolute luminosity of the Cepheids to the measured brightness, Hubble was able to obtain an estimate of the distance to these objects. The nebulae where the Cepheids were located were found to be well outside our galaxy. This finally settled the debate on the nature of these nebulae (which were initially named ‘island universes’) as it was agreed that they were galaxies just like the one we live in.

It is difficult to imagine that barely 90 years ago, we were ignorant about what our universe looks like. In fact, it was believed that our universe was limited to the Milky Way. This did not change until American astronomer Edwin Hubble used his observations of Cepheids variable stars in spiral nebulae to calculate the distance to these objects. He did so by utilizing the relation between the period of the Cepheid (i.e., the time it takes for its brightness to oscillate) and its luminosity. Comparing the absolute luminosity of the Cepheids to the measured brightness, Hubble was able to obtain an estimate of the distance to these objects. The nebulae where the Cepheids were located were found to be well outside our galaxy. This finally settled the debate on the nature of these nebulae (which were initially named 'island universes') as it was agreed that they were galaxies just like the one we live in.
Despite this incredible achievement, Hubble's most famous contribution to cosmology had not come yet. He continued his study on distant galaxies, more specifically on the distance to them, by making use of the previously mentioned Cepheids method. In 1929 Hubble published one of the most iconic papers in the history of Astrophysics: "A relation between distance and radial velocity among extra-galactic nebulae". In said paper, he studied the link between the velocity at which the galaxies are moving away from (or towards) us and the distance that separates us. The results presented evidence for one of the greatest discoveries in science: the expansion of the universe. Hubble showed that most galaxies are moving away from us at a velocity that is proportional to the distance between us and the galaxy. A plot of these results can be found in Figure 1.
The discovery of the expansion of the universe is one of the greatest achievements of 20 th century astrophysics, as it reveals a much deeper secret. One would expect that due to the gravitational force between galaxies, the expansion of the universe would slow down (and eventually reverse). However, in the late 1990s it was found that the cosmos is not just expanding, but it also does so at an accelerated rate. In other words, as the universe becomes larger, it grows faster. This led to the conclusion that 'something' had to be providing the energy to overcome the gravitational pull. Due to its unknown nature, this 'substance' was named dark energy. Figure 1. Hubble's original plot of radial velocity against distance for extragalactic nebulae. "Radial velocities, corrected for solar motion, are plotted against distances estimated from involved stars and mean luminosities of nebulae in a cluster. The black discs and full line represent the solution for solar motion using the nebulae individually; the circles and broken line represent the solution combining the nebulae into groups; the cross represents the mean velocity corresponding to the mean distance of 22 nebulae whose distances could not be estimated individually" (Note: the velocity should be in kilometers per second) Thanks to the discovery of dark energy, along with that of dark matter, it became very clear that our universe is made of many substances other than baryonic matter (i.e., the matter we are made of). In reality, regular matter only makes up about 5% of the cosmos, a number that rises to ~25% for dark matter, and ~70% for dark energy. The study of the composition of the universe led to the creation of the Standard Model of Cosmology, which explains our current understanding of the origin and evolution of the cosmos.

Theoretical Background:
In this section we shall discuss a variety of theoretical concepts (that are not necessarily related to each other) which will be of key importance to understand future discussions.
• Hubble's Law: In his 1929 paper, Edwin Hubble noticed a linear relation between the distance to galaxies ( ) and their radial velocity ( ) (see Figure 1). The general formula for this law is: where ( ) is known is the constant of proportionality known as the Hubble constant. Despite its name, its value is not constant, but changes with time . This is as expected from the expansion of the universe. If as time goes on the universe expands faster, we expect the value of the Hubble constant to increase. Normally, we use another form of the Hubble's Law: where 0 is the current value of ( ). This new formula provides an insight on the rate of expansion of the universe at our current time.
In terms of units, in equation (2)

• Scientific Context
At the current time, there are two main methods to determine the value of the constant of proportionality 0 .
On the one hand, the first method relies on a phenomenon called redshift, which is a consequence of Doppler's effect. This effect is named after Christian Doppler, an Austrian mathematician that discovered that the frequency (and thus wavelength) of sound waves changes with the relative motion of the source with respect to the observer. This phenomenon can be extrapolated to any wave, including light. More specifically, when the source moves away from the observer, the wavelength of the wave increases, which in the case of light means that the spectrum is shifted towards reddish colors. The redshift, , is defined like this: Where is the difference between the wavelength of the observed light, , and that of the emitted light ( 0 ). Additionally, for speeds much smaller than that of light ( ), we can define as: Using the concept of redshift, we can compare something whose appearance we know (like the emission spectrum of hydrogen, whose 0 we know) to what we measure ( ), to determine . This way, we can get a decently accurate value for the speed at which a certain galaxy is moving away from us. To figure out the distance we can use a variety of methods. Among these, we can highlight the use of Cepheid Variables, Supernovae IA (both of which will be discussed later), or any other technique from the distance ladder. Once both the distance and the speed have been worked out, it is possible to infer a value of the Hubble constant. The current consensus of such value is about On the other hand, the second procedure of finding a value for the Hubble constant is built on a much more fundamental idea: our understanding of the Universe. Here comes into play the Lambda-CDM model (ΛCDM), which is used to describe what the Universe is made of and in what proportions. First, the Greek letter Λ stands for a cosmological constant associated with dark energy, a substance which we believe is intrinsic to space and makes its expansion accelerate. Secondly, CDM means Cold Dark Matter, a component of the Universe we cannot see but whose gravitational effects can be measured. Finally, ΛCDM also considers baryonic matter (i.e. ordinary matter). ΛCDM is a cosmological model that has been utilized to accurately predict many things, including the macroscopic structures of our Universe, or the amount of Helium that was formed in the early Universe. Astrophysicists realized that, depending on how much of each substance the primordial plasma (i.e. the high-energy content of the early Universe) had before it started expanding, the final light this plasma would emit would be different. This light, which is essentially a snapshot of the early Universe, is called Cosmic Microwave Background (CMB). By measuring the actual CMB, we can estimate the proportions of the main three components of the Universe: dark energy, dark matter and baryonic matter; whose abundances are about 70%, 25% and 5%, respectively. Using these values, we can figure out the rate of expansion of the Universe at our age, which yields a value of the Hubble constant of 67.4 ± 1.4 −1 −1 . Historically, when these two methods started to be used to measure 0 , they produced very high uncertainties (on the order of 5 On the other hand, ΛCDM-supporters argue that the for the redshift method we have not measured a large enough number of galaxies or we are not taking into account possible gravitational interferences between galaxies, which would affect the value of the Hubble constant.
• Distance Measuring -Cepheid Variables: Here we shall discuss the use of Cepheid variable stars to measure distances, as this method will become important later in the experiment. Cepheids are a type of variable star, i.e., a star whose luminosity oscillates periodically with time. As mentioned above, the period of oscillation of the brightness of a Cepheid is related to its average luminosity or absolute magnitude.
This link period-brightness link is known as Leavitt's law, named after Henrietta Leavitt, its discoverer. For the visible part of the spectrum, this relation becomes: = 1.371 ± 0.095 − (2.986 ± 0.094)log 10 ( ) , where is the absolute magnitude at visible light, and is the period measured in days.  If one measures the time difference between consecutive peaks (or troughs) in brightness, a value for the period can be obtained. Plugging said value on equation (3), the average absolute magnitude is found. Now all that must be done is compare it to the apparent magnitude , i.e., the magnitude of the object as measured from the Earth. To do so, we must use the following formula: where is the distance to the star in parsecs. Solving for , an estimate of the distance to a Cepheid can be obtained. This is the method used by Edwin Hubble in early 1900s to find the distance to nearby galaxies.
• Distance Measuring -Type IA Supernovae: Supernovae are massive explosions of a star and are considered the biggest explosions that take place in space, as they can be even brighter than a galaxy. They occur at the latest stages of a star's life. Normally, most supernovae happen when the star runs out of fuel to feed the nuclear fusion reactions that take place at its core. These reactions exert an outward pressure that counters gravity. However, when there is no more fuel for these reactions, the lack of outward pressure leads to the collapse of the core of the star, which eventually results in the explosion of the star itself.
Nevertheless, this process is not always the cause of the explosion of stars.
There is a type of supernova that only takes place in very specific conditions: type Ia supernovae. These are thought to originate in binary systems consisting of a white dwarf and a moderately massive star (but more massive than the white dwarf). If these two are too close, the tidal forces exerted by the white dwarf can become stronger than the gravitational force keeping the companion star together. If this happens, the former will rip apart material from the latter, which will be accreted into the white dwarf. However, if the mass of the dwarf exceeds the Chandrasekhar limit (i.e., the maximum mass a white dwarf can have before it becomes stable due to gravity overcoming the outward electron degeneracy pressure), the star will go supernova. These explosions are the brightest of any kind of supernovae, reaching an absolute magnitude of ~− 19.5 at peak luminosity.

Methods:
• Data Collection: Galactic Surveys: The Center for Astrophysics (CfA) is an ongoing collaboration between the Smithsonian Astrophysical Observatory and Harvard College Observatory founded in 1973 in Cambridge, Massachusetts. This joint project had the objective of mapping the large-scale structure of the universe.
From 1977 to 1982, the first major galactic survey was made, aiming to measure the radial velocities of the brightest galaxies (those with apparent magnitudes below 14.5) in the nearby universe: "This survey produced the first large area and moderately deep maps of large-scale structures in the nearby universe, as well as the first crude but truly quantitative measurements of the 3-D clustering properties of galaxies". The procedure followed was using the redshift of the observed light to calculate the radial velocity of the galaxies (equations 5 and 6) and link this to the distance to the galaxies using Hubble's law (equation 2). Thankfully for us, this survey initially looked at some nearby galaxies to find a value of the Hubble constant with which work out the distances to the farthest galaxies (whose distance cannot be measured with conventional methods such as the use of Cepheid Variables or Type Ia supernovae).
This data was a list of observed galaxies with their respective radial velocity. For some of these galaxies, the distance value was included, providing all we needed to find 0 .

• Data Analysis:
In this section we will explore how the data from the survey was analyzed to find a value for the Hubble constant and obtain a plot of the galaxies' radial velocities versus their distance from us.
Firstly, because the data from the survey was incomplete, i.e., for some galaxies it was not specified how far away they are; we had to filter out those galaxies whose distance was unknown. From this new set of galaxies, we just had to extract their respective values of velocity and distance. Figure 6. Plot of raw galactic data of distance and radial velocities. Note that there are several clear outliers, e.g., a few galaxies with negative velocities.
However, there was a problem with our data: there were several outliers. In other words, some galaxies had abnormal velocities that could not be explained with Hubble's law. Most of these were galaxies that were very close and had negative radial velocities (because the gravitational pull from the Milky Way was able to overcome Hubble's expansion). Nevertheless, there were some other galaxies that were very far away and still had abnormal velocities. This is most likely explained by gravitational interactions with neighboring galaxies. Because these datapoints were not useful for studying Hubble's law, they had to be discarded. We shall talk discuss how this was done in a moment.
With a set of velocities and distances a fit could now be performed. To do this, we used the library NumPy on Python, more specifically, the function numpy.polyfit(), which fits a polynomial through the data, whose degree can be specified. The th degree fit function is thus: ( ) = 0 + 1 + 2 2 + . .. + .
In our case, since the relation between the velocity and distance is linear, a firstdegree polynomial was expected to fit the data. So, in reality: Therefore, the expected output of the fit are two parameters: the y-offset and the slope of the linear function, though we are only interested in the latter.
Numpy.polyfit() works by minimization of the squares (also known as least squares fit). Essentially, it minimizes the sum of the squared difference between the datapoint and the fit function's value at that point, through all datapoints: where represents each velocity value, and ( ) represents the polynomial evaluated at the corresponding distance value . In other words, numpy.polyfit() tries a whole range of parameters 0 and 1 until it finds the ones that minimize equation (7). Additionally, this NumPy function can also return the corresponding uncertainty on said parameters.
Having now a fit through all data, the outliers must be filtered out. To do this, all points where the absolute value of the difference between the velocity and the fit at that point was greater than 0.5 times the velocity were removed. In mathematical terms, all points where the following condition was met were deleted: With all outliers now removed, it is time to perform a second linear fit, with more accurate results on the slope and error on the slope. This was done following the same procedure explained earlier.

• Error Estimation:
One of the main advantages of using the NumPy function polyfit() is that it can be ordered to return a covariance matrix for the fit parameters.
In the case of a linear fit, i.e., ( ) = + , the covariance matrix looks like this: However, we are only interested in the error on the slope. This is calculated the following way:

Results:
As explained earlier, the results of the parameters of the fit were estimated by the minimization of the squared error (equation 7). In this case, we are interested in the parameter corresponding to the polynomial term of first order, i.e., the slope. One can better understand how this is done by look at the following figure:  Having determined a reasonable value of the slope, a plot of the datapoints and the linear fit was made, resulting in the following figure: Figure 8. Plot of the galaxies' data obtained from the survey (in black), along with the linear fit (in red) that minimizes the sum of the squared differences.
As one can see, in this plot the outliers are no longer present, and the remaining datapoints fall in the vicinity of the linear function very well. However, because of how the filter was constructed (see Equation 8), at the left end of the plot there are many datapoints that are rather far away from the fit. This is due to the fact that in this region the velocity is very small, making it is easy for some point to get past the initial filter. This partially the reason why the bulk of galaxies on the plot is in this region We now must take a look at the residuals plot in Figure 4, which represents the difference between the velocity value and the fit at each point. This plot further supports the point made earlier that the filter is not very effective at low velocities, as one can see that this is where highest residuals are actually located. Figure 9. Plot of the residuals against distance for all galaxies. Discussion:

Results and Procedure:
As outlined in the previous section, the results were quite satisfying since they fall very well and precisely on our linear fit. ) is very small, which places our experimental value 5.97 standard deviations away from the accepted value. According to a normal distribution, the likelihood of getting a difference of 5 standard deviations or higher is of order 10 −7 . This clearly indicates that the uncertainty has been severely underestimated. Therefore, we concluded that the method described above to determine the error on the Hubble constant should be replaced by some other procedure that does not yield such an underestimate.
Additionally, it is worth mentioning that, as it can be seen in Figure 5, the residuals (difference between measured velocity and fit velocity) are a lot bigger at the left end of the plot than they are on the right end. This is, as discussed earlier, most likely due to the way the outliers were filtered out. Because of the procedure that numpy.polyfit() uses to calculate the error on the slope, this may potentially have impacted the resulting uncertainty on the Hubble constant. For all these reasons, it is convenient to utilize some other method to identify and remove the outliers if the experiment is to be repeated.

Potential Improvements:
There are several changes that can potentially be introduced to obtain better and more precise results. These include the already mentioned: utilizing a different method to find 0 and making use of a different filtering method for the outliers. However, there are several more ways we could improve our results.
The most obvious potential enhancement is utilizing more datasets. That is, obtaining more galaxies' data from different galactic surveys. An instance of this could be the Hubble Legacy Archive (HLA). This platform provides access to most observations made by the Hubble Space Telescope, in several types of file formats. Additionally, it also provides a tool to calculate the radial velocity of each galaxy based on the observed wavelength of the light coming from the galaxy. More information on the HLA can be found on: https://hla.stsci.edu/.
Additionally, another potential improvement would have been to calculate the distance to the galaxies ourselves, instead of relying on the distances provided by the CfA galactic survey. To do this we could have made use of a variety of methods, an instance of which is main sequence fitting. This procedure relies on the Hertzsprung-Russel (HR) diagram. Without going into much detail, this is a representation of several star types. An instance of the HR diagram is the following figure: Figure 10. Example of the HR diagram.
This diagram plots the absolute magnitude of stars (or in the case of Figure 7 their luminosity in a logarithmic scale, which is equivalent) against their spectral type. Evolutionary patterns have been shown to relate to the mass, age and composition of the star, which allows us to classify stars in several types. The principal of these is the main sequence, formed by stars on their hydrogenburning phase. If a star falls in this group and its spectral properties are measured, its absolute magnitude can be estimated, which can be compared to its measured apparent magnitude (using equation 4) to work out the distance to the star.

Importance of the Experiment:
This experiment was performed in hopes of finding a value of the Hubble constant that agrees with the theoretical value. And while the result was closer to the value predicted by the ΛCDM model than to the accepted experimental one, it is logical to think that if all the improvements explained above were implemented, we would have obtained a result within the range of the experimental redshift 0 .
Nevertheless, there is some hope, as new techniques of measuring 0 are being developed, such as the tip of the red-giant branch, megamasers or even gravitational waves. Hopefully, these alternative methods will yield new values of the Hubble constant which will help us determine which of the two current values is more accurate.
If the Standard Model of Cosmology (ΛCDM) turns out to provide a correct value of the Hubble constant, astrophysicists will probably have to improve and change the experimental procedure of determining 0 , or accounting for external phenomena (some of which we may not even know about). On the other hand, if the redshift method is the one that is in the right, this might mean that we must rethink our understanding of the Universe. This could involve changing some parameters of the distributions of dark energy, dark matter, and baryonic matter; or maybe finding new ways the universe expands; or even something we simply cannot imagine right now.
Even though this current situation might seem quite frustrating, it shows how little we know about the place we inhabit. Furthermore, we must remember that pretty much every single major scientific discovery has had a wave of confusion and disagreement as precedent (such as the origin of species and evolution; the nature of atoms; quantum mechanics and relativity, etc.). Therefore, for all we know, this situation could very well be the precursor to another outstanding scientific revolution.