Sunday, November 1, 2009

Astronomical Optical Interferometry

Astronomical Optical Interferometry

A Literature Review by Bob Tubbs

St John's College Cambridge

Abstract

This report documents the development of optical interferometry and provides a physical explanation of the processes involved. It is based upon scientific papers published over the last 150 years, and I have included references to the ones which are most relevant. The reader is assumed to have an understanding of modern optical theory up to undergraduate level - References 28 and 29 give explanations at a more basic level. The formation of images from interferometric measurements is discussed and several example images are included.

Introduction

Fizeau first suggested that optical interferometry might be used for the measurement of stellar diameters at the Academie des Sciences in 1867. The short wavelength of light and the absence of sensitive calibrated detectors precluded more sophisticated interferometric measurements in the optical spectrum for over a century. After the Second World War most researchers instead turned to the radio spectrum, where macroscopic wavelengths and electronic detection greatly simplified the measurement of interferometric quantities. Modern computers, lasers, optical detectors and the data processing techniques developed for radio interferometry have recently enabled astronomers to produce high resolution images with optical arrays. At present only a few optical interferometer arrays are capable of image formation but many more are planned or under construction. The basic principles underlying the operation of optical interferometers have not changed, so I begin with a look at some of the earliest instruments.

Notes:

  • Superscript numbers 1) link to the References section of this report and relate to relevant reference numbers.
  • All unusual symbols are presented as GIF images.
For a more detailed description of astronomical optical interferometry I would recommend this review article by John Monnier (68 pages).

Early Optical Interferometry

The American physicist A. A. Michelson demonstrated the practicability of measuring light sources using optical interferometry2 in 1890 with the experimental apparatus shown in Figure 1.
Figure 1 - Michelson?s experimental apparatus

Various masks were placed in front of incoherent light sources, acting as "artificial stars" for the experiment. Light from a distant artificial star passed through slits O and O' and was then focused by a lens of focal length y to form an image on the screen. In a mathematical analysis of this experiment it is easier to first consider a monochromatic point source at Q on the optic axis. Spherical wavefronts will radiate from the source reaching slits O and O' simultaneously. Light passing through slit O will interfere with light passing through slit O' forming intensity fringes on the screen either side of point P. The optical path length from Q to point P on the screen is the same for rays travelling through either slit. This will not be the general case for light rays travelling to an arbitrary point on the screen from Q. The difference in optical path length between light rays travelling via slit O and those travelling via O' will then be to a first approximation, where v is the co-ordinate on the screen shown in Figure 1. When light rays from the two slits are combined on the screen they will interfere producing intensity proportional to , where k is the wavenumber defined as . Light rays from a point source offset from Q by an angle as shown in Figure 1 give light intensity on the screen proportional to . An extended incoherent source placed at Q can be considered as a distribution of many such point sources. A chromatic source can be considered as the superposition of many monochromatic sources of different frequency. The intensity observed on the screen will be the sum of the intensities produced by each point on the source.

Michelson was not able to make quantitative measurements of the visibility of interference fringes on the screen but did make measurements of the slit separation x which gave minimum fringe visibility. The size of the artificial star can be calculated from this measurement provided its shape and distance are known. With modern photodiode detectors it is possible to make accurate intensity measurements and hence calculate fringe visibilities. The viewing screen is replaced by four light intensity detectors as shown in Figure 2. Detector 1 is positioned so that the optical path lengths from the detector to slit O and from the detector to slit O' are equal. Detector 2 is positioned so that the optical path lengths to O and O' differ by a 1/4 of the mean wavelength. For detectors 3 and 4 the path differences are 1/2 of a wavelength and 3/4 of a wavelength respectively. If A is the complex amplitude of the light arriving at detector 1 along the path through slit O, the amplitude of the light arriving via slit O' will be Aexp[-i kx], giving a total amplitude of A+Aexp[-i kx]. The intensity at detector 1 will be:

Similarly if A is the amplitude of the light arriving at detector 2 along the path through slit O, the intensity at the detector will be:

For detector 3:

For detector 4:

I have defined the complex fringe intensity I as (I1-I3)+i(I2-I4) where I1 to I4 are the intensities shown above, and i is . In the case of the point source shown in Figure 2

I=4AA*(cos[ kx]+isin[ kx])

=4AA*exp[i kx]

Figure 2 - Visibility measurement Figure 3 - Alternative optical arrangement

As the complex intensity I is a linear combination of intensities, the complex intensity of an extended incoherent source can be calculated by summing the contributions from each point on the source. The amplitude A( ) of the light received from points between and +d on the source will be dependent on the source brightness distribution B ( ) in the following manner:

(assuming d is small)

The complex intensity for light received between and +d will be I( )=4B( )exp[i kx] d . Integrating over all gives:

If the variable u is defined as u=kx, then ITOTAL is proportional to the Fourier transform of the one dimensional source brightness distribution B( ) with respect to u. If this Fourier transform is normalised to have a total intensity of unity we obtain the complex visibility:

Michelson did not have sensitive electronic detectors so his measurements relied on human eyesight. He succeeded in calculating the diameters of Jupiter's satellites 3 using an aperture mask with two slits of adjustable separation placed over the objective of a 12-inch telescope. He measured the slit separations at which the fringes were least visible, and calculated the diameters of the satellites by assuming them to be circular disks with uniform illumination. His results agreed well with visual estimations of the satellite diameters which had been made using large optical telescopes.

With the optical arrangement of Figure 2 a large objective lens or mirror is required for measurements with large slit separations and much of the light that passes through the slits in the aperture mask is wasted. Figure 3 shows an alternative optical arrangement which uses separate optical elements for the two beams. The incident light is from a distant point source at angle . Light entering each of the slits is split into four equal beams which are then directed to the detectors. The path differences between rays travelling through O and O' to each of the detectors are the same as in Figure 2, but in this arrangement all the light entering the apparatus is used efficiently. In practice glass blocks might produce reflections within the apparatus and would probably not be used. Instead, the appropriate difference in optical path length from the detectors to each of the slits could be produced by careful adjustment of the mirror positions. By varying the optical path length of one of the beams it is possible to calculate the complex visibility with just one detector. As the optical path length is varied the interference fringes will be scanned past the detector. The amplitude and phase of the intensity variations at the detector will be linearly related to the amplitude and phase of the complex visibility. In most modern interferometers the intensity variation with time is Fourier transformed to give an amplitude and phase for the complex visibility.

In 1891 Michelson 4 discussed the possibility of obtaining information about the brightness distribution within a source from interferometric measurements. He conceded that this was not practicable as it would require accurate measurements of fringe visibility at many different slit separations. Over the next sixty years most of the work on optical interferometry concentrated instead on the measurement of stellar diameters and the separation of binary stars5. In 1920 A. A. Michelson and F. G. Pease6 constructed a separate-element Michelson stellar interferometer as shown in Figure 4. The separation of the siderostat mirrors was equivalent to the slit separation in his earlier interferometers. Separations of over 20ft were possible, enabling measurements of the diameters of several large stars to be performed. An interferometer with a 50ft siderostat separation 7 was built in 1930, with mirrors attached to 9 tons of steel girderwork on the front of a 40 inch optical telescope. Very few astronomical measurements were made with this instrument due to the difficulty of operating it. With both of these interferometers atmospheric fluctuations produced phase variations which caused the fringes to "shimmer", making observation extremely difficult. R. Hanbury Brown8 estimated that atmospheric fluctuations may have led to errors of between ten and twenty percent in Michelson and Pease's stellar diameter calculations. Hanbury Brown produced more accurate measurements using an intensity interferometer in Navarra 8. Intensity interferometers look at the statistical relationship between the intensities at two separated detectors observing a distant source. Quantum mechanics suggests that this is related to the amplitude of the complex visibility function, allowing measurements of visibility with large detector separations. Unfortunately the phase of the complex visibility cannot be determined, and accurate visibility amplitudes can only be calculated for bright astronomical sources.

Figure 4 - Simple separate element interferometer

Development of Radio Interferometry

Much of the early work in interferometric imaging was done by radio astronomers. Cosmic radio emissions were discovered in the 1930s9 and radio interferometry developed after the Second World War. In 1946 Ryle and Vonberg10 constructed a radio analogue of the Michelson interferometer and soon located a number of new cosmic radio sources. The signals from two radio antennas were added electronically to produce interference. Ryle and Vonberg's telescope used the rotation of the Earth to scan the sky in one dimension. Fringe visibilities could be calculated from the variation of intensity with time. Later interferometers included a variable delay between one of the antennas and the detector as shown in Figure 5.
Figure 5 - Radio interferometer
In Figure 5 radio waves from a source at an angle to the vertical must travel a distance delta l further in order to reach the left-hand antenna. These signals are thus delayed relative to the signals received at the right hand antenna by a time cdelta l=casin[ ] where c is the speed of the radio waves. The signal from the right hand antenna must be delayed artificially by the same length of time for constructive interference to occur. Interference fringes will be produced by sources with angles in a small range either side of determined by the coherence time of the radio source. Altering the delay time delta t varies the angle at which a source will produce interference fringes. It should be noted that the effective baseline of this interferometer will be given by the projection of the telescope positions onto a plane perpendicular to the source direction. The length of the effective baseline, shown at the bottom of Figure 5, will be x=acos where a is the actual telescope separation.

An interferometer constructed from two antennas with separation variable in one direction can only provide information about the sky brightness distribution in one dimension. However, a two dimensional map of the sky can be produced if the separation vector is varied in two dimensions. In Figure 6 the separation between two radio antennas is described by the vector (a,b) constructed from two cartesian co-ordinates. The position of the source in the sky is described using the angles in the plane of the a axis and in the plane of the b axis. As in Figure 5, the effective baseline (x,y) will be the projection of the separation vector onto a plane perpendicular to the source direction: (x,y)=(acos[ ],bcos[ ]). Measurements of complex visibility are usually plotted in the Fourier transform plane of the sky brightness distribution using the dimensionless variables u conjugate to angle and v conjugate to angle . These can be calculated as u=kx and v=ky, where k is the wavenumber of the radio source defined as . Either the phase of signals from the left-hand antenna can be measured relative to the those from the right hand antenna, or the phase of the signals from the right-hand antenna can be measured relative to those from the left. A measurement of complex visibility for an antenna separation (a,b) can thus provide values of the complex visibility function at two points in the u-v plane:

(u,v)=(kx,ky)=(akcos[ ],bkcos[ ]) and (u,v)=(-kx,-ky)=(-akcos[ ],-bkcos[ ])

Figure 6 - The telescope separation vector (a,b)

In order to produce a perfect map of the sky brightness distribution the complex visibility would have to be known for all points in the u-v plane (Fourier transform plane). The complex visibility must be known at all points in a n×m rectangular array in the u-v plane for a portion of the sky to be mapped with resolution equivalent to n×m pixels. The radio source brightness distribution B( , ) is reconstructed by Fourier transforming the array of complex visibility measurements. Figure 7 shows a typical cosmic radio source with brightness distribution B( , ). Fourier transforming a 40×40 array of complex visibility measurements in the u-v plane gives a relatively accurate model of the source brightness distribution, as shown in Figure 8. Figure 9 shows the cruder model formed from a 9×9 array of complex visibility measurements.

Figure 7 - Source brightness distributionFigure 8 - brightness distribution with 40x40 Fourier componentsFigure 9 - brightness distribution with 9x9 components

Axes and brightness key

For direct measurement of the complex visibility at a rectangular array of points in the u-v plane a large number of different baselines is required. The cost of radio antennas soon led astronomers to try and find methods for calculating the complex visibility throughout the u-v plane using measurements from only a small number of antennas. The most important of these is the Earth rotation aperture synthesis technique.

If an interferometer is constructed from two antennas with a separation which is not parallel to the Earth's axis of rotation, the effective baseline of the interferometer will rotate. Figure 10 shows an interferometer in the northern hemisphere with antennas located at A and B. During the day antenna A will move to A' and then A'' whilst B moves to B' and B''. Only the relative positions of the two antennas are relevant when constructing a map of complex visibility in the u-v plane. To an irrotational observer standing beside antenna A, antenna B would appear to rotate in a circle, and vice-versa. In a twelve hour period the complex visibility can be measured at all points on an ellipse in the u-v plane. If one of the antennas is mobile, the antenna separation can be altered every day so as to measure complex visibilities in a different part of the u-v plane. A mathematical function which approximates the complex visibility is created by interpolation from the measurements made. This can then be Fourier transformed to give an approximation to the source brightness distribution.

Figure 10 - Rotation of the Earth

Information about the fine structural detail of a radio source is found at large values of u and v due to the reciprocal nature of the Fourier transform plane. In order to produce a radio map of high angular resolution it is therefore necessary to measure fringe visibilities over very long baselines. The radio signal received at an antenna cannot be sent further than a few tens of kilometers by electrical cable due to the signal loss incurred. Electronic amplification en route introduces delays and distortion to the signal. The most effective method for measuring the complex visibility for very long baseline interferometry (VLBI) is to first record the signals received by each antenna along with timing signals from a local atomic clock. The recorded signals from each antenna can then be sent to a laboratory where they are replayed to produce interference. Figure 11 shows the received signals from three antennas being recorded onto magnetic tapes along with timing signals from local atomic clocks. From these tapes the complex visibility can be calculated at six points in the u-v plane corresponding to the antenna separations a1, -a1, a2, - a2, a3 and -a3 in Figure 11.

Figure 11 - Recording radio signals for very long baseline interferometry

Each antenna will be a different distance from the radio source, and as with the short baseline radio interferometer (Figure 5) the delays incurred by the extra distance to one antenna must be added artificially to the signals received at each of the other antennas. The approximate delay required can be calculated from the geometry of the problem. The tapes are played back in synchronous using the recorded signals from the atomic clocks as time references, as shown in Figure 12. If the position of the antennas is not known to sufficient accuracy or atmospheric effects are significant, fine adjustments to the delays must be made until interference fringes are detected. If the signal from antenna A is taken as the reference, inaccuracies in the delay will lead to errors e B and e C in the phases of the signals from tapes B and C respectively. As a result of these errors the phase of the complex visibility cannot be measured with a very long baseline interferometer.


Figure 12 - Visibility measurements in very long baseline interferometry

The phase of the complex visibility depends on the symmetry of the source brightness distribution. Any brightness distribution B( , ) can be written as the sum of a symmetric component and an anti-symmetric component . The symmetric component BS of the brightness distribution only contributes to the real part of the complex visibility, while BA only contributes to the imaginary part. To demonstrate the dependence of the phase of the complex visibility on the symmetry of the source I separated the 9×9 array of complex visibility used to produce Figure 9 into real and imaginary parts. Figure 13 was produced using only the real component of the visibility, with the imaginary component set to zero. As the phase of the complex visibility is zero throughout the u-v plane the image is symmetric about its centre. In Figure 14 the real component was removed instead, giving an anti-symmetric image. As the phase of each complex visibility measurement cannot be determined with a very long baseline the symmetry of the corresponding contribution to the source brightness distributions is not known.

Figures 13 - Symmetric componentsFigure 14 - Anti-symmetric components

R. C. Jennison developed a novel technique for obtaining information about visibility phases when delay errors are present, using an observable called the closure phase. Although his initial laboratory measurements of closure phase had been done at optical wavelengths, he foresaw greater potential for his technique in radio interferometry. In 195811 he demonstrated its effectiveness with a radio interferometer, but it only became widely used for long baseline radio interferometry in 197412. A minimum of three antennas are required. I will initially look at the simplest case, with three antennas in a line separated by the distances a1 and a2 shown in Figure 11. The radio signals received are recorded onto magnetic tapes and sent to a laboratory as described above. The effective baselines for a source at an angle will be x1=a1cos[ ], x2=a2cos[ ] and x3=(a1+a2)cos[ ]. The phases of the complex visibility of the radio source corresponding to baselines x1, x2 and x3 I will call 1, 2 and 3 respectively. The phase of interference fringes on each baseline will contain errors resulting from e B and e C in the signal phases. The measured phases for baselines x1, x2 and x3, denoted 1, 2, and 3, will be:

1= 1+e B-e C

2= 2-e B

3= 3-e C

Jennison defined the quantity C for the three antennas as:

C= 1+ 2- 3 = 1+ 2- 3

C is often called the closure phase12.

The contributions to C from errors e B and e C in the signal phases cancel out allowing accurate measurement. Using measurements of C, 3 can be written in terms of 1 and 2, the unknown phases. If many closure phase measurements are made the complex visibility can be written as a function of several unknown phases. In order to produce an image of the sky the unknown phases must be estimated so that the complex visibility function can be calculated. This is usually done using iterative algorythms13,14,15 which attempt to minimise unphysical properties of the image, such as areas of negative brightness (black areas above and below the source in figures 8 and 9) and large fluctuations in the background radio noise well away from the known location of the source. In radio astronomy visibilities are typically measured on more than three baselines simultaneously, providing more information about the source than Jennison's closure phase technique. The mapping algorithms are designed to retreive the maximum amount of information from the measurements performed without adding artificial detail. Images have been produced with baselines of many thousands of kilometers and resolution higher than one milliarcsecond.