CoKon 65-2






			      Non-Periodic Phenomena in Variable Stars
				  IAU Colloquium, Budapest, 1968

SPECTRAL ANALYSIS OF THE LIGHT CURVES OF T TAURI STARS 
		AND OTHER OBJECTS

		STEPHEN PLAGEMANN
Institute of Theoretical Astronomy, Cambridge, England


		  LIST OF SYMBOLS

(1) f	          frequency in units of cycles per day.
(2) f_N          Nyquist frequency, where fn = 1/2 Delta t
(3) omega        angular frequency in radians, f_N = pi / Delta t
(4) Delta t      sampling interval.
(5) n	          number of observations.
(6) lambda(k)    lag of time domain window, also known as the covariance 
	  averaging kernel.
(7) k	          number of lags in the time domain.
(8) m	          the total number of lags.
(9) X_i(t)       observed time series (light curve).
(10) sigma^2      variance.
(11) mu_i	  mean of time series.
(12) gamma_11(k)  sample estimate of the autocovariance of lag k uncorrected 
	  for the mean.
(13) gamma_12(k)  crosscovariance of X_1(t) and X_2(t).
(14) rho_11(k)	  autocorrelation coefficient.
(15) c_11(k)	  sample estimate of the autocovariance.
(16) r_12(k)	  ordinary product-moment correlation between X_1(t) and X_2(t).
(17) U_k	  series used as filter in time domain with convolution.
(18) T(omega)	  spectral window or kernel. 
(19) b(omega)	  bias.
(20) nu           degrees of freedom.
(21) B omega      bandwidth.
(22) A_11(k)	  autocorrelation function of X_1(t).
(23) A_21(k)	  crosscorrelation function of X_2(t) and with X_1(t).
(24) I_11(omega)  spectral estimate for harmonic analysis. 
(25) eta(omega-y) spectral window in frequency domain.
(26) f (omega)	  spectral density function of X_1(t); the cosine transformation
	  of X_1(t).
(27) R_12	  contribution to the sample variance from the jth harmonic.
(28) E(k)	  the energy per unit interval of f(a m Delta t) centred upon 
	  k = 0, 1, ..., m.
(29) Z_(K)	  the co-spectrum; the cosine transformation of the crosscorrelation
(30) W(k)	  the quadratine spectrum; the sine transformation of the crosscorrelation.
(31) R(k)	  the coherence.
(32) theta(k)	  phase difference.
(33) G(omega)	  the gain.
(34) T(i omega)   transfer function.


		     I. INTRODUCTION

The purpose of computing power spectra from the light curves of irregular 
stars is both theoretical and practical. A power spectrum of a star is an 
ideal method to study a complicated light or velocity curve whose inherent 
periodicities may be obscured by noise. If one lacks theoretical light or 
velocity curves, power spectrum analysis of the empirical data can reveal 
the presence of peaks above the noise level, giving an exact value for the 
periodicities and their amplitude, and the distribution of energy at different
vibrational frequencies. We can also study the changes in time of the period, 
amplitude and energy distribution. This information can in turn be compared 
with the T Tauri variables, such as their peculiar absorption and emission 
spectra, their spectral type and their age and mass when compared to a 
theoretical H-R diagram.
The preparation of power spectra of T Tauri stars involved a research scheme 
carried out in three stages. Stage I began with an extensive search of the 
literature over the past fifty years for published light curves of all the 
130 T Tauri stars given in a table by Herbig (1962), plus some additions. This
task has been eased by references in two biographical works, "Geschichte und 
Literatur des Lichtwechsels der Veränderlichen Sterne" and the "Astronomischer 
Jahresbericht". It was necessary to check through each reference to discover 
which contained the useful data. The light curves were then put on to punched 
cards, each card having a single Julian Date and magnitude. After the light 
curves were punched on IBM cards (some 27,000 readings in all) they were sorted 
out serially and counted on an IBM 1401 computer, the light curves were printed 
out and checked visually to reveal the presence of flares. The criteria used 
was an increase in brightness greater than 0.4m in the course of one hour 
passage. Stage II involved the calculation of pilot power spectra of T Tauri 
stars for different sampling intervals Delta t and lags K in order to select 
stable spectral estimates. Checks for stationarity and normality were made by 
breaking up long data sequences in halves and comparing the power spectra of each.
Stage III of estimating the power spectra of T Tauri stars was done using 
optimal sampling intervals and lags, i.e. those that all the essential details 
appeared in the spectrum. An important factor in revealing the data was a 
prewhitening or filtering of the data in order to reduce the effects of power 
leakage from higher frequencies that tend to distort the power spectrum. As a 
further check of the power spectrum's stability the light curves were converted 
to a new time series by the addition of a Gaussian distributed random numbers 
within a standard deviation of the mean, and the resulting power spectrum was 
then computed. Definitions and procedural details will be presented below.
All of the computations involving time series used a system of programmes 
(BOMM) developed by Bullard et al. (1966) for geophysical applications. The 
calculations were carried out by the IBM 7090 computer at Imperial College, 
London. Total computation times for each light curve analysed will be provided
and other details will be presented in a later paper.


	   II. TIME DOMAIN AUTOCORRELATION

In the following pages, two forms of spectral analysis will be discussed:
a) auto-spectral analysis of a single time series, and
b) cross-spectral analysis of pairs of time series for the purpose of 
comparison of sets of empirical data with theoretical models.
We must first consider in the time domain some statistical limitations 
upon the data. The spectrum of the discrete series X_1(t) is only defined 
for frequencies up to f_N = 1/2 Delta t cycles per day. As we shall see, some 
arrangement must be made so that the process contains negligible power 
at frequencies higher than f_N. If there are no obvious linear trends in 
the data we can assume stationarity, which is a type of statistical 
equilibrium which implies that the random variables, X_1 and X_2 have the 
same probability density function, i.e. f(X_1) and f(X_2) are dependent of t. 
Practically we can estimate f_1(X_1) and f_2(X_2) by forming a histogram 
of the observed values of X_1(t) and X_2(t). A small visual difference is 
sufficient to confirm the stationarity assumption (Jenkins, 1965). Tukey (1961) 
observed that unless one is sure about the processes, it is wise not to average 
spectrum over too long a time, since stationarity may hold only for a short 
time. In fact frequently we are more interested in changes in the energy 
distribution at each frequency than the power spectrum itself, since this may 
give physical insight about the mechanism producing the irregular light curves. 
Another way of formulating stationarity is to insist that all multivariate 
distributions involving X_1 and X_2 depend only on time differences t-t' and 
not on t and t' separately. The joint distribution of X_1(t) and X_2(t) is then 
the same for time points K lags apart, allowing us to construct a scatter
diagram of X_1(t) and X_2(t-k). If this joint distribution can be approximated
within a reasonably close measure by means of a multivariate normal distribution, 
then the joint distributions characterized by its means and its covariance 
matrix. If the errors are Gaussian or normal, then mu_i and sigma^2 
characterize the distribution completely, and the condition of normality is 
fulfilled.
We must now more carefully define the covariance matrix, its components 
and relations to other statistical qualities. The variance sigma^2 is:

 (2.1)
where                                          
 (2.2)
and also                                                    
 (2.3)

where p(x) is the probability density function of the errors and mu_i is 
the mean value of X and p(x). We can also write the sample estimate 
c_11(k) of the autocovariance of lag K uncorrected for the mean.

 (2.4)

Since c_11(k) is even
 (2.5)

If the number of observations n increases indefinitely, c_11(k) becomes the
ensemble autocovariance gamma_11(k). Priestley (1965) points out that C_11(k)
has a bias b(omega) the order m/n, but since most estimates of the spectral 
density use only the first m(<<n) values of gamma_11(k), so the bias in 
gamma_11(k) is unimportant. If we wish to compute only the autocorrelation, 
then we use an unbiased estimate of c_11(k). However, to calculate the power 
spectrum of X_1(t) requires a biased estimate of c_11(k) (see designation 3.14 
and the immediate discussion), thus a weighted autocorrelation becomes:

 (2.6)

The autocorrelation coefficients rho_11(k) are phi_11(k)=gamma_11(k)/gamma_11(0); 
here the same as the regression coefficients of X_1(t) on X_1(t-k). The 
cross correlation coefficients rho_12(k) become

 (2.7)

Thus if the time series is stationary and normal, its statistical process, i.e. 
the joint distribution of X_1(t) for time t_1, t_2, ..., t_n is characterized 
sufficiently by mu_i, sigma_2 and rho_12(k) (Jenkins, 1961). More formally the
estimators gamma_12(k) for rho_12(k) are the ordinary product-moment 
correlations between sequences
						
 (2.8)

where mu_1 and mu_2 are means calculated from the first n-k and last n-k terms, 
respectively. The plots of r_12(k) versus the lag k are called the autocorrelation 
functions A_11(k) and r_12(k) against k the cross-correlation functions. It is
worth noting that the means mu_1 and mu_2 must be subtracted to produce 
proper autocorrelation coefficients. If we do not remove the sample mean before 
calculating the autocovariances it will dominate the contribution to its 
cosine transform not only at zero frequency, but also at frequencies close to 
zero. A Fourier transform of rho_11(k) will dominate only the zero frequency, 
but if there is linear trend present in addition, there will be an additional 
contribution to the power at all frequencies, but predominantly at low 
frequencies. We note also that if the lag k becomes too large, R_12(k) becomes
erratic in appearance since fewer observations fall into each group, so that 
the variance of the estimate increases. Thus resolution cum number of lags 
k works in the opposite directions to the variance, and a compromise between 
the two must be effected to accurately estimate autocorrelation coefficients.


III. HARMONIC ANALYSIS OF STRICTLY PERIODIC TIME SERIES

The sample autocorrelation function is a difficult quantity to interpret 
physically since neighbouring correlations tend to be correlated. We shall try
to look at the time series transformed into the frequency domain. We shall first
decompose a time series that is valid only for strictly periodic phenomena.
After we show how certain types of noisy spectra can be represented, i.e. 
we shall estimate a power spectrum of a time series containing both periodic 
signals and noise.
Following Jenkins (1965), we shall unite N = 2n sampling intervals of a 
strictly periodic time series (light curve) which will be fitted to a 
harmonic regression of the form

 (3.1)
or
 (3.2)
where


The observations are represented by a mixture of sine and cosine waves
whose frequencies are multiples of the fundamental frequency 2 pi / N . Then

 (3.3)

 (3.4)
and
 (3.5)

If only a few of the harmonics are used, and the remainder are regarded 
as error, then (3.4) and (3.5) give the usual least square estimates of the 
constants. The total sum of the squares (variances) about the mean can be 
decomposed.

 (3.6)

If we assume no harmonic terms are present, then N/2*R_ij^2 is distributed as 
chi-square with two degrees of freedom.
If we define the sample estimate of the autocovariance of lag k uncorrected 
for the mean c_11(k) then the sample spectral density function is

 (3.7)

We can now carry out a harmonic analysis for strictly periodic phenomena as

 (3.8)

If the series X_1(t) is stationary, and if the number of observations N 
increases indefinitely the histogram R_ij becomes a continuous curve 
known as the spectral density function f(omega) and, C_11(k) is replaced 
by gamma_11(k) so that

 (3.9)

It is worth noting that f(omega) and gamma_11(k) are Fourier transforms of 
each other. We have

 (3.10)

and we write rho_11(k) = gamma(k)/sigma^2

 (3.11)

The Weiner-Khintchine theorem states this more formally: that a function 
f (omega) being the integrated spectrum of X (t) has the physical 
interpretation that follows from the Fourier expansion of X_1(t)_1 i.e.

 (3.12)

where dM(omega) is an orthogonal stochastic process in the interval 
(-infinity, infinity) 
with the estimate

 (3.13)

Thus df(omega) represents the average "power" associated with the 
components of X_1(t) whose frequencies lie between omega and omega + d omega. 
We shall now see why this easy progression in the proof does not 
yield valid results for physically meaningful spectra, since simply, the 
tenants of the basic assumption of strictly periodic X_1(t) are violated 
for stars with irregularly varying light curves.
As we have stated harmonic analysis works only for strictly periodic 
phenomena. For mixed time series harmonic analysis does not give a natural 
estimate of the spectra. In fact engineers and statisticians know harmonic 
analysis of noise or random numbers produces a highly spiked spectrum. For 
large sample sizes the mean periodogram I_11(omega) does tend to the spectral 
density, but the variance of the fluctuations of I_11(omega) about f(omega) 
does not tend to zero as n -> infinity. For a large n, the distribution of 
I_11(omega) is a multiple of a chi-squared distribution with two degrees of 
freedom, independently of n. Another way to see that for mixed spectra harmonic 
estimates of the periodogram I_11(omega) are spurious is that

 (3.14)
where



As we see, the weights tend to zero as k tends to N, giving erratic and 
incorrect estimates. We shall now see a true method of estimating time 
series as developed by Weiner (1967), Blackman and Tukey (1958), and others.


      IV. ESTIMATION OF MIXED AUTOSPECTRA

We here define a mixed spectra by decomposing F(omega), integrated spectrum 
of X_1(t) as

 (4.1)

where F_1(omega) is an absolutely continuous function having a derivative 
f(omega) also called the spectral density function, and F_2(omega) is a step 
function at certain frequencies omega_n, n = 1, 2,... If a = 1, a_2 = 0 the 
spectrum is purely continuous; if a_1 = 0, a_2 = 1, it is purely discrete. 
If a_1 unequal 0, and a_2 unequal 0  are not constant, then the time 
series is said to possess a mixed spectrum. If a process contains 
periodic terms and residual processes which in themselves have continuous 
spectra, then we have to estimate (1) the number of sine waves, (2) 
their amplitudes and frequencies, and (3) the spectral density function 
of the residual process. 
To estimate mixed spectra one defines both a series lambda(k) as a time 
domain window and a spectral window tau(omega) as a window in the frequency 
domain. We choose for mixed spectra to define spectral estimates in the form 

 (4.2)

where we assume that the width of the spectral window associated with lambda(k) 
is so small that the spectrum is small over the window. We can also write for 
gamma_11(k) 

 (4.3)

Another expression of (4.2) is 

 (4.4) 

where
 (4.5)

with
 (4.6) 

and
 (4.7) 

The tau(omega-y) is the spectral window. In changing omega we move the slit 
along through the entire frequency range, with a bandwidth +-2 pi/N 
about y = omega. Jenkins (1961) gives several physical definitions of 
bandwidth which approximate a rectangular band pass window. In general 
we will replace the autocovariances gamma_11(k) with the autocorrelations 
rho_11(k) so that

 (4.8)

For the choice of moving weights lambda_k, we can choose from a number 
of cosine-like series, all nearly the same. We have used

 (4.9)

This weight gives variance f(omega)~ 3m/4n, thus making m small 
increases the bandwidth, hence decreasing the variance for this choice 
of lambda_12. 
This operation can be regarded as a filtering process using a 
rectangular-like bandpass with small side lobes in order to increase the 
accuracy of the spectral density f(omega). Unless one is faced with 
peaky or a steeply slanted spectrum, the exact shape of the window is 
not too critical.
This window (4.9) and others like it tend to a normal distribution, 
hence its transform is the same as itself. It retains its shape in both 
time and frequency domains. For his particular window, the eta(omega-y) 
in (4.1) is given by Jenkins (1961) as

 (4.10)

Most windows are similar to those generated from the window lambda_12 = 1, 
giving a shape similar to the probability density function of a 
rectangular random variable. This particular window (4.9) has both small 
bandwidths and small side lobes, cutting the leakage of spectral power 
from frequencies distant from y = omega, which would distort the true 
picture at y = omega_0. It must be emphasized that there are a number of 
spectral windows, each with different bandwidths. We shall see later 
that filters similar to lambda_12 are used to suppress certain undesirable 
features in the spectrum such as power leakage. More important than the 
choice of window shape is the choice of the total number of lag k steps, 
which we call m. The nature of the analysis attempted here is to use a 
fixed run of observations and a fixed sampling interval Delta t. Priestley 
(1962) has attempted to establish quantitative expressions for designing 
an optimum spectral estimate, involving the relation of variance sigma^2 and 
bias b(omega). As we have mentioned earlier, one must first fix the form of 
the weight sequence and the number of lags m. A not always realizable 
suggestion by Priestley (1962) is to have the bandwidth Bomega not greater 
than the width of the narrowest peak of f(omega), more specifically Bomega 
is the distance between the `half-power' points in the main lobe.
Blackman and Tukey {1958) discuss an optimal design in terms of sampling 
errors of f(omega) assuming a chi-squared distribution with nu degrees, of freedom,
including a measure of f(omega) in terms of variance, but not bias b(omega). 
We shall discuss these points later when considering the errors inherent in any 
estimate of f(omega). These considerations of sigma^2 and b(omega) are not 
strictly applicable to this particular type of power spectrum of light 
curves analysis of light curves of irregular variable stars taken from 
the astronomy literature. Unless one studies only short period 
fluctuations of variable stars, or has several observatories in the 
world keeping constant watch on certain stars, one's sampling rate will 
always be disturbed by the day-night variation. The weather will also 
confound the most diligent observer, and tend to frustrate any decrease 
of Delta t even for very long records of light curves.
Jenkins (1965) suggests an empirical method which has been followed in 
this work, and which will be illustrated by a few examples and diagrams. 
Those who try to determine an optimal m always find that the answers 
depend on the spectral density f(omega). We therefore choose an m small 
(or large bandwidth). We then increase m and decrease the bandwidth 
improving the resolution, stopping when we are satisfied about the 
detail of the spectrum (following the suggestion of Priestley). The 
matter may be judged in the light of the consideration that the variance 
tends to increase with m; we must in fact balance three considerations: 
accuracy of spectral estimate (the variance) against the number of lags 
m, i.e. the resolution of fine structure in the spectrum. The ideal 
method to increase resolution at low frequencies which would amount to 
an increase in the total length of data, hence a range of possible 
periods of up to 1,000 days. Along with this increase in n we could 
safely increase m and still retain a high level of confidence in our 
statistical estimates of the spectra. The diurnal variation of the 
Earth, infrequent bad weather and the new Moon puts a very real 
limitation on an accurate f(omega), without an enormous increase in time 
and effort of the observational astronomers.
The irregular nature of the observations that make up a light curve is 
dealt with by a subroutine of BOMM that interpolates the time series, 
using second differences, at regular intervals Delta t over the range of 
observations that one chooses. In pilot studies, we usually choose 
several values of Delta t.
We are led finally to conclude this section with the specific 
formulation of the spectral density as computed by a BOMM subroutine. 
The actual cosine transformation of the autocorrelation is written: -

 (4.11)

where Epsilon_1 = 1/2 if i = 0, Epsilon_1 = 1 otherwise; j = 0, 1, ... , 1', 1' 
is the greatest integer such that   omega' + l'/ 2 Delta t <=  omega_N 
where omega' is the low frequency limit, usually 0 and omega_N is 
the Nyquist frequency.
Jenkins (1961) observes that the logarithmic transformation of the spectral 
density f(omega) produces a distribution whose variance is nearly independent 
of the frequency. We see that

 (4.12)

where beta_i depends on the window used for the spectral estimate and n is 
the total number of observations. If we then write an expression independent of
the frequency

 (4.13)

Log_10 f(omega) produces estimates that do not rely so heavily upon 
normality, in fact it produces distributions closer to normality. It is 
assumed that the errors here are produced by the variance and not by 
leakage of power from other frequencies or by aliasing. It is worth 
noting that this leads us easily to the method engineers use for 
estimating spectral power since
								 
 (4.14)

We see that an increase of power by a factor of 2 is equivalent of a change 
of 3 decibels.


   V. ESTIMATION OF MIXED CROSS SPECTRA

Often we need either to compare two empirical time series X_1(t) and 
X_2(t) that arise in similar fashions. To do this, we shall show how one 
estimates the cross spectrum. We shall show now that if X_1(t) and X_2(t) 
do not have similar origins, then one can estimate the gain of a system, 
and perhaps the coherent energy. Let one of the time series X_1(t) be a 
theoretical complex input and X_2(t) the actual recording of the light 
curve. We can write

 (5.1)

where n(t) is the noise term which arises because other input variables 
are not well controlled. If the fluctuations in X_1(t) are not too large, 
then we linearize the above equation to:
			    
 (5.2)

where n(t) now contains quadratic and higher terms. This relationship is a 
linear dynamic equation, the regression or impulse response omega(u) can be 
estimated in several ways, by changing X_1(t) by a well defined step pattern 
or sinusoidally. In the case of the latter we let X_1(t) = delta cos omega t 
and

 (5.3)

then the transfer function becomes

 (5.4)

T(i omega) is the transfer function, G(omega) the gain and Phi(omega) the phase 
shift. If n(t) is small, the gain G(omega) may be obtained from the ratio of 
the amplitudes of the output and input, and Phi(omega) by matching up the two 
waves. In this case we used a sine wave generator to actuate X_1(t) and sweep 
out a range of frequencies to generate all the information.
We shall now estimate omega(u) from the existing noisy fluctuations in X_1(t) 
and X_2(t). These two series are now given to be stationary time series, and
X_1(t) and n(t) are uncorrelated, so we can write the crosscovariance by 
multiplying (5.2) by X_1(t - k) and averaging

 (5.5)

where gamma_12(k) is the crosscovariance function of lag k between X_1(t) and 
X_2(t), and gamma_11(k-u) is the autocovariance function. If gamma_12(k) and 
gamma_11(k-u) are known, then omega(u) may be estimated by solving (5.5), 
usually called the Weiner-Hopf equation. If we calculate the Fourier transform
of both sides, T(omega) is then

 (5.6)

where f_12(omega) is the cross spectral density function, and sigma_1 and 
sigma_2 are the standard deviations of X_1(t) and X_2(t). The spectral density 
f_12(omega) is given by

 (5.7)

We have shown that the frequency response tau(omega) is given by the ratio of 
the cross spectral density and the input auto spectrum. We now write gamma_12(k) as:

 (5.8)

We have already defined gamma_11(k) and f_11(omega) in an earlier section of 
the paper. Since gamma_12(k) unequal gamma_21(k) in general, then (5.7) gives 
both a cosine and sine transform of the crosscorrelation

 (5.9)

 (5.10)

where
 (5.11)

 (5.12)

and
 (5.13)

In this case W_12(omega) is the quadrature or out-of-phase spectrum, and Z_12(omega)
is called the co- or in-phase spectrum. Using (5.6) and (5.8) we can write the

gain G(omega) and phase Phi(omega) as

 (5.14)

 (5.15)

Tan Phi_i(omega) here is the phase difference. This formulation is due to Munk, 
et al. (1959). If we are examining the covariance between two stationary time 
series X_1(t) and X_2(t) and not as input and output of some linear system, 
one usually plots the coherence R^2(omega), which is normalized to run between
0 and 1 so that

 (5.16)

The coherency R^2(omega) is analogous to the correlation coefficient calculated 
at each frequency omega. The spectrum of the noise term n(t) in (5.2) is now

 (5.17)

which is analogous to the residual sum of squares in linear regression.

 (5.18)

We see if R^2(omega) is large, the noise spectral density is small relative 
to the output spectral density and vice versa. We can also write R^2(omega) as

 (5.19)

If G(omega) is specified, the coherency is large if the signal-to-noise ratio 
is large, that is S(omega)/N(omega)

 (5.20)

Munk, Snodgrass and Tucker (1959) point out that, if both X_1(t) and 
X_2(t) are identical and simultaneous, Z_12(omega) = +1, W_12(omega) = 0. 
If they are identical but there is a time lag in one record corresponding to a 
phase difference Phi(omega), then the coherence is +1.
Jenkins (1965) points out that positive cross correlations will result in a 
high frequency cross amplitude spectrum with most of its power at low 
frequencies; negative cross correlations will results in a high-frequency 
cross amplitude spectrum.
We follow Jenkins (1963, 1965) to choose w(n) in (5.2) so that the 
integrated squared error is minimized, and that (5.21) is small compared 
to some Epsilon

 (5.21)

and 
 (5.22)

Here the ensemble cross covariances gamma_12(k) are replaced by the sample 
cross covariances Z_12(k). Since X_1(t) and X_2(t) are not phase shifted sine 
waves, but stationary time series, we must introduce a weight function 
lambda(k) as in the univariate spectral analysis. We then write the 
estimates for the co-spectra, quadrature and auto-spectra as

 (5.23)

 (5.24)

 (5.25)

where alpha_12(k) and beta_12(k) are estimates of alpha_12 and beta_12 
as defined earlier in (5.12) and (5.13). 
It has been pointed out by Jenkins (1965) that coherencies may sometimes 
be low because of the influences of the individual autospectra. One can take 
two independent time series and produce large coherence between them arising 
from peaks in the individual spectra or if the bandwidth is too small. However, 
we can calculate the autospectra first, then pre-filter each series to make 
their spectra move nearly uniform or white. This leads to a large increase in 
accuracy in the estimation of coherency. If one likes to think of an input 
function into the atmosphere o£ a star as some type of periodic wave, then 
following Munk and Cartwright (1966) we can separate the estimate of the energy 
into two parts, namely

 (5.26)

and
 (5.27)

The latter will be effectively noise energy, but may contain energy coherent 
with other input functions.


		     VI. FILTERS

Once the basic spectrum has been created and stabilized for a fixed m 
and Delta t it becomes important to examine further the processes that tend 
to distort the true nature of f(omega). We shall then show it is possible 
to construct a digital filter U_k so that convolution upon the data in 
the time domain will eliminate unwanted spectral energy at some 
specified frequency. If we generate a series which we shall term our 
filter, and given our light curve X_1(t), we then create a new series 
W_j(t) by the process described. We can write W_j(t) as

 (6.1)

where


p' being the number of terms by which the weight factors are advanced 
for unit increase in 1/(l' + 1) is a normalization factor. If we want 
weights that are symmetrical, then W_j assumes the following form

 (6.2)

where


We can construct also a frequency response function T(omega) to accept or 
reject any range of frequencies so that

 (6.3)

The BOMM programme generates these series of moving weights either for 
use as lag windows or as filters. We have the choice of symmetrical 
filters or those fading to the left or right. If we want a high-pass or 
low-pass filter, then for some frequency f, the terms are multiplied by a 
factor which gives a filter whose transmission is 6 db (1/2 amplitude). 
Thus: --

 (6.4)

 (6.5)

If we want a low pass filter to be convolved in equation (6.1), then we write

 (6.6)

and for a high pass filter
 (6.7)

where Epsilon_j = 1/2 for j = 0 and Epsilon_j = 1 otherwise; 
delta(j) = 1 for j = 0 and 0 otherwise.
As we stated earlier, the filter type is determined by the sharpness of 
cut-off and by the side band characteristics. The amplitude response of 
any filter can be obtained taking its cosine transform. It may be worthwhile 
explaining some of the conditions which necessitate the use of filters, such 
as the removal of trends, leakage of power from high frequencies due to aliasing. 
Often non-stationarity of time series may be due to the presence of a linear 
trend. In general, one can use a high-pass digital filter to suppress low 
frequency trends and then go on to perform a spectral analysis of the residual
series. Another use of filters in time series analysis is where one suspects 
that a series consists of a mixture of stationary series, i.e. differing 
frequency bands may have different physical origins. In such a case, we can 
construct a series of digital band pass filters to separate the original 
spectrum into several distinct spectra.
A more complete discussion of the use of filters of all types will come 
in the next section with a discussion of the errors involved in calculating 
the spectral density.


     VII. ERRORS IN SPECTRAL DENSITY ESTIMATES

Munk, Snodgrass and Tucker (1959) have given several quantitative 
estimates of errors in spectral density. We shall deal with those errors 
that are due to random errors, the properties of the spectral window 
lambda(k), leakage of power due to aliasing, and give some treatment of 
systematic errors. Within the range of this examination is the problem 
of detecting signals in the presence of noise, i.e. estimation of the 
range of error allowable for any peak that appears in the mixed spectra.
Parzen (1961) gives some design relations in terms of the signal to 
noise estimate, i.e.

 (7.1)

We see the SNR[f(omega)] is the mean divided by the standard deviation (or the
reciprocal) of the coefficient of variation of the random variable f(omega). If 
f(omega) is normally distributed and we seek to obtain a 95% confidence level 
for our spectral estimates, then we can estimate SNR[f(omega)] in terms of 
equivalent degrees of freedom. If X_1(t) is a random variable (i.e. Gaussian noise) 
which has a chi-squared distribution with nu degrees of freedom, then

 (7.2)

and
 (7.3)

If X_1(t) is a positive random variable, then nu is just the equivalent 
degree of freedom.
Several authors put forward their views on design relations to maximize 
SNR[f(omega)] under various circumstances (Parzen, 1961; Priestley, 1962; 
1965), but we shall have to outline this in more detail later because of 
the peculiar nature of the light curves of T Tauri stars and the fixed 
amount and nature of the data available. The assumption of a chi-squared 
distribution is approximately correct where f(omega) can be expressed as a 
weighted sum of chi^2-variables. It seems a similar method of treatment is 
to sum the squares of amplitudes of the components of a Fourier series.

 (7.4)

where n = 0, 1, 2 ... , and the energy density varies slowly with the 
frequency separation 1/T of the harmonics. Then p_n and q_n, are normal 
and p_n^2 = q_n^2 = true energy density/T, i.e.

 (7.5)

Thus for p harmonics, the number of degrees of freedom nu = 2p, and from 
thee chi-squared distribution "confidence limits" can be calculated. This is 
because the chi-squared distribution gives

 (7.6)

The effective number of degrees of a record is approximately nu. These 
ideas can be used with tables given in Blackman and Tukey (1957) to 
calculate the random errors based on the assumptions above. Thus

 (7.7)

It seems likely that one must consider both the bias and variance when 
carrying out practical estimates of the spectral density f(omega), but it 
is difficult to formulate quantitatively any practical numerical 
formulae due to the presence in f(omega) of non-random statistical errors, 
i.e. side band leakage for which f(omega) differs with the type of spectral 
window chosen, aliasing. The problems of power leakage from peaks at 
frequencies > omega_N confuse the estimate at neighbouring frequencies 
omega_N +- Delta omega. Parzen (1961) and Priestley (1962) both use a formula

 (7.8)

to give an estimate at a particular frequency that does not have more 
than a 10% proportional error at the 95% confidence level. In this 
matter the variance is defined by 

 (7.9)

and the bias is given as
 (7.10)

where lambda(theta-omega) is the weight function.

Implementation of these formulae for eta(omega) and b(omega) rest critically
on ones choice of the width of the weight function lambda(k). In fact since 
both n and b are fixed for the light curves, it is possible to calculate 
optimum lags m for a given spectra, if we accept the mean square error 
at frequency omega as

 (7.11)

But it cannot be denied that other sources of error creep in as we have 
mentioned earlier, and the chi-squared error estimates only account for 
one source of error, those of a random nature.
The difficulty is that both eta(omega) and b(omega) are asymptotically 
proportional to f(omega). The method we have used is to vary the resolution 
until increasing m any further does not reveal any more features in the 
spectra, while at the same time attempting to prewhiten the spectra as much as
possible, as suggested by Blackman and Tukey (1957). This is done during Stage
II of the estimation process, which is mostly exploratory.

Pre-whitening is useful in eliminating the systematic errors that occur 
due to the curvature of f(omega). If f(omega) is flat, or varies linearly with 
frequency, then epsilon [f(omega)] = f(omega). If not, then spectral energy 
diffuses down toward lower spectral levels (Munk et al., 1959) and one must 
worry about leakage of power from higher frequencies to lower frequencies. We 
shall now discuss the question of leakage in the estimates of spectral density. 
The window we choose lambda(k) is usually some kind of polynomial which 
vanishes at a finite number of points. Its main lobe centering on a band of 
frequencies over which we wish to estimate f(omega) is inevitably accompanied 
by side bands or minor lobes, which allow leakage from the part of the spectrum 
outside the desired band to affect our estimate inside the band. This can be 
cured by taking two estimates, one with the side lobes positive, the other 
negative.
One of the more critical problems recognised by Tukey is aliasing; or 
the problem of sampling at discrete intervals Delta t. It is equivalent to 
multiplying the record with spikes with a frequency 1 / Delta t, so that sum 
and difference frequencies are produced. An analysis for energy density 
at any frequency f will also include the densities near frequencies:

1 / Delta t - f_0, 1 / Delta t + f_0, 2 / Delta t + f_0, ...etc

One of the major uses of filters comes in dealing with leakage of powers 
associated with aliasing caused by the loss of information at frequencies 
above the Nyquist frequency, namely when omega_N = pi / Delta t, 
or f = 1/2 Delta t cycles per day.

   

       Fig. 1	                            Fig. 2

Since there is no information available about X_1(t) between the data 
points, there is no means of estimating the amplitude of frequencies 
higher than the Nyquist frequency. In fact at frequency f_N we are 
unable to estimate, since at the Nyquist frequency, f(omega_N) is 
confused with all the frequencies that are indistinguishable from 
omega_N. If f*(omega) is the spectral density corresponding to X(t), 
then the spectral density of the sampled trace is

 (7.12)

We see the sampled spectrum is obtained by folding the unsampled 
spectrum about even multiples (2 pi k)/(Delta t) of f and adding this 
contribution in the range (0, omega_N) Thus to be able to measure f*(omega) 
in (0, omega_N) we must hope that that f(omega)~~0 for omega > omega_N 
In our case it is certain that there is a need to use a low pass 
filter to cut any peaks that may occur near f_N. The details of this 
process and the practical results of it will be shown when we present 
the details of the power spectra for the particular T Tauri stars. The 
use of different Delta t's in calculating pilot spectra will tell us if 
we are in danger of confusing f(omega) due to higher energies at higher 
frequencies.


		 REFERENCES

Blackman, R. B. and Tukey, J. W., 1958, The Measurement of Power Spectra. Dover 
  Publications.
Bullard, E. C. et al, 1966, A User's Guide to BOMM. A Series of Programs for 
 the Analysis of Time Series. Institute of Geophysics and Planetary 
 Physics, La Jolla. 
Herbig, G. H., 1962, Adv. Astr. Astrophys. 1, 47.
Jenkins, G. M., 1961, Technometrics, 3, 167.
Jenkins, G. M., 1963, Technometrics, 5, 227.
Jenkins, G. M., 1965, Applied Statistics, 14, 205.
Munk, W. H., Snodgrass, F. E. and Tucker, M. J., 1959, Bull. Scripps Inst. 
 Oceanography, 7, 283.
Munk, W. H. and Cartwright, D. E., 1966, Phil. Trans. R. Soc. London, A 259, 533. 
Parzen, E., 1961, Technometrics, 3, 167.
Priestley, M. B., 1962, Technometrics, 4, 551.
Priestley, M. B., 1965, Applied Statistics, 14, 33.
Tukey, J. W., 1961, Technometrics, 3, 191.
Weiner, N., 1967, Time Series, Dover Publications.


		  DISCUSSION

Detre: How is it possible, that the analyses of the same observational material, 
by the same method, executed by two different teams - as I mentioned 
in my introductory report - lead to opposite results for mu Cephei?
Plagemann: I have not read these contributions, but I shall study them soon.

Herbig: Does your technique recover the periods of a few days found by 
Hoffmeister when you analyze the observations of southern RW Aurigae-type 
variables made by him ?
Plagemann: I have not yet analyzed his data, although the material is already
   on computer cards. It will be done when I return to Cambridge. 
Penston: Do I understand that you find Kinman's 80 day period for 3C 345
 is not statistically significant?
Plagemann: That is correct -- my preliminary results do not show his 80d period.