The Internet Archive discovers and captures web pages through many different web crawls.
At any given time several distinct crawls are running, some for months, and some every day or longer.
View the web archive through the Wayback Machine.
- Top ranked pages (up to a max of 100) from every linked-to domain using the Wide00012 inter-domain navigational link graph
-- a ranking of all URLs that have more than one incoming inter-domain link (rank was determined by number of incoming links using Wide00012 inter domain links)
-- up to a maximum of 100 most highly ranked URLs per domain
The seed list contains a total of 431,055,452 URLs The seed list was further filtered to exclude known porn, and link farm, domains The modified seed list contains a total of 428M URLs
TIMESTAMPS
The Wayback Machine - https://web.archive.org/web/20160808082407/https://en.wikipedia.org/wiki/Recursive_least_squares
The Recursive least squares (RLS) is an adaptive filter which recursively finds the coefficients that minimize a weighted linear least squarescost function relating to the input signals. This is in contrast to other algorithms such as the least mean squares (LMS) that aim to reduce the mean square error. In the derivation of the RLS, the input signals are considered deterministic, while for the LMS and similar algorithm they are considered stochastic. Compared to most of its competitors, the RLS exhibits extremely fast convergence. However, this benefit comes at the cost of high computational complexity.
RLS was discovered by Gauss but laid unused or ignored until 1950 when Plackett rediscovered the original work of Gauss from 1821. In general, the RLS can be used to solve any problem that can be solved by adaptive filters. For example, suppose that a signal d(n) is transmitted over an echoey, noisy channel that causes it to be received as
where represents additive noise. We will attempt to recover the desired signal by use of a -tap FIR filter, :
where is the vector containing the most recent samples of . Our goal is to estimate the parameters of the filter , and at each time n we refer to the new least squares estimate by . As time evolves, we would like to avoid completely redoing the least squares algorithm to find the new estimate for , in terms of .
The benefit of the RLS algorithm is that there is no need to invert matrices, thereby saving computational power. Another advantage is that it provides intuition behind such results as the Kalman filter.
The idea behind RLS filters is to minimize a cost function by appropriately selecting the filter coefficients , updating the filter as new data arrives. The error signal and desired signal are defined in the negative feedback diagram below:
The error implicitly depends on the filter coefficients through the estimate :
The weighted least squares error function —the cost function we desire to minimize—being a function of e(n) is therefore also dependent on the filter coefficients:
where is the "forgetting factor" which gives exponentially less weight to older error samples.
The cost function is minimized by taking the partial derivatives for all entries of the coefficient vector and setting the results to zero
Next, replace with the definition of the error signal
Rearranging the equation yields
This form can be expressed in terms of matrices
where is the weighted sample covariance matrix for , and is the equivalent estimate for the cross-covariance between and . Based on this expression we find the coefficients which minimize the cost function as
The smaller is, the smaller is the contribution of previous samples to the covariance matrix. This makes the filter more sensitive to recent samples, which means more fluctuations in the filter co-efficients. The case is referred to as the growing window RLS algorithm. In practice, is usually chosen between 0.98 and 1.[1] By using type-II maximum likelihood estimation the optimal can be estimated from a set of data.[2]
The discussion resulted in a single equation to determine a coefficient vector which minimizes the cost function. In this section we want to derive a recursive solution of the form
where is a correction factor at time . We start the derivation of the recursive algorithm by expressing the cross covariance in terms of
where is the dimensional data vector
Similarly we express in terms of by
In order to generate the coefficient vector we are interested in the inverse of the deterministic auto-covariance matrix. For that task the Woodbury matrix identity comes in handy. With
To come in line with the standard literature, we define
where the gain vector is
Before we move on, it is necessary to bring into another form
Subtracting the second term on the left side yields
With the recursive definition of the desired form follows
Now we are ready to complete the recursion. As discussed
The second step follows from the recursive definition of . Next we incorporate the recursive definition of together with the alternate form of and get
With we arrive at the update equation
where is the a priori error. Compare this with the a posteriori error; the error calculated after the filter is updated:
That means we found the correction factor
This intuitively satisfying result indicates that the correction factor is directly proportional to both the error and the gain vector, which controls how much sensitivity is desired, through the weighting factor, .
Lattice recursive least squares filter (LRLS)[edit]
The Lattice Recursive Least Squaresadaptive filter is related to the standard RLS except that it requires fewer arithmetic operations (order N). It offers additional advantages over conventional LMS algorithms such as faster convergence rates, modular structure, and insensitivity to variations in eigenvalue spread of the input correlation matrix. The LRLS algorithm described is based on a posteriori errors and includes the normalized form. The derivation is similar to the standard RLS algorithm and is based on the definition of . In the forward prediction case, we have with the input signal as the most up to date sample. The backward prediction case is , where i is the index of the sample in the past we want to predict, and the input signal is the most recent sample.[4]
The algorithm for a LRLS filter can be summarized as
Initialization:
For i = 0,1,...,N
(if x(k) = 0 for k < 0)
End
Computation:
For k ≥ 0
For i = 0,1,...,N
Feedforward Filtering
End
End
Normalized lattice recursive least squares filter (NLRLS)[edit]
The normalized form of the LRLS has fewer recursions and variables. It can be calculated by applying a normalization to the internal variables of the algorithm which will keep their magnitude bounded by one. This is generally not used in real-time applications because of the number of division and square-root operations which comes with a high computational load.
^Emannual C. Ifeacor, Barrie W. Jervis. Digital signal processing: a practical approach, second edition. Indianapolis: Pearson Education Limited, 2002, p. 718
^Welch, Greg and Bishop, Gary "An Introduction to the Kalman Filter", Department of Computer Science, University of North Carolina at Chapel Hill, September 17, 1997, accessed July 19, 2011.