for k=1:n, a(k)=0; b(k)=0; end; b(1)=2;

for k=2:n, b(k)=1/(4-1/(k-1)ˆ2); end

Program 83 - coe¬‚agu : Coe¬cients of Laguerre polynomials

function [a, b] = coe¬‚agu(n)

if (n <= 1), disp(™ n must be > 1 ™); return; end

a=zeros(n,1); b=zeros(n,1); a(1)=1; b(1)=1;

for k=2:n, a(k)=2*(k-1)+1; b(k)=(k-1)ˆ2; end

Program 84 - coefherm : Coe¬cients of Hermite polynomials

function [a, b] = coefherm(n)

if (n <= 1), disp(™ n must be > 1 ™); return; end

a=zeros(n,1); b=zeros(n,1); b(1)=sqrt(4.*atan(1.));

for k=2:n, b(k)=0.5*(k-1); end

Program 85 - zplege : Coe¬cients of Gauss-Legendre formulae

function [x,w]=zplege(n)

if (n <= 1), disp(™ n must be > 1 ™); return; end

[a,b]=coe¬‚ege(n);

JacM=diag(a)+diag(sqrt(b(2:n)),1)+diag(sqrt(b(2:n)),-1);

[w,x]=eig(JacM); x=diag(x); scal=2; w=w(1,:)™.ˆ2*scal;

[x,ind]=sort(x); w=w(ind);

Program 86 - zplagu : Coe¬cients of Gauss-Laguerre formulae

function [x,w]=zplagu(n)

if (n <= 1), disp(™ n must be > 1 ™); return; end

[a,b]=coe¬‚agu(n);

JacM=diag(a)+diag(sqrt(b(2:n)),1)+diag(sqrt(b(2:n)),-1);

[w,x]=eig(JacM); x=diag(x); w=w(1,:)™.ˆ2;

Program 87 - zpherm : Coe¬cients of Gauss-Hermite formulae

function [x,w]=zpherm(n)

if (n <= 1), disp(™ n must be > 1 ™); return; end

10.7 Approximation of a Function in the Least-Squares Sense 431

[a,b]=coefherm(n);

JacM=diag(a)+diag(sqrt(b(2:n)),1)+diag(sqrt(b(2:n)),-1);

[w,x]=eig(JacM); x=diag(x); scal=sqrt(pi); w=w(1,:)™.ˆ2*scal;

[x,ind]=sort(x); w=w(ind);

10.7 Approximation of a Function in the

Least-Squares Sense

Given a function f ∈ L2 (a, b), we look for a polynomial rn of degree ¤ n

that satis¬es

f ’ rn = min f ’ pn w,

w

pn ∈Pn

where w is a ¬xed weight function in (a, b). Should it exist, rn is called a

least-squares polynomial. The name derives from the fact that, if w ≡ 1, rn

is the polynomial that minimizes the mean-square error E = f ’rn L2 (a,b)

(see Exercise 8).

As seen in Section 10.1, rn coincides with the truncation fn of order

n of the Fourier series (see (10.2) and (10.3)). Depending on the choice

of the weight w(x), di¬erent least-squares polynomials arise with di¬erent

convergence properties.

—

Analogous to Section 10.1, we can introduce the discrete truncation fn

(10.4) of the Chebyshev series (setting pk = Tk ) or the Legendre series

(setting pk = Lk ). If the discrete scalar product induced by the Gauss-

˜

Lobatto quadrature rule (10.28) is used in (10.5) then the fk ™s coincide with

the coe¬cients of the expansion of the interpolating polynomial ΠGL f (see

n,w

(10.29) in the Chebyshev case, or (10.38) in the Legendre case).

—

Consequently, fn = ΠGL f , i.e., the discrete truncation of the (Cheby-

n,w

shev or Legendre) series of f turns out to coincide with the interpolating

polynomial at the n + 1 Gauss-Lobatto nodes. In particular, in such a case

—

(10.6) is trivially satis¬ed, since f ’ fn n = 0.

10.7.1 Discrete Least-Squares Approximation

Several applications require representing in a synthetic way, using elemen-

tary functions, a large set of data that are available at a discrete level, for

instance, the results of experimental measurements. This approximation

process, often referred to as data ¬tting, can be satisfactorily solved using

the discrete least-squares technique that can be formulated as follows.

Assume we are given m + 1 pairs of data

{(xi , yi ), i = 0, . . . , m} (10.43)

432 10. Orthogonal Polynomials in Approximation Theory

where yi may represent, for instance, the value of a physical quantity mea-

sured at the position xi . We assume that all the abscissae are distinct.

n

We look for a polynomial pn (x) = ai •i (x) such that

i=0

m m

wj |pn (xj ) ’ yj | ¤ wj |qn (xj ) ’ yj |2 ∀qn ∈ Pn ,

2

(10.44)

j=0 j=0

for suitable coe¬cients wj > 0. If n = m the polynomial pn clearly co-

incides with the interpolating polynomial of degree n at the nodes {xi }.

Problem (10.44) is called a discrete least-squares problem since a discrete

scalar product is involved, and is the discrete counterpart of the contin-

uous least-squares problem. The solution pn is therefore referred to as a

least-squares polynomial. Notice that

± 1/2

m

|||q||| = wj [q(xj )]2 (10.45)

j=0

is an essentially strict seminorm on Pn (see, Exercise 7). By de¬nition

a discrete norm (or seminorm) · — is essentially strict if f + g — =

f — + g — implies there exist nonnull ±, β such that ±f (xi ) + βg(xi ) = 0

for i = 0, . . . , m. Since ||| · ||| is an essentially strict seminorm, problem

(10.44) admits a unique solution (see, [IK66], Section 3.5). Proceeding as

in Section 3.13, we ¬nd

n m m

∀i = 0, . . . , n,

ak wj •k (xj )•i (xj ) = wj yj •i (xj ),

j=0 j=0

k=0

which is called a system of normal equations, and can be conveniently

written in the form

BT Ba = BT y, (10.46)

where B is the rectangular matrix (m+1)—(n+1) of entries bij = •j (xi ), i =

0, . . . , m, j = 0, . . . , n, a ∈ Rn+1 is the vector of the unknown coe¬cients

and y ∈ Rm+1 is the vector of data.

Notice that the system of normal equations obtained in (10.46) is of

the same nature as that introduced in Section 3.13 in the case of over-

determined systems. Actually, if wj = 1 for j = 0, . . . , m, the above system

can be regarded as the solution in the least-squares sense of the system

n

ak •k (xi ) = yi , i = 0, 1, . . . , m,

k=0

10.8 The Polynomial of Best Approximation 433

which would not admit a solution in the classical sense, since the number of

rows is greater than the number of columns. In the case n = 1, the solution

to (10.44) is a linear function, called linear regression for the data ¬tting

of (10.43). The associated system of normal equations is

1 m m

wj •i (xj )•k (xj )ak = wj •i (xj )yj , i = 0, 1.

k=0 j=0 j=0

m

Setting (f, g)m = f (xj )g(xj ) the previous system becomes

j=0

(•0 , •0 )m a0 + (•1 , •0 )m a1 = (y, •0 )m ,

(•0 , •1 )m a0 + (•1 , •1 )m a1 = (y, •1 )m ,

where y(x) is a function that takes the value yi at the nodes xi , i =

0, . . . , m. After some algebra, we get this explicit form for the coe¬cients

(y, •0 )m (•1 , •1 )m ’ (y, •1 )m (•1 , •0 )m

a0 = ,

(•1 , •1 )m (•0 , •0 )m ’ (•0 , •1 )2

m

(y, •1 )m (•0 , •0 )m ’ (y, •0 )m (•1 , •0 )m

a1 = .

(•1 , •1 )m (•0 , •0 )m ’ (•0 , •1 )2

m

Example 10.3 As already seen in Example 8.2, small changes in the data can

give rise to large variations on the interpolating polynomial of a given function

f . This doesn™t happen for the least-squares polynomial where m is much larger

than n. As an example, consider the function f (x) = sin(2πx) in [’1, 1] and

evaluate it at the 22 equally spaced nodes xi = 2i/21, i = 0, . . . , 21, setting

fi = f (xi ). Then, suppose to add to the data fi a random perturbation of the

order of 10’3 and denote by p5 and p5 the least-squares polynomials of degree

˜

˜ , respectively. The maximum norm of the

5 approximating the data fi and fi

di¬erence p5 ’ p5 over [’1, 1] is of the order of 10’3 , i.e., it is of the same order

˜

as the perturbation on the data. For comparison, the same di¬erence in the case

•

of Lagrange interpolation is about equal to 2 as can be seen in Figure 10.2.

10.8 The Polynomial of Best Approximation

Consider a function f ∈ C 0 ([a, b]). A polynomial p— ∈ Pn is said to be the

n

polynomial of best approximation of f if it satis¬es

f ’ p— = min f ’ pn ∀pn ∈ Pn

∞, (10.47)

∞

n

pn ∈Pn

where g ∞ = maxa¤x¤b |g(x)|. This problem is referred to as a minimax

approximation, as we are looking for the minimum error measured in the

maximum norm.

434 10. Orthogonal Polynomials in Approximation Theory

2

1.5

1

0.5

0

’0.5

’1

’1.5

’2

’1 ’0.5 0 0.5 1

FIGURE 10.2. The perturbed data (circles), the associated least-squares polyno-

mial of degree 5 (solid line) and the Lagrange interpolating polynomial (dashed

line)

Property 10.1 (Chebyshev equioscillation theorem) For any n ≥ 0,

the polynomial of best approximation p— of f exists and is unique. More-

n

over, in [a, b] there exist n + 2 points x0 < x1 < . . . < xn+1 such that

f (xj ) ’ p— (xj ) = σ(’1)j En (f ),

—

j = 0, . . . , n + 1

n

with σ = 1 or σ = ’1 depending on f and n, and En (f ) = f ’ p—

—

∞.

n

(For the proof, see [Dav63], Chapter 7). As a consequence, there exist n + 1

points x0 < x1 < . . . < xn , with xk < xk < xk+1 for k = 0, . . . , n, to be

˜ ˜ ˜ ˜

determined in [a, b] such that

p— (˜j ) = f (˜j ), j = 0, 1, . . . , n,

nx x

so that the best approximation polynomial is a polynomial of degree n that

interpolates f at n + 1 unknown nodes.

—

The following result yields an estimate of En (f ) without explicitly com-

puting p— (we refer for the proof to [Atk89], Chapter 4).

n

Property 10.2 (de la Vall´e-Poussin theorem) Let n ≥ 0 and let x0 <

e

x1 < . . . < xn+1 be n + 2 points in [a, b]. If there exists a polynomial qn of

degree ¤ n such that

f (xj ) ’ qn (xj ) = (’1)j ej j = 0, 1, . . . , n + 1

where all ej have the same sign and are non null, then

—

min |ej | ¤ En (f ).

0¤j¤n+1

—

We can now relate En (f ) with the interpolation error. Indeed,

¤ f ’ p— + p— ’ Πn f

f ’ Πn f ∞.

∞ ∞

n n

10.9 Fourier Trigonometric Polynomials 435

On the other hand, using the Lagrange representation of p— we get

n

n n

p— (p— (xi ) p—

’ Πn f ’ f (xi ))li ¤ ’f |li |

= ∞,

∞ ∞ ∞

n n n

i=0 i=0

from which it follows

—

f ’ Πn f ¤ (1 + Λn )En (f ),

∞

where Λn is the Lebesgue constant (8.11) associated with the nodes {xi }.

Thanks to (10.25) we can conclude that the Lagrange interpolating poly-

nomial on the Chebyshev nodes is a good approximation of p— . The above

n

results yield a characterization of the best approximation polynomial, but

do not provide a constructive way for generating it. However, starting from

the Chebyshev equioscillation theorem, it is possible to devise an algorithm,

called the Remes algorithm, that is able to construct an arbitrarily good

approximation of the polynomial p— (see [Atk89], Section 4.7).

n

10.9 Fourier Trigonometric Polynomials

Let us apply the theory developed in the previous sections to a particular

family of orthogonal polynomials which are no longer algebraic polynomials

but rather trigonometric. The Fourier polynomials on (0, 2π) are de¬ned

as

k = 0, ±1, ±2, . . .

•k (x) = eikx ,

where i is the imaginary unit. These are complex-valued periodic functions

with period equal to 2π. We shall use the notation L2 (0, 2π) to denote the

complex-valued functions that are square integrable over (0, 2π). Therefore

2π

f : (0, 2π) ’ C such that |f (x)|2 dx < ∞

2

L (0, 2π) =

0

with scalar product and norm de¬ned respectively by

2π

(f, g) = f (x)g(x)dx, f = (f, f ).

L2 (0,2π)

0

If f ∈ L2 (0, 2π), its Fourier series is

2π

∞

1 1

f (x)e’ikx dx = (f, •k ). (10.48)

Ff = fk •k , with fk =

2π 2π

k=’∞ 0

If f is complex-valued we set f (x) = ±(x) + iβ(x) for x ∈ [0, 2π], where

±(x) is the real part of f and β(x) is the imaginary one. Recalling that

436 10. Orthogonal Polynomials in Approximation Theory

e’ikx = cos(kx) ’ i sin(kx) and letting

2π

1

ak = [±(x) cos(kx) + β(x) sin(kx)] dx

2π

0

2π

1

bk = [’±(x) sin(kx) + β(x) cos(kx)] dx,

2π

0

the Fourier coe¬cients of the function f can be written as

∀k = 0, ±1, ±2, . . . .

fk = ak + ibk (10.49)

We shall assume henceforth that f is a real-valued function; in such a case

f’k = fk for any k.

Let N be an even positive integer. Analogously to what was done in

Section 10.1, we call the truncation of order N of the Fourier series the

function

N

’1

2

—

fk eikx .

fN (x) =

k=’ N

2

The use of capital N instead of small n is to conform with the notation usu-

ally adopted in the analysis of discrete Fourier series (see [Bri74], [Wal91]).

To simplify the notations we also introduce an index shift so that

N ’1

N

—

fk ei(k’ 2 )x ,

fN (x) =

k=0

where now

2π

1 1

f (x)e’i(k’N/2)x dx = (f, •k ), k = 0, . . . , N ’ 1 (10.50)

fk =

2π 2π

0