Possible Problems for Exams

Chapter ONE

Construct the floating point system where the numbers

$\displaystyle 1000,\frac{1}{2}, \frac{1}{3},\frac{1}{4},\frac{1/5},\frac{1/6},\frac{1}{2048}$

may be exactly representable.

Consider the floating point system with $N=5$

digits with base $\beta=2$ and lower and upper values of exponents being $U=5$

and

. Subnormals are allowed.

How many numbers are there between the number $\frac{1}{4}$ and number (not counting $\frac{1}{4}$ and .
What is the number closest to $\frac{1}{5}$ ?
What is the smallest INTEGER which is not representable in this floating point system?
How would your answer change if and were to be and ?

Let

and

be two adjacent positive Normalized Single Precision floating point numbers.

What is the minimal possible distance between and ?
What is the maximum possible distance between and ?
How many Double Precision Numbers are there between and (for double numbers ).

IEEE Single Precision has $\beta=2$ ,

, and subnormals are allowed. What is the maximum integer value of

for which the number

$\displaystyle x = 2^k+2^{-k}$

can be exactly re presentable in Single Precision?

IEEE Single Precision has $\beta=2$ , , , , and subnormals are allowed.

In class we have shown that if is a real number, and $\bar x$ is its floating point representation, then the maximum possible relative error in representing this number is given by

$\displaystyle \vert\frac{x-\bar{x}}{x}\vert\le\frac{\epsilon}{2},$

(1)

where $\epsilon$ is machine precision, $\epsilon = \beta^{1-N}.$

How the equation (1) changes for the subnormals?

To answer this question,

First, consider $y = 2^{-128}$ . How is going to be represented in IEEE SP?
What is the floating point number that follows number on the computer number line?
What is the maximum possible absolute error in representing numbers between and ?
Consider number such that

$\displaystyle y<z<2y.$
What is the maximum possible relative error in representing the number ?
extra credit Generalize the result of the previous question to any number between $2^{-149}<x<2^{-126}$ . You can guess the correct result by contemplating equation (1).

In single precision, what is the floating point number that follows “32”? In other words, what is the smallest positive floating point number

such that

$\displaystyle x>32?$

Please write the result the way it is stored in the computer, i.e. in binary floating point form.

Consider a floating-point arithmetic with base , precision and exponent range . In other words, each number in this system can be represented as

$\displaystyle x = \left( d_0 + \frac{d_1}{2}+\frac{d_2}{4}+\frac{d_3}{8}+\frac{... ..._5}{32}+\frac{d_6}{64} +\frac{d_7}{128}\right)\times 2^{E}, -16\le E\le 16.$

Here

is integer, and all $d_i, i =0\dots 7$ are either 0 or

Show that addition is not necessarily associative. I.e., give an example of 3 numbers and such that is not equal to
write down two adjacent normalized numbers and such that $\vert x-y\vert$ is maximal.
In this floating point system, what is the range of possible relative errors in representing a given number by a machine number?

(a) Find the largest open interval around so that all real numbers from the interval are rounded to . That is, find the smallest value of and the largest vlue of with so that any number from the interval is rounded to the floating point number . Assume double precision is used (53 binary digits).

(b)

Redo the part (a) for the , that is, find the interval that rounds to the floating point number

$\displaystyle x_f = 50 = \left(1+\frac{1}{2}+\frac{1}{2^4}\right)\times 2^5.$

IEEE SP has $\beta=2$ , , , .

Calculate , and $\epsilon_{\rm mach}$ for this system. Assume rounding by chopping.

How many floating point numbers are there between any successive powers of ? For example, how many floating point numbers are there between 2 and 4?

Consider the floating point system with

$\displaystyle \beta=2, p=4, L=-5,U=5.$

(a) What is the distance from number to the next largest floating point number in this floating point system?

(b) What is the distance from number to the next smallest floating point number in this floating point system?

(c) What distance is larger, (a) or (b)? Why? What is the relation between these distances and machine precision $\epsilon_{\rm machine}$ ?

Consider IEEE SP, which has $\beta=2$ ,

. What is the closest floating point number to the number

$\displaystyle x=75+\frac{1}{3}$

in IEEE SP?

Consider IEEE SP, which has $\beta=2$ ,

. What is the absolute error in representing the number

$\displaystyle x=75+\frac{1}{3}$

in IEEE SP?

Consider IEEE SP, which has $\beta=2$ ,

. What is the closest floating point number to the number

$\displaystyle x=75+\frac{1}{3}$

in IEEE SP?

What is the spacing of the floating point numbers between

and

, i.e. $16 \le x \le 32$ ?

Define machine epsilon and explain its significance

What would be the output in MATLAB for the ratio $\epsilon /\epsilon$ ?

Consider

$\displaystyle f(x) = (e^x - 1) / x.$

a) Approximate using third degree Taylor polynomial expanded about . Use this expansion to show that

$\displaystyle \lim\limits_{x\to 0}f(x) = 1.$

b) Explain why MATLAB would compute the limit of to be 0.

What is the largest value of

such that

$\displaystyle {\rm {float}}(19+2^{-k})> {\rm {float}}(19)$

in IEEE SP system? Here ${\rm float(x)}$ is a floating point representation of the number

We have studied in class that the maximum possible relative error for normalized numbers is equal to

$\displaystyle \vert\frac{x - {\rm float}(x)}{x}\vert\le\frac{\epsilon_{\rm mach}}{2}.$

< What is the range of possible errors for the subnormal floating point numbers?

Consider

$\displaystyle f(x)=\frac{1-(1-x)}{x}$

function. The following values were obtained in MATLAB:



$\frac{\epsilon}{4}(1+10^{-10})$

${\epsilon\over 4}(1+10^{-10})+{\epsilon\over 2}$

${\epsilon\over 4}(1+10^{-10})+\epsilon$

$-{\epsilon\over 2}(1+10^{-10})-\epsilon$

Please explain in details the values in the right column of this table.

For which positive integers

can the number $5+2^{-k}$ be represented exactly, with no rounding error in IEEE SP floating point system?

Consider IEEE SP that has binary numbers $\beta=2$ , with digits, and lower and upper values of the exponents of , . As we discussed in class, the number $\frac{1}{3}$ is not representable in IEEE SP. What is the Floating Point Number that precedes $x=\frac{1}{3}$ ? In other words, find the largest Floating Point Number that is less then $x=\frac{1}{3}$ .

Hint: your answer should contain 24 binary digits of the mantissa and the value of the exponent.

(a)Which of the following operations of two positive floating point numbers can produce overflow?

— addition

— subtraction

— multiplication

— division

If you answered “yes” to any of the questions, please give one example of two Single Precision numbers that produce overflow by given operation.

(b) Which of the following operations of two positive floating point numbers can produce underflow?

— addition

— subtraction

— multiplication

— division

If you answered “yes” to any of the questions, please give one example of two Single Precision numbers that produce underflow by given operation.

What is the number that follows the number “zero”, ${\rm zero}\equiv 0$ . In other words, find the smallest possible positive number that is representable in this system. Write the result in both binary and decimal format.

Consider the following expression:

$\displaystyle \frac{1}{1-x}-\frac{1}{1+x},$

assuming $x\ne\pm 1.$

(a) for what values of it is difficult to calculate this expresion accurately in floating point arithmetic?

(b)Give a rearrangement of the terms such that, for the range of in part a, the computation is more accurate in floating point system

Consider IEEE SP that has binary numbers $\beta=2$ , with

digits, and lower and upper values of the exponents of

. What is the Floating Point Number that is before

? In other words, what is the largest Floating Point Number that is less then

Consider the IEEE floating point system, where the binary numbers $\beta=2$ , with

digits, and lower and upper values of the exponents of

are used. Also, assume that the “rounding to nearest” rule is used, and if there is a tie, a smallest number is chosen.

— For what numbers will the computer claim that inequality is true?

— For what real numbers will a computer claim that ?

— Suppose it is claimed that the solution of is exactly representable in this system. Why it is not possible? What is the distance between two floating point numbers that is right above and right below solution of in this system?

Consider the following toy floating point system: base $\beta=2$ , with

digits, with lower exponent of

and upper exponent

Consider the following claim: If the two positive binary floating point numbers and in this toy floating point systems are such that

$\displaystyle \frac{1}{2}\le \frac{x}{y}\le 2,$

then their difference,

is exactly representable number in the floating point system.

Is this claim true or false? If it is true, explain why, if it is false, find counter example.

Consider the following function:

$\displaystyle f(x)=\frac{1-(1-x)}{x}.$

The value of this function for any

is equal to one. However, when we calculate

on the computer for small value of

, result is not equal to

. Here is a computer generated graph of

for small value of

Please explain why this graph looks the way it does.

In particular, answer the following questions:

Why is zero for $0<x<0.555 \times 10^{-16}$ ?
Why is zero for $-1.1\times 10^{-16} <x<0$ ? Why the value of zero on the left is twice longer than the value of zero on the right?
Why after zero, jumps to the value of at $x\simeq 0.5556 \times 10^{-16}$ ?
Why does oscillate around for $0.5556 \times 10^{-16} <x< 5\times 10^{-16}$ ?
Why does oscillation diminish, as become larger
Why oscillations are twice as frequent for positive than for negative ?
Explain why does the second jump appear at $x=1.665 \times 10^{-16}$ .

Assume a normalized floating point system with base $\beta=10$ , three digits of accuracy

and the lowest possible exponent of

What is the smallest possible positive floating point number that is representable in this system (also called “UFL” for underflow level)?
If $x=6.87\times 10^{-97}$ and $y=6.81\times 10^{-97}$ , what is the result of computing ?
If the subnormal numbers were to be allowed, what would be the result of ?

IEEE SP has $\beta=2$ , , , . In single precision floating point system write down the floating point number that follows the number . (In other words, find minimal value of , that is exactly representable in this floating point system.

IEEE SP has $\beta=2$ , , , . What is the smallest possible positive integer that is not a single precision number?

In a floating point system with precision

decimal digits, $\beta=10$ , let

and

How many significant digits does the difference contain?
If the floating point system is normalizes, wwhat is hte minimum exponent range for which and are exactly representable?
Is the fifference exctly representable, ragardless of exponent range, if gradual underflow is allowed? Why?

Suppose one calculates using computer arithmetic the following number:

$\displaystyle A=3*(\frac{4}{3}-1)-1.$

We have shown in class that one can estimate the machine precision by the number

, so that

$\displaystyle \epsilon_{\rm machine}\simeq A.$

Determine whether the following examples may be used to determine machine precision:

$\displaystyle B=(\frac{7}{3} - \frac{4}{3}) - 1,$

and

$\displaystyle C= (\frac{4}{3} - \frac{1}{3}) - 1.$

Explain your reasoning by using the system with two digits of precision.

You may find it helpful to use calculator to gain intuition for this problem.

Consider a floating-point number system with base , precision and exponent range I.e., in this system any number can be written as $1.a_{1}a_{2}a_{3}a_{4} \ast 2^{L}.$

(a) Write down two adjacent normalized numbers and such that $\left\vert x-y\right\vert$ is minimal.

(b) In this floating-point system, what is the maximal possible error of representing $\pi$ by a machine number? What is the possible relative error in representing $\pi$ by a machine number?

Assume a decimal (base 10) floating point system having machine precision $\epsilon_{mach}=10^{-5}$ and an exponent range of $\pm 20$ . What is the result of each of the following floating point arithmetic operations?

$\displaystyle 1+ 10^{-7}=$

$\displaystyle 1+10^3=$

$\displaystyle 1+ 10^7=$

$\displaystyle 10^{10}+10^3=$

$\displaystyle 10^{10}/10^{-15}=$

$\displaystyle 10^{-10}\times 10^{-15}=$

Chapter TWO

Suppose you are solving

by iterations, and you have obtained $(x_{k-1},f(x_{k-1})$ and

. Now use linear interpolation to find $(x_{k+1},f(x_{k+1})$ such that $f(x_{k+1})\simeq 0$ . What is the resulting method of solving

that you have obtained?

For the secant, Newton, and bisection methods: Looking at iterative error when solving

, which of these methods converges to an error of $10^{-16}$ the fastest?

True or False:
If Newton's method converges to a solution

for a particular choice of

, then it will converge to

for any starting point between

and

.
To get credit for this problem, you will need to give comprehensive explanation of your answer.

For compyting the midpoint

of an interval

, which of the following two equations is prefereable in floating point system? Why? When? Devise the example when the midpoint given be the equation lies outside of the

interval.

Solve with three digits accuracy an equation

$\displaystyle x^2 + 3 x -8\times 10^{-14}=0.$

Note that you have to find two roots. What version of quadratic formulas are you going to use to find these roots?

The “divide and average” method for computing a square root

of a given number

can be formulated as follows:

$\displaystyle x_{n+1} = \frac{x_n+ \frac{a}{x_n}}{2}.$

Show that this method converges to $\sqrt a$ , i.e. that

$\displaystyle \lim\limits_{n\to\infty}x_n = \sqrt{a},$

and calculate the rate of convergence.

Show that if

$\displaystyle f(x) =\frac{1}{x}-b,$

for nonzero

, then the Newton method, converging to the root

can be implemented without performing any divisions. new

Newton method for solving a scalar nonlinear equation requires computation of the derivative of at each iteration. Suppose that we instead replace the true derivative with a constant value , that is we use the iteration scheme

$\displaystyle x_{k+1} = x_k - f(x_k)/d.$

(a) Under what condition on value of will this scheme be locally convergent?

(b) What is the convergence rate of this scheme?

(c) Is there any value of to give a quadratic convergence?

In class we have shown that Newton method

$\displaystyle x_{k+1} = x_k - \frac{f(x_k)}{f'(x_k)},$

(2)

has quadratic rate of convergence for simple root (i.e. when

and $f'(x_*)\ne 0$ ). In the homework you have shown that convergence rate is linear for the double root.

Proove, that Newton method has linear rate of convergence for triple root, i.e. when $f(x_*)=f'(x_*)=f''(x_*)=0, f'''(x_*)\ne 0$ .

Extra Credit Proove it for roots of multiplicity , i.e. when

$\displaystyle f(x_*)=0 \left(\frac{d}{d x}\right)^k f(x_*)= 0, k = 1,2, dots n-1, \left(\frac{d}{d x}\right)^{n+1} f(x_*)\ne 0 .$

Suppose you came to the Ice Age, and computer in your Time Machine is broken. Suppose to come back to 2016 to complete the numerical computing exam, you need to calculate “two to the power one third”, i.e.

$\displaystyle 2^\frac{1}{3}$

. You can only use

and

. Set up the nonlinear equation that has $2^\frac{1}{3}$ as the solution.

Describe how to use the bisection method to make this calculation. What is your initial bracket? How many iteration steps do you need to perform to get the solution with $10^{-3} \simeq 2^{-10}$ accuracy?
Describe how to use the Newton method for this calculation. How many steps do you need to do to get the root with $10^{-3}$ accuracy?

Consider the following iteraton methods

$\displaystyle x_{n+1} = \frac{20 x_n + \frac{21}{x_n^2}}{21},$
$\displaystyle x_{n+1} = x_n - \frac{x_n^3-21}{3 x_n^2},$
$\displaystyle x_{n+1} = \sqrt{\Big(\frac{21}{x_n}\Big)}.$
$\displaystyle x_{n+1} = \frac{ 21 + 2 x^3}{ 3 x^2}.$

Assume that all iterations start from

4 points Verify that each of these fixed-point iterations converge to $\sqrt[3]{21}$ , i.e.

$\displaystyle \lim\limits_{n\to\infty}x_n=X=\sqrt[3]{21},$
and
rank the methods in order based on their apparent speed of convergence (i.e. find the fastest method to converge, the second fastest, the third and the last to converge).

In class we have shown that fixed point iterations

$\displaystyle x_{k+1}= g(x_k), x^*=g(x^*),$

converge if

$\displaystyle \vert g(x^*)\vert<1.$

Under what condition fixed point iterations converge if

$\displaystyle g(x^*)=1?$

To gain an intuition on this problem, consider the fixed point iterations

$\displaystyle x_{k+1}= g(x_k), g(x)=x-x^2/2.$

For this problem the fixed point is

. Start with

. Then

$\displaystyle x_1=g(x_0)=1/2,$

$\displaystyle x_2=g(x_1)=3/8=0.375,$

$\displaystyle x_3=g(x_2)=39/128=0.304688.$

It seems to converge to the fixed point

, yet

. Why is that?

Consider the function

$\displaystyle f(x) = e^x - 2 x^2.$

The graph of the function is given by

As you see from the graph, this function has three roots, given by , and .

The following two functions are defined as

$\displaystyle g_1(x) = -\sqrt{\frac{e^x}2}$
$\displaystyle g_2(x) = \sqrt{\frac{e^x} 2}$			(3)

Consider the following fixed point iterations:

$\displaystyle x_{k+1} = g_1(x_k)$			(4)
$\displaystyle x_{k+1} = g_2(x_k)$			(5)

Show that fixed point iterations (4) with converges to the root .
Show that fixed point iterations (5) with converges to the root .
Show that iterations with neither nor converge to the root , regardless of the starting point.
Propose a fixed point iteration scheme that will converge to the root .

We have studied Secant method

$\displaystyle x_{k+1} = x_k - f(x_k)\frac{x_k-x_{k-1}}{f(x_k)-f(x_{k-1})}.$

Show that it can be equivalently rewritten as

$\displaystyle x_{k+1}= \frac{x_{k-1} f(x_k)- x_k f(x_{k-1})}{f(x_k)-f(x_{k-1})}.$
List two advantages and two disadvantages of Secant method relative to Newton method.

Suppose you are solving

by iterations, and you have obtained $(x_{k-1},f(x_{k-1}))$ and

. Now use linear interpolation to interpolate a straight line through these two points, and choose $x_{k+1}$ so that $f(x_{k+1})\simeq 0$ . What is the resulting method of solving

that you have derived?

Express the Newton iteration method for solving the following system of nonlinear equations:

$\displaystyle x_1^2 + x_1 x_2 ^3 =9, 3 x_1^2 x_2 - x_2^3 = 4,$

and carry out one iteration starting from the starting point

Express the Newton iteration method for solving the following system of nonlinear equations:

$\displaystyle x_1^2 + x_1 x_2 ^3 =9,$
$\displaystyle 3 x_1^2 x_2 - x_2^3 = 4,$

and carry out one iteration starting from the starting point

Carry out one iteration of Newton's method applied to the system

$\displaystyle x_{1}^{2}-x_{2}^{2}$	$\displaystyle =$	0
$\displaystyle x_{1}x_{2}$	$\displaystyle =$	$\displaystyle 1$

with starting value ${\bf x}_{0}=[0,1]^{T}.$

The following values for the solution of were computed using Matlab. What method was used (bisection, Newton or secant)? Make sure you explain in details why it is the method you claim:

X=50.25125628140704

X=25.378140640072242

X=12.944094811287638

X=6.7320923307183715

X=3.636103634563207

X=2.1079101939699143

X=1.3816957571715662

X=1.0826201384421688

X=1.0058580941730362

X=1.0000339198559194

X=1.0000000011504786

X=1.0000000000000000

Chapter THREE

Show that for arbitrary square matrices

and

the following is true

$\displaystyle (A\cdot B)^{-1} = B^{-1}\cdot A^{-1}.$

Show that for arbitrary square matrices

and

the following is true

$\displaystyle (A\cdot B)^{\rm T} = B^{\rm T}\cdot A^{-\rm T},$

where ${\rm T}$ denotes the transposition

Prove or give counterexample: if

is a singular matrix, then

$\displaystyle \vert\vert A^{-1}\vert\vert=\vert\vert A\vert\vert^{-1}.$

Consider the following systems of equations:

$\displaystyle x^2+y^2=4, y = x^6.$

$\displaystyle x^3+4 y^3=4, \ \ 2 x^2+y^2y = 7.$

Sketch the two curves and explain where approximately the solutions are located
Set up and explain the Newton method to find the solutions. Calculate the Jacobian and the RHS ${\bf f}$ of the Newton method.
What would be the good starting point for your calculations?
Write down in all details the system of equations to make the first iteration. Do not solve the resulting equations.

Suppose that you use Newton method

$\displaystyle x_{k+1} = x_k - \frac{f(x_k)}{f'(x_k)},$

(6)

to find the solution of the equation

$\displaystyle f(x^*)=0$

for which

$\displaystyle f'(x^*)\ne 0, \rm {\ and \ } f''(x^*)=f'''(x^*)=f''''(x^*)=0.$

What would be the convergence rate for the Newton method for such a case? HINT: convergence of the Newton method is not necessarily quadratic.

(a) Let ${\bf A}$ be an arbitrary square matrix, and let

be an arbitrary scalar.

Prove or disprove the following statements:

(i)

$\displaystyle \vert\vert c {\bf A}\vert\vert = \vert c\vert \cdot \vert\vert A\vert\vert$

(ii)

$\displaystyle {\rm cond} ( c \cdot {\bf A}) = \vert c\vert \cdot {\rm cond} ({\bf A})$

(b)

Let ${\bf A}$ be an $n\times n$ diagonal matrix with all its diagonal entries equal to .

(i) What is the value of $det( {\rm A})$ ?

(ii) What is the value of ${\rm cond (A)}$ ?

Consider the linear system

$\displaystyle A x =b,$
$\displaystyle A=\left(\begin{array}{cc} 1 & 1.1 0.9 & 1\end{array}\right),$
$\displaystyle b = \left(\begin{array}{c} 1.11 1\end{array}\right).$

Solve this system by any method you like using exact arithmetic.
Solve this system by computing an inverse of using two decimal digit machine arithmetic.
Hint If

$\displaystyle A=\left(\begin{array}{cc} a & b c & d\end{array}\right),$
then

$\displaystyle A^{-1}=\frac{1}{a d - b c} \left(\begin{array}{cc} d & -b -c & a\end{array}\right).$
Also

$\displaystyle x= A^{-1}b.$
Now solve (9) by using LU factorization using two decimal digit machine arithmetic.
Compare and explain the difference between results obtained in items 1, 2 and 3. What conclusion can you make about LU factorization and computing solution of (9) by using inverse of a matrix?

In this problem

$\begin{displaymath}A = \left[ \begin{array}{rr} 1 & -2 \\ -2 & 3 \end{array}\right] \end{displaymath}$

(a) Find a vector

such that $\vert\vert A x\vert\vert _\infty = \vert\vert A\vert\vert _\infty$ .

(b) Find a vector such that $\vert\vert A x\vert\vert _1 = \vert\vert A\vert\vert _1$

Consider the

problem with $\alpha>0$ and

$\begin{displaymath} A = \left[ \begin{array}{rr} -1 & 1 \\ 0 & \alpha \end{a... ... \ b = \left[ \begin{array}{rr} 1 \\ 1 \end{array}\right] \end{displaymath}$

Suppose that the error ${\bf e} = {\bf x } - {\bf x_c}$ in computed solution is small, but nonzero. Here ${\bf x}$ is an exact solution, and ${\bf x}_c$ is a computed solution. For what values of $\alpha$ , if any, the residual will be large?

HINT For what values of $\alpha$ , if any, the matrix will be ill-conditioned?

HINT If

$\displaystyle A=\left(\begin{array}{cc} a & b c & d\end{array}\right),$

then

$\displaystyle A^{-1}=\frac{1}{a d - b c} \left(\begin{array}{cc} d & -b -c & a\end{array}\right).$

The Kronecker delta (named after Leopold Kronecker) is a function of two variables, usually two integers, is defined as

$\begin{displaymath}\delta^i_j =\left( \begin{array}{cc} 1 & {\rm if } i=j\\ 0 & {\rm if } i\ne j\\ \end{array}\right.\end{displaymath}$

(7)

Consider the $n\times n$ matrix defined as

$\displaystyle A = (a_{ij}), a_{ij} = i\cdot \delta^i_j.$

In other words, is a diagonal matrix with diagonal entries being equal to

$\displaystyle 1, 2, 3, \dots (n-1), n.$

What is the one norm of this matix?
What is the two norm of this matrix?
What is the $\infty$ norm of this matrix?
What is the Condition Number of this matrix?

For the matrix

$\begin{displaymath} {\bf A=}\left[ \begin{array}{rrr} 2 & -4 & 2 \\ 1 & 0 & 5 \\ 2 & -2 & 2 \end{array}\right] \end{displaymath}$

find the

-factorization (show both

and

matrices explicitly).

Consider the system

$\displaystyle {\bf A} \cdot x = b,$

(8)

where

$\begin{displaymath} {\bf A=}\left[ \begin{array}{rrr} 5 & 6 & 7 \\ 10 & 20 & 23 \\ 15 & 50 & 58 \end{array}\right] \end{displaymath}$

and

$\begin{displaymath} b =\left[ \begin{array}{r} 6\\ 13\\ 23 \end{array}\right] \end{displaymath}$

$\begin{displaymath} {\bf A=}\left[ \begin{array}{rrr} 1 & 3 & 5 \\ 2 & 9 & 15 \\ 2 & 18 & 37 \end{array}\right] \end{displaymath}$

and

$\begin{displaymath} b =\left[ \begin{array}{r} 22\\ 65\\ 149 \end{array}\right] \end{displaymath}$

(a) Find explicitly factorization of ${\bf A}$ .

(b) Use this decomposition to solve (9).

Consider the matrix

$\begin{displaymath} {\bf B=}\left[ \begin{array}{rrr} 1 & 0 & 0 \\ 2 & 1 & 0 \\ 3 & 2 & 1 \end{array}\right] \end{displaymath}$

Calculate the inverse of this matrix $B^{-1}$ by representing the matrix as a product of two elementary Gauss elimination matrices.

In this problem $\alpha$ is a small positive number. Sketch the two lines in the

plane, and describe how they change, including the point of intersection, as $\alpha$ approaches zero. Also, calculate the condition number for the matrix, and describe how it changes, when $\alpha$ approaches zero.

$\displaystyle x-y = -1, -x+ (1+\alpha) y =1.$
$\displaystyle 2 x + 4 y = 1, (1-\alpha)x+2 y = -1.$

Prove that the one norm is the maximum absolute column sum; use $2\times 2$ matrices.

Consider

$\displaystyle {\bf A}=\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 &-2&2\\ 0 & 2& -1\end{array}\right)$

(9)

a) Find the LU factorization of the given matrix

b) Calculate the norms $\vert\vert A\vert\vert _2$ , $\vert\vert A\vert\vert _\infty$ , and $\vert\vert A\vert\vert _1$ for this matrix

c) Calculate the condition number for the matrix A.

Consider

$\begin{displaymath}{\bf A=}\left( \begin{array}{cc} -1 & 1 \\ 2 & 2 \\ \end{array}\right)\end{displaymath}$

Find all vectors satisfying $\vert\vert x\vert\vert _\infty=1$ and $\vert\vert{\bf A} x\vert\vert _\infty=\vert\vert{\bf A}\vert\vert _\infty$ .
Find a vector satisfying $\vert\vert{\bf A} x\vert\vert _1=\vert\vert{\bf A}\vert\vert _1.$
Find a vector satisfying $\vert\vert{\bf A} x\vert\vert _2=\vert\vert{\bf A}\vert\vert _2.$

The $\infty$ norm of a vector is defined as a modulus of a maximum component of a vector. As we learned in class, matrix norms are defined through the vector norms as

$\displaystyle \vert\vert A\vert\vert _\infty = {\rm max} \frac{ \vert\vert A x\vert\vert _\infty}{\vert\vert x\vert\vert _\infty}.$

Use this definition of the matrix $\infty$ norm through the vector norm to proove that the $\infty$ norm of a matrix is a maximum absolute row sum.

Complete the proof for $2\times 2$ matrices.

b Extra credit 10% Prove this statement for general $n\times n$ matrices.

Consider the system

$\begin{displaymath}\left[ \begin{array}{ll} 4 & 2 \\ 2+\varepsilon & 1+\varepsi... ...y}\right] =\left[ \begin{array}{l} 2 \\ 1 \end{array}\right] .\end{displaymath}$

(10)

where $\varepsilon$ is small. The system has two approximate solutions ${\bf\hat{x}=}[0,1]^{T}$ and ${\bf\tilde{x}=}[1-5\varepsilon ,-1]^{T}.$ Find the norms of the respective residuals. Which one is smaller? Find the condition number of the matrix. Explain, why you should not use residuals in this case to determine the quality of a solution.

Hint: Find an exact solution to (11).

Hint

$\begin{displaymath} \left[ \begin{array}{rr} a & b \\ c & d \end{array}\right]... ...left[ \begin{array}{rr} d & -b \\ -c & a \end{array}\right] \end{displaymath}$

Calculate an inverse of the following matrix:

$\begin{displaymath}{\bf A=}\left( \begin{array}{cccccc} 1 & 0 &0 &0 &0 &0 \\ 0 ... ... 0 & 0 &3 &0 &1 &0 \\ 0 & 0 &4 &0 &0 &1 \\ \end{array}\right)\end{displaymath}$

(11)

Calculate the inverse of this matrix, ${\bf A}^{-1}$ .

Let a $40\times 40$ matrix ${\bf A}$ be factored as follows:

$\begin{displaymath}{\bf A=LU=}\left( \begin{array}{cccccc} 1 & & & & & \\ -1 & ... ...s & \vdots \\ & & & & 1 & 1 \\ & & & & & 1 \end{array}\right)\end{displaymath}$

(12)

where the blank spaces stand for zeros. Use the LU factorization to solve the system ${\bf Ax=[}0,1,-1,1,...,-1,1]^{T}.$

Consider the matrix

$\begin{displaymath}A = \left( \begin{array}{cc} 1 & 1+\epsilon\\ 1-\epsilon & 1 \end{array}\right)\end{displaymath}$

What is the determinant of ?
In floating point arithmetic, for what range of $\epsilon$ will the computed value of the determinant be nonzero
What is the factorization of ?
In floating point arithmetic, for what value of $\epsilon$ will the computed value of be singular?

In a computer with no build in function for floating point divisions, one might instead use multiplication by the reciprocal of the divisor. Apply Newton's method to produce an iterative scheme to approximate the reciprocal of a number

, to solve the equation

$\displaystyle f(x)=\frac{1}{x}-y=0,$

given y. Considering intended application your scheme should not contain any divisions.

Consider the following system of equations:

$\displaystyle x^2+y^2=4, y = x^6.$

Sketch the two curves and explain where approximately the solutions are located
Set up and explain the Newton method to find the solutions. Calculate the Jacobian and the RHS ${\bf f}$ of the Newton method.
What would be the good starting point for your calculations?
Write down in all details the system of equations to make the first iteration. Do not solve the resulting equations.

Let a $6\times 6$ matrix ${\bf A}$ be factored as follows:

$\begin{displaymath}{\bf A=} \left( \begin{array}{ccccccc} 1 & 1 & 1 & 1 &1 &1 \\... ...& 0 & 0 & 0 &1 &1 \\ 0 & 0 & 0 & 0 &0 &1\\ \end{array}\right)\end{displaymath}$

Using this factorization, solve for

the following equation:

$\begin{displaymath}{\bf A x =} \left( \begin{array}{cccccc} 21 \\ 41 \\ 59 \\ 74 \\ 85 \\ 91 \\ \end{array}\right)\end{displaymath}$

Chapter FOUR

The Kronecker delta (named after Leopold Kronecker) is a function of two variables, usually two integers, is defined as

$\begin{displaymath}\delta^i_j =\left( \begin{array}{cc} 1 & {\rm if } i=j\\ 0 & {\rm if } i\ne j\\ \end{array}\right.\end{displaymath}$

(13)

Consider the $n\times n$ matrix defined as

$\displaystyle A = (a_{ij}), a_{ij} = i\cdot \delta^i_j.$

In other words, is a diagonal matrix with diagonal entries being equal to

$\displaystyle 1, 2, 3, \dots (n-1), n.$

Find eigenvalues of .
Find eigenvectors of .
To which eigenvector would a power iteractions converge? Explain, how to set up such power iterations.
To which eigenvector would an inverse power iteraction converge? Explain, how to set up such power iterations.
What algorithm would you use to find second largest eigenvector and corresponding eigenvalue? Please explain in details.

Let the $n\times n$ matrix

have eigenvalues $\lambda_1\dots\lambda_n$ , and eigenvectors $X_1\dots X_n$ . Let the $n\times n$ matrix

have eigenvalues $\mu_1\dots\mu_n$ and same eigenvectors $X_1\dots X_n$ . What are eigenvectors and eigenvalues of the matrix $C$

given by

$\displaystyle C = 2 A^3+ 4 B^{-4}.$

$\displaystyle {\bf A}= \left(\begin{array}{ccc} 1 & 2 & -4\\ 3 &-2 & 2\\ 6 & 2 & 4 \end{array}\right)$

(14)

a) Calculate the Eigenvalues and Eigenvectors for the given matrix

b) What Eigenvalue would the inverse power method converge to? Why?

c) What Eigenvalue would the power method converge to? Why?

d) Define . For this matrix, what Eigenvalue would the power method converge to? Why?

Consider the $6\times 6$ matrix

$\displaystyle A = \begin{bmatrix} 5 & 0 & 0 & 0 & 1 & 0 \\ 0 & 5 & 0 & 0 & 0... ... 0 & 0 \\ 1 & 0 & 0 & 0 & 5 & 0 \\ 0 & 0 & 0 & 0 & 0 & 5\\ \end{bmatrix}$

Find Eigenvalues and Eigenvectors of this matrix
Find LU decomposition of this matrix
To what eigenvalue and eigenvector the power iteration will converge?
To what eigenvalue and eigenvector the inverse power iteration will converge?

Suppose that ${\bf A}$ is a symmetric $4\times 4$ matrix with eigenvalues

$\displaystyle -4, 1, 2, 3.$

a) To which of these eigenvalues will the power method converge? Why

b) To which of these eigenvalues will the inverse power iteration converge? Why?

c) To what eigenvalue the power iteration method applied to the matrix

$\displaystyle {\bf B}=2 {\bf A}+3 I$

will converge? Why?

Consider

$\displaystyle A=\left(\begin{array}{cc} 4 & 8 1 & 6\end{array}\right),$

(a) Find eigenvalues and eigenvectors of .

(b) Find eigenvalues and eigenvectors of

$\displaystyle B=\left(\begin{array}{cc} 5 & 8 1 & 7\end{array}\right)^3 + \left(\begin{array}{cc} 3 & 8 1 & 5\end{array}\right)^{-1}.$

Note that you do not have to calculate explicitly.

Chapter FIVE

Given

$\displaystyle f(x) = x^5,$

on the interval $-1\le x\le 2$ , use the points

and

a) Find the piecewise linear interpolation function

b) find the quadratic interpolation function

Consider

for $-1\le =\le 0$

for $0 \le x\le 1$ .

Solve for so that the given function is a natural cubic spline from $-1\le x\le 1$ .

Use the theorems we studied in class to calculate the maximum error in interpolating the function $\sin(t)$ by a polynomial of degree four using five equally spaced points on the interval $[0,2\pi]$ .

Suppose you want to interpolate the function $\sin(t)$ by a polynomial of degree

by using

equally spaced points on the interval $[0,2\pi]$ . How many points should you use so that the difference between the sine function and your interpolation is less that $10^{-16}$ ?

For a set of

given data points $t_1, \dots t_n$ , define the function

$\displaystyle \pi(t)= (t-t_1)(t-t_2)\dots (t-t_n).$

Show that

$\displaystyle \pi^\prime(t_j) = (t_j-t_1)(t_j-t_2)\dots (t_j-t_{j-1})(t_j-t_{j+1}) \dots (t_j-t_n).$
Show that the 'th Lagrange basis function can be expressed as

$\displaystyle L_j(t) = \frac{\pi(t)}{(t-t_j)\pi^\prime(t_j)}.$

In general, is it possible to interpolate

data points by a piecewise quadratic polynomial, with knots at a given data points, such that the interpolant is

once continuously differentiable?
twice continiously differentiable

Explain your answer as best as you can.

What is the maximum number of points

that can be interpolated by a piecewise quadratic polynomial that is twice continuously differentiable?

Consider the following data

$\displaystyle (0,1) ( 1,2) (2,9).$

Interpolate this data by a quadratic polynomial by using monomial interpolation
Interpolate this data by using Lagrange interpolation
Show that Lagrange interpolation reduces to monomial interpolation

The Gamma function $\Gamma(x)$ has the following known values:

$\displaystyle \Gamma(0.5)=\sqrt{\pi}, \Gamma(1)=1, \Gamma(1.5) = \sqrt{\pi}/2.$

Use quadratic interpolation to determine the value approximate value of

such that $\Gamma(x)=1.5$ . HINT: Instead of using the data $(0.5,\sqrt{\pi}), (1,1), (1.5, \sqrt{\pi}/2),$ you may find it easier to use the data $(\sqrt{\pi},0.5), (1,1), (\sqrt{\pi}/2, 1.5).$

Suppose that some measurements had produced the following data:

$\displaystyle (0; 5), (1;9), (2,15)$

(i)

Write down second degree polynomial passing through all three points by using Lagrange interpolation

(ii)

Write down second degree polynomial passing through all three points by using Newton interpolation

(iii)

Show that the two polynomials obtained in (i) and (ii) are equivalent

Use appropriate Lagrange interpolating polynomial of to interpolate the following data:

$\displaystyle (0,-3), (1,0), (2,5), (3,12) .$

What is the degree of interpolating polynomial? There is a catch in this question.

Consider the following data:

$\displaystyle (0,3), (1,1), (2,3).$

Interpolate this data by a quadratic polynomial by using monomial interpolation
Interpolate this data by using Lagrange interpolation
Show that Lagrange interpolation reduces to monomial interpolation
Interpolate this data by piece wise linear interpolation

Determine the parabola (interpolating polynomial of degree two) that interpolates the values of $\sin x$ for $x=0,\pi/2,\pi.$

Find the clamped cubic spline

, which goes through the points

$\displaystyle (0,1), (1,3), (2,4)$

with the value of the first derivative being equal to

and

at the beginning and the end of the domain, i.e.

$\displaystyle s'(x=0)=2, {\rm and } s'(2)=3.$

Find the clamped cubic spline

, which goes through the points

$\displaystyle (0,1), (1,3), (2,5)$

with the value of the first derivative being equal to

and

at the beginning and the end of the domain, i.e.

$\displaystyle s'(x=0)=2, {\rm and } s'(2)=-2.$

Consider the following function

$\begin{displaymath}g(x) = \left[ \begin{array}{lll} x^3+1, & {\rm for }& 0\le x\... ...x^3+6 x^2 -6 x + 1 & {\rm for} &1\le x\le 2 \end{array}\right. \end{displaymath}$

a cubic spline for $0\le x\le 2$ ? Make sure you justify your answer

Consider the logarithmic function $\ln(x)$ evaluated at the points 1,2 and 3:

$\displaystyle (1, \ln(1)), (2, \ln(2)), (3,\ln(3)).$

Write down the entries of the matrix and right hand side of the linear system that determines the coefficients for the cubic not-a-knot spline (variation: natural) interpolating these three points.

HINT: not-a-knot spline requires that the third derivative is continuous at the first and last points. Do not solve this system.

Suppose you were to define a piece-wise quadratic spline that interpolates

given values

$\displaystyle (x_1, y_1), (x_2, y_2), \dots (x_n, y_n), (x_{n+1}, y_{n+1}).$

Write down in general form quadratic polynomials that interpolate these points, such that the resulting piece wise quadratic polynomial has continuous first derivative. How many additional conditions are required to make a square system for the coefficients of this quadratic spline?

This problem considers the function

$\displaystyle g(x) = \begin{bmatrix} x^3-1, & {\rm if} &0\le x\le 1 \\ -x^3+6x^2-6x+1, & {\rm if} & 1\le x\le 2 \end{bmatrix}$

(15)

Is a cubic spline for $0\le x\le 2$ ?
If it is a spline, it is natural, clamped, or neither?

Make sure to justify your answers.

This problem considers the function

$\displaystyle g(x) = \begin{bmatrix} 2+3x^2 + \alpha x^3, & {\rm if} &-1\le x\le 0 \\ 2+\beta x^2 -x^3, & {\rm if} & 0\le x\le 1 \end{bmatrix}$

(16)

For what values of $\alpha$ and $\beta$ , if any, is a cubil spline for $-1\le x\le 1$ ? These values are to be used for the rest of this problem.
What were the data points that give rise to this cubic spline?
for what values of $\alpha$ and $\beta$ is a natural cubic spline?
for what values of $\alpha$ and $\beta$ is a clamped cubic spine?

Suppose you are given 4 point:

$\displaystyle (t_1,y_1), (t_2, y_2), (t_3, y_3), (t_4, y_4).$

Is it possible to interpolate these points by piece-wise cubic polynomial with continuous first,second and third derivatives?
Is it possible to interpolate these points by piece-wise cubic polynomial with continuous first,second, third and fourth derivatives?

Please explain your reasoning as accurately as you can.

Suppose that you would like to obtain a quadratic spline that interpolates the function $\cos(x)$ between the nodes 0, and .

Write down the form of the spline interpolation function. How many coefficients need to be determined?
Write down the conditions that the spline must satisfy, and the corresponding equations for the coefficients. Do not solve the resulting equations.
Are there enough conditions? If not, what extra condition(s) would you recommend to make a best possible fit of $\cos(x)$ ?

b Given a function on a discrete set of data points. Explain the difference between interpolation and approximation. What Is the difference between an interpolating polynomial of degree and an approximating polynomial of the same degree ?

Is it possible to interpolate three points

$\displaystyle (x_1,y_1), (x_2,y_2), (x_3, y_3)$

by a second order piece wise continious quadratic polynomial with continious first and second derivatives?

Is it possible to interpolate four points

$\displaystyle (x_1,y_1), (x_2,y_2), (x_3, y_3),(x_4,y_4),$

by a second order piece wise continious quadratic polynomial with continious first and second derivatives?

Suppose that you would like to obtain a quadratic spline that interpolates the function $\cos(x)$ between the nodes .

Write down the form of the spline interpolation function. How many coefficients need to be determined?
Write down the conditions that the spline must satisfy, and the corresponding equations for the coefficients. Do not solve the resulting equations.
Are there enough conditions? If not, what extra condition(s) would you recommend?

Suppose that you are given 4 experimental points: .

Is it possible to interpolate these 4 data points by piecewise quadratic polynomial with knots at these given data points, such that interpolant is

Once continuously differentialble?
Twice continuously differentialble?
Three times continuously differentialble?

In each case, if the answer is “yes” explain why, and outline the procedure to find the interpolating function (you may use short form of the equations, do not solve the resulting equations); if the answer is “no”, explain why.

In class we have studied cubic splines, i.e. interpolation by a piece wise cubic polynomial with continious first and second derivative. It is possible to also introduce quadratic spline, i.e. piece wise quadratic polynomial with continious first derivative. Such qudratic spline is the focus of this problem.q

Consider the same data:

$\displaystyle (0,3), (1,1), (2,3).$

Interpolate this data by piece wise quadratic polynomial with continious first and second derivative at a middle point.

Extra Credit, 30 percent Suppose that one additional point is added to the above data, so that there are four points.

$\displaystyle (0,3), (1,1), (2,3), (3,9).$

Is it possible to interpolate this data by piece wise quadratic interpolation with continious first and second derivatives at the interior points? Is it possible to do it in general?

Chapter SIX

In class we have studied quadrature rules of the form

$\displaystyle \int\limits_a^b f(x) d x \simeq \sum\limits_{i=0}^{n}w_i f(x_i).$

It appears that sometimes it is advantageous to derive quadratures which also use the derivatives of the function, in addition to value of the function at selected points. Find the weights for the quadrature

$\displaystyle \int\limits_{-1}^1f(x) d x \simeq \omega_1 f(-1) + \omega_2 f''(0) + \omega_3 f(1).$

Chose the weights $\omega_1,\omega_2,\omega_3$ to maximize the precision of the quadrature. What is the order of the resulting quadrature?

Derive the error term of this quadrature.

In class we have studied quadrature rules of the form

$\displaystyle \int\limits_a^b f(x) d x \simeq \sum\limits_{i=0}^{n}w_i f(x_i).$

$\displaystyle \int\limits_{-1}^1f(x) d x \simeq \omega_1 f(-1) + \omega_2 f'(-1) + \omega_3 f'(1)+\omega_4 f(1).$

Chose the weights $\omega_1,\omega_2,\omega_3, \omega_4$ to maximize the precision of the quadrature. What is the order of the resulting quadrature?

HINT Solution of this problem will be much easier if you guess the relationship between $\omega_1$ , $\omega_2$ , $\omega_3$ and $\omega_4$ .

Derive the error term of this quadrature.

Consider the qudrature rule of the form

$\displaystyle \int\limits_{0}^1 f(x) d x \simeq a f(\frac{1}{2}) + b f(1).$

Choose and to maximize the accuracy of the resulting quadrature. Calculate the truncation error of the resulting quadrature.

Is it more accurate or less accurate than Midpoint quadrature?

Given the points :

$\displaystyle (0,3), (1,1), (2,0), (4,-2), (6,1).$

a) Evaluate the integral of the function using the midpoint rule

b) Evaluate the integral using Simpson's rule

c) Evaluate the integral using the trapezoid rule

(a) Consider the integration rule of the form

$\displaystyle \int\limits_{0}^{1} f(x) d x \simeq a_1 f(\frac{1}{3})+ a_2 f(\frac{2}{3}).$

Choose

and

to maximise the precision of this quadrature rule 16 points

(b) Calculate truncation error of this quadrature.

This problem concerns using numerical methods to calculate the integral

$\displaystyle I=\int\limits_0^1 x^4 d x.$

Note that the exact value is $I =\frac{1}{5}.$

We are going to compare different ways to calculate this integral by using the value of the function on only three points, , $x=\frac{1}{2}$ and .

Using the composite trapezoidal rule, and 2 subintervals, find an approximate value for the integral. What is the error?
Using the Simpson rule on an interval find the approximate value of this integral. What is the error?
out of two methods used above, which one gives more accurate answer, and why? Make sure you justify your answer.
Extra Credit 10 percent Do the errors you obtained with Simpson and composite trapezoid above agree with the predictions given by the theorems we studied in class?

Let us denote

$\displaystyle I = \int\limits_{-1}^{1}\cos(x) d x.$

As you know, we can calculate the value of this integral analytically:

$\displaystyle \int\limits_{-1}^{1}\cos(x) d x = 2 \sin (1)\simeq 1.6829419696157930133.$

Calculate the value of this integral numerically by using

Modpoint method
Trapezoid method
Simpson method
Two point Gauss quadrature

Compare the result of your calculations with the exact value. Which of these four methods give the most accurate result? Is this consistent with your expectations?

(a) Consider the integration rule of the form

$\displaystyle \int\limits_{0}^{1} f(x) d x \simeq a_1 f(\frac{1}{2})+ a_2 f(\frac{3}{4}).$

Choose

and

to maximise the precision of this quadrature rule 16 points

(b) Calculate truncation error of this quadrature.

Find 3-point Gaussian rule for $% \int_{0}^{1}F(x)dx,$ if the standard 3-point Gaussian rule is given by

$\displaystyle \int_{-1}^{1}g(x)dx\cong \frac{1}{9}\left( 5g(-\sqrt{3/5})+8g(0)+5g(\sqrt{3/5})\right)$

(b) The trapezoid rule has the error estimate

$\displaystyle \int_{a}^{a+h}g(x)dx=\frac{h}{2}\left[ g(a)+g(a+h)\right] -\frac{1}{12}% h^{3}g^{\prime \prime }(s)$

where

If interval

is divided into

equal panels, show that the error of the composite trapezoid rule is bounded by

$\displaystyle \frac{(b-a)^{3}}{12n^{2}}\max_{a\leq x\leq b}\left\vert g^{\prime \prime }(s)\right\vert$

(c) Use the result in part (b) to determine the number of panels sufficient to approximate $\int_{0}^{1}\sin xdx$ within to $\frac{1}{3} 10^{-4}$ by using the composite trapezoid rule.

(a) (10 points) In the three point quadrature rule

$\displaystyle \int_{-1}^{1} f (x) d x = a_1 f(x_1) + a_2 f(x_2) + a_3 f(x_3),$

choose

and

to maximize the precision of the quadrature rule. (The result will be three-point Gauss quadrature.)

(b) (5 points)What is the degree of the resulting scheme? Demonstrate this by showing the scheme correctly integrates a polynomial of that degree, and that it does not integrate correctly the polynomial of the degree of one order higher.

Find $a_{1},a_{2},a_{3},x_1,x_2,x_3$ in

$\displaystyle \int\limits_{-1}^1 f(x) d x = a_1 f(x_1)+ a_2 f(x_2) + a_3 f(x_3)$

to maximize the precision of the quadrature. Find the error term of this quadrature.

In class we have studied the two point Gauss quadrature

$\displaystyle \int\limits_{-1}^{1}f(x)\simeq f(-\frac{1}{\sqrt{3}})+f(\frac{1}{\sqrt{3}}).$

Calculate the error term for this quadrature.

Hint Derivation is similar to calculation of the trapezoid error term we did in class. You will need to express $f(\pm\frac{1}{\sqrt{3}})$ via and its derivatives.

In class we have studied two point Gauss quadrature. In this problem you are to derive one point Gauss quadrature.

Consider the qudrature rule of the form

$\displaystyle \int\limits_{-1}^1 f(x) d x \simeq a f(z).$

Choose and to maximize the order (accuracy) of the resulting quadrature. What is the truncation order of this quadrature?

Suppose that you have a tabular data, that is to say that the function

is given only on the

equidistant points

, $i=1\dots n$ such that

. Propose a way to numerically evaluate

$\displaystyle g(x) = \int\limits_0^x f(x) d x.$

Chapter SEVEN

Find an

approximation of

that utilizes

, $y(t_{j+3})$ , and $y(t_{j-1})$ .

You are producing a final project for your Master's Degree and need to solve Initial Value Problem numerically. You prefer to have accurate numerical solution. Out of Forward Euler, Backward Euler, Trapezoidal, Heun (RK2), and Runge-Kutta (RK4) method, which method would you choose and why?

Consider RK2 (Heun's) method:

$\displaystyle k_{1}=f(x_{k},y_{k});\;k_{2}=f(x_{k}+h,y_{k}+hk_{1});\;y_{k+1}=y_{k}+\frac{h}{2}(k_{1}+k_{2}).$

Show that this method is second order accurate, i.e. it finds exact solution to the initial value problem

$\displaystyle y'(x)=2 x, y(x=0)=0.$

This can be done, for example, by showing that if at step the numerical solution is

$\displaystyle y_k = x_k^2,$

then at step

the numerical solution is equal to

$\displaystyle y_{k+1} = (x_k+h)^2,$

which agrees with exact analytically solution

$\displaystyle y(x)=x^2.$

Derive an

finite difference approximation to

that uses $y(t_{j+1})$ , $y(t_{j-1})$ and $y(t_{j+2})$ . Calculate the truncation error term of the resulting finite difference approximation.

Find approximation of the first derivative that uses , and . What is the error term of your approximation?

For the equation $y^{\prime }=f(y)$ consider the following numerical method

$\displaystyle k_{1}=f(y_{k});\;k_{2}=f(y_{k}+hk_{1});\;y_{k+1}=y_{k}+\frac{h}{2}% (k_{1}+k_{2})$

(a) Is this method explicit or implicit? Is it one-step or muti-step?
(b) Perform one step of the method for the equation $y^{\prime }=\lambda y.$
(c) Find the order of the method.

Consider the following -step method for solving $y^{\prime }=f(y)$

$\displaystyle y_{k+1}=y_{k}+af(y_{k-1})+bf(y_{k+1})%$

(17)

What is the ”number of steps” for this method? Is this method explicit or implicit?
Determine and , for which the method has the highest possible accuracy.
Determine the order of the method of the highest possible accuracy in (18).
Determine the order of the method of the highest possible accuracy in (18).
SOLUTION Method of undetermined coefficients, is satisfied automatically, gives , gives . Solving equations we get second order accurate

Determine whether the method

$\displaystyle y_{k+1}=y_{k}+hf(t_{k},y_{k})+\frac{h^{2}}{2}\left[ \frac{\partia... ...{\partial t}+f(t_{k},y_{k})\frac{\partial f(t_{k},y_{k}% )}{\partial y}\right]$

with

is stable for the equation $y^{\prime}=\lambda y$ with $\lambda=-30$ .

Suppose you want to solve numerically $y'(t)=e^{-y(t)} + t^5,$ for using 100 time steps (so, ). The method to be tried are (i) the Euler method (ii) the backward Euler method (iii) the trapezoidal method (iv) the RK2 (Heun) method (v) the RK4 method.

(a) Which method do you expect to finish the calculation the fastests? Why?

(b) Which would be the second fastest method? Why?

(d) If stability is a concern, which method would be the best? Why?

This is general question. It is sufficient to give an answer with out proof.

Consider the following initial value problem

$\displaystyle y'(t) = f(t,y(t)), y(t=0)=y_0.$

The following algorithm is proposed for its numerical solution:

$\displaystyle y_{n+1} = y_n + \frac{h}{2}\left[ f(t_n,y_n)+f(t_{n+1}, y_{n+1})\right].$

Define the term stability for a numerical algorithm in the context of initial value problems for ODE's.
Determine if the above algorithm is stable, and if it is, what restrictions if any is implied on the size of step size for stability.
List one advantage and one disadvantage of the above algorithm over the Euler method.

Consider the following initial value problem:

$\displaystyle y'(t) = t e^{-3 t} - y, y(0)=1, 0\le t\le 1, {\rm with} h=0.5$

Using the Euler method with $\Delta t = 0.5$ calculate and 10 pt.

Is the proposed method numerically stable for the propsoed time step? 2 pt

Extra credit - 2 pt compare the result with the exact analytical solution.

computer the solution of

$\displaystyle \frac{d y} { d t} = e^{-2 y} - 4 t^3$

for $0\le t\le 1$ using 100 time steps (i.e. with

). The methods to be tried are

The 2nd order Taylor method
RK4
Trapezoid method

Please answer the following questions:

Which one would you expect to complete the calculation the fastest? Why?
Which one would you expect to complete the calculation last Why?
Which one you expect to be more accurate? Why?
If stability is a concern, which meshod should be used? Why?

(a) Which method do you expect to finish the calculation the fastests? Why?

(b) Which would be the second fastest method? Why?

(d) If stability is a concern, which method would be the best? Why?

IsHeun's method

$\displaystyle k_{1}=f(t_{k},y_{k});\;k_{2}=f(t_{k}+h,y_{k}+hk_{1});\;y_{k+1}=y_{k}+\frac{h}{2}(k_{1}+k_{2}),$

stable for the equation $y^{\prime }=-20y$ with

Consider RK2 (Heun's) method:

$\displaystyle k_{1}=f(t_{k},y_{k});\;k_{2}=f(t_{k}+h,y_{k}+hk_{1});\;y_{k+1}=y_{k}+\frac{h}{2}(k_{1}+k_{2}).$

Show that this method is second order accurate, i.e. it finds exact solution to the initial value problem

$\displaystyle y'(x)=frac{x}{2}, y(x=0)=0.$

This can be done, for example, by showing that if at step the numerical solution is

$\displaystyle y_k = x_k^2,$

then at step

the numerical solution is equal to

$\displaystyle y_{k+1} = (x_k+h)^2,$

which agrees with exact analytically solution

$\displaystyle y(x)=x^2.$

Consider the following initial value problem:

$\displaystyle y'(x) = 2 x y , y(0)=1.$

Using Taylor second order scheme, calculate

with $\Delta x= 0.1$ (i.e. calculate one time step).

( 4 points) Compare the result with the exact analytical solution.

State whether the following methods are (i) explicit or implicit (ii) single step or multi-step (iii) selfstarting or not. Cross out wrong statements and underline correct ones:

$\displaystyle y_{k+1} = y_k + \frac{h}{24}(55 y'_k - 59 y'_{k-1}- 9 y'_{k-3}),$

(i)explicit/implicit (ii)single step/multi-step (iii)selfstarting/not selfstarting.

$\displaystyle y_{k+1}=\frac{1}{11}(18 y_k - 9 y_{k-1} + 2 y_{k-2}) + \frac{6 h}{11}y'_{k+1}$

(i)explicit/implicit (ii)single step/multi-step (iii)selfstarting/not selfstarting.

$\displaystyle y_{k+1} = y_k + \frac{h}{2} (y'_{k}+ y'_{k+1})$

(i)explicit/implicit (ii)single step/multi-step (iii)selfstarting/not selfstarting.

The Bernoulli equation is

$\displaystyle y'(t)+y^3(t)=\frac{y(t)}{1+t}.$

If the Forward Euler method is used to solve this equation, what is the resulting finite difference equation (i.e. equation expressing $y_{k+1}$ through )?
If the trapezoid method is used to solve this equation, what is the resulting finite difference equation?
If the RK2 method is used, what is the resulting finite difference equation?
If the RK4 method is used, what is the resulting finite difference equation?

Consider the fourth order RK method:

$\displaystyle y_{n+1}$	$\displaystyle =$	$\displaystyle y_n + \frac{1}{6}(K_1+2K_2 + 2K_3 + K_4),$	(18)
$\displaystyle t_{n+1}$	$\displaystyle =$	$\displaystyle x_n + h,$	(19)
$\displaystyle K_1$	$\displaystyle =$	$\displaystyle h f(x_n, y_n),$	(20)
$\displaystyle K_2$	$\displaystyle =$	$\displaystyle h f(x_n + \frac{1}{2}h, y_n+ \frac{1}{2}K_1),$	(21)
$\displaystyle K_3$	$\displaystyle =$	$\displaystyle h f(x_n+\frac{1}{2}h, y_n+\frac{1}{2}K_2),$	(22)
$\displaystyle K_4$	$\displaystyle =$	$\displaystyle h f(x_n+h, y_n+K_3).$	(23)

Apply this RK4 method for solving the equation

$\displaystyle y'(x) = 4 x^3,$

with initial condition

$\displaystyle y(x=0)=0,$

and show that the RK4 method is indeed at least fourth order accurate. This can be done, for example, by showing that if at step

the numerical solution is

$\displaystyle y_k = x_k^4,$

then at step

the numerical solution is equal to

$\displaystyle y_{k+1} = (x_k+h)^4,$

which agrees with exact analytically solution

$\displaystyle y(x)=x^4.$

Consider the fourth order RK method:

$\displaystyle y_{n+1}$	$\displaystyle =$	$\displaystyle y_n + \frac{1}{6}(K_1+2K_2 + 2K_3 + K_4),$	(24)
$\displaystyle t_{n+1}$	$\displaystyle =$	$\displaystyle t_n + h,$	(25)
$\displaystyle K_1$	$\displaystyle =$	$\displaystyle h f(t_n, y_n),$	(26)
$\displaystyle K_2$	$\displaystyle =$	$\displaystyle h f(t_n + \frac{1}{2}h, y_n+ \frac{1}{2}K_1),$	(27)
$\displaystyle K_3$	$\displaystyle =$	$\displaystyle h f(t_n+\frac{1}{2}h, y_n+\frac{1}{2}K_2),$	(28)
$\displaystyle K_4$	$\displaystyle =$	$\displaystyle h f(t_n+h, y_n+K_3).$	(29)

Apply this RK4 method for solving the equation

$\displaystyle y'(x) = y (x),$

and show that this method is indeed fourth order accurate. This can be done, for example, by computing an amplification factor and comparing it to the analytic value

Chapter EIGHT

Set up the linear least squares problem for fitting the model
$f(t,\mathbf{x})=x_{1}t+x_{2}e^{-t}$ to the four data points , , , .

Set up and solve the linear least squares system

$\displaystyle A x \simeq b$

for the fitting the model function

$\displaystyle f(t,x) = x_1 t + x_2 e^t, x=(x_1,x_2)^T$

to the three data points

$\displaystyle (1,2), (2,3), (3,5)$

(a) Suppose you would like to fit the data points

$\displaystyle (-1,0), (0,1) {\rm and} (1,2)$

by the fitting function

$\displaystyle y(x) = a \cos(x) + b x^2.$

Find

and

by setting up a linear (overdetermined) least square problem and solving it.

(b) Also calculate the residual.

Suppose you measure as a function of and you get the following:


0	2
1	1
2	-2
3	-1

Suppose you would like to fit this data by

$\displaystyle x(t) = a \sin \left(\frac{\pi t}{2}\right) +b \cos \left(\frac{\pi t}{2}\right)$

Find

and

by setting up a linear (overdetermined) least square problem and solving it. Also calculate the residal.

Consider the data

$\displaystyle (1,2), (2, 3), (3, 4)$

Apporximate this data by a constant, i.e. find such that

$\displaystyle y(t) = x,$

is a good fit for this data.

Use least square method. As we have shown in class the least square method minimizes the square of the second norm of a residual, i.e. $\vert\vert r\vert\vert^2_2$ , where
Now minimize the fourth power of the fourth norm of a residual.
Compare the results and explain the difference.

Suppose that an experiment produced the following data:

$\displaystyle (1,3), (2,2).$

You are to fit this data by using the linear fit

$\displaystyle y(t) = x \cdot t.$

a Calculate the value of which minimized the square of the second norm $\vert\vert r\vert\vert^2_2$ of the residual

b Calculate and . Sketch ${\rm Span}(A)$ , , and . Verify that $r\perp A x$ . Is it so on your graph?

c Extra-credit, 5 points Obtain equation for that minimizes $\vert\vert r\vert\vert^4_4$ instead of traditional $\vert r\vert^2_2$ . Do not solve the resulting equation. What are the advantages and disadvantages of using $\vert\vert r\vert\vert^4_4$ instead of $\vert\vert r\vert\vert^2_2$ ?