Chapter 5 Estimating Coefficients From Real Data

regression approach

This is basically using least square to estimate the coefficients of spline functions.

$y_ j = x(t_j) + \epsilon_j = \phi'(t_j) \mathbf{c} + \epsilon_j,$ Then $\mathbf{y} = \mathbf{\Phi} \mathbf{c} + \mathbf{\epsilon}$ and $\hat{\mathbf{c}} = (\Phi'\Phi)^{-1} \Phi' \mathbf{y}.$

We can use simple R functions lsfit to estimate the $\Phi$ as follows:

#  define the range
Rng  = c(1,18)
age     = growth$age
agefine = seq(1,18,len=501)
#  set up order 6 spline basis with 12 basis functions for
#  fitting the growth data so as to estimate acceleration
nbasis = 12;
norder =  6;
heightbasis12 = create.bspline.basis(ageRng, nbasis, norder)
#  fit the data by least squares
basismat   = eval.basis(age, heightbasis12)
heightmat  = growth$hgtf
heightcoef = lsfit(basismat, heightmat, intercept=FALSE)$coef

# heightcoef is a 12 * 54 matrix. Each column represents the coefficients for 12 basis functions for each boy. 

Or there is a function in the fda package: smooth.basis. This function takes three arguments, $t_j, y_j$ and $\Phi$. That is, the support of the responses, the responses and the bspline basis system. It returns a fd object and several other things.

heightList = smooth.basis(age, heightmat[,1:5], heightbasis12)
(heightfd   = heightList$fd)
(height.df  = heightList$df)
[1] 12
(height.gcv = heightList$gcv)
   girl01    girl02    girl03    girl04    girl05
0.2513923 0.7977237 0.3340652 0.3085981 0.3479272

roughness penalty

A measure of a function’s roughness is the integrated square second derivative or total curvature

$PEN_2 (x) = \int [D^2x(t)]^2dt$

Useful R Functions

Some useful R functions:

`dput`: save R

temp  = data.frame(a = rnorm(10),b = LETTERS[1:10])
temp
            a b
1   1.9106530 A
2   0.5568481 B
3  -2.0955271 C
4  -0.4132244 D
5   0.4280722 E
6  -0.5981550 F
7   0.4110932 G
8  -0.8469106 H
9  -0.3415771 I
10  0.2930876 J
dput(temp)
structure(list(a = c(1.91065303013136, 0.556848072515929, -2.09552705843547,
-0.413224404461701, 0.428072223420949, -0.598155043699695, 0.411093231611976,
-0.846910593628984, -0.341577057278408, 0.293087562336535), b = c("A",
"B", "C", "D", "E", "F", "G", "H", "I", "J")), .Names = c("a",
"b"), row.names = c(NA, -10L), class = "data.frame")

Chapter 4 Bulid Functional Data Objects

book, fda, r, research

object of the functional data class `fd` in R

An object of fd combines the coefficients with the basis system. In the previous post, we learned that create.bspline.basis entable one create a system of bsplines. The object class is fdbasis.

add coefficients to the fdbasis to obtain a functional data object fd.

Use the function fd() in R.

tempfd = fd(coefmat,bspline_temp)

Note that adding coefficients in is different from evaluate a bspline system at a given time point $t$. Suppose $f(x)$ is approximated by a bspline system with basis functions denoted by $\phi_i(x)$. That is, $f(x) \approx \sum_{i=1}^m \beta_i \phi_i(x).$

Thus, evaluating $f(t)$ means that given $t = t_0$ compute $f(t_0)$, which is essentially a vector. Each element in the vector stands for the basis function value when $t = t_0$, i.e., $\phi_i(t_0)$. On the other hand, supply coefficients into the bspline system amounts to given the value of $\beta_i$. Once the coefficients are given, one can evaluate the smoothing system as a single function given $t=t_0$ using R function eval.fd().

unitRng = c(0,1)
bspl2 = create.bspline.basis(unitRng, norder=2)
tstFn1 = fd(c(-1, 2), bspl2)
t0 = seq(0,1,by=0.2)
eval.fd(t0,tstFn1)
     reps 1
[1,]   -1.0
[2,]   -0.4
[3,]    0.2
[4,]    0.8
[5,]    1.4
[6,]    2.0
> eval.fd(t0,tstFn1,1) # evaluate the function data object tstFn1 first derivative. 
     reps 1
[1,]      3
[2,]      3
[3,]      3
[4,]      3
[5,]      3
[6,]      0

Linear Differential Operator or Lfd Class

The notation $Lx$ refers to the application of a linear differential operator L to a function $x$. In general, $Lx(t) = \beta_0(t) x(t) + \beta_1(t) Dx(t) + \ldots + \beta_{m-1}(t) D^{m-1} x(t)+ D^{m} x(t)$

Chapter 3 fdaWithRMat

r, research, spline, statistics

create bspline basis functions

There are normally four relevant arguments that defines a bspline basis system.

rangeval: gives the range of the bspline functions
nbasis: the number of basis functions in total
norder: the order of the bspline, which equals one plus the degree. the default is set to be 4, i.e., cubic splines
breaks: the breaking points. It must satisfy that the first and last breaks are the boundary of the rangeval. For example, the code below specifies a bspline system with 13 basis functions. Each basis function is a cubic polynomial function.
In fact, one only needs to gives breaks and norder, then the function will know the nbasis, since nbasis = order + number of internal knots.

bspline_temp = create.bspline.basis(rangeval = c(0,10), nbasis = 13,norder=4,breaks = 0:10)

one property of bspline basis system is that the sum of B-spline basis function at any given time t sum up to 1.

evaluate bspline system at given time points

Once the bspline system is created, one might be interested in checking each basis function values at a given time t. This can be achieved by eval.basis function in R. It also computes derivatives when a third argument Lfdobj is given. In R, predict function can also do the same thing.

eval.basis(evalarg=1:3,bspline_temp)
eval.basis(evalarg=1:3,bspline_temp)[,1:6]
     bspl4.1 bspl4.2   bspl4.3   bspl4.4   bspl4.5   bspl4.6
[1,]       0    0.25 0.5833333 0.1666667 0.0000000 0.0000000
[2,]       0    0.00 0.1666667 0.6666667 0.1666667 0.0000000
[3,]       0    0.00 0.0000000 0.1666667 0.6666667 0.1666667
rowSums(eval.basis(evalarg=1:3,bspline_temp))
[1] 1 1 1
eval.basis(evalarg=1:3,bspline_temp,Lfdobj=1)
     bspl4.1 bspl4.2 bspl4.3 bspl4.4 bspl4.5 bspl4.6 bspl4.7 bspl4.8
[1,]       0   -0.75    0.25     0.5     0.0     0.0       0       0
[2,]       0    0.00   -0.50     0.0     0.5     0.0       0       0
[3,]       0    0.00    0.00    -0.5     0.0     0.5       0       0
# using predict function
> predict(bspline_temp,1:3,1)
     bspl4.1 bspl4.2 bspl4.3 bspl4.4 bspl4.5 bspl4.6 bspl4.7 bspl4.8
[1,]       0   -0.75    0.25     0.5     0.0     0.0       0       0
[2,]       0    0.00   -0.50     0.0     0.5     0.0       0       0
[3,]       0    0.00    0.00    -0.5     0.0     0.5       0       0

Markdown Syntax

markdown

Markdown Cheet Sheet

Markdown is a powerful and simple approach to document blog postings. It enables one to write in a plain text format and transfers them into clean, web-standard HTMLs.

Headings

Headings start with a # symbol. The largest one comes with one # and the second large one with 2 #.

# Markdown Syntax
## Headings

Paragraphs

Like latex, new paragraph is seperated with the previous one using a blank row in between.

This is another paragraph.

Bold and itshape

Bold on one word or serveral words together can be achieved by adding ** on two sides.

itshape is done by adding one * on each side of the word.

this is a **bold** 
this is a *itshape*

Latex

this is a post for using latex code in octopress $\begin{align} \mbox{Union: } & A\cup B = \{x\mid x\in A \mbox{ or } x\in B\} \\ \int_{t=1}^\alpha & \bigg \{xy\mid x\in A \mbox{ and } y\in B\bigg\} \\ \mbox{Star: } & A^\star = \{x_1x_2\ldots x_k \mid k\geq 0 \mbox{ and each } x_i\in A\} \\ \end{align}$

$$
\begin{align}
\mbox{Union: } & A\cup B = \{x\mid x\in A \mbox{ or } x\in B\} \\
\int_{t=1}^\alpha & \bigg \{xy\mid x\in A \mbox{ and } y\in B\bigg\} \\
\mbox{Star: } & A^\star  = \{x_1x_2\ldots x_k \mid  k\geq 0 \mbox{ and each } x_i\in A\} \\
\end{align}
$$

Start

nothing

This is the first post. I will try to maintain this blog as often as I can. Hopefully, it grows as my PhD goes.