5. Taylor’s approximation

In what follows, fix some x,δ𝕌x,\delta\in\mathbb{U} with x>x>\mathbb{R} and x+δ>x+\delta>\mathbb{R}. We shall abbreviate ff\frac{f^{\prime}}{f} with ff^{\dagger}, provided f0f\neq 0. Note that f=(log|f|)f^{\dagger}=(\log|f|)^{\prime}.

Proposition 5.1.

For all non-zero fTf\in\mathbb{R}\langle\!\langle T\rangle\!\rangle, if f1f\not\asymp 1, x+δxx+\delta\asymp x, and (fx)δ1(f^{\dagger}\circ x)\delta\preceq 1, then

f(x+δ)fxandf(x+δ)fx(fx)δ.f\circ(x+\delta)\asymp f\circ x\quad\text{and}\quad f\circ(x+\delta)-f\circ x% \asymp(f^{\prime}\circ x)\delta.

In particular, if (fx)δ1(f^{\dagger}\circ x)\delta\prec 1, then f(x+δ)fxf\circ(x+\delta)\sim f\circ x.

Proof.

Note that x+δxx+\delta\asymp x implies δx\delta\preceq x. We work by induction on ER(f)\mathrm{ER}(f). For the base case, suppose that f=logn(T)f=\log^{\circ n}(T) for some nn\in\mathbb{N}. Both conclusions are trivial for n=0n=0. For n=1n=1, we know that

log(x+δ)log(x)=log(1+δx)δx=(fx)δ.\log(x+\delta)-\log(x)=\log\left(1+\frac{\delta}{x}\right)\asymp\frac{\delta}{% x}=(f^{\prime}\circ x)\delta.

Note that δx1log(x)\frac{\delta}{x}\preceq 1\prec\log(x), so in particular log(x+δ)log(x)\log(x+\delta)\sim\log(x). For n>1n>1, recall that the Taylor series of log\log implies that log(y+ε)log(y)εy\log(y+\varepsilon)-\log(y)\sim\frac{\varepsilon}{y} when εy\varepsilon\prec y. Using y=logn1(x)y=\log^{\circ n-1}(x), ε=logn1(x+δ)y\varepsilon=\log^{\circ n-1}(x+\delta)-y, we can verify inductively that εy\varepsilon\prec y, or in other words logn1(x+δ)logn1(x)\log^{\circ n-1}(x+\delta)\sim\log^{\circ n-1}(x), and

log(logn1(x+δ))log(logn1(x))logn1(x+δ)logn1(x)logn1(x)log(x+δ)log(x)log(x)logn1(x)δxlog(x)logn1(x)=(fx)δ.\log(\log^{\circ n-1}(x+\delta))-\log(\log^{\circ n-1}(x))\sim\frac{\log^{% \circ n-1}(x+\delta)-\log^{\circ n-1}(x)}{\log^{\circ n-1}(x)}\\ \sim\frac{\log(x+\delta)-\log(x)}{\log(x)\cdots\log^{\circ n-1}(x)}\asymp\frac% {\delta}{x\log(x)\cdots\log^{\circ n-1}(x)}=(f^{\prime}\circ x)\delta.

For ER(f)>0\mathrm{ER}(f)>0, we first prove the conclusions for f=eγf=e^{\gamma} with γ𝕁0\gamma\in\mathbb{J}^{\neq 0}. Suppose that (fx)δ1(f^{\dagger}\circ x)\delta\preceq 1; since f=eγγeγ=γf^{\dagger}=\frac{e^{\gamma}\gamma^{\prime}}{e^{\gamma}}=\gamma^{\prime}, we have (γx)δ1(\gamma^{\prime}\circ x)\delta\preceq 1, and in particular (γx)δ1(\gamma^{\dagger}\circ x)\delta\prec 1, as γ1\gamma\succ 1. By inductive assumption, γ(x+δ)γx(γx)δ1\gamma\circ(x+\delta)-\gamma\circ x\asymp(\gamma^{\prime}\circ x)\delta\preceq 1. Since ey1ye^{y}-1\asymp y for y1y\preceq 1, we get

eγ(x+δ)eγx=eγx(eγ(x+δ)γx1)eγx(γx)δ=((eγ)x)δ.e^{\gamma}\circ(x+\delta)-e^{\gamma}\circ x=e^{\gamma\circ x}\left(e^{\gamma% \circ(x+\delta)-\gamma\circ x}-1\right)\asymp e^{\gamma\circ x}(\gamma^{\prime% }\circ x)\delta=((e^{\gamma})^{\prime}\circ x)\delta.

Since ey1e^{y}\asymp 1 for y1y\preceq 1, we similarly find that

eγ(x+δ)=eγxeγ(x+δ)γxeγx=eγx.e^{\gamma}\circ(x+\delta)=e^{\gamma\circ x}e^{\gamma\circ(x+\delta)-\gamma% \circ x}\asymp e^{\gamma\circ x}=e^{\gamma}\circ x.

For general ff, let reγre^{\gamma} be the leading term of ff. Note that by assumption γ0\gamma\neq 0. Since freγf\sim re^{\gamma}, we have

f(x+δ)reγ(x+δ)reγxfx.f\circ(x+\delta)\sim re^{\gamma}\circ(x+\delta)\asymp re^{\gamma}\circ x\sim f% \circ x.

Since f1f\not\asymp 1, we also have f(reγ)f^{\prime}\sim(re^{\gamma})^{\prime} and f(reγ)=γf^{\dagger}\sim{(re^{\gamma})}^{\dagger}=\gamma^{\prime}, hence by Corollary 4.4,

f(x+δ)fxreγ(x+δ)reγxr((eγ)x)δ(fx)δ.f\circ(x+\delta)-f\circ x\sim re^{\gamma}\circ(x+\delta)-re^{\gamma}\circ x% \asymp r((e^{\gamma})^{\prime}\circ x)\delta\sim(f^{\prime}\circ x)\delta.

When (fx)δ1(f^{\dagger}\circ x)\delta\prec 1, or in other words (fx)δfx(f^{\prime}\circ x)\delta\prec f\circ x, we find that indeed f(x+δ)fxf\circ(x+\delta)\sim f\circ x. ∎

To prove Taylor’s theorem, we now want to iterate the above approximation. To do that, we need to control the assumption (fx)δ1(f^{\dagger}\circ x)\delta\preceq 1 when we replace ff with its derivatives.

Lemma 5.2.

Let fTf\in\mathbb{R}\langle\!\langle T\rangle\!\rangle be non-zero such that fTkf\not\asymp T^{k} for all kk\in\mathbb{N}.

  1. (1)

    If f1Tf^{\dagger}\succ\frac{1}{T}, then (f(n))f{(f^{(n)})}^{\dagger}\sim f^{\dagger} for all nn; in particular, f(n)(f)nff^{(n)}\sim{(f^{\dagger})}^{n}f.

  2. (2)

    Otherwise, for some \ell\in\mathbb{N}, (f(n))1T{(f^{(n)})}^{\dagger}\asymp\frac{1}{T} for all n1n\neq\ell-1, and if >0\ell>0, then (f(1))1T{(f^{(\ell-1)})}^{\dagger}\prec\frac{1}{T}; in particular, f(n)fTnf^{(n)}\preceq\frac{f}{T^{n}}, f(+n)f()Tnf^{(\ell+n)}\asymp\frac{f^{(\ell)}}{T^{n}} for all nn.

Proof.

(1) We have 1f=ffT\frac{1}{f^{\dagger}}=\frac{f}{f^{\prime}}\prec T, hence

T=1(ff)=1ff′′(f)2=1(f)f.T^{\prime}=1\succ\left(\frac{f}{f^{\prime}}\right)^{\prime}=1-\frac{ff^{\prime% \prime}}{{(f^{\prime})}^{2}}=1-\frac{{(f^{\prime})}^{\dagger}}{f^{\dagger}}.

This says that (f)f{(f^{\prime})}^{\dagger}\sim f^{\dagger}, and in particular also that (f)1T{(f^{\prime})}^{\dagger}\succ\frac{1}{T}. By induction, (f(n))f{(f^{(n)})}^{\dagger}\sim f^{\dagger} for all nn\in\mathbb{N}. Since f(n+1)=(f(n))f(n)ff(n)f^{(n+1)}={(f^{(n)})}^{\dagger}f^{(n)}\sim f^{\dagger}f^{(n)}, we also find f(n)(f)nff^{(n)}\sim{(f^{\dagger})}^{n}f.

(2) Let rr be the unique real number such that frT1Tf^{\dagger}-\frac{r}{T}\prec\frac{1}{T}. Then fTrfff^{\prime}T-rf\prec f. Since f1f\not\asymp 1, we find

f′′T+frff,or in other words(f)r1T1T.f^{\prime\prime}T+f^{\prime}-rf^{\prime}\prec f^{\prime},\quad\text{or in % other words}\quad{(f^{\prime})}^{\dagger}-\frac{r-1}{T}\prec\frac{1}{T}.

In particular, (f)1T{(f^{\prime})}^{\dagger}\preceq\frac{1}{T}. Since fTkf\not\asymp T^{k} for all kk, we have f(k)1f^{(k)}\not\asymp 1 for all kk, so we deduce by induction that

(f(n))rnT1T.{\left(f^{(n)}\right)}^{\dagger}-\frac{r-n}{T}\prec\frac{1}{T}.

In turn, (f(n))1T{(f^{(n)})}^{\dagger}\asymp\frac{1}{T} for nrn\neq r, and (f(n))1T{(f^{(n)})}^{\dagger}\prec\frac{1}{T} if n=rn=r, which can only happen if rr\in\mathbb{N}. Let =r+1\ell=r+1 if rr\in\mathbb{N} and =0\ell=0 otherwise to recover the desired conclusion. For nn\in\mathbb{N} we have f(+n+1)=(f(+n))f(+n)f(+n)Tf^{(\ell+n+1)}={(f^{(\ell+n)})}^{\dagger}f^{(\ell+n)}\asymp\frac{f^{(\ell+n)}}% {T}, so by induction f(+n)f()Tnf^{(\ell+n)}\asymp\frac{f^{(\ell)}}{T^{n}}. Similarly, f(n+1)=(f(n))f(n)f(n)Tf^{(n+1)}={(f^{(n)})}^{\dagger}f^{(n)}\preceq\frac{f^{(n)}}{T}, hence f(n)fTnf^{(n)}\preceq\frac{f}{T^{n}}. ∎

Two illustrative examples are the following. Take f=eeTf=e^{e^{T}}. In this case, f=eT1Tf^{\dagger}=e^{T}\succ\frac{1}{T}, and Lemma 5.2 predicts that f(n)(f)nff^{(n)}\sim{(f^{\dagger})}^{n}f. Indeed,

f=eTeeT=ff,f′′=eTeeT+e2TeeT(f)2f,f′′′e3TeeT=(f)3ff^{\prime}=e^{T}e^{e^{T}}=f^{\dagger}f,\quad f^{\prime\prime}=e^{T}e^{e^{T}}+e% ^{2T}e^{e^{T}}\sim{\left(f^{\dagger}\right)}^{2}f,\quad f^{\prime\prime\prime}% \sim e^{3T}e^{e^{T}}={\left(f^{\dagger}\right)}^{3}f\quad\dotsc

Now take f=Tlog(T)f=T\log(T). Here f=log(T)+1Tlog(T)1Tf^{\dagger}=\frac{\log(T)+1}{T\log(T)}\sim\frac{1}{T}, so we expect to see f(n+1)f(n)Tf^{(n+1)}\asymp\frac{f^{(n)}}{T}, with at most one exceptional f(n+1)f(n)Tf^{(n+1)}\prec\frac{f^{(n)}}{T}. Indeed,

f=log(T)+1fT,f′′=1TfT,f′′′=1T2f′′T,f′′′′=2T3f′′′Tf^{\prime}=\log(T)+1\sim\frac{f}{T},\quad f^{\prime\prime}=\frac{1}{T}\prec% \frac{f^{\prime}}{T},\quad f^{\prime\prime\prime}=-\frac{1}{T^{2}}\asymp\frac{% f^{\prime\prime}}{T},\quad f^{\prime\prime\prime\prime}=\frac{2}{T^{3}}\asymp% \frac{f^{\prime\prime\prime}}{T}\quad\dotsc
Corollary 5.3.

Let fTf\in\mathbb{R}\langle\!\langle T\rangle\!\rangle be non-zero such that fTkf\not\asymp T^{k} for all kk\in\mathbb{N}. Suppose that δ0\delta\neq 0. Then the sequence (f(n)x)δn(f^{(n)}\circ x)\delta^{n} is strictly \prec-decreasing when δx\delta\prec x, and (fx)δ1(f^{\dagger}\circ x)\delta\prec 1, weakly \prec-increasing when δx\delta\preceq x and (fx)δ1(f^{\dagger}\circ x)\delta\preceq 1, and eventually strictly \prec-increasing otherwise.

Proof.

We apply Lemma 5.2 to the ratio between two successive elements of the sequence:

(f(n+1)x)δn+1(f(n)x)δn=((f(n+1)f(n))x)δ=((f(n))x)δ.\frac{(f^{(n+1)}\circ x)\delta^{n+1}}{(f^{(n)}\circ x)\delta^{n}}=\left(\left(% \frac{f^{(n+1)}}{f^{(n)}}\right)\circ x\right)\delta=\left({\left(f^{(n)}% \right)}^{\dagger}\circ x\right)\delta.

When f1Tf^{\dagger}\succ\frac{1}{T}, then (f(n))f{(f^{(n)})}^{\dagger}\sim f^{\dagger} (Lemma 5.2(1)), so

((f(n))x)δ(fx)δ.\left({\left(f^{(n)}\right)}^{\dagger}\circ x\right)\delta\sim\left(f^{\dagger% }\circ x\right)\delta.

The sequence is then strictly \prec-decreasing when (fx)δ1(f^{\dagger}\circ x)\delta\prec 1, which implies δx\delta\prec x since fx1xf^{\dagger}\circ x\succ\frac{1}{x}, weakly \prec-increasing when (fx)δ1(f^{\dagger}\circ x)\delta\preceq 1, which also implies δx\delta\prec x, and it is strictly \prec-increasing otherwise.

When f1Tf^{\dagger}\preceq\frac{1}{T}, then (f(n))1T{(f^{(n)})}^{\dagger}\preceq\frac{1}{T}, with (f(n))1T{(f^{(n)})}^{\dagger}\asymp\frac{1}{T} for all but possibly one value of nn (Lemma 5.2(2)), hence

((f(n))x)δδx,\left({\left(f^{(n)}\right)}^{\dagger}\circ x\right)\delta\preceq\frac{\delta}% {x},

with equivalence for all but possibly one nn. The sequence is then strictly \prec-decreasing when δx\delta\prec x, which implies (fx)δ1(f^{\dagger}\circ x)\delta\prec 1 since fx1xf^{\dagger}\circ x\preceq\frac{1}{x}, weakly \prec-decreasing when δx\delta\preceq x, which implies (fx)δ1(f^{\dagger}\circ x)\delta\preceq 1, and it is eventually strictly \prec-increasing otherwise. ∎

Corollary 5.4.

Let fTf\in\mathbb{R}\langle\!\langle T\rangle\!\rangle be non-zero such that fTkf\not\asymp T^{k} for all kk\in\mathbb{N}. If x+δxx+\delta\asymp x and (fx)δ1(f^{\dagger}\circ x)\delta\preceq 1, then f(n)(x+δ)f(n)xf^{(n)}\circ(x+\delta)\asymp f^{(n)}\circ x for all nn\in\mathbb{N}.

Proof.

Note that δx\delta\preceq x. By Corollary 5.3, for every nn\in\mathbb{N} we have

((f(n))x)δ=(f(n+1)x)δn+1(f(n)x)δn1,{\left({\left(f^{(n)}\right)}^{\dagger}\circ x\right)}\delta=\frac{\left(f^{(n% +1)}\circ x\right)\delta^{n+1}}{\left(f^{(n)}\circ x\right)\delta^{n}}\preceq 1,

thus the conclusion follows from Proposition 5.1 applied to each f(n)f^{(n)}. ∎

Proof of Theorem C.

When ER(f)=0\mathrm{ER}(f)=0, f=logk(T)f=\log^{\circ k}(T), and the conclusion follows directly from the Taylor expansion of log\log. For ER(f)>0\mathrm{ER}(f)>0, fix some xx, δ\delta as in the assumptions. Note that δx\delta\preceq x. Write f=γrγeγf=\sum_{\gamma}r_{\gamma}e^{\gamma} for γ\gamma ranging in 𝕁\mathbb{J}, where by construction rγ0r_{\gamma}\neq 0 implies ER(γ)<ER(f)\mathrm{ER}(\gamma)<\mathrm{ER}(f). Split ff as follows:

f0=(γx)δ1rγeγ,f1=ff0=(γx)δ1rγeγ.f_{0}=\sum_{(\gamma^{\prime}\circ x)\delta\preceq 1}r_{\gamma}e^{\gamma},\quad f% _{1}=f-f_{0}=\sum_{(\gamma^{\prime}\circ x)\delta\succ 1}r_{\gamma}e^{\gamma}.

We first show that f1f_{1} can be ignored. Suppose that f10f_{1}\neq 0. On the one hand, by construction (f1x)δ(γx)δ1(f_{1}^{\dagger}\circ x)\delta\sim(\gamma^{\prime}\circ x)\delta\succ 1, where eγe^{\gamma} is the leading monomial of f1f_{1}, hence f1ff_{1}^{\dagger}\succ f^{\dagger}, thus necessarily f1ff_{1}\prec f and f11f_{1}\prec 1, whence f1Tkf_{1}\not\asymp T^{k} for all kk\in\mathbb{N}; since fTkf\not\asymp T^{k} for all kk, f(n)f1(n)f^{(n)}\succ f_{1}^{(n)}. On the other, (f1x)1δ1x(f_{1}^{\dagger}\circ x)\succ\frac{1}{\delta}\succeq\frac{1}{x}, so we cannot have f11Tf_{1}^{\dagger}\preceq\frac{1}{T}, hence f11Tf_{1}^{\dagger}\succ\frac{1}{T}, and so f1(n)(f1)nf1f_{1}^{(n)}\sim{(f_{1}^{\dagger})}^{n}f_{1} by Lemma 5.2(1). It follows that

f1xf1(n)x(f1x)nf(n)x(f1x)n(f(n)x)δn.f_{1}\circ x\sim\frac{f_{1}^{(n)}\circ x}{{(f_{1}^{\dagger}\circ x)}^{n}}\prec% \frac{f^{(n)}\circ x}{{(f_{1}^{\dagger}\circ x)}^{n}}\prec(f^{(n)}\circ x)% \delta^{n}.

Similarly, consider f1(x+δ)f_{1}\circ(x+\delta). Note that (f1(x+δ))δ1(f_{1}^{\dagger}\circ(x+\delta))\delta\succ 1: if not, by Corollary 5.4 applied to f1f_{1} at x+δx+\delta, with δ-\delta in place of δ\delta, we would get

1(f1x)δ=f1xf1xδf1(x+δ)f1(x+δ)δ=(f1(x+δ))δ1,1\prec(f_{1}^{\dagger}\circ x)\delta=\frac{f_{1}^{\prime}\circ x}{f_{1}\circ x% }\delta\asymp\frac{f_{1}^{\prime}\circ(x+\delta)}{f_{1}\circ(x+\delta)}\delta=% (f_{1}^{\dagger}\circ(x+\delta))\delta\preceq 1,

a contradiction. Therefore, just as in the previous argument, we find

f1(x+δ)f(n)(x+δ)(f1(x+δ))n(f(n)(x+δ))δn.f_{1}\circ(x+\delta)\sim\frac{f^{(n)}\circ(x+\delta)}{{(f_{1}^{\dagger}\circ(x% +\delta))}^{n}}\prec(f^{(n)}\circ(x+\delta))\delta^{n}.

Since f(n)(x+δ)f(n)xf^{(n)}\circ(x+\delta)\asymp f^{(n)}\circ x by Corollary 5.4,

f1(x+δ)(f(n)x)δn.f_{1}\circ(x+\delta)\prec(f^{(n)}\circ x)\delta^{n}.

Therefore,

f(x+δ)fx=f0(x+δ)f0x+o((f(n)x)δn)f\circ(x+\delta)-f\circ x=f_{0}\circ(x+\delta)-f_{0}\circ x+o\left((f^{(n)}% \circ x)\delta^{n}\right)

Since ff0f\asymp f_{0}, so f(n)f0(n)f^{(n)}\asymp f_{0}^{(n)}, it is now enough to prove the conclusion with f0f_{0} in place of ff. Suppose that eγe^{\gamma} is a monomial in the support of f0f_{0}, thus γ𝕁\gamma\in\mathbb{J}, (γx)δ1(\gamma^{\prime}\circ x)\delta\preceq 1, and ER(γ)<ER(f0)ER(f)\mathrm{ER}(\gamma)<\mathrm{ER}(f_{0})\leq\mathrm{ER}(f). We distinguish two cases.

If γTk\gamma\asymp T^{k} for some kk\in\mathbb{N}, then k>0k>0, hence γ1T\gamma^{\dagger}\asymp\frac{1}{T}. For each monomial 𝔪\mathfrak{m} in the support of γ\gamma, we have γ𝔪1\gamma\succeq\mathfrak{m}\succ 1, so in particular 𝔪γ1T\mathfrak{m}^{\dagger}\preceq\gamma^{\dagger}\asymp\frac{1}{T}, hence (𝔪x)δδx1(\mathfrak{m}^{\dagger}\circ x)\delta\preceq\frac{\delta}{x}\preceq 1. When 𝔪Td\mathfrak{m}\not\asymp T^{d} for all dd\in\mathbb{N}, since ER(𝔪)ER(γ)<ER(f)\mathrm{ER}(\mathfrak{m})\leq\mathrm{ER}(\gamma)<\mathrm{ER}(f), we may apply the inductive hypothesis to deduce

𝔪(x+δ)=i=0n1𝔪(i)xi!δi+O(𝔪(n)δn).\mathfrak{m}\circ(x+\delta)=\sum_{i=0}^{n-1}\frac{\mathfrak{m}^{(i)}\circ x}{i% !}\delta^{i}+O\left(\mathfrak{m}^{(n)}\delta^{n}\right).

When 𝔪Td\mathfrak{m}\asymp T^{d} for some dd\in\mathbb{N}, then in fact 𝔪=Td\mathfrak{m}=T^{d}, hence the above approximation still holds by the binomial theorem and the fact that δx\delta\preceq x. Moreover, in either case 𝔪(i)𝔪TiTki\mathfrak{m}^{(i)}\preceq\frac{\mathfrak{m}}{T^{i}}\preceq T^{k-i} (by Lemma 5.2(2) in the former case, trivially in the latter), so (𝔪(i)x)δi1(\mathfrak{m}^{(i)}\circ x)\delta^{i}\preceq 1 for all i>0i>0. By strong linearity of composition and derivation, we can sum all the terms of γ\gamma to deduce that (γ(i)x)δi1(\gamma^{(i)}\circ x)\delta^{i}\preceq 1 for all i>0i>0, and that

γ(x+δ)=i=0n1γ(i)xi!δi+O(γ(n)δn).\gamma\circ(x+\delta)=\sum_{i=0}^{n-1}\frac{\gamma^{(i)}\circ x}{i!}\delta^{i}% +O\left(\gamma^{(n)}\delta^{n}\right).

If γTk\gamma\not\asymp T^{k} for all kk\in\mathbb{N}, then we simply observe that (γx)δ1(\gamma^{\dagger}\circ x)\delta\prec 1, since γ1\gamma\succ 1, so the above equality holds by inductive hypothesis, and (γ(i)x)δi1(\gamma^{(i)}\circ x)\delta^{i}\preceq 1 for i>0i>0 by Corollary 5.3.

Therefore,

f0(x+δ)=(γx)δ1rγeγ(x+δ)=(γx)δ1rγei=0n1γ(i)xi!δi+O((γ(n)x)δn)=(γx)δ1rγeγxexp(i=1n1γ(i)xi!δi+O((γ(n)x)δn))=(γx)δ1(i=0n1(rγeγ)(i)xi!δi+O(((rγeγ)(n)x)δn))=i=0n1f0(i)xi!δi+O((f0(n)x)δn),\begin{split}f_{0}\circ(x+\delta)&=\sum_{(\gamma^{\prime}\circ x)\delta\preceq 1% }r_{\gamma}e^{\gamma\circ(x+\delta)}=\sum_{(\gamma^{\prime}\circ x)\delta% \preceq 1}r_{\gamma}e^{\sum_{i=0}^{n-1}\frac{\gamma^{(i)}\circ x}{i!}\delta^{i% }+O\left((\gamma^{(n)}\circ x)\delta^{n}\right)}\\ &=\sum_{(\gamma^{\prime}\circ x)\delta\preceq 1}r_{\gamma}e^{\gamma\circ x}% \exp\left(\sum_{i=1}^{n-1}\frac{\gamma^{(i)}\circ x}{i!}\delta^{i}+O\left((% \gamma^{(n)}\circ x)\delta^{n}\right)\right)\\ &=\sum_{(\gamma^{\prime}\circ x)\delta\preceq 1}\left(\sum_{i=0}^{n-1}\frac{{(% r_{\gamma}e^{\gamma})}^{(i)}\circ x}{i!}\delta^{i}+O\left(({(r_{\gamma}e^{% \gamma})}^{(n)}\circ x)\delta^{n}\right)\right)\\ &=\sum_{i=0}^{n-1}\frac{f_{0}^{(i)}\circ x}{i!}\delta^{i}+O\left((f_{0}^{(n)}% \circ x)\delta^{n}\right),\end{split}

where on the second line, the argument of exp\exp is 1\preceq 1, since (γ(i)x)δi1(\gamma^{(i)}\circ x)\delta^{i}\preceq 1 for i>0i>0, and so we may use the fact that ey=1+y++yn1(n1)!+O(yn)e^{y}=1+y+\dotsb+\frac{y^{n-1}}{(n-1)!}+O(y^{n}) for any y1y\preceq 1 to proceed to the following step. ∎

Remark 5.5.

The conclusion of Theorem C loses its significance at the boundary: if f1Tf^{\dagger}\succ\frac{1}{T} and (fx)δ1(f^{\dagger}\circ x)\delta\asymp 1, then the error terms all have the same size by Lemma 5.2, and the conclusion collapses to f(x+δ)=O(fx)f\circ(x+\delta)=O(f\circ x); a comparable remark can be made for f1Tf^{\dagger}\preceq\frac{1}{T} and δx\delta\succeq x, where the error terms can get smaller at most once.

Error terms even increase in size if (fx)δ1(f^{\dagger}\circ x)\delta\succ 1 or δx\delta\succ x. In those cases, the conclusion of Theorem C may or may not be valid depending on ff. Consider the ‘first-order approximation’

f(x+δ)fx=O((fx)δ).f\circ(x+\delta)-f\circ x=O((f^{\prime}\circ x)\delta).

When (fx)δ1(f^{\dagger}\circ x)\delta\succ 1, consider f=eTf=e^{T}, thus assume δ1\delta\succ 1. Then the first-order approximation is valid for f=eTf=e^{T} if and only if δ<0\delta<0. More generally, the approximation remains valid if f1f\succ 1 and δ<0\delta<0, or if f1f\prec 1 and δ>0\delta>0, since Theorem A implies f(x+δ)fxf\circ(x+\delta)\preceq f\circ x.

For δx\delta\succ x, the first-order approximation is valid for f=log(T)f=\log(T) and it fails for f=T3f=\sqrt{T^{3}}. Analogy with real functions suggests that the approximation is valid for δx\delta\succ x exactly when fTf\preceq T. This is related to whether f′′0f^{\prime\prime}\geq 0 implies that ff is convex, namely f(x+δ)fx(fx)δf\circ(x+\delta)-f\circ x\geq(f^{\prime}\circ x)\delta for δ0\delta\geq 0. As alluded to in the introduction, this does not seem to follow in a direct way from Theorem A.

The boundary x+δxx+\delta\prec x is more subtle: the error terms do not increase in size, but the Taylor approximation may still hold or fail. For example, the first-order approximation fails for f=1Tf=\frac{1}{T} and f=log(T)f=\log(T) (note that δx\delta\asymp x, so (fx)δ1(f^{\dagger}\circ x)\delta\preceq 1 in both examples):

1x+δ1x=δ(x+δ)xδx2,log(x+δ)log(x)=log(x+δx)1δx.\frac{1}{x+\delta}-\frac{1}{x}=-\frac{\delta}{(x+\delta)x}\succ-\frac{\delta}{% x^{2}},\quad\log(x+\delta)-\log(x)=\log\left(\frac{x+\delta}{x}\right)\succ 1% \asymp\frac{\delta}{x}.

On the other hand, the approximation is valid for f=Tf=\sqrt{T}:

x+δx=δx+δ+xδ2x.\sqrt{x+\delta}-\sqrt{x}=\frac{\delta}{\sqrt{x+\delta}+\sqrt{x}}\asymp-\frac{% \delta}{2\sqrt{x}}.

We can in fact give a full classification. Assume x+δxx+\delta\prec x and f1Tf^{\dagger}\preceq\frac{1}{T} (we ignore f1Tf^{\dagger}\succ\frac{1}{T}, as in that case (fx)δ1(f^{\dagger}\circ x)\delta\preceq 1 implies δx\delta\prec x). Then the first-order approximation is valid if and only if f1Tf^{\dagger}\asymp\frac{1}{T} and f1f\succ 1.

Indeed, suppose that f1Tf^{\dagger}\asymp\frac{1}{T}. Note that the first-order approximation collapses to f(x+δ)=O(fx)f\circ(x+\delta)=O(f\circ x). We have log(f)rlog(T)\log(f)\sim r\log(T) for some non-zero rr\in\mathbb{R}, so

log(f(x+δ))log(fx)r(log(x+δ)log(x))1\log(f\circ(x+\delta))-\log(f\circ x)\sim r(\log(x+\delta)-\log(x))\succ 1

by Corollary 4.4. Note that log(x+δ)log(x)\log(x+\delta)-\log(x) is negative infinite. When f1f\prec 1, we have r<0r<0, so f(x+δ)fxf\circ(x+\delta)\succ f\circ x, hence the approximation fails. When f1f\succ 1, then r>0r>0, so f(x+δ)fxf\circ(x+\delta)\prec f\circ x, hence the approximation is valid.

Now suppose that f1Tf^{\dagger}\prec\frac{1}{T}. Let ε=f(x+δ)fxfx\varepsilon=\frac{f\circ(x+\delta)-f\circ x}{f\circ x}. We claim that ff satisfies the first-order approximation if and only if log|f|\log|f| does. Since (fx)δ(fx)xfx(f^{\prime}\circ x)\delta\asymp(f^{\prime}\circ x)x\prec f\circ x, the approximation for ff implies ε1\varepsilon\prec 1, while the one for log|f|\log|f| implies log(1+ε)1\log(1+\varepsilon)\prec 1. Crucially, in either case εlog(1+ε)\varepsilon\sim\log(1+\varepsilon). The claim follows at once from (log|f|)=ff(\log|f|)^{\prime}=\frac{f^{\prime}}{f}. Moreover, (log|f|)=flog|f|f1T{(\log|f|)}^{\dagger}=\frac{f^{\dagger}}{\log|f|}\prec f^{\dagger}\prec\frac{1% }{T}, since f1f\not\asymp 1. Therefore, to check whether ff satisfies the first-order approximation, we may replace ff with log|f|\log|f| until flogk(T)f\sim\log^{\circ k}(T). Since f(logk(T))f^{\prime}\sim(\log^{\circ k}(T))^{\prime}, Corollary 4.4 implies that we may further replace ff with logk(T)\log^{\circ k}(T). Applying the same argument in reverse, we may replace logk(T)\log^{\circ k}(T) with log(T)\log(T). The approximation fails for log(T)\log(T), hence it fails for the starting ff.

Corollary 5.6.

Let fTf\in\mathbb{R}\langle\!\langle T\rangle\!\rangle and x,δ𝕌x,\delta\in\mathbb{U} with x>x>\mathbb{R}, x+δ>x+\delta>\mathbb{R}, x+δxx+\delta\asymp x. If fTkf\asymp T^{k} for some kk\in\mathbb{N} and f(k+1)0f^{(k+1)}\neq 0, suppose that ((f(k+1))x)δ1({(f^{(k+1)})}^{\dagger}\circ x)\delta\preceq 1; otherwise, if f0f\neq 0, suppose that (fx)δ1(f^{\dagger}\circ x)\delta\preceq 1. Then for all n0n\geq 0,

f(x+δ)=i=0n1f(i)xi!δi+O((f(n)x)δn).f\circ(x+\delta)=\sum_{i=0}^{n-1}\frac{f^{(i)}\circ x}{i!}\delta^{i}+O\left((f% ^{(n)}\circ x)\delta^{n}\right).
Proof.

If fTkf\not\asymp T^{k} for all kk\in\mathbb{N} and f0f\neq 0, this is just Theorem C. If f=0f=0, the conclusion is trivial.

Now suppose that fTkf\asymp T^{k} for some kk\in\mathbb{N}. Let pp be the sum of all the terms of ff that are Td\asymp T^{d} for some dd\in\mathbb{N}, where necessarily dkd\leq k. Note that pfTkp\asymp f\asymp T^{k}. Then pp is a polynomial in TT of degree kk, and by construction g=fpg=f-p satisfies gTdg\not\asymp T^{d} for all dd\in\mathbb{N}. In particular, p(k+1)=0p^{(k+1)}=0, so f(k+1)=g(k+1)f^{(k+1)}=g^{(k+1)}.

The conclusion of Theorem C is valid for pp by the binomial theorem and the fact that δx\delta\preceq x (and for n>kn>k, it is true even for δx\delta\succ x, as the error term becomes 0). If g=0g=0, we are done. If g0g\neq 0, we distinguish two cases. If g1Tg^{\dagger}\preceq\frac{1}{T}, then δx\delta\preceq x implies (gx)δδx1(g^{\dagger}\circ x)\delta\preceq\frac{\delta}{x}\preceq 1. If g1Tg^{\dagger}\succ\frac{1}{T}, then (f(k+1))=(g(k+1))g{(f^{(k+1)})}^{\dagger}={(g^{(k+1)})}^{\dagger}\sim g^{\dagger} by Lemma 5.2(1), so the assumptions guarantee that (gx)δ1(g^{\dagger}\circ x)\delta\preceq 1. In either case, we can apply Theorem C to gg. The conclusion now follows immediately from f=p+gf=p+g and the observations f(n)p(n)g(n)f^{(n)}\asymp p^{(n)}\succ g^{(n)} for nkn\leq k, f(n)=g(n)f^{(n)}=g^{(n)} for n>kn>k. ∎