Mathematical and statistical software often relies on sequential computations.
Examples are **likelihood evaluations** where it typically is necessary to loop over
the rows of the data, or solving **ordinary differential equations** where
numerical approximations are based on looping over the evolving time. When using
high-level languages such as `R`

or `python`

such calculations can be very slow
unless the algorithms can be vectorized. Fortunately, it is straight-forward to
make the implementations in `C/C++`

and subsequently make an interface to `R`

and
`python`

(
Rcpp and
pybind11).

Emacs is my preferred tool for development, statistical computing, and writing. The killer feature is without doubt https://orgmode.org/ which allows for powerful literate programming, handling of bibliographies, organizing notes collections, and much more.

I finally decided to make the move from my vanilla emacs configuration to Doom Emacs. This gives an optimized emacs experience with fast load times due to lazy-loading of packages, and more importantly, the maintainer is doing an amazing job on adapting to new features and changes to emacs and the various lisp packages. A task which I found increasingly time-consuming.

The **Cantor ternary set** is a remarkable subset of the real numbers
named after German mathematician Georg Cantor who described the set
in 1883. It has the same cardinality as \(\mathbb{R}\), yet it
has zero Lebesgue measure.

The set can be constructed recursively by first considering the unit
interval \(C_{0}=[0,1]\). In the next step, we divide the set into
three equal parts and discard the open middle set. This leads to the
new set \(C_{1}=[0,\tfrac{1}{3}]\cup[\tfrac{2}{3},1]\). This procedure
is repeated on the two remaining subintervals \([0,\tfrac{1}{3}]\) and
\([\tfrac{2}{3},1]\), and iteratively on all remaining subsets such
that \(C_n = \tfrac{1}{3}C_{n-1} \cup (\tfrac{2}{3} +
\tfrac{1}{3}C_{n-1})\). The **Cantor set** is defined by the limit as
\(n\to\infty\), i.e., \(\mathcal{C} = \cap_{n=0}^{\infty} C_{n}\).

The recursion can be illustrated in python in the following way

```
import numpy as np
def transform_interval(x, scale=1.0, translation=0.0):
return tuple(map(lambda z: z * scale + translation, x))
def Cantor(n):
if n==0: return {(0,1)}
Cleft = set(map(lambda x: transform_interval(x, scale=1/3), Cantor(n-1)))
Cright = set(map(lambda x: transform_interval(x, translation=2/3), Cleft))
return Cleft.union(Cright)
```

ML-inference in non-linear SEMs is complex. Computational intensive methods based on numerical integration are needed and results are sensitive to distributional assumptions.

In a recent paper:
**A two-stage estimation procedure for non-linear structural equation models**
by *Klaus Kähler Holst & Esben Budtz-Jørgensen (**https://doi.org/10.1093/biostatistics/kxy082)*,
we consider two-stage estimators as a **computationally simple alternative to MLE**.
Here both steps are based on linear models: first we predict the non-linear
terms and then these are related to latent outcomes in the second step.

Logic gates are the building blocks of digital electronics. Simple logic gates are efficiently implemented in various IC packages such as the 74HCXX series. However, it is educational to have a look at the implementation using just NPN transistors.

The 74HC165 and 74HC495 are useful integrated circuits for dealing with multiple digital inputs and outputs. In a recent project, I used the following prototype based on a variant of the above simple circuit schema:

The four 74HC495 ICs gives access to 2 x 16 bits accessible through IDC connectors controllable using just three pins on the microcontroller (DATA, CLOCK, LATCH). Similarly, two 74HC165 ICs gives access to 16 input bits through another IDC connector.

A small illustration on using the
`armadillo`

C++ linear algebra
library for solving an ordinary differential equation of the form
\[ X’(t) = F(t,X(t),U(t)).\]

The abstract super class `Solver`

defines the methods `solve`

(for approximating the solution in
user-defined time-points) and `solveint`

(for interpolating user-defined
input functions on finer grid). As an illustration a simple
Runge-Kutta solver is derived in the class `RK4`

.

The first step is to define the ODE, here a simple one-dimensional ODE \(X’(t) = \theta\cdot\{U(t)-X(t)\}\) with a single input \(U(t)\):

```
rowvec dX(const rowvec &input, // time (first element) and additional input variables
const rowvec &x, // state variables
const rowvec &theta) { // parameters
rowvec res = { theta(0)*theta(1)*(input(1)-x(0)) };
return( res );
}
```

The ODE may then be solved using the following syntax

```
odesolver::RK4 MyODE(dX);
arma::mat res = MyODE.solve(input, init, theta);
```

with the **step size** defined implicitly by `input`

(first column is the time variable
and the following columns the optional different input variables) and
**boundary conditions** defined by `init`

.

Assume that two positive numbers are given, \(X\) and \(Y\), with unknown joint
probability distribution \(P\), and \(X\neq Y\) *a.s.*

A player draws randomly one of the numbers and has to guess if the number is smaller or larger than the other unrevealed number, i.e., let \(U\sim Bernoulli(\tfrac{1}{2})\) independent of \(X, Y\), then the player sees \(Z_{1} = UX + (1-U)Y,\) and \(Z_{2} = (1-U)X + UY\) remains unseen.

A random guess (coin-flip) would due to the sampling \(U\), indepedently of \(F\), have probability \(\tfrac{1}{2}\) of correct guessing. The question is if we can find a better strategy?

The 74HC595: an 8-bit serial-in/serial or parallel-out shift register with a storage register and 3-state outputs.

If higher load is required there is also the TPIC6C595 (e.g., for driving LEDs), or it should be paired with for example ULN2803 or similar. For multiple inputs see the 74HC165.

The basic usage is to serially transfer a byte from a microcontroller to the IC. When latched the byte will then in parallel be available on output pins QA-QH (Q0-Q7).

$
\newcommand{\pr}{\mathbb{P}}\newcommand{\E}{\mathbb{E}}
$
**Relative risks** (and risk differences) are **collapsible**
and generally considered easier
to interpret than odds-ratios. In a recent publication
Richardson et
al (JASA, 2017) proposed a new regression model for a binary exposure
which solves the computational problems that are associated with using for example
binomial regression with a log-link function (or identify link for the
risk difference) to obtain such parameter estimates.

Let \(Y\) be the **binary response**, \(A\) **binary exposure**, and \(V\) a **vector of covariates**, then the target parameter is

\begin{align*} &\mathrm{RR}(v) = \frac{\pr(Y=1\mid A=1, V=v)}{\pr(Y=1\mid A=0, V=v)}. \end{align*}

Let \(p_a(V) = \pr(Y \mid A=a, V), a\in\{0,1\}\), the idea is then to
posit a linear model for \[ \theta(v) = \log \big(RR(v)\big) \] and a
**nuisance model** for the odds-product \[ \phi(v) =
\log\left(\frac{p_{0}(v)p_{1}(v)}{(1-p_{0}(v))(1-p_{1}(v))}\right) \]
noting that these two parameters are **variation independent** which can be from the below L’Abbé plot. Similarly, a model can be constructed for the
risk-difference on the following scale
\[\theta(v) = \mathrm{arctanh} \big(RD(v)\big).\]