Quite a while ago, I wrote a post on how to compute an arbitrarily large Fibonacci number in essentially steps. Turns out, that technique generalizes to other recurrence relations, and of any order.
We don’t usually think of linear algebra being compatible with derivatives, but it very useful to be able to know the derivative of certain elements in order to adjust them—basically using gradient-descent.
We might therefore ask ourselves questions like, how do I compute the derivative of a scalar product relative to one of the vector elements? Of a matrix-vector product? Through a matrix inverse? Through a product of matrices? Let’s answer the first two questions for now.
A Taylor series for a function around that is times differentiable is given by
where is the th derivative of at .
Have you ever wondered where the coefficients in a Taylor series come from? Well, let’s see!
Getting good text data for language-model training isn’t as easy as it sounds. First, you have to find a large corpus. Second, you must clean it up!
Von Neumann proposed the middle square method of generating pseudo-random numbers in 1949, in a paper published a bit later. The method is simple: you take a seed, say 4 digits long, you square it, and extract the middle 4 digits, which become the next seed. For example:
While it seems random enough, is it?
Last week, we noticed a fun connection between lattices and fractions, which helped us get rational approximations to real numbers. Since only points close to the (real-sloped) line are of interests, only those points representing fractions are candidates for rational approximation, and the closer to the line they are, the better.
But what if we find a point real close to the line? What information can we use to refine our guess?
Finding rational approximations to real numbers may help us simplify calculations in every day life, because using
makes back-of-the-envelope estimations much easier. It also may have some application in programming, when your CPU is kind of weak and do not deal well with floating point numbers. Floating point numbers emulated in software are very slow, so if we can dispense from them an use integer arithmetic, all the better.
However, finding good rational approximations to arbitrary constant is not quite as trivial as it may seem. Indeed, we may think that using something like
will be quite sufficient as it will give you 6 digits precision, but why use 3141592/1000000 when 355/113 gives you better precision? Certainly, we must find a better way of finding approximations that are simultaneously precise and … well, let’s say cute. Well, let’s see what we could do.