Recently on Freenode channel ##cpp, I saw some code using an include-all-you-can header. The idea was to help beginners to the language, help them start programming without having to remember which header was which, and which headers are needed.
Is that really a good idea?
We don’t usually think of linear algebra being compatible with derivatives, but it very useful to be able to know the derivative of certain elements in order to adjust them—basically using gradient-descent.
We might therefore ask ourselves questions like, how do I compute the derivative of a scalar product relative to one of the vector elements? Of a matrix-vector product? Through a matrix inverse? Through a product of matrices? Let’s answer the first two questions for now.
A Taylor series for a function around that is times differentiable is given by
where is the th derivative of at .
Have you ever wondered where the coefficients in a Taylor series come from? Well, let’s see!
Getting good text data for language-model training isn’t as easy as it sounds. First, you have to find a large corpus. Second, you must clean it up!
About a week ago, some dude drops on IRC that he’s beat memcpy “by a lot”. That’d be interesting, except that we couldn’t get neither code nor test methodology out of him. But, how hard can making a better memcpy be? Turns out, harder than you think!
If you think this is a typical case of “reinventing the wheel”, I mostly agree with you. But while reinventing will be hard, can improvements be made?